← Content
Draftmedium· voice: ajay

1.4m-vendor-story

02-content/drafts/medium/2026-05-05-private-ai-three-tests-ajay.md

Private AI Doesn't Mean What Most Vendors Say It Means

A $1.4M lesson from one industrial firm — and the three tests your compliance officer should apply to every AI vendor in your building.

A Fortune 500 industrial firm I worked with spent eighteen months and $1.4M on an AI deployment before their compliance team discovered what the vendor's Data Processing Addendum didn't say in plain English.

The prompts — which included vendor pricing, project cost codes, and personally-identifiable employee data — were traveling from the firm's network to a third-party data center the firm did not own, did not audit, and in two of the three months they checked, did not know the physical location of. The DPA said "prompts are processed in secure infrastructure with industry-standard encryption." That sentence was technically true. It was also, in every meaningful sense, a lie by omission.

The contract paused the day the network trace landed on the CIO's desk. Eighteen months of work. $1.4M of spend. A pilot the operations team actually liked. All of it frozen because nobody had asked the question hard enough at procurement.

Here's the part that matters. The vendor was not some fly-by-night AI startup. It was a major, public, widely-deployed platform you've heard of. It was doing exactly what its DPA said it would do. The firm's compliance team was not surprised that they caught a vendor being loose with data; they were surprised that nobody in the evaluation process had known to check.

This is the conversation every mechanical, industrial, and manufacturing contractor is going to have with their legal counsel in 2026. Most of them will have it late. These are the three tests I'd use if I were sitting in the evaluation seat.

Test 1: Does the prompt leave your VPC at any point during inference?

This is the first question and it's the one most vendors won't answer in a straight sentence.

A prompt — what you type in, what your agent sends, what your ERP integration passes — contains your business data. For a mechanical contractor, a single prompt might include a project number, a vendor's bid pricing, a change order narrative mentioning a client's name, and the cost-code breakdown tied to your margins. That data is not generic business language. It is the operational substance of your company.

If the prompt leaves your VPC — even for a millisecond, even "with encryption in transit," even to "a secure region" — it has entered infrastructure you do not control. At that point, three things become true. You are dependent on the vendor's access controls rather than your own. You are dependent on the vendor's data retention promises rather than your policy. And you are subject to whatever subpoena, court order, or state-level compulsion process applies to the vendor's jurisdiction, not yours.

"We don't retain the data" is not a mitigation. It is a promise. Promises are not an auditable control.

The right architecture runs inference inside your VPC — on infrastructure you own or that the vendor operates as a single-tenant dedicated deployment inside your network. Prompts don't leave. Responses don't leave. If a compliance officer wants to verify this, they can. Not with a DPA. With a network egress monitor.

When you're evaluating a vendor, the straight form of this question is: "Show me the network diagram of a production deployment and point to the boundary where prompt data crosses." If the vendor hesitates, your answer is already in.

Test 2: Who controls the model weights?

The second test is the one that gets missed even in careful evaluations, because it sounds abstract until it becomes urgent.

When you deploy an AI system, you're not just deploying software. You're deploying a specific model version with specific behavior, validated — ideally — against your business processes and your compliance requirements. That validation is expensive and slow. It's also load-bearing. Your estimating workflow, your AP reconciliation logic, your submittal triage process all get tuned to the model's behavior.

Now ask: who can change the model?

For most cloud-hosted AI platforms, the answer is the vendor. On their schedule. With "deprecation notice" windows that might be 30 days, might be 90. When they flip the switch, your validated workflow is now running against a model whose behavior you haven't tested. Your compliance posture — which you signed documents about — is invalidated the moment the model version increments.

This is not a theoretical risk. It happens every few months at every major foundation-model vendor. Large, widely-deployed models are updated. Previously working prompts start producing slightly different outputs. The edges of behavior shift. Your audit trail, if you have one, now spans multiple model versions that your team can no longer run side-by-side because the old one is no longer available.

The architecture that actually survives contact with compliance is one where model weights are frozen, versioned, and under your control. You decide when to move to a new model. You decide when to deprecate an old one. The version running on November 14th, 2026 is the same bit-for-bit version you can run six months later when your auditor asks what it would have produced.

For mechanical contractors doing federal work, doing prevailing-wage work, running under GSA schedules or DoD subcontracts, this isn't optional. The audit standard assumes you can reconstruct a specific output with the specific model weights and retrieval context that produced it. Cloud-hosted platforms with vendor-controlled model rotation cannot meet that standard. They can approximate it in policy. They cannot deliver it in evidence.

Test 3: Can you reconstruct any output, in under 30 seconds, six months later?

This is the test that turns policy into evidence.

A private AI deployment that takes compliance seriously produces a tamper-evident audit trail on every inference. Who prompted. What they prompted. Which retrieval sources the agent pulled — with hashes, so the retrieval itself is verifiable. Which model version responded. What the response was. When all of this happened, to the millisecond, on a clock you trust.

That audit trail is not optional telemetry. It is the substrate that lets you answer the question an auditor, a plaintiff's lawyer, or a federal reviewer will ask: "For this specific decision on July 12th, what did the AI see and what did it say?"

Most AI platforms marketed as enterprise-ready have some form of logging. Very few produce audit trails that survive a hostile forensic review. The difference matters because the discovery standard in a construction-related dispute — particularly one involving federal funding, prevailing-wage compliance, or safety — is not "the vendor had logs." It is "you can reproduce the decision with its inputs."

If you can't reproduce, you can't defend. If you can't defend, the decision lives in a legal penumbra where the AI's involvement becomes a liability rather than a productivity gain.

The straight form of this question during evaluation: "Show me how I'd reconstruct the exact output produced by a prompt from six months ago, including the model version and every retrieval source." If the vendor demonstrates this in a live session, they pass. If they send you a whitepaper, they don't.

Why this matters more for mechanical contractors than most industries

Most industries get to treat these tests as risk management. Mechanical contractors increasingly don't have that luxury. Three specific regulatory and business realities are converging.

First, federal and state-funded work is growing as a share of the mechanical contractor pipeline. Infrastructure-adjacent work under IIJA, energy-project work, GSA buildings, DoD installations. All of it carries data-handling requirements that were written assuming your AI either didn't exist or was fully on-premises. Vendor-cloud AI introduces a category of risk the regulators haven't fully caught up to, but the contracting officer will catch when they audit.

Second, the joint-venture and partnership structures common in large mechanical projects create NDA stacks where your AI's data handling becomes a question for your JV partner's compliance team, not just yours. A partner's veto on your AI vendor choice is a real thing. It has stopped deployments mid-flight.

Third, the prevailing-wage data and personnel data that flows through your ERP — WH-347 submissions, certified payroll — is regulated with fingers-and-thumbs specificity. It is not "business data." It is classified under specific statutes with specific penalty structures. Any AI system that touches that data needs to be evaluated against those statutes directly, not against a general "enterprise-grade security" claim.

For contractors who pass these regulatory thresholds, the vendor-cloud AI path closes faster than you'd think.

What the alternative actually looks like in production

The architecture that survives these tests is not theoretical. DKubeX 2.0, which is what we build, runs inference inside customer VPCs or on-premises in customer data centers. Model weights are frozen, versioned, and under the customer's control. The audit trail is cryptographically chained with tamper-evident hashes. A production deployment at a mid-sized mechanical contractor looks like a single rack in their server room, or a dedicated enclave in their existing cloud account, serving 300-500 users.

The twelve-week pilot cost is $25K for a scoped wedge use-case. The ROI test is agreed on at kickoff. The audit trail works from day one of the pilot, not after a compliance hardening phase at production cutover.

The contrast that matters to a CFO: instead of paying $200-$800K per year in cloud-AI subscription fees with an audit posture that doesn't actually audit, you pay a one-time pilot fee plus production licensing on infrastructure you already own — with an audit posture your compliance officer can defend in a deposition.

The contrast that matters to a CIO: instead of managing a vendor relationship where the model can change under you, you manage infrastructure where the model you validated is the model you run until you decide to change it.

Apply the three tests, starting today

If you take one thing from this — ask your current AI vendor, in a single email, the three questions. Does the prompt leave our VPC during inference. Can you change the model version without our approval. Can you demonstrate reconstruction of a six-month-old output in a live session.

If the answers are unclear, you've found your $1.4M problem before it's $1.4M.

If you want the faster version, the AI Readiness Index runs you through thirty questions that produce a scoped report covering your current AI posture, the three fastest-payback wedge use cases for your operation, and a rough compliance assessment you can hand to your legal counsel. Five minutes. The output is yours to keep, sales-call-free.

The reclaimed time from getting private AI right isn't just avoided risk. It's operational capacity that compounds. Every hour your operations team is freed from cognitive paperwork is an hour your company can redeploy into bid pursuit, service density, or margin protection. That's the math that turns this from an IT line item into a revenue conversation.

What's your compliance officer going to find?

---

*Ajay Tyagi is Senior Director at DKube, where he works with mechanical, industrial, and manufacturing contractors on private, agentic AI deployments. Twenty years of enterprise software taught him that every vendor's DPA is a contract and every production network trace is a fact.*