On every architecture call, the question that produces the longest pause is the same: do we want tenant-owned weights, or are shared embeddings sufficient. The pause is diagnostic. In nearly every case, the CIO has been operating shared embeddings — the default for hosted AI services — without having made the decision to do so. The decision was made by procurement, or by the engineering team that integrated the first vendor, or by the security team that approved the data-handling addendum. It was not made at the architectural level it deserves.
The cost of an unconsidered choice on this question shows up around month nine. By then the deployment has accumulated enough institutional context that the difference between owning the weights and renting access to a shared embedding space starts to determine what the deployment can and cannot do. The decisions that should have been made at procurement become significantly harder to reverse.
What follows is the framework we walk every CIO through. It is not a rule. It is a decision tree that, in our experience, makes the trade-offs explicit enough to actually choose between.
What "tenant-owned" actually means.
Tenant-owned weights is shorthand for an architecture in which the model — whether base, fine-tuned, or both — has its weight files written into infrastructure controlled by the buyer's tenancy. The buyer can mount the weights into a different runtime, dump them, audit them, and ship them to a new cloud region without renegotiating the vendor contract. Crucially, the buyer's data did not leave the tenancy boundary in order to produce those weights. The fine-tune was performed inside the buyer's environment, against the buyer's corpus, and the resulting artifacts inherit the buyer's data classification.
Shared embeddings is the opposite default. The buyer ingests data into a vendor's environment, the vendor produces embeddings against a model the vendor owns, and the resulting vectors live in a shared embedding space — sometimes a literal shared space across customers, more often a logically partitioned slice of one. The vendor controls the model used to produce future embeddings, the schema, and the upgrade path. The buyer's data has crossed the tenancy boundary, and the embeddings produced are no longer the buyer's portable asset.
The two architectures are not equally portable, equally auditable, or equally defensible in front of a regulator. Pretending they are produces the unconsidered choice we keep finding on architecture calls.
Three questions that resolve the decision.
We use three questions, in order, to make the choice explicit. They are not rhetorical. The answer to any one of them can flip the recommendation.
01. Does your data classification permit cross-tenancy embedding generation?
If the answer is no — typically because the data is bound by sector-specific rules, regional transfer restrictions, or an internal classification that prohibits processing in vendor environments — the decision is made. Tenant-owned weights are not a preference; they are the only architecture that survives a serious procurement review. We see this most often in healthcare, public sector, and the parts of financial services that handle non-public material.
If the answer is yes, the decision tree continues. Permission is necessary but not sufficient.
02. Will the deployment depth justify the operational cost?
Tenant-owned weights have an operational tax. Somebody has to host the model, monitor it, schedule fine-tunes, and version the weight files. The tax is not high — modern inference stacks make it manageable for any team that runs serious infrastructure — but it is non-zero. For shallow deployments where the AI is used as a thin layer over a few workflows, the tax is not justified, and shared embeddings inside a vendor's environment is the right call.
The threshold we use is workflow depth: if three or more workflows of strategic importance will run against the model, the deployment is deep enough to justify ownership. Below that, rent. The break-even is not a precise number, but in our experience it is closer to three workflows than to ten.
03. How long is your strategic horizon for this capability?
If the AI capability is being built for a horizon of less than eighteen months — a tactical bridge while a different platform is being procured, a one-off campaign automation, an experimental pilot — shared embeddings are likely fine. The deployment will not run long enough for the portability cost of vendor lock-in to compound.
If the horizon is longer than three years, which is the case for any deployment that purports to be infrastructure, tenant-owned weights are the only architecture that survives the time horizon. By year three the vendor will have changed pricing, changed model defaults, or been acquired, and the deployment that depended on shared embeddings will need to be re-platformed under conditions the buyer no longer controls.
The hidden cost most CIOs underprice.
There is a fourth consideration that does not fit neatly into the decision tree but that determines whether the deployment compounds: the cost of corrective edits. Every editor-in-the-loop correction is a piece of institutional judgment. Over months, those corrections accumulate into something that looks like brand voice, decision policy, or domain expertise. The question is who owns the accumulated judgment.
On tenant-owned weights, the corrections are folded into the next fine-tune cycle, and the model that ships in month six embodies more institutional judgment than the model that shipped in month one. The output gets better not because the underlying model improved but because the buyer's specific judgment got encoded.
On shared embeddings, the corrections are typically applied at retrieval time — re-ranking, prompt steering, or vendor-side feedback. The accumulated judgment lives as a configuration on top of a model owned by someone else. When the vendor changes the underlying model, the configuration may or may not transfer. Often it does not. The buyer is left to re-encode the institutional judgment from scratch.
This is the asymmetry that determines whether a deployment compounds. We wrote about why compounding is the only AI ROI metric that survives audit precisely because this is where the gap shows up.
What this decision looks like in practice.
The Knyte install pattern is tenant-owned by default. We deploy a private 70-billion-parameter base model into the buyer's tenancy, fine-tune it against the buyer's corpus inside the buyer's environment, and write the resulting weights to storage the buyer controls. The trade-off is operational complexity, which we absorb on behalf of the install team during the ninety-day rollout. The benefit is that month nine looks materially different from month one, and the deployment survives any vendor's roadmap change.
We do not think this architecture is the right answer for every deployment. Shallow workflows that genuinely live and die on the vendor's roadmap should sit on the vendor's roadmap. The point is to make the choice deliberately, with the data classification, the deployment depth, and the strategic horizon all on the table at the same time.
If you cannot remember the meeting where this decision was made on your current AI portfolio, the meeting did not happen. That is the most common pattern we find. The remediation is not heroic — it is a structured architecture review that puts the three questions on the table and forces an answer. It is the conversation we run on every install call. The answer is sometimes shared embeddings. More often, given the depth of the workflows our customers run, it is tenant-owned. The point is that the answer is reached, not inherited.