The case for editor-in-the-loop as the default workflow primitive.

The architectural choice that has produced the most operational difference, across the deployments we run, is also the most boring. We treat editor-in-the-loop as the default workflow primitive. Every workflow stops at human sign-off by default. The engineering team has to explicitly opt a workflow out of the editorial gate, and the opt-out has to be approved by the operations lead, and the opt-out is logged. The default direction of the system is reversibility.

This sounds like a slowdown. In practice it is the architectural feature that makes the deployment defensible in front of regulators, auditors, and risk committees, and — less obviously — the feature that makes the model improve over time. Editor corrections are the highest-quality training signal a deployment generates. A workflow that bypasses the editor by default is a workflow that produces less useful signal per execution.

What "default" actually means in the runtime.

Most workflow runtimes implement editor review as a special node type — a manual approval step that is added to the workflow graph when the designer decides it is needed. The default position is execute-without-review. Adding a review step requires extra work, which is structurally biased toward leaving it out.

We invert the default. Every step that produces an output destined for an external system — a CRM update, an email send, a calendar invite, a contract amendment — is wrapped in an editorial gate at the runtime layer. The gate is part of the step's execution contract, not an optional decoration on top of it. To bypass the gate, the workflow definition has to declare an opt-out for that specific step, with a justification that is captured in the workflow's audit trail.

The economic effect of this inversion is significant. Engineering teams that previously left out review steps because adding them was tedious now leave them in because removing them is tedious. The default direction of the system shifts. Three months in, every workflow we run has gates at its irreversible actions, without any specific policy mandate.

workflow:
  id: renewal-brief
  steps:
    - kind: retrieve
      from: corpus.deals.renewal
    - kind: draft
      model: tenant.brand-voice.v3
    - kind: review     # default; editor must approve
      role: ae
      timeout: 24h
    - kind: emit
      target: salesforce.case

Why this produces better models.

Editor-in-the-loop is not just a governance feature. It is a training pipeline that operates continuously and at no incremental cost to the deployment. Every editorial action — accept, reject, edit, escalate — is a labeled training example. The labels are produced by humans whose judgment the deployment is supposed to be encoding. There is no better source of fine-tune data.

The deployments that bypass the editor by default are training their models on whatever the model produced previously, which is a subtle but consequential form of model collapse. The compounding curve flattens because the model is converging on its own outputs rather than on the editor's judgment. We see this in the production telemetry of teams we audit who had bypassed the editor early in the deployment for performance reasons.

The performance objection, and why it is usually wrong.

The most common objection to editor-in-the-loop as the default is performance. "We can't have a human approve every email send, the workflow will be too slow." In every audit where this objection has been raised, two things have turned out to be true: the engineering team had not measured the actual editor latency, and the editor latency was not the bottleneck.

Editor latency, when measured against a properly designed review surface, is fast. A trained editor reviewing a generated draft in a familiar UI takes between fifteen and ninety seconds, depending on the document type. The throughput limits are not the editor — they are the rate at which the workflow is producing items to review. For most workflows, the rate is bounded by upstream business logic, not by AI generation speed. The editor is not the bottleneck.

When the editor genuinely is the bottleneck — a high-volume support triage workflow, for example — the right answer is not to remove the gate. It is to redesign the gate. We have shipped batched-approval surfaces that let an editor approve fifty drafts in two minutes when the workflow is structurally batchable. The gate stays. The throughput problem is solved.

Three workflow patterns that benefit most.

Editor-in-the-loop is most valuable, in our measurements, for three workflow categories.

External communications. Anything that emits to a customer, prospect, or external partner. The cost of a bad output is a customer relationship event. The marginal cost of an editor approval is fifteen seconds. The asymmetry is overwhelming.

Records-of-truth updates. CRM writes, contract amendments, financial system updates. Anything that creates a record the company will treat as authoritative. The cost of a bad write is a remediation project. The marginal cost of an editor approval is, again, seconds.

Brand-voice generation. Marketing copy, board narratives, press materials. The cost of a bad output is a brand event that propagates through external channels faster than internal correction can catch up. Editor sign-off is non-negotiable.

Two patterns where editor-in-the-loop is genuinely overkill: internal-facing summarization, and read-only retrieval. A meeting summary that the workflow drops into a private team channel does not need an editor. A retrieval call that produces a candidate set for an editor to choose among does not need a separate editor approval. The opt-outs are real and the runtime supports them. The default is just inverted.

How this connects to the broader architecture.

Editor-in-the-loop only works if the editor surface is a first-class component of the workflow runtime. We surface every workflow's review queue inside the Knyte automation page, with the corpus context, the retrieval trace, and the editor's previous decisions on similar items all in one place. An editor reviewing a draft in twenty seconds is doing so because the surface has been designed to make twenty seconds enough.

The same architectural principle — make the right thing the default — applies to almost every other piece of the workflow runtime. Retrieval defaults to the policy-visible subset; bypassing requires explicit opt-out. Outputs default to traceable; untraceable execution requires explicit opt-out. Auditing defaults to on; disabling requires explicit opt-out and is logged. The defaults compound. After three months, the deployment behaves correctly without anyone having to remember to behave correctly.

The boring engineering decision — to invert the default on editor-in-the-loop — has produced more operational difference, in the deployments we run, than any specific model selection or any specific prompt strategy. It is not the kind of decision that produces a launch headline. It is the kind of decision that produces a deployment still standing in month eighteen.

J. ReichertPRINCIPAL ENGINEER · KNYTE

Twelve years on production retrieval and inference systems. Previously at Stripe (risk infra) and Anthropic (eval tooling). Writes about the boring parts of agentic infra.

RECENT

Postmortem →

RECENT

Streaming model outputs without losing the editorial gate. →

RECENT

Designing queryable memory for enterprise retrieval pipelines. →

KEEP READING

More from the dispatch.

All posts →

FIG. 27PROD / 2026

The cost curve is steeper than most teams have measured.

ENGINEERING

Inference cost as a first-class engineering concern.

04.26.2612 MIN · DC

FIG. 28PROD / 2026

The stream is for the user. The gate is for the system.

ENGINEERING

Streaming model outputs without losing the editorial gate.

04.23.2610 MIN · JR

FIG. 29ARCHITECTURE / 2026

Either weights are tenant-scoped or leakage is possible.

ENGINEERING

Multi-tenant fine-tunes without cross-tenant leakage.

04.20.2613 MIN · DC

The case for editor-in-the-loop as the default workflow primitive.

What "default" actually means in the runtime.

Why this produces better models.

The performance objection, and why it is usually wrong.

Three workflow patterns that benefit most.

How this connects to the broader architecture.

More from the dispatch.

Inference cost as a first-class engineering concern.

Streaming model outputs without losing the editorial gate.

Multi-tenant fine-tunes without cross-tenant leakage.

Get the dispatch in your inbox.