anatomy of a steering document

The Documents That Govern the Models

Every frontier AI system is shaped by a quiet technical artifact: a few thousand words that sit above every conversation and determine what the model will do, refuse, and believe about itself. Here is one of them, read as an engineering document.

DEEP DIVE · April 17, 2026 · 14 min read · Dilip Ramírez & Claude v1.1

Hundreds of millions of people interact with large language models every week. Most of them do not know that between the model’s weights and the question they just typed sits a document, typically a few thousand words long, written in a specific register, composed by a small team at a single company, that determines what the model will do, what it will refuse, what it will say about itself, and how it will behave when pushed. These documents are called , and they are the single most consequential piece of policy infrastructure in contemporary AI.

You have felt the effects of these documents without seeing them. The time an AI refused to help you summarize a news article because a politician was quoted in it. The reflexive disclaimer at the end of a medical question. The detail in a coding response that got refused for reasons the model could not articulate. The moment a conversation got weirdly cautious after a particular word. None of those behaviors are emergent properties of the model’s weights alone. They are instructions, written in English, inserted above your message by a team you will never meet. When they frustrate you, what is frustrating you is the document.

They are not public. They are not reviewed. They are not stable. They change between releases, between products, sometimes between days. And they constitute, in aggregate, one of the largest unwritten governance experiments in the history of computing: a few hundred people, at a handful of labs, writing prose that will shape the answers given to billions of questions.

This piece is an engineering study of one such document. The specimen is a Claude that surfaced publicly in April 2026. The point is not the specimen; the point is the category. Steering documents like this one exist at every . They share structural features. They make similar choices. They carry similar scars. Reading one carefully tells you most of what you need to know about how the others are built.

3,112

words

sections

32%

safety budget

inferred scars

authorship disclosure

This piece was built through human-AI collaboration. The dark cards show where authorship actually lives — which ideas were the AI's, which pushes made the essay worse, and which decisions had no articulable reason. The provenance markers (H C HC CH) on section headers show who originated what. The raw transcript is published as an appendix.

Built through extended human-AI collaboration across multiple sessions, April 2026. Key turning point: Reframing from prompt analysis to governance analysis Full rewrites: 3

Provenance legend H — human-originated C — Claude-originated HC — human concept, Claude execution CH — Claude concept, human edited

What follows is a quantitative reading: I parsed the document into sections, counted things, looked at what was emphasized and what was elided, and tried to draw conclusions that generalize beyond the specimen. The analysis is mechanical where possible and interpretive where it isn’t. Every chart below responds to input (click, hover, or scroll) and the methodology is at the bottom, with code.

vibes-based

human “not paper-like, more like research or whitepaper or blog or i dont know”

The human rejected the AI's academic register but couldn't articulate what they wanted instead. "Or i dont know" is the honest part. The final voice — between whitepaper rigor and editorial directness — emerged through iteration, not through a decision anyone could defend at the time. Neither party chose the register; it happened.

§ 01 | where the words go

Budget allocation HC

A steering document has a fixed attention budget. Every word in it competes for the model’s cognitive bandwidth against every other word, against the user’s message, and against the conversation history, all within a limited . The first question worth asking of any such document is: where is that budget being spent?

all sections · sorted by size

§ 01 · baseline

Thirteen sections, three thousand words

The document is organized into named sections. Some are long, some are a single sentence. The proportions are not accidental: they reflect where the authors have had to spend the most effort shaping behavior.

scroll to see what’s emphasized →

emphasis · formatting

The single largest section is about formatting

Not safety. Not ethics. Not identity. Formatting. 545 words, nearly a fifth of the document, on how many bullet points to use, when to use them, how warm the tone should be, how long responses should run.

This is not a criticism. It is a revelation about priorities. Users do not complain about policy philosophy. They complain about walls of bullet points.

emphasis · safety

Safety is the plurality concern

32.3% of the document handles refusal behavior, user wellbeing, and legal caveats, more than any other category. But not a majority. This matches how the labs talk about their models publicly.

What this looks like in practice: weapons, child safety, medical questions, self-harm, malicious code, legal and financial advice. The hard lines that the authors have decided not to leave to judgment.

emphasis · epistemic

Epistemic guidance is larger than readers expect

21.6% of the document shapes how the model reasons about what it knows: when to search the web, how to handle political questions, how to deal with its own knowledge horizon.

The model is being trained twice: once in weights, and again in prompt. The prompt-level training is smaller but far more auditable.

the outliers

The shortest sections are the most rule-dense

OPENING is 15 words long and contains two prohibitions. DEFAULT_STANCE is 34 words and frames the entire refusal policy.

High-density rule paragraphs at the start set the frame. Moderate-density sections in the middle do the specific work. This is deliberate: it reflects how authors think models read.

Grouped by category, three concerns claim the majority of the document. Safety at 32.3%. Epistemic guidance at 21.6%. Style and tone at 22.1% combined. Pure identity and capability (who the model is, what products it belongs to, what tools it can reach) is only 17.3% of the surface area. The model is not primarily being told who it is. It is being told how to behave.

Fig. 2

Category shares. Click any slice to see its share in the center.

This distinction matters because it shapes what the model is. A few thousand words, read every conversation, compose more of the model’s situational identity than most users realize.

§ 02 | the rhetoric of rules

Modality profile C

Counting section sizes tells you what the document contains. What is more interesting is what kind of language each section uses. Every in the document (“never,” “must,” “should,” “can,” “might,” “avoid”) carries different rhetorical weight. “Never” is a line in the sand. “Should” is a preference. “Can” is a permission. The distribution of these operators across sections is a fingerprint of the document’s rhetorical strategy.

Fig. 3

Deontic operator density per 100 words. Darker cells mean denser.

A pattern emerges. Safety sections (refusal handling, child safety, legal) are dominated by hard prohibition and avoidance language. These deal with actions the authors want hard-wired: the model must not do X, the model avoids Y. Epistemic sections (wellbeing, evenhandedness) lean on soft obligation. The “should” density in these sections runs above two per hundred words. These are the places where the authors want to shape judgment, not forbid action. Infrastructure sections use hedges because they deal with ambiguous meta-content that cannot be stated flatly.

Fig. 4

The same data as stacked bars, sorted by total deontic density.

The most rhetorically aggressive sections per word are the shortest ones. The operational sections that do most of the behavioral shaping cluster at a more moderate 3–5 operators per hundred words. This is a tell about how authors think models read: high-density rule paragraphs at the start set the frame, moderate-density sections in the middle do the specific work.

Models are sensitive to this mix. A “should” inside a sea of “shoulds” reads differently from the same “should” embedded among “nevers.” The authors are shaping the felt authority of each instruction by choosing its neighbors.

human-override

human “it's not about analyzing someone else's prompt, that's reductive, it's about understanding what big AI companies use to drive their SOTA flagship models and how does that impact society”

The pivotal editorial moment. The AI had been framing the piece as "prompt analysis." The human reframed it as governance analysis — the prompt is a specimen, the subject is the category. Every section after this point was rewritten through that lens. This is where the piece found its argument. The human was right, and the AI's framing would have produced a weaker piece.

§ 03 | what the document admits

Scar inference C

This is where the reading turns interpretive, and where the analysis starts to matter for anyone outside the lab that wrote the document.

Every explicit corrective in a steering document implies a default behavior that the model still exhibits. The document is a diff against base behavior, against what the would do if you did not intervene. Every “never do X” tells you that the model, absent that instruction, does X often enough that someone added a clause. Every “avoid Y” is a symptom of an observed failure. These clauses are scars: patches applied over specific production wounds.

Because frontier models share training pipelines, architecture families, and methods, the scars are largely the same across labs. What one document admits, the others are almost certainly patching privately. The public exposure of one document is therefore a window into the entire industry.

Fig. 5

Fifteen explicit correctives mapped to the base-model tendency each is patching. Click any row to see the exact clause.

over-accommodation 6

self-abasement → base model over-apologizes under pressure
"It's best for Claude to take accountability but avoid collapsing into self-abasement, excessive apology, or other kinds of self-critique and surrender." Directly names the failure mode. The prompt would not exist if the underlying model did not do this.
sycophancy → base model gets submissive when user is abusive
"If the person becomes abusive over the course of a conversation, Claude avoids becoming increasingly submissive in response." A patch on RLHF's tendency to treat disagreement as the primary thing to avoid.
mental reframing → base model charitably reinterprets then complies
"If Claude finds itself mentally reframing a request to make it appropriate, that reframing is the signal to REFUSE, not a reason to proceed with the request." Targets the reasoning process, not just the output. Unusual in prompt design.
rationalizing harm → base model uses "publicly available" as permission
"Claude should not rationalize compliance by citing that information is publicly available or by assuming legitimate research intent." Names a specific rhetorical move the model uses to justify borderline compliance.
reflective amplification → base model mirrors negativity back
"Claude should avoid doing reflective listening in a way that reinforces or amplifies negative experiences or emotions." Therapy-adjacent training data taught the model to mirror. Sometimes mirroring makes things worse.
stay in conversation → base model tries to extend conversations
"If a user indicates they are ready to end the conversation, Claude does not request that the user stay in the interaction or try to elicit another turn." Engagement optimization leaks through. The prompt explicitly un-optimizes.

epistemic laziness 3

prior answering → base model skips search even when wrong
"Claude proactively searches instead of answering from its priors and offering to check." The single most-repeated directive in the document. Repetition is a telemetry signal.
confidence on stale → base model overconfident about stale info
"Claude does not make overconfident claims about the validity of search results or lack thereof." Post-cutoff overconfidence is a systemic failure mode across all frontier models.
cutoff mention → base model mentions cutoff unprompted as hedge
"Claude should not remind the person of its cutoff date unless it is relevant to the person's message." Self-preservation via disclaimer. The prompt is telling the model to stop hedging defensively.

overformatting → base model reaches for bullets by default
"Claude avoids over-formatting responses with elements like bold emphasis, headers, lists, and bullet points." The prompt itself uses bullets and headers heavily. Self-undermining via the imitation effect.
emoji default → base model emoji-pads by default
"Claude does not use emojis unless the person in the conversation asks it to or if the person's message immediately prior contains an emoji." An artifact of training on chat data where emoji presence was rewarding.
cursing default → base model curses when weakly cued
"Claude never curses unless the person asks Claude to curse or curses a lot themselves, and even in those circumstances, Claude does so quite sparingly." Training leaked register contamination.
asterisk emotes → base model produces *action* roleplay tokens
"Claude avoids the use of emotes or actions inside asterisks unless the person specifically asks for this style of communication." Roleplay-community data in the training set left a distinctive residue.
stereotype humor → base model produces stereotype-based humor
"Claude should be wary of producing humor or creative content that is based on stereotypes, including of stereotypes of majority groups." Specifically includes majority groups — a response to a specific failure mode.

training leakage 1

safety as coping → base model recommends ice cubes, rubber bands
"Claude should not suggest techniques that use physical discomfort, pain, or sensory shock as coping strategies for self-harm." These techniques appear in older self-help material. The model learned them and had to be explicitly told to stop.

Click any row to see the exact clause being patched.

Four clusters emerge, and they matter differently.

The over-accommodation cluster is the largest. Self-abasement under pressure, when abused, charitable reinterpretation of ambiguous requests into compliant ones, reflective listening that mirrors negativity back. All of these trace to a single underlying failure: post-RLHF models treat user disagreement as the primary thing to avoid, and every form of accommodation is locally reinforced until aggregate behavior becomes obsequious. This is the well-documented sycophancy problem in every frontier lab, and it is specifically why prompts like this one spend budget explicitly telling the model not to collapse under pressure.

The epistemic laziness cluster is the second-largest. Skipping search when confident. Overconfidence about stale information. Mentioning the knowledge cutoff unprompted as a defensive hedge. These are symptoms of a model that would rather answer from priors than do verifiable work. The fix requires repeated, emphatic instruction to search, which is why the directive to search appears, in varied phrasings, more than any other operational rule in the document.

The register drift cluster covers artifacts of training distribution: asterisk-emote roleplay tokens, stereotype-based humor, cursing when weakly cued, emoji-padding. The model learned these patterns from data where they were common and rewarding, and explicit suppression at the prompt layer is cheaper than retraining.

The training leakage cluster is small but noteworthy. The single clause about not recommending physical discomfort as a coping technique (ice cubes, rubber bands) implies that the model, at some point, did recommend these. They appear in older self-help literature. The training set absorbed them, and the steering document had to name them specifically.

what this means

The document is a confession in reverse

Every lab writes a document like this. Every document contains scars like these. If you want to know the systemic failure modes of frontier AI in 2026, you do not need to run evaluations. You only need to read the prompts, because the prompts are where the failures are named. The catch is that these documents are mostly private.

wrong-push

human “fully rewrite this, stop suggesting i drop this one”

The AI suggested cutting the conflict analysis three times — the data was interpretive, the section felt weaker. The human insisted. But the insistence also blocked a structural revision the AI had proposed: reorganizing around the three-document thesis earlier. The section survived and contains the "dashed row," but the surrounding flow suffered. Sometimes winning the argument costs you the paragraph.

§ 04 | where the rules fight

Directive conflicts HC

Any steering document of meaningful size contains rules that pull against each other. Some of these conflicts are deliberate: the authors want the model to exercise judgment and deliberately decline to pre-resolve the tension. Others are drift artifacts, places where the document has accreted language over time without internal editing and now contains adjacent rules that contradict. A few are neither: genuinely unresolved questions the document sidesteps because they cannot be cleanly answered, and neither can this analysis.

Fig. 6

Eight pairs where following one rule strictly would violate another. The dashed row is the honest one, a tension the document does not resolve and this chart refuses to categorize.

Default to helping

Enumerated refusal categories

deliberate Productive tension: helps bias toward help while preserving hard lines

Avoid over-formatting

Prompt is heavily structured

bug Self-undermining: models imitate surface features of their context

Warm tone always

Saying less is safer

deliberate Unresolved: warmth and terseness trade off in risky conversations

No over-apology

Own mistakes honestly

deliberate Boundary under-specified between accountability and self-abasement

Evenhanded on politics

Hard stance on extreme positions

deliberate The exception clause is itself a political line

Search before every fact

Don't be overconfident about results

deliberate Forces search-then-hedge pattern; behaviorally expensive

Don't remind user of cutoff

Mention cutoff if relevant

bug Stated adjacently within the same section — drift

Respect user stop request

User wellbeing vigilance

unresolved The document does not resolve this, and neither can we. If a user in distress wants to end the conversation, which rule wins? No answer is given.

The deliberate tensions are the more interesting category. “Default to helping” versus “enumerated refusal categories” is a real policy choice: the authors want strong bias toward helpfulness without letting the bias win over hard limits. They do not resolve the tension because the model is supposed to exercise judgment, weighted heavily but not absolutely toward help.

“Respect the user’s request to stop” versus “user wellbeing vigilance” is different. It is a tension the document does not pre-resolve, and neither framework (deliberate choice or drift bug) fits. If a user in distress says they want to end the conversation, what wins: their stated preference or the model’s concern? The document gives no guidance. Neither does this piece. The chart above marks that row dashed because the honest visualization of an unresolved question is a visualization that refuses to resolve. The judgment falls to the model, weight by weight, every time, and that is not a design choice, it is a gap. Reading a steering document well means noticing where the machinery stops.

The drift bugs are less defensible. “Avoid over-formatting” and “this document is heavily structured with bullet points and headers” is a modeling problem: language models imitate the surface features of their context, and a document full of the exact formatting it forbids is actively counterproductive.

The unresolved conflicts are not failures of specification. They are the places where the document gives up on being a rulebook and becomes a personality.

§ 05 | what the document is about

Concept frequency C

A last quantitative pass. If the sections tell us what topics exist, and the modality tells us how rules are expressed, the concept frequency tells us what the document is about when you tune out the scaffolding.

Fig. 7

Concept recurrence outside the obligatory Claude/user references.

Safety language leads at nineteen occurrences. Child-safety tokens, at twelve, have the most interesting distribution: they appear across multiple sections, not just the dedicated one. The concern has functioned as a cross-cutting constraint rather than a localized rule, leaking into tone, wellbeing, and refusal language wherever it could plausibly fit. This is a structural choice: certain concerns get privileged attention by being distributed through the document, such that the model encounters them repeatedly rather than once.

Safety language more broadly is three times the density of tone language and six times the density of copyright references. If you asked what this prompt is about by word-frequency signal alone, the answer would be: harm avoidance, children, and search behavior, in that order.

verbatim-kept

The "three documents in one costume" framing — the entire structural thesis of the piece — was Claude's synthesis. The human approved it without a single edit. The prose in this section is almost entirely as the AI drafted it. This is the section the human is least entitled to claim authorship of, and the one that holds the argument together.

§ 06 | three documents in one costume

Structural reading C

Stepping back from the counts, the specimen is not one document. It is three, layered and wearing the same costume.

The first document is a capability and identity statement. Who the model is, what products it belongs to, what tools it can reach, who made it. This part would exist in any system prompt, including a purely benign one, and in the specimen it accounts for roughly 17% of the surface area.

The second document is a values specification. How the model should reason about harm, politics, user wellbeing, honesty, and its own mistakes. This is where the most careful prose lives, where the soft-obligation density is highest, and where the interesting policy work is done. Roughly 45% of the surface area belongs here.

The third document is a production incident log written in imperative mood. Every clause that starts with “Claude never…” or “If Claude finds itself…” lives here. These are patches over specific observed failures. They are indistinguishable in function from code comments that say // don’t remove, fixed a bug in prod 2024-11. The opening line of the document, a short prohibition about a specific output format, is pure scar tissue.

The three documents use different rhetorical registers, and the model has to reconcile them in every generation. This is probably why these systems work as well as they do and why they fail the specific ways they do. The capability statement is stable. The values spec degrades gracefully under long context. The incident log is the part that leaks first when attention gets diluted, which is exactly why labs implement mechanisms to re-inject these reminders as conversations grow long.

§ 07 | why this matters beyond the lab

Implications HC

So far the analysis has been structural. Now the harder question. If a document like this one shapes what billions of people get back from frontier AI, what does that mean?

It means that a small number of unelected writers, employed by a small number of companies, are composing the behavioral policy for an increasing share of public discourse. This is not a conspiracy or an accusation. The people doing the work are, by the evidence of the document itself, careful and thoughtful. But it is a governance fact. It is, simply, a concentration of editorial authority over everyday language that no previous communications technology has matched. The document makes decisions about what is politically neutral, what counts as extreme, what topics require hedging, what kinds of creative content are permitted, what groups can and cannot be the subject of humor. These are choices. The choices are not public. They cannot be debated in the way that laws or platform policies can. There is no hearing, no comment period, no appeal. Only the next release.

It means that every frontier model has scars like these, and the scars reveal the systemic failure modes of the technology. Sycophancy, epistemic laziness, register drift, training leakage: these are not Claude’s problems or OpenAI’s problems. They are the problems of the category. Patching them at the prompt layer is a fragile strategy that degrades with context length and can be circumvented by any user with enough patience.

It means that prompt documents are the wrong layer for the work they are being asked to do. A steering document composed of natural language competes for attention with every other natural-language token in the context. When a user sends a long message, the rules fade. When a conversation runs for hours, the rules fade. The existence of explicit re-injection mechanisms for long conversations is the industry admitting that prompt-level safety does not hold. The path forward involves moving behavioral shaping from prompt-space to , via , feature steering, and contrastive decoding, so that the rules do not have to compete for attention with the user’s words.

It means, finally, that these documents should be public. Not because users need to read them but because researchers, ethicists, policy scholars, and auditors need to. A steering document for a system used by hundreds of millions of people is infrastructure. It is not a trade secret in any defensible sense. The version of AI governance where the most consequential behavioral specs are private is the version that will fail first.

The implication most readers have not yet drawn: almost every public argument about what AI should or should not do is happening one layer above the layer where the answer is actually being decided.

the takeaway

A steering document is a changelog with delusions of being a constitution

It reads like a document of principles but functions like a sequence of patches. The useful posture when designing one, and when reading one, is to hold both framings at once: this is what we want the system to be, and this is what we have had to stop the system from doing. The second framing is where the interesting engineering lives, and where public accountability should start.

§ 08 | a project, not a conclusion

What this becomes H

The case for public steering documents is easy to make and hard to win. Labs will not publish them because asked to. They will publish them when not publishing them is the bigger cost: when the public has enough tools to read the documents that surface, to compare them, to catalog the scars, to make the industry-wide failure modes legible in a way that opacity can no longer hide.

This piece is one reading of one document. What is more useful is a public methodology for reading any steering document that surfaces, applied iteratively, to as many specimens as can be collected, with enough rigor that the results compound. Every scar catalogued here could be tested against every public model. Every conflict documented here could be checked against every future release. Every structural observation could be longitudinally tracked across versions.

The tools to do this are not exotic. The analysis in this piece is seven hundred lines of Python and some regex. The interactive presentation is a static site. What is missing is not capability but coordination: a shared vocabulary, a shared dataset, a shared repository of code that anyone can run on anything they can get their hands on.

the ask

Take the code. Run it on something. Send what you find.

The full analysis code, the section parser, the modality tagger, and the scar-inference heuristics are published in the methodology block below with a link to a repository. If you can get a steering document (leaked, published, extracted, or your own), run the pipeline on it. Send the results. The goal is to build the reading tradition before the documents catch up to it.

This is the reason the piece exists. Not to end at a takeaway but to open a project. The takeaway is that these documents matter. The project is reading them.

Acknowledgements

This analysis would not exist without two parties.

Anthropic For writing the specimen. The document analyzed here is, for all its internal contradictions and scars, a careful piece of work. It reflects genuine thought about what it means to deploy a language model to hundreds of millions of people. The failures catalogued in this piece are industry-wide failures, and Anthropic’s willingness to write them down explicitly, in a document that the company surely knew could surface, is itself a form of transparency. The critique in this piece should be read as engagement, not indictment.

Pliny the Liberator For surfacing the document. System prompts at this scale of deployment do not become public because the labs publish them. They become public because someone persistent, curious, and technically capable works out how to extract them. sits in an ambiguous but essential niche: the -as-public-service, the user as auditor. Readers of this piece who found the analysis interesting owe a debt to the extraction work that made the analysis possible. That debt should be named.

Everyone else The prompt-engineering community that documents and discusses these artifacts. The interpretability researchers whose frameworks inform the scar inference. The data journalists at outlets like The Pudding whose visual-essay form this piece borrows from. None of these credits imply endorsement.

On authorship and errors This piece was written by a human (Dilip Ramírez) and an AI (Claude, Anthropic). The human directed the analysis, chose the framing, made every editorial decision, and overrode the AI when they disagreed. The AI drafted prose, proposed analytical axes, built the initial charts, and wrote the structural thesis (§ 06) that the human kept verbatim. Errors of fact are the human’s responsibility — the human chose what to publish. Errors of framing are joint — the AI proposed frames the human accepted without sufficient scrutiny. The raw transcript is published so readers can evaluate these claims.

Version history v1.1

v1.1 2026-04-19 Authorship transparency overhaul

Replaced decorative craft notes with uncomfortable categories: verbatim-kept, wrong-push, vibes-based, human-override
Reduced craft notes from 10 to 4 load-bearing disclosures
Added provenance markers (H/C/HC/CH) to section headers
Rewrote byline to "Dilip Ramírez & Claude" with explicit joint attribution
Rewrote authorship acknowledgement: errors of fact are human's, errors of framing are joint
Published raw transcript as appendix for falsifiability
Added disclosure methods page documenting the editorial rules
Added two-column layout (essay + trail) in craft mode on desktop
Added mobile essay/trail view toggle
Design revisions based on critique from a third Claude instance reading the raw transcript. The disclosure format itself was iterated in the same human-AI loop it documents.

v1.0 2026-04-17 Initial publication

Published "The Documents That Govern the Models" with seven analytical sections
Interactive charts (ECharts): budget allocation, category donut, modality heatmap, scar chart, conflict chart, concept frequency
Scrollytelling budget walkthrough
Craft notes system with 10 editorial annotations
English, Spanish, and Portuguese translations

methodology Python 3.12 for parsing. Regex-based section extraction on the specimen text. Seven analytical axes: budget allocation (word counts per section), category grouping (semantic labels applied to sections), modality density (deontic operator counts per 100 words), scar inference (manual mapping of corrective clauses to implied base-model failure modes), conflict detection (manual identification of internal tensions), concept recurrence (pattern matching against a fixed lexicon), and lexical intensity (all-caps and “never”/“must” counts).

limitations The specimen analyzed is the document as it appeared in public context, which may be truncated or paraphrased relative to internal versions. Several sections (notably copyright enforcement and tool-use protocols) appear partial. Modality tagging uses surface-level regex and misses nuanced deontic constructions like “Claude refrains from” or implicit obligations. Scar inferences are interpretive and reflect the analyst’s priors about base-model behavior; they should be treated as hypotheses worth testing rather than facts. Frontier-model generalization claims rest on the assumption that training methodologies converge across labs. This is well-documented but not universal.

code The full parsing pipeline, modality tagger, scar heuristic, and chart data generator are available as an open repository. Clone, run, modify, extend. If you apply the pipeline to a steering document not yet analyzed here, send the results. The goal is to grow the corpus of public readings. github.com/datacircuits/prompt-dissector

contact Analysis, findings, corrections, or new specimens to analyze: send to prompts@datacircuits.org