Claude Code at scale: context architecture for large codebases

Why should you read this?
You're a developer working on a codebase that's been growing for years. Twenty modules. Business rules nobody wrote down. Workarounds that only your team understands. You start using Claude Code and the first tasks go great โ it feels like magic. Then you ask it something simple: update an email template. One file, clear scope, should take minutes.
What it generates looks clean, professional. With a header your design system forbids. Without the mandatory pink CTA gradient. Missing three layers of dark mode CSS. The code compiles. It violates every visual rule your email module has.
The obvious fixes come quickly. You write a longer CLAUDE.md. You create a skill for each module. You try dedicated agents that specialize in billing, or email, or auth. Each approach helps โ and each one hits its own wall. The CLAUDE.md becomes a 2,000-line file that loads on every task. Skills multiply until maintaining them is a project in itself. Specialized agents need their own context, and you are back to the same problem one level up.
None of these approaches solve the real problem: how does the agent know what it needs to know, when it needs to know it, without drowning in irrelevant context?
After a lot of trial and error, the answer turned out to be almost embarrassingly simple. No framework, no complex tooling, no reinventing anything. Just document what needs to be documented, for an AI agent, where it needs to be documented.
The solution: documenting for AI
The pattern is called IA-docs. Markdown files named IA-docs.md are placed at strategic points in a codebase's directory tree. Each file documents the context that is relevant to that level of the hierarchy. When an AI agent needs to work on any file, it reads every IA-docs.md from the root down to the target directory.
Here is the actual structure from a production SaaS platform with 20+ modules:
back-cliencer/
โโโ IA-docs.md # Global: tech stack, DDD rules, conventions
โโโ src/
โ โโโ IA-docs.md # DDD layers in detail, anti-patterns, module structure
โ โโโ app/
โ โโโ billing/
โ โ โโโ IA-docs.md # Stripe integration, checkout flow, price IDs
โ โโโ customer-support/
โ โ โโโ IA-docs.md # Multi-agent AI architecture, LangGraph patterns
โ โโโ crm/
โ โ โโโ IA-docs.md # Marketing automation, journey steps
โ โ โโโ emailService/
โ โ โโโ IA-docs.md # Email template anatomy, dark mode, visual identity
โ โโโ games/
โ โ โโโ IA-docs.md # Game types, prize system, configuration
โ โโโ analytics/
โ โ โโโ IA-docs.md # Event types, collections, query patterns
โ โโโ auth/
โ โ โโโ IA-docs.md # Passwordless auth, gotchas, deprecated endpoints
โ โโโ image-generation/
โ โ โโโ IA-docs.md # Async flow, worker architecture, quota management
โ โโโ ...8 more modules
โโโ src/jobs/
โ โโโ IA-docs.md # Cron schedules, worker queues, Redis config
โโโ src/scripts/
โโโ IA-docs.md # Migration scripts, maintenance operations
The root CLAUDE.md defines the protocol as mandatory:
## IA-docs.md Protocol (Backend)
**BEFORE editing/creating/analyzing any file in back-cliencer/:**
1. Identify the path of the file to work on
2. **Read ALL IA-docs.md in the hierarchy** from module root to the file's directory
3. Follow the documented patterns and rules
**No exceptions.** Even for "simple" changes.But a protocol the agent can ignore is just a suggestion. A session state file tracks which IA-docs have already been read, and a Claude Code hook verifies โ before every edit โ that the agent has loaded the relevant context. More on the enforcement system later.
How does it work in practice?
The key insight is progressive context loading. Each level adds specificity without repeating what the parent already said.
Working on src/app/crm/emailService/templates/paymentFailed.template.ts:
Read in order:
1. back-cliencer/IA-docs.md โ Tech stack, DDD rules, logging conventions
2. back-cliencer/src/IA-docs.md โ DDD layers, controller patterns, Zod validation
3. src/app/crm/IA-docs.md โ CRM journey steps, services architecture
4. src/app/crm/emailService/IA-docs.md โ Template anatomy, dark mode rules, color palette
After reading these four files, the agent knows:
- The project uses TypeScript strict with Express.js and MongoDB
- Controllers use arrow functions and propagate errors to global middleware
- The CRM module orchestrates three services: Chatwoot, SendGrid, and WhatsApp
- Email templates must have no header, a greeting by name, a pink CTA gradient, and three layers of dark mode CSS
Working on src/app/games/GameApi.ts:
Read in order:
1. back-cliencer/IA-docs.md โ Tech stack, DDD rules
2. back-cliencer/src/IA-docs.md โ DDD layers, API class patterns
3. src/app/games/IA-docs.md โ Game types, prize system, style variants
Different file, different path, different context loaded. The agent only reads what is relevant.
Tips for a better design
Each IA-docs.md follows a consistent structure, but the content varies by level:
Root level: the global contract
# Backend Cliencer - IA-docs (Root)
> Updated: 2026-01-10
> This file is read ALWAYS when working on any backend file.
## What is Cliencer
Platform description, business context.
## Key Technologies
Runtime, framework, database, AI stack, queues, auth, storage...
## DDD Architecture (Summary)
Routes -> Controllers -> APIs -> Domain <- Repositories
## Code Conventions
TypeScript strict, logging patterns, error handling, documentation rules.
## Background Jobs
Cron schedules, workers, queue configuration.Source level: patterns in detail
The src/IA-docs.md expands on architecture with actual code examples for each DDD layer, anti-patterns to avoid, module creation checklists, and naming conventions. This file alone is 300+ lines of production-tested patterns.
Module level: domain-specific knowledge
Each module's IA-docs.md contains what only that module's developer would know:
# Auth Module - IA-docs
## Gotchas
- **Lead upgrade**: In signup-with-trial, if the phoneNumber already exists
with email @chatwoot.auto, the lead gets upgraded instead of rejected.
Only applies to leads, not real clients (409 with masked email).
- **Trial lazy Stripe**: signup-with-trial does NOT create a Stripe customer.
It gets created when the user decides to pay.
- **Code charset**: Magic codes use A-Z (without I,L,O,G) and 1-9 (without 0)
to avoid visual confusion. 6 chars, expire in 1h.These are the "gotchas" that would take a new developer days to discover. With IA-docs, the agent knows them before writing a single line.
DDD works for AI too
If you practice Domain-Driven Design โ or hexagonal, or clean architecture โ you will recognize something familiar: each IA-docs.md defines a bounded context boundary.
Every IA-docs.md includes a mandatory "Responsibility" section. This section explicitly states what a module IS and IS NOT responsible for:
## Responsibility
This module handles user authentication and session management.
It does NOT handle authorization (role checks happen in each module's middleware).
It does NOT create Stripe customers (that is billing's responsibility).This is DDD's bounded context made explicit for the AI agent. Without it, the agent might reasonably assume that the auth module should create a Stripe customer during signup โ after all, many SaaS platforms do exactly that. The IA-docs encodes the business decision: "signup-with-trial does NOT create a Stripe customer. It gets created when the user decides to pay."
The directory structure mirrors the domain model. Module-level IA-docs encode the ubiquitous language of each bounded context. When the games module IA-docs says "Influencer," it means "end-user/player" โ not "marketing influencer." That distinction exists only in this codebase's domain language, and the agent can only learn it from the documentation.
This means the agent does not just follow coding patterns. It understands domain semantics. It knows business rules that are invisible in the code: which module owns which responsibility, what terminology means in this specific context, and which integrations are intentionally deferred. The code alone cannot tell you any of this.
How can I start implementing this?
Start with these three steps:
-
Write one root IA-docs.md with your tech stack, architecture decisions, and coding conventions. This alone will improve every AI interaction.
-
Add module-level docs for your most complex modules. The ones where new developers always ask questions. The ones where "gotchas" live.
-
Mandate the protocol in CLAUDE.md. Make it explicit: before editing any file, read the IA-docs hierarchy. No exceptions.
Once the basics work, the next question is: what if the agent ignores the protocol?
Making the agent follow the rules
The CLAUDE.md protocol says "read all IA-docs before editing." But what if the agent skips it? Three Claude Code hooks form a safety net that makes skipping impossible:
// .claude/settings.json (simplified)
{
"hooks": {
"PreToolUse": [
{
"matcher": "Edit|Write",
"hooks": [
{ "command": ".claude/hooks/enforce-ia-docs.sh" },
{ "command": ".claude/hooks/inject-dev-context.sh" }
]
}
],
"PostToolUse": [
{
"matcher": "Read",
"hooks": [
{ "command": ".claude/hooks/notify-ia-docs-read.sh" }
]
}
]
}
}enforce-ia-docs.sh (PreToolUse: Edit|Write) โ The core enforcement hook. Before every edit, it walks the directory tree from the target file up to the project root, finds all IA-docs.md files in the path, and injects them as additionalContext into the tool call. The agent cannot edit a file without receiving the relevant IA-docs first.
notify-ia-docs-read.sh (PostToolUse: Read) โ The deduplication hook. When Claude explicitly reads an IA-docs.md file, this hook records it in a session state file. The enforce hook checks this state to avoid injecting the same content twice.
inject-dev-context.sh (PreToolUse: Edit|Write) โ The gold rules hook. Injects condensed architecture principles (different sets for backend vs frontend) on every edit. These are not module-specific โ they are project-wide anti-patterns and conventions distilled into a compact format.
The three hooks share state through a session file at /tmp/claude-ia-docs-<session_id>, which tracks which IA-docs have already been injected in the current session. This prevents the same context from being loaded repeatedly, keeping token usage under control.
Developer asks to edit file
โ
โผ
โโโโโโโโโโโโโโโ
โ PreToolUse โโโโ enforce-ia-docs.sh walks tree, injects IA-docs
โ Edit|Write โโโโ inject-dev-context.sh injects gold rules
โโโโโโโโฌโโโโโโโ
โ
โผ
Edit proceeds with full context
โ
โผ
โโโโโโโโโโโโโโโ
โ PostToolUse โโโโ notify-ia-docs-read.sh tracks what was read
โ Read โ
โโโโโโโโโโโโโโโ
The result: the agent cannot edit code without understanding the local domain. Not because it chooses to follow the protocol, but because the hooks inject the context automatically.
While IA-docs carry module-specific knowledge, the gold rules carry project-wide architectural principles:
- Backend gold rules: DDD layer boundaries, error propagation patterns, logging conventions, testing patterns
- Frontend gold rules: component structure, state management, API integration patterns, MUI conventions
These are condensed โ intentionally brief โ distillations of the most critical patterns. A module's IA-docs might say "this service uses SendGrid," but the gold rules say "all external service calls must go through the domain layer, never from controllers."
Keeping documentation alive
Writing IA-docs by hand works, but it does not scale. Two dedicated agents handle the creation and maintenance of IA-docs files.
The create agent (ia-docs-create)
When a module has no IA-docs.md, this agent analyzes its source code and proposes one:
- Targets 30โ60 lines (80 max) โ enough context without noise
- Enforces a mandatory "Responsibility" section with scope and boundaries
- Core principle: "Document ONLY what an AI CANNOT infer by reading the code directly"
- Uses read-only tools (Read, Grep, Glob) โ it never writes files
- Runs on Sonnet for cost efficiency
- Proposes content for human review, never commits directly
The agent examines imports, class hierarchies, route definitions, and business logic to extract what matters. It deliberately ignores implementation details the AI can read from source code, focusing instead on the "why" and the "gotchas."
The update agent (ia-docs-update)
After weeks of development, IA-docs drift from reality. This agent detects what changed and proposes updates:
- Scans recent git changes, grouped by module
- Evaluates existing IA-docs against current code
- Proposes additions, modifications, AND removals (with justification)
- Evaluates bottom-up: deepest modules first, then parents
- Conservative on additions, proactive on removing redundant content
- Also read-only โ proposes changes for review
The lifecycle forms a closed loop:
Create agent proposes โ Human reviews โ Code evolves โ
Update agent evaluates โ Proposes changes โ Human approves โ repeat
Both agents are defined as Claude Code task agents with explicit constraints. They cannot modify your codebase โ they only produce proposals that you accept, modify, or reject.
The supporting cast
IA-docs is the core, but three more layers compound its value.
Skills for cross-cutting concerns
Domain-specific operations that span modules (database queries, payment operations, email testing) are encapsulated as agent skills with their own instructions:
| Skill | When to use |
|-----------|--------------------------------------|
| /mongodb | Queries, search clients, analytics |
| /stripe | Subscriptions, payments, webhooks |
| /front-dev| Frontend development (Next.js, MUI) |Each skill contains tested queries, safety rules, and environment-awareness (production vs development).
Types as documentation
Interfaces with extensive JSDoc serve as living documentation that the compiler enforces:
/**
* Plan expiration date.
*
* **Usage:**
* - TRIAL plans: End date of 7-day trial period (local, no Stripe subscription)
* - BEAUTY/STORE plans: NOT used (Stripe subscriptions are open-ended)
*
* **Access control:**
* - TRIAL: Checked against this date
* - BEAUTY/STORE: Controlled by Stripe subscription status, NOT this field
*/
endDate: Date;The predecessor: HowTo files
Before IA-docs existed as a pattern, the same instinct manifested as "HowTo" files โ markdown guides living next to the code they documented:
sub-agents/tools/
HowToCreateTools.md โ Step-by-step guide with examples
sendLoginEmail.ts โ Reference implementation
escalateToHuman.ts โ Another reference implementation
In practice, a HowTo file is an IA-docs at a deeper nesting level: domain-specific instructions living next to the code they describe. The naming convention (HowTo*.md) predates the standardized IA-docs.md naming. The idea was right โ documentation next to code, not in a wiki โ and it evolved into the hierarchical system described here.
What it solves โ and what it doesn't
What it solves:
- Context precision. The agent loads only the documentation relevant to the file path it's editing โ not the entire project's knowledge base.
- Domain knowledge transfer. Gotchas, bounded contexts, business decisions, and terminology that are invisible in code become explicit and available.
- Scalable ownership. Module owners maintain their own docs. No single person becomes a bottleneck.
- Progressive understanding. Each level of the hierarchy adds specificity without repeating what the parent already said.
- Human and AI onboarding. New developers and AI agents read the same files. Good documentation for AI is good documentation, period.
What it doesn't solve:
- Planning. IA-docs tells the agent how to write code in your project. It says nothing about what to build or in what order. You still need a planning system or an issue tracker.
- Cross-cutting patterns by file type. If you need "all test files follow this pattern," use tools like
.claude/rules/*.mdwith glob matching. IA-docs is organized by directory hierarchy, not file type. - Real-time state. IA-docs describes architecture and decisions, not live system state. It won't tell the agent which services are currently down or which feature flag is active.
The honest downsides
No pattern is perfect. Here is what I have learned the hard way:
Maintenance is real work. Every time the architecture changes, someone needs to update the IA-docs. In practice, this means adding "update IA-docs" to the PR checklist. If you skip it, the docs drift from reality and become harmful instead of helpful. The update agent helps catch drift, but it still requires human review.
Over-documentation is a trap. Some modules need 150 lines of context. Others need 20. The temptation to be thorough everywhere leads to docs that are too long to be useful. My rule: document decisions that would surprise someone, skip everything obvious.
Enforcement adds complexity. The hook system automates context injection, but it introduces its own maintenance surface. Hook scripts need to be kept in sync with the project structure, session state files need cleanup, and debugging why a hook did not fire adds a layer of indirection. It works โ the agent genuinely cannot skip IA-docs anymore โ but it is not zero-cost.
Initial setup cost is significant. Writing 16 IA-docs files from scratch took several days. For a new project, start with just the root file and add module-level docs only when a module has non-obvious patterns worth documenting. The create agent reduces this cost significantly for existing codebases.
Stale docs are worse than no docs. An IA-docs.md that says "use config.configurable" when the API moved to "config.context" will actively generate bugs. Date-stamp every file and review quarterly.
Why this matters beyond AI
Documentation was always important. READMEs, wikis, architecture decision records โ good teams have written these for decades. What changed is not the principle. What changed is the audience.
When an AI agent works on your codebase, documentation stops being a nice-to-have that new hires read in their first week. It becomes the operating system of every code change. The agent that understands your domain produces correct code. The one that does not produces plausible code โ which is worse, because it looks right until it breaks in production.
But here is the thing: a new developer joining the team reads the same IA-docs the agent reads. The same bounded contexts, the same gotchas, the same "this module does X, not Y" boundaries. Good documentation for AI is good documentation, period.
Tools and frameworks will change. Claude Code might be replaced by something else next year. The hooks system might evolve. But the need to encode domain knowledge where it is needed โ at the boundary of each context, not in a central monolith โ that is timeless. It is context engineering, and it matters whether your reader is an AI model or a human with a fresh git clone.
The goal is not perfect documentation. The goal is that anyone โ human or AI โ working on any file in your codebase, understands the local context well enough to produce code that a reviewer would approve on the first pass.
That is the compound interest of hierarchical documentation: every minute spent writing context today saves hours of re-explaining and fixing tomorrow.
Try it in your project
The hooks, agents, and setup guide are open source: ia-docs on GitHub.
Two shell scripts, two agents, zero dependencies beyond jq. Copy them into your .claude/ directory and start writing your first IA-docs.md.