Community-Trained Agent: Learning from a Repo, Not from Docs

When you build non-trivial Claude Code setups — skills that chain into workflows, hooks that inject context, agents that audit your own configuration — you quickly outgrow the official documentation. It is improving, but the docs move slower than the tool itself. The features and usage patterns that matter most are the ones nobody has documented yet.

I needed to build an expert agent for Claude Code. One that knows how to structure hooks, configure agents, manage state between sessions. Not the kind of knowledge you find in a reference guide.

The question was: where do you find a reliable source of that knowledge?

I will show you what it found before I explain how

I pointed the finished agent at my own .claude/ directory — the full production setup for a SaaS platform with 197+ PRs — and asked it to audit everything.

It found 25 issues. Two were critical:

CRITICAL — Exposed Secrets

settings.local.json
- MongoDB PROD connection strings with password
  and SendGrid API key in plaintext inside the allow array
- File: .claude/settings.local.json (lines ~27, 124, 150)
- Action: Remove secret entries, keep generic patterns

.mcp.json
- Stripe LIVE API key (sk_live_...) hardcoded
- Action: Move to environment variable

One medium-severity finding was subtler: an 800-line agent file that loaded into every session, burning context window before any real work started. The agent flagged the exact file path and recommended splitting it into focused sub-agents.

That audit came from an agent that learned how Claude Code configurations should look by studying how thousands of engineers actually configure them.

The source: a repo I don't like but respect

I had found Get Shit Done a while back. Over 20,000 stars. Active community. I installed it, tried it, and decided it was not for me. I think my own IA-docs pattern — hierarchical markdown files placed at each directory level — works better for structuring AI context. That opinion has not changed.

But the implementation was solid. Their hook patterns, agent structures, state management — all in production across different projects, maintained by thousands of engineers. Every merged PR passed review by people solving real problems.

How I built the expert agent

Instead of writing instructions from scratch, I pointed a task agent at the GSD repository and told it: study how this repo structures everything. Here is the actual workflow:

# Step 1: Read repo structure via GitHub API
gh api "repos/gsd-build/get-shit-done/contents/.claude/agents" \
  -q '.[].name'
 
# Step 2: Read specific implementation files
gh api "repos/gsd-build/get-shit-done/contents/.claude/hooks/gsd-context-monitor.js" \
  -q '.content' | base64 -d
 
# Step 3: Save discoveries as dated memory files
# Path: .claude/agent-memory/claude-code-expert/2026-02-26-gsd-patterns.md

The agent reads source code from the repo, extracts patterns, and saves them to timestamped memory files. Here is what one of those files actually looks like — this is real output from studying GSD's agent structure:

# Patterns learned from get-shit-done (GSD)
 
> Source: https://github.com/gsd-build/get-shit-done
> Stars: ~20.5k (Feb 2026)
> Analysis date: 2026-02-26
 
## Agent structure (.claude/agents/*.md)
 
GSD uses XML tags to structure agent sections:
- <role> — Who you are, responsibilities
- <project_context> — How to discover project context
- <execution_flow> with <step name="..." priority="...">
- <deviation_rules> — What to auto-fix vs what to ask about
- <success_criteria> — When to consider work complete
 
## Critical: Node.js over Bash for hooks
 
jq's // operator treats 0 as falsy, silently breaking
exit code logic. GSD migrated all hooks to Node.js.

Those XML tags — <role>, <project_context>, <execution_flow>, <deviation_rules> — I adopted them directly for my own agents. Before studying GSD, my agent definitions were flat markdown with no consistent structure. After, they have clear sections that Claude Code can parse predictably. The jq discovery alone saved me a debugging session I would not have seen coming.

Next time I ask the agent a question, it reads its memory first before investigating again. No repeat work, and each discovery builds on the last.

The full agent and its memory files are public at claude-production-toolkit.

The honest downsides

This strategy has real risks. You are coupling your agent's knowledge to a third-party repository. If that repo changes direction, gets archived, or introduces patterns that contradict your architecture, your agent inherits that drift. I already live with this tension — I disagree with GSD's core concept but depend on its implementation patterns.

There is also a token cost. Processing a full repository is not free. And the agent might learn patterns that work well in the source repo but do not fit your project. You need to review what it produces, not trust it blindly.

What I did not plan for

The GSD contributors were solving their own problems, reviewing each other's code, filing issues about things that broke. My agent just learned from all of it.

The pattern works for any tool that evolves faster than its documentation. Find a well-maintained repo that uses it in production, point an agent at it, and let real usage fill the gap that the docs have not caught up with yet.