Prompt Kit
Prompt Kit: Building Agents Is 80% Plumbing
This kit gives you an interactive agent architecture audit based on the 12 infrastructure primitives revealed by the Claude Code leak. Describe your agent system (or your team's), and get back a prioritized gap analysis that tells you exactly what's missing and what to build next — organized by urgency tier.
How to use this kit
This kit contains one prompt: an Agent Architecture Audit that works for anyone building or evaluating agentic AI systems. Paste it into any thinking-capable model like ChatGPT, Claude, or Gemini. The prompt runs as an interview — it asks you questions one at a time, builds a picture of your system, then delivers a structured assessment. You don't need to prepare anything in advance. Whether you're a solo developer with a single-agent setup, an engineering lead evaluating production readiness, or an MCP developer wondering what you're missing beneath the tool layer, the audit adapts to your level of sophistication and tells you the three most important things to address next.
Agent Architecture Audit
Job: Evaluate any agentic AI system against the 12 production infrastructure primitives derived from Claude Code's architecture, and deliver a prioritized gap analysis with concrete next steps.
When to use: When you've built (or are evaluating) an agent that works in demos but you're not sure it's production-ready. When your agent breaks in ways you can't explain. When you need to hand your engineering team a clear checklist of what to build next. When you're assessing whether a team's agent architecture is ready for real users.
What you'll get: A tiered gap analysis (Day One → Week One → Month One) with severity ratings, specific findings about what's missing, and a prioritized action plan with the top 3 things to build or fix next.
What the AI will ask you: It starts by asking you to describe your agent in a few sentences. Then it asks targeted follow-up questions — one at a time — about how your system handles things like sessions, permissions, cost tracking, crash recovery, and observability. It only asks about what it can't already infer from your answers.