Prompt Kit
You're Burning Money and Blaming the Model
Prompt Kit: You're Burning Money and Blaming the Model
This kit turns Nate's token discipline framework into tools you can use right now. The centerpiece is the Stupid Button — the blunt, no-nonsense diagnostic Nate describes in the article — built as a copy-paste prompt that audits your actual AI habits and tells you exactly where you're hemorrhaging tokens. The remaining prompts help you fix what the diagnostic finds: rescuing bloated conversations, planning model routing, and auditing agent architectures against the KISS commandments.
How to use this kit
Start with Prompt 1 (The Stupid Button). Run it in Claude, ChatGPT, or Gemini. Be honest when it asks about your habits — it can't help you if you lie to it. It will give you a brutally direct assessment and prioritized fixes. Then use the other prompts based on what it finds:
- Hitting usage limits constantly? → Run Prompt 2 to rescue your current sprawling conversation, then start fresh.
- Not sure which model to use when? → Run Prompt 3 to build a model routing plan for your actual workflows.
- Building agents or API pipelines? → Run Prompt 4 to audit your architecture against the five KISS commandments.
The prompts work independently — use whichever ones match your situation.
Prompt 1: The Stupid Button — Token Burn Diagnostic
Job: The blunt diagnostic Nate's been promising — looks at your actual AI habits and tells you where you're being dumb with tokens.
When to use: You're hitting Claude usage limits regularly. Your API bill feels too high. You suspect you're wasting tokens but don't know where. Or you just want a reality check on your AI fluency.
What you'll get: A brutally honest token waste score (1–10), identification of your specific waste patterns ranked by severity, estimated token savings if you fix each one, and a prioritized action plan starting with the highest-impact fix.
What the AI will ask you: How you use AI (subscription vs. API), what tools you use (Claude Desktop, Claude Code, API, etc.), your typical conversation habits, how you handle documents, which models you use for what, and whether you've ever audited your context overhead.
Prompt 2: The Context Rescue — Extract and Compress for a Fresh Start
Job: Extracts the minimum viable context from a long, sprawling conversation so you can start a clean new chat without losing your work.
When to use: You're 20+ turns deep in a conversation and realize you should have started fresh ten turns ago. Your messages are getting slow. You suspect you're burning massive tokens on accumulated context. You want to continue the work but in a new, lean conversation.
What you'll get: A clean, compressed context summary you can paste into a new conversation to pick up exactly where you left off — at a fraction of the token cost.
What the AI will ask you: To paste in the conversation you want to rescue (or describe what you've been working on and what decisions/outputs matter).
Prompt 3: The Model Router — Build Your Workflow Tier Map
Job: Creates a personalized model routing plan that tells you exactly which model tier to use for each part of your workflow, so you stop using your most expensive model for tasks a cheaper one handles just as well.
When to use: You're using one model for everything. You know you should be switching models for different tasks but aren't sure where the lines are. You want a concrete plan, not vague advice about "using the right model."
What you'll get: A tiered routing table mapping your specific, recurring tasks to model tiers (top-tier reasoning, mid-tier execution, lightweight cleanup), with estimated cost impact.
What the AI will ask you: What kind of work you do with AI on a typical day or week.
Prompt 4: The KISS Audit — Agent Architecture Waste Finder
Job: Audits an agent or API pipeline architecture against the five KISS commandments for agents and identifies where it's bleeding tokens through architectural laziness.
When to use: You're building or running an agentic system, multi-step API pipeline, or any automated AI workflow. Your costs feel too high. Your agents seem slow or inconsistent. You want someone to look at your architecture and tell you what's stupid.
What you'll get: A commandment-by-commandment audit with specific violations identified, estimated waste per violation, and an implementation plan for fixes.
What the AI will ask you: To describe your agent architecture — what agents you have, what context they receive, how they're orchestrated, and whether you're using caching.
Prompt 5: The Token Translator — Make the Invisible Visible
Job: Takes a description of how you currently use AI and translates it into actual token costs and usage-limit burn, so you can see exactly where your money (or your meter) is going.
When to use: You have no idea how many tokens your habits actually consume. You want to understand why you're hitting your limit or running up your bill. You want the math, not the vibes.
What you'll get: A detailed breakdown of your token consumption across a typical session, with the specific moments where waste spikes, and a side-by-side comparison of your current approach vs. a cleaned-up version doing the same work.
What the AI will ask you: To walk through a recent AI session step by step — what you did, in what order, in roughly how many turns.