Stop Babysitting your AI Coding Agent Every Single Session

Watched a video by DSquaredLabs on "No more re-explaining: Give your AI agent a memory graph" — here's what stuck with me and why I think this three-tool system is worth your attention

If you work on a large codebase with AI coding tools, you already know the ritual. Open a new chat, paste your file structure, explain your stack, describe how the modules connect — and repeat the whole thing tomorrow.

It is not a minor inconvenience. It is a compounding productivity tax that gets worse as your project grows.

This article breaks down a three-tool system designed to fix exactly that. If you want the full walkthrough with a live demo on a real monorepo, the video at the bottom covers everything in depth — but the core concepts are worth understanding before you dive in.

The Problem With How Most Developers Use AI Today

Most AI-assisted coding workflows are missing three things that every good engineering team actually has:

A map of the codebase

A process for how work gets done

A way to communicate efficiently

Without these, your AI agent guesses, drifts, touches files it was never supposed to touch, and burns tokens re-learning context it already learned last session. The tools below are a direct answer to each of these gaps.

The Three-Tool System

Image credit: DsquaredLabs

1. Graphify — The Context Layer

Graphify reads your entire project — source files, docs, diagrams, schemas — and builds a queryable knowledge graph of your codebase. Instead of dumping files into context at the start of every session, your AI agent queries the graph for only what it needs.

Three things that matter practically:

Fewer tokens per query. The agent pulls targeted information from the graph rather than scanning files wholesale.
Fully local parsing. Every language is analyzed on your machine. Nothing is uploaded to a remote server.
Zero source code in the cloud. The graph is yours. Your code never leaves your environment.

Before running Graphify on a monorepo, you will want to create a .graphifyignore file — similar to .gitignore — to exclude build artifacts, platform folders, generated files, and test caches. Without this, the graph picks up noise and becomes less useful. This step should come before the install.

2. Superpowers — The Process Layer

This is the most important tool in the system. Superpowers gives your AI agent a structured engineering workflow: plan, implement, test, review, fix — every time, with no exceptions.

Before writing a single line of code, the agent pulls a real spec from the implementation document, breaks the work into precise tasks with exact file paths and verification steps, and only then begins execution.

The gap between having this and not having it feels small on a simple task. On a large, complex project, it is the difference between code you can maintain and code you have to rewrite from scratch.

One setup detail that is easy to overlook: inside your CLAUDE.md (or AGENT.md for Codex), explicitly instruct the agent to use Superpowers and Graphify on every task. AI agents do not retain instructions between sessions. Without this, there is a real chance the agent quietly skips the tools and falls back to reading files directly with no plan.

One more thing worth emphasising: always read the plan before execution starts. If there is a mistake in the plan, everything that follows inherits that mistake. The agent will confidently build the wrong thing, and by the time you notice, you are several tasks deep. The plan is the contract — get it right first.

3. Caveman — The Efficiency Layer

Caveman runs throughout the entire session. Its job is simple: keep the AI's output short, precise, and low on tokens. It cuts the filler, the padding, and the unnecessary explanations that accumulate across every reply.

The claimed result is a 75% reduction in token consumption and responses that are faster and cleaner to read. At the end of any session, you can run a summary command to see exactly how much was saved.

What This Actually Changes

Image credit: DsquaredLabs

Here is a concrete example from the DSquaredLabs video — implementing a welcome email feature on a Flutter and Python monorepo:

Starting usage: 8%
Ending usage: 22%
Actual burn: 14%
Expected burn for equivalent work: ~25%
Saving: ~11% in a single session

Eleven percent in one session may not sound dramatic. But this compounds. Session after session, feature after feature, the saving accumulates — alongside a workflow that is structured, reviewable, and safe. You are not trading quality for efficiency. You are getting both.

At the end of that same session, Caveman reported 65% token savings on communication overhead alone.

The Full Picture

Image credit: DsquaredLabs

The system works as a single pipeline:

Graphify scans the repo at the start and gives the AI a map
The AI starts with context — it knows where everything is before touching a file
Superpowers runs the workflow — plan first, change code, run tests, review before done
Caveman stays on the whole time — persistent guardrails keeping every response tight

Together they turn what DSquaredLabs aptly calls "vibe coding" from a gamble into something repeatable and reliable.

Watch the Full Walkthrough
The concepts above give you the mental model, but the real value is in watching the system run on an actual codebase.
Watch the full DSquaredLabs video here

References

LLM Knowledge Bases by Andrej Karpathy

DsquaredLabs blog

graphify — An AI coding assistant skill that turns any folder of code, SQL schemas, scripts, docs, papers, images, or videos into a queryable knowledge graph, with support for Claude Code, Codex, Cursor, Gemini CLI, and more. github

superpowers — A complete software development methodology for coding agents, built on composable skills covering TDD, systematic debugging, spec-driven brainstorming, subagent-driven development, and more. github

caveman — A Claude Code skill (also supporting Codex, Gemini, Cursor, and 30+ other agents) that compresses agent output by roughly 65% by having the agent respond in stripped-down, fragment-style prose, while preserving full technical accuracy. github

Why Domain-Specific AI Agents Beat One Big Agent

Everyone is building agents right now. Real estate firms. Independent insurance brokers. Fortune 500 companies with budgets big enough to hire an army of consultants. Ask around and you'll hear the same story everywhere: "we're building our own agent." And yet almost nobody is asking the obvious question: why does the default approach keep failing? One large, general-purpose agent gets wired up to every tool the business owns. It impresses in the demo. Then it quietly stalls before production. There's a gap between what businesses want and what they're actually getting. They want AI woven into their data, their workflows, their day-to-day operations. What they get instead is one oversized agent trying to be a sales rep, a compliance officer, and a customer support line, all at once. That gap is an architecture problem, not a model problem. Key Takeaways The default "one big agent" pattern breaks down on context bloat, cost, fragility, and portability...

DJ-Android

Search This Blog

Stop Babysitting your AI Coding Agent Every Single Session

Watched a video by DSquaredLabs on "No more re-explaining: Give your AI agent a memory graph" — here's what stuck with me and why I think this three-tool system is worth your attention

The Problem With How Most Developers Use AI Today

The Three-Tool System

1. Graphify — The Context Layer

2. Superpowers — The Process Layer

3. Caveman — The Efficiency Layer

What This Actually Changes

The Full Picture

Watch the Full Walkthrough

References

LLM Knowledge Bases by Andrej Karpathy

DsquaredLabs blog

Comments

Post a Comment

Popular posts from this blog

Don't Ship AI Agent Skills Without Evals

Why Domain-Specific AI Agents Beat One Big Agent

ANDROID - Adding ActionBar Navigation Tabs

DJ-Android

Stop Babysitting your AI Coding Agent Every Single Session

Watched a video by DSquaredLabs on "No more re-explaining: Give your AI agent a memory graph" — here's what stuck with me and why I think this three-tool system is worth your attention

The Problem With How Most Developers Use AI Today

The Three-Tool System

1. Graphify — The Context Layer

2. Superpowers — The Process Layer

3. Caveman — The Efficiency Layer

What This Actually Changes

The Full Picture

Watch the Full Walkthrough

References

LLM Knowledge Bases by Andrej KarpathyDsquaredLabs blog

Comments

Post a Comment

Popular posts from this blog

Don't Ship AI Agent Skills Without Evals

Why Domain-Specific AI Agents Beat One Big Agent

ANDROID - Adding ActionBar Navigation Tabs

LLM Knowledge Bases by Andrej Karpathy

DsquaredLabs blog