What Running 34 AI Agents in Production Actually Looks Like

Nova, my personal AI system, now runs 34 agents across general chat, pipeline execution, ERP support, and more. Here's an honest look at what production orchestration means when you're the one who built it and the one who depends on it.

Nova is in production. That sentence sounds more dramatic than it is, but it matters to me because a year ago it was a single chat interface wired to an API. Now it's 34 agents spanning general conversation, pipeline execution, ERP specialists, language tasks, and a few things that don't fit neatly into any category. It runs. I use it every day. And recently I've been thinking seriously about what comes next — specifically, making it improve itself.Let me be honest about what "34 agents in production" actually means before that sounds more impressive than it is. It doesn't mean 34 processes running in parallel waiting for work. It means 34 defined agent roles — each with a specific system prompt, a set of tools, and a scope of responsibility — that get invoked by an orchestration layer depending on what I'm asking for or what a pipeline needs. Most of them sit idle most of the time. The complexity isn't in the count, it's in the routing logic and making sure each agent does exactly its job and nothing more.That constraint — each agent doing exactly its job — turns out to be the hard part. When you build a system like this yourself, you notice the failure modes that don't show up in demo videos. An agent that's slightly too broad in scope will drift. It'll try to handle things adjacent to its purpose because the language model underneath it is generalist by nature. So you're constantly tightening prompts, narrowing tool access, and deciding whether a new capability belongs in an existing agent or warrants a new one. These are design decisions, not just configuration tweaks.The orchestration layer is where I've spent the most time lately. Getting an agent to do something useful in isolation is relatively straightforward. Getting the right agent to be invoked at the right time, with the right context, and hand off cleanly to the next one — that's system design. It has more in common with building a good API than it does with prompt engineering. You're thinking about contracts between components, about what state needs to be passed and what can be left behind, about failure paths when an agent returns something unexpected.The next goal I'm working toward is a self-improvement loop: agents that can review Nova's own code, flag problems, propose refactors, and surface improvements without me initiating every cycle. This is not a small thing to get right. The obvious risk is an agent that confidently proposes changes that are technically valid but contextually wrong — it doesn't know why a particular decision was made, only what the code looks like now. So the loop needs memory, it needs access to the reasoning behind past decisions, and it needs a human checkpoint before anything actually changes.What I find genuinely interesting about this problem is that it forces you to think about documentation and context as infrastructure. If an agent is going to review a codebase and make useful suggestions, it needs more than the code. It needs the intent. That means the notes I write about architectural decisions, the reasons I chose one pattern over another, the constraints from a specific client context — all of that needs to be structured and retrievable. Building the self-improvement loop is, in large part, a problem of making implicit knowledge explicit.This connects to something I keep coming back to across all my projects: AI as leverage only works if the underlying system is legible. Nova can help me move faster on AIREP, on Find a Sign, on client work — but only if it can understand what's already been built and why. A messy, undocumented codebase doesn't become easier to work with because you've added an AI layer on top. You still have to do the work of making things clear. The AI just makes that work more obviously worth doing.Production is a milestone, but it's also just the starting point for the more interesting problems. The architecture is stable enough to build on. Now I'm building on it.

What Running 34 AI Agents in Production Actually Looks Like

Comments

Leave a comment