← Back to blog

34 Agents, One System: What 'Production' Actually Means for a Personal AI

· 3 min read
34 Agents, One System: What 'Production' Actually Means for a Personal AI 34 Agents, One System: What 'Production' Actually Means for a Personal AI

Nova is my personal AI system and it's now in production orchestration — but what does that actually mean when you're a solo developer running a multi-agent system against your own life and work?

<p>There's a moment in every project where "in progress" stops being an honest description. Nova hit that point recently. With 34 agents now running across general chat, pipeline execution, ERP specialisation, and language tasks, calling it a side project or an experiment doesn't hold up anymore. It's in production. My production.</p><p>That framing matters more than it sounds. Most writing about AI agents talks about deploying them for customers, for scale, for revenue. Nova is none of those things yet. It's a system I built for myself — to orchestrate my own work across AIREP, Find a Sign, Sweeper Parts, client websites, and whatever else is running at any given time. The user is me. The stakes are real, just not in the traditional commercial sense.</p><p>So what does production mean in that context? It means the system has to be reliable enough that I actually depend on it. Not "I demo it when it works" — I mean I route real decisions through it, store real memory in it, and trust it to surface the right context when I need it. The moment I caught myself frustrated when Nova wasn't available, rather than just mildly inconvenienced, I knew we'd crossed the line.</p><p>Getting to 34 agents didn't happen through a grand design session. It happened incrementally, driven by friction. Every time I noticed I was doing the same kind of cognitive work repeatedly — context-switching between projects, translating requirements into structured tasks, looking up my own prior decisions — I asked whether that was something a well-scoped agent could absorb. Usually the answer was yes, eventually.</p><p>The architecture reflects that. Agents are scoped tightly. An ERP specialist agent doesn't need to know anything about signage marketplace logic. A pipeline execution agent doesn't need opinions about tone of voice. Keeping concerns separated meant each agent could be built to a high standard in a narrow domain, rather than one sprawling generalist that does everything poorly. This mirrors how I think about software generally: the discipline of not letting things bleed into each other pays off compounding over time.</p><p>The orchestration layer is where the interesting complexity lives. Individual agents are relatively simple. Deciding which agent handles which request, how context passes between them, how failures get handled without silent degradation — that's the hard part. Production revealed edge cases that no amount of local testing would have. Ambiguous intents. Requests that span multiple domains. Context that's technically available but semantically stale. These aren't hypothetical problems; they're things I ran into in the first weeks of real use.</p><p>One thing I didn't anticipate: how much the system would expose gaps in my own thinking. When you ask a well-constructed agent a question and it gives you a confused answer, it's worth asking whether the question was actually clear. A surprising number of times, the agent was reflecting genuine ambiguity back at me. That feedback loop — using the system's failures as a diagnostic on my own mental models — has been one of the more unexpected benefits.</p><p>The self-improvement loop goal is the next serious milestone. Right now Nova's agents are built and maintained by me, manually. The aim is to get to a point where agents can review their own performance, identify where they're underperforming, and propose or even implement improvements autonomously. That's a harder problem than building the agents in the first place — it requires the system to have a reliable model of what "good" looks like, which is not a given.</p><p>I'm not going to pretend I have that solved. But having a production system running against real workloads is the necessary foundation. You can't improve what you're not actually using. The feedback that will drive that loop has to come from somewhere, and right now it's coming from me using Nova every day and noticing where it falls short.</p><p>That's the unglamorous truth about production AI systems, even personal ones: the work doesn't stop at deployment. It starts there.</p>

Comments

No comments yet — be the first!

Leave a comment

Comments are held for moderation before appearing.