Building Nova: My Autonomous AI Personal Assistant

A behind-the-scenes look at Nova — an autonomous AI assistant I built from scratch, featuring 33+ specialised agents, a self-improving Ghost system, and a coding swarm that reviews and ships code every morning.

When people ask me what I've been working on lately, I tell them: an AI that manages my calendar, reads my emails, reviews my code, ships bug fixes, and gets a little smarter every day. They usually assume I mean some SaaS tool with my name in the settings. I mean software I wrote myself.

Nova is my personal AI assistant — built from scratch on FastAPI, PostgreSQL, and four LLM providers. It started as a Telegram bot in early 2024 and has since grown into something I genuinely rely on every day.

Why build your own?

The honest answer is that I wanted to understand what I was building with. Every team I've worked on uses LLMs for something now — code review, document generation, customer-facing chat. But there's a real gap between prompting a model and designing a system that's reliable enough to trust with real work. I built Nova to close that gap for myself.

The longer answer is that off-the-shelf assistants don't compose. They don't know that my email client is connected to my calendar, that my AIREP deployment runs on Render, or that a spike in the freight queue usually means I need to check the API key. Nova knows these things because I taught it.

The architecture

Nova runs as a FastAPI application with a PostgreSQL backend. At its core is a Brain module — the request coordinator that routes incoming messages from Telegram or the web dashboard to the right part of the system.

Underneath that is an agent layer. There are currently 33 runtime agents, each defined in a YAML file with its own system prompt, preferred LLM provider, and context files. When a message arrives, the manager agent decides which specialist handles it — the ERP agent for supply chain questions, the language coach for German practice, the coding agent for anything related to the repos.

The memory system uses pgvector for hybrid search: BM25 keyword matching combined with vector similarity via Reciprocal Rank Fusion. Short-term context lives in a sliding window; long-term knowledge is embedded and retrieved by relevance.

The coding swarm

The part I'm most proud of is the coding swarm. Every morning Nova runs a review pipeline across three codebases — Nova itself, AIREP (my ERP project), and keirantrace.com. It scores open issues and improvement opportunities, builds a prioritised queue, and then executes the highest-value items through a nine-stage pipeline: triage, plan, code, review, verify, test, deploy, brief.

By the time I sit down at my desk, there's a Telegram message summarising what changed overnight. Sometimes it's a bug fix. Sometimes it's a refactor I'd been meaning to do for weeks. It ships, CI passes, and it's on Render before I've had coffee.

This isn't magic — it's engineering. The pipeline has hard cost caps, stage timeouts, a reviewer agent that can reject work and send it back, and an orchestrator that watches for anomalies. When something goes wrong, it alerts me rather than silently failing.

Self-improvement

The Ghost system is Nova's most experimental feature. It observes patterns in its own outputs — cases where it hedged when it should have committed, where it asked a clarifying question that wasn't necessary, where a response was longer than it needed to be. Over time it proposes rewrites to its own system prompts and context files.

It's a feedback loop. Not an AGI moment, but something more useful: a system that surfaces its own failure modes instead of hiding them.

What I learned

A few things surprised me building this.

Reliability is harder than capability. Getting a model to generate useful output is straightforward. Getting it to do so consistently, within budget, without hallucinating file paths or silently skipping steps, takes real systems work. Retry logic, idempotency, structured output validation — the boring parts matter most.

Prompt design is engineering. A system prompt isn't a magic spell. It's a specification. The same discipline that goes into a good API contract goes into a good agent prompt. Ambiguity in the prompt shows up as inconsistency in behaviour.

Observability is non-negotiable. Nova logs every LLM call with token counts, model, and estimated cost. I can query that table to see exactly what I've spent and where. Without that, cost would get away from me and debugging would be guesswork.

What's next

Nova is a living system and I keep adding to it. Current priorities include better episodic memory — using completed pipeline runs to inform future planning — and tighter integration with AIREP for more autonomous ERP operations.

If any of this sounds interesting, I'm happy to talk through the architecture. The design decisions have been hard-won and I'm always keen to compare notes with others building in this space.