Building in Public: One Week of AI Development at Plutonal

Last week was one of those periods where everything breaks, then somehow comes together, then breaks again. But that's startup life, right? Here's what actually happened behind the scenes at Plutonal as we inch closer to our February 2026 launch.
The RAG System Finally Works
We finally cracked our Retrieval Augmented Generation architecture. After weeks of wrestling with how to store and access financial research at scale, we got our vector database properly connected and our caching layer optimised. The real breakthrough came when we solved the chunking problem, which sounds boring but was absolutely critical.
Here's what was happening: our embedding model was receiving text chunks that were either too small (losing context) or too large (diluting semantic meaning). Imagine trying to teach an AI about quantitative finance by randomly cutting up academic papers mid-sentence. That's what we were accidentally doing.
We rebuilt the entire chunking logic to respect semantic boundaries while maintaining optimal token windows for our embedding model. Then we implemented a sliding window approach with overlap to preserve context across chunk boundaries. The difference in retrieval accuracy was night and day.
Once we fixed that, we loaded 65 additional research papers and financial textbooks into the system. We're now sitting on a knowledge base that would make most investment banks jealous, all accessible through natural conversation. When we tested retrieval precision, we were hitting 94% relevance on complex quantitative finance queries.
The Multi-Agent Coordination Challenge
The harder problem this week was getting our various AI agents to work together intelligently. We've got agents specializing in different types of market analysis, and they need to coordinate without stepping on each other's toes or duplicating work.
We implemented a query routing system that analyses user intent and determines which agents to activate. A simple question like "what's happening with tech stocks?" might trigger sentiment analysis, cross-market correlation checks, and news aggregation simultaneously. The orchestration layer needed to handle parallel execution, merge results intelligently, and present coherent answers.
The breakthrough was implementing proper conversation memory with Redis. Now the system actually remembers what you asked three questions ago and can build on that context. It sounds obvious, but getting this right with multiple agents maintaining separate analytical threads was genuinely tricky.
The News Integration Nightmare
We spent considerable time this week switching our financial news data provider. The previous solution was giving us inconsistent data quality and latency spikes that killed the user experience. We researched alternatives, ran parallel tests, and finally migrated to a better provider that covers both US and Indian markets with lower latency.
The integration itself was straightforward. The hard part was rebuilding our news processing pipeline to handle the new data structure, set up proper webhooks for real-time updates, and implement fallback mechanisms when the API inevitably has issues.
We also added Wikipedia and other search capabilities as fallback agents when specialized financial sources don't have answers. Sounds simple, but coordinating when to use which source and how to weight different information streams required rebuilding significant parts of our agent routing logic.
Compliance as a Technical Problem
Most AI startups ignore this until regulators come knocking. We're building compliance guardrails directly into the model behaviour. Plutonal provides research and analysis, not financial advice. That distinction keeps us from getting regulated into oblivion by the SEC and SEBI.
We implemented a classification layer that detects when responses might cross into advisory territory. If a user asks "should I buy this stock?", the system recognises the advisory intent and reframes to educational analysis instead. We're essentially training the AI to be a careful analyst rather than a reckless advisor.
This required building custom prompt engineering frameworks and response validators that run before anything reaches the user. We tested hundreds of edge cases to ensure the system maintains this boundary even when users explicitly try to push it into giving recommendations.
The Performance Obsession
Nobody wants to wait 30 seconds for market analysis. But we're running multiple AI models, querying vector databases, calling external APIs, and processing thousands of data points per query. The naive implementation was painfully slow.
We implemented aggressive caching strategies using Redis. Frequently accessed embeddings stay in memory. Common queries get cached responses. We also optimised our database queries and implemented parallel processing where agents can run simultaneously rather than sequentially.
The result: most queries now return in under 3 seconds despite running sophisticated multi-agent analysis in the background. We're also tracking cache hit rates and query patterns to continuously optimise what we pre-compute versus what we calculate on demand.
What's Actually Hard About AI in Finance
The hardest part isn't building an AI chatbot. Every fintech startup has that now. The challenge is making the AI actually understand quantitative finance deeply enough to be useful to sophisticated investors.
When someone asks about cross-market correlations or volatility clustering, the system needs to know these are specific statistical concepts with precise mathematical definitions. It can't just generate plausible-sounding text. It needs to run actual calculations, cite proper academic research, and present results that would pass muster with a quantitative analyst.
We're essentially building AI agents that can read academic papers about GARCH models or Granger causality tests, understand the methodology, implement it on real-time market data, and explain the results conversationally. That requires RAG systems that actually work, agents that coordinate intelligently, and enough domain knowledge baked into the prompts that the AI doesn't hallucinate financial concepts.
The Honest Truth
We're three months from beta launch with 100 users. We've got our core AI infrastructure working, our multi-agent coordination stable, and our knowledge retrieval hitting institutional-grade accuracy.
Every week matters. Every agent optimised, every embedding improved, every latency reduction gets us closer to something that could genuinely change how retail investors access market intelligence.
The big financial institutions have been gatekeeping sophisticated analysis for decades, partly because the technology to democratise it simply didn't exist. Now it does. We're building it.
More updates next week, assuming our vector database doesn't explode.
Neil Brahmavar
Founder & CEO, Plutonal