Atticus AI
2025 –A legal AI platform that gives small law firms the research and analysis tools only BigLaw could afford.
LiveThe Problem
Legal tech is built for BigLaw. The platforms that handle research, document review, and compliance cost six figures a year and assume you have a dedicated IT department. Small and mid-sized firms — roughly 80% of the market — are stuck with generic tools, manual processes, or nothing at all.
Legal research alone can take hours per issue. Document review is still done by reading every page. Compliance is tracked in spreadsheets. These firms aren't short on legal talent — they're short on infrastructure.
That gap is where I started building. The space sits at the intersection of complex NLP, real-time systems, data pipelines, and security constraints that actually matter — HIPAA compliance isn't a checkbox, it shapes every architectural decision.
The Platform
Atticus is a multi-tenant legal workspace. Lawyers log in, see their firm's matters, and interact with AI tools that understand legal context. The core capabilities are RAG-powered research with source attribution, contract review with structured clause analysis, real-time AI chat with transparent tool use, and document creation from templates.
The agent loop is the part I'm most proud of. When a user asks a question, the system doesn't just call an LLM and return text. It runs a 7-step pipeline:
- Query classification
- Document retrieval
- Context assembly
- Prompt construction
- Model invocation
- Response validation
- Source citation
Each step streams progress to the UI so the user sees exactly what the system is doing — no black box.
Constitutional AI guardrails enforce hard boundaries: the system never fabricates citations, never presents analysis as legal advice, and always attributes sources. If the model can't find supporting evidence, it says so rather than hallucinating an answer.
Agentic Workflow
The platform doesn't operate as a single monolithic AI call. It runs an orchestrator that delegates to a set of specialized sub-agents, each scoped to a narrow task with minimal context. A research agent handles retrieval and citation. A drafting agent handles document generation. A review agent handles contract clause analysis. A compliance agent validates outputs against regulatory constraints.
Each sub-agent receives only the context it needs — the research agent gets the query and document index, not the full conversation history. The drafting agent gets the outline and relevant precedents, not the raw search results. This keeps token usage tight and prevents context pollution, where irrelevant information degrades response quality as the window fills up.
The orchestrator classifies incoming requests, selects the right agent chain, and manages handoffs between them. A complex request like “draft a motion to dismiss based on the statute of limitations defense in the Henderson matter” triggers a three-agent pipeline: research pulls the relevant case documents and Florida civil procedure rules, drafting assembles the motion with proper legal formatting and citation, and review validates the citations exist and the arguments are internally consistent.
This architecture also makes the system more reliable. If the drafting agent produces a citation the review agent can't verify, the system flags it rather than silently including a hallucinated reference. The agents check each other's work the same way a junior associate and a partner would.
Hybrid RAG Pipeline
The retrieval pipeline uses a hybrid approach — 70% semantic search via pgvector embeddings and 30% BM25 keyword matching, fused with Reciprocal Rank Fusion. A legal confidence scorer filters results, and document chunking is domain-aware: contracts split by section boundaries, case law by holdings and reasoning, statutes by subsection.
Model Routing
Simple queries — definitions, procedural questions, status lookups — route to Haiku. Complex analysis — multi-document comparison, contract risk assessment, nuanced legal reasoning — routes to Sonnet. The routing classifier itself is lightweight, a few hundred tokens of context analysis, and reduces AI spend by roughly 60%.
A circuit breaker pattern handles failures: if the primary model is unavailable, the system falls through three tiers automatically. Each tier has calibrated token limits and capability flags so the response quality degrades gracefully rather than failing outright.
Document Processing
The document processing pipeline uses Rust for text extraction and vector operations, which runs 31x faster than the equivalent Python path for large PDFs. Streaming is end-to-end: Bedrock responses flow through SSE, Nginx with proxy buffering disabled, and an async generator on the frontend. Time to first token is under a second.
CDC Pipeline
The CDC pipeline is the infrastructure work nobody sees but everything depends on. Debezium captures PostgreSQL WAL changes and pushes them through Kafka (AWS MSK). Dual consumers keep the search index in sync and maintain an audit trail — every document change, every query, every user action is captured for compliance.
Knowledge Base
The legal knowledge base runs on Aurora MySQL with over a million indexed records. Search follows a local-first pattern: the system checks the firm's own documents before falling back to the broader legal corpus. This means a firm's internal precedents and templates take priority over generic results, which is how lawyers actually work — they start with what their firm has done before.
Compliance & Storage
S3 handles the compliance archive with Object Lock for immutability — once a document version is stored, it cannot be modified or deleted. Multi-tenant isolation runs through every layer: database queries filter by firm_id, S3 prefixes namespace document storage, JWT tokens carry firm context, and even the Redis cache keys are tenant-scoped.
Deployments use a blue-green pattern with health checks at each stage — zero downtime, with automatic rollback if the new deployment fails readiness checks.