Atticus AI

2025 –

A legal AI platform that gives small law firms the research and analysis tools only BigLaw could afford.

The problem

Legal tech is built for BigLaw. The platforms that handle research, document review, and compliance cost six figures a year and assume you have a dedicated IT department. Small and mid-sized firms, roughly 80% of the market, are stuck with generic tools, manual processes, or nothing at all.

Legal research alone can take hours per issue. Document review is still done by reading every page. Compliance is tracked in spreadsheets. These firms aren't short on legal talent. They're short on infrastructure.

That gap is where I started building. The space sits at the intersection of complex NLP, real-time systems, data pipelines, and security constraints that actually matter. HIPAA compliance isn't a checkbox here; it shapes every architectural decision.

The platform

Atticus is a multi-tenant legal workspace. Lawyers log in, see their firm's matters, and interact with AI tools that understand legal context. The core capabilities are RAG-powered research with source attribution, contract review with structured clause analysis, real-time AI chat with transparent tool use, and document creation from templates.

The agent loop is the part I'm most proud of. When a user asks a question, the system doesn't just call an LLM and return text. It runs a 7-step pipeline:

Query classification
Document retrieval
Context assembly
Prompt construction
Model invocation
Response validation
Source citation

Each step streams its progress to the UI, so you can see exactly what the system is doing. No black box.

Constitutional AI guardrails enforce hard boundaries: the system never fabricates citations, never presents analysis as legal advice, and always attributes sources. If the model can't find supporting evidence, it says so rather than hallucinating an answer.

Agentic workflow

The platform doesn't operate as a single monolithic AI call. It runs an orchestrator that hands work to a set of specialized sub-agents, each scoped to a narrow task with minimal context. One handles retrieval and citation, another drafts documents, a third reviews contract clauses, and a compliance agent checks every output against regulatory constraints.

Each sub-agent only gets the context it needs. The research agent gets the query and the document index, not the whole conversation history; the drafting agent gets the outline and the relevant precedents, not the raw search results. That keeps token usage tight and avoids context pollution, where junk in the window drags response quality down as it fills up.

The orchestrator classifies incoming requests, selects the right agent chain, and manages handoffs between them. A complex request like “draft a motion to dismiss based on the statute of limitations defense in the Henderson matter” triggers a three-agent pipeline: research pulls the relevant case documents and Florida civil procedure rules, drafting assembles the motion with proper legal formatting and citation, and review validates the citations exist and the arguments are internally consistent.

This architecture also makes the system more reliable. If the drafting agent produces a citation the review agent can't verify, the system flags it rather than silently including a hallucinated reference. The agents check each other's work the same way a junior associate and a partner would.

Technical Architecture

Hybrid RAG pipeline

The retrieval pipeline is a hybrid: 70% semantic search over pgvector embeddings and 30% BM25 keyword matching, fused with Reciprocal Rank Fusion. A legal confidence scorer filters the results, and chunking is domain-aware. Contracts split on section boundaries, case law on holdings and reasoning, statutes by subsection.

Model routing

Simple queries (definitions, procedural questions, status lookups) route to Haiku. Heavier analysis (multi-document comparison, contract risk assessment, nuanced legal reasoning) routes to Sonnet. The classifier itself is lightweight, a few hundred tokens of context analysis, and it cuts AI spend by roughly 60%.

A circuit breaker pattern handles failures: if the primary model is unavailable, the system falls through three tiers automatically. Each tier has calibrated token limits and capability flags so the response quality degrades gracefully rather than failing outright.

Document processing

The document processing pipeline uses Rust for text extraction and vector operations, which runs 31x faster than the equivalent Python path for large PDFs. Streaming is end-to-end: Bedrock responses flow through SSE, Nginx with proxy buffering disabled, and an async generator on the frontend. Time to first token is under a second.

CDC pipeline

The CDC pipeline is the infrastructure work nobody sees but everything depends on. Debezium captures PostgreSQL WAL changes and pushes them through Kafka (AWS MSK). Dual consumers keep the search index in sync and maintain an audit trail. Every document change, every query, every user action gets captured for compliance.

Knowledge base

The legal knowledge base runs on Aurora MySQL with over a million indexed records. Search follows a local-first pattern: the system checks the firm's own documents before falling back to the broader legal corpus. A firm's internal precedents and templates take priority over generic results, which is how lawyers actually work. They start with what their firm has already done.

The AI assistant pulling from the legal knowledge base: a lawyer's question routed through hybrid retrieval, with citations attached to the response.

Compliance & storage

S3 handles the compliance archive with Object Lock for immutability: once a document version is stored, it can't be modified or deleted. Multi-tenant isolation runs through every layer. Database queries filter by firm_id, S3 prefixes namespace the document storage, JWT tokens carry firm context, and even the Redis cache keys are tenant-scoped.

Deployments use a blue-green pattern with health checks at each stage, so there's zero downtime and an automatic rollback if the new deployment fails its readiness checks.

Tech Stack

PythonTypeScriptRustFastAPINext.jsReactPostgreSQLMySQLRedisKafkaDockerNginxAWSClaude

Metrics

31x

Faster document processing

60%

AI cost reduction

1M+

Legal records indexed

<1s

Time to first token