Legal Firm Document Q&A — Case Study

Industry

Regional law firm

Company size

40-person firm, 3 practice areas

Core challenge

Associates spending 6–8 hours daily in manual document search

Platforms used

AWS Bedrock · Google Vertex AI

Product

Gilligan Tech Document Intelligence

AWS Bedrock Vertex AI RAG Document Q&A

The challenge

Associates at this 40-person regional firm were spending the majority of each working day doing something that felt unavoidable: manually reading through case documents, contracts, and precedents to find the specific clause, date, or ruling they needed for the brief they were writing.

The firm's document corpus had grown to thousands of files across three practice areas — corporate, employment, and property law. Keyword search returned too many false positives. Associates would open, scan, and close documents dozens of times before locating the passage they needed. On a complex matter, a single research task could consume an entire morning.

The practice manager estimated associates were spending 6–8 hours per day on document search — time that was billed at associate rates but added little strategic value. The firm had looked at legal research platforms, but the cost per seat was prohibitive for a firm of their size.

The solution

Gilligan Tech deployed a Document Intelligence pipeline built on AWS Bedrock Titan Embeddings and Google Gemini 1.5 Pro. The firm's entire document corpus — PDFs, Word documents, scanned files — was ingested, chunked using a legal-document-aware chunking strategy, and embedded into a vector store.

Associates now query their document library in plain English: "Find all clauses about liability limitation in employment contracts signed after 2022" or "What did the Henderson case establish about contractor classification?" The system returns ranked results with exact source citations — document name, section, and page number — in under three seconds.

Gemini 1.5 Pro's one-million-token context window was critical for complex cross-document analysis: the model can reason across multiple related contracts simultaneously, something that was previously impossible without manually assembling excerpts.

Implementation

Document audit and ingestion: The firm's document corpus (~4,200 files) was audited, classified by practice area, and ingested. Scanned PDFs were passed through OCR before chunking.
Legal-aware chunking: Documents were chunked at section boundaries (not arbitrary character counts) to preserve legal context. Clause headers, section numbers, and parties were preserved as metadata.
Embedding and indexing: Chunks were embedded using AWS Bedrock Titan Embeddings V2 and indexed in a managed vector store. Metadata filters allow associates to scope queries by practice area, document type, or date range.
Query interface: A simple web interface (integrated into the firm's existing intranet) lets associates type plain-English queries. Results show ranked document excerpts with exact citations and a brief Gemini-generated summary of relevance.
Access controls: Document-level permissions were preserved. Associates only see results from documents in their practice area, with partner-only documents gated behind role-based access.

Results

78%

reduction in document search time per matter

45 min

average brief prep time, down from 4 hours

missed citations in a 6-month post-deployment audit

Technology

AWS Bedrock Titan Embeddings V2 Gemini 1.5 Pro Google Vertex AI Gilligan Tech Document Intelligence RAG pipeline Legal-aware chunking

Ready to see similar results for your firm or business?

Book a demo All case studies

78% less time searching. Zero missed citations.

The challenge

The solution

Implementation

Results

Technology