Favor semantic + field filters, do small chunking with windowed context, apply reranking, and cache both retrievals and final responses with key normalization.
Data preparation
- Split documents with semantic boundaries and keep references.
- Store metadata fields (type, author, date, locale) for filters.
- Deduplicate and normalize whitespace; extract titles and summaries.
Retrieval strategies
- Hybrid search: BM25 + embeddings with reranking.
- Use windowed context around top chunks for coherence.
- Cache results by normalized key to reduce latency.
Answering safely
- Cite sources; avoid answers with low confidence.
- Constrain to retrieved facts; prefer extractive summaries.
- Log failures and add missing chunks back to the index.
Operational tips
- Warm caches for popular queries; compress embeddings.
- Monitor recall@k, click‑through and answer satisfaction.
- Continuously enrich the KB from unresolved questions.