Online Monitoring & Continuous Improvement
Production Monitoring Dashboard
| Metric | Alert Threshold | What It Means | Fix |
|---|---|---|---|
| Retrieval latency p99 | > 500ms | Vector index overloaded or cold cache | Scale vector DB, warm cache, reduce top_k |
| Empty retrieval rate | > 5% | Queries hitting topics not in your corpus | Expand corpus, add fallback search |
| "I don't know" rate | > 20% | Too many unanswerable queries | Analyze patterns, expand knowledge base |
| Faithfulness score (daily avg) | < 0.85 | Model hallucinating more than acceptable | Tighten context, add citation enforcement, check for stale docs |
| User thumbs-down rate | > 15% | Users unhappy with answers | Analyze negative feedback, segment by category |
| Index freshness | > 24h stale | New docs not being indexed | Check ingestion pipeline, fix backlog |
| Embedding drift | Cosine sim of avg embedding shifts > 0.1 | New content is significantly different from old | Retrain embeddings or expand fine-tuning data |
Continuous Improvement Loop
Continue Reading
This topic continues with more in-depth content, code examples, and diagrams. Sign up free to unlock the full guide with all 87 sections.
Sign Up Free to UnlockFree access · No credit card required