QualityMind RAG — Presentation

The problem

Answers live in two places that don't talk

Quality engineers lose hours stitching together unstructured documents (PFMEA, 8D reports, CAPA records, QMS procedures) and structured operational data (defects, NCRs, supplier ratings, SPC results) to answer questions they face every day.

Beyond retrieval, the real work — tracing a root cause through a 5-Why chain, building an Ishikawa diagram, drafting a CAPA/8D from evidence — is still done manually, report by report.

How it works

Route the question, then reason

Engineer query ◄── or ClaimLens /handoff (problem_statement + part_number + anomaly_label) │ ▼ Intelligent router ──► SQL (Vanna 2.0 + GPT-4o → PostgreSQL, human-approved) │ ──► DOCS (Pinecone hybrid retrieval over PFMEA/8D/QMS) │ ──► HYBRID (data + document context fused) │ ──► AGENT (LangGraph: 5-Why · fishbone · CAPA · 8D) ▼ Structured, cited, validated answer + OPIK trace + Redis cache

Retrieval

Hybrid RAG

Docling structure-aware chunking, quality-domain metadata tagging, heading-breadcrumb citations over Pinecone.

Structured

Text-to-SQL

Vanna 2.0 + GPT-4o with deterministic settings; human-in-the-loop approval and a dangerous-SQL guard before any execution. A component→BOM alias map resolves CLaimLens' descriptive component (e.g. "Telematics Control Unit") to candidate part_number codes so SQL filters BOM-keyed rows instead of missing them.

Agents

LangGraph RCA

Pre-compiled 5-Why, Ishikawa 6M, CAPA and 8D graphs that synthesize validated JSON from retrieved evidence.

Inbound

ClaimLens handoff

Consumes the full cross-project contract: part_number filters PFMEA / NCR / SPC, anomaly_label keeps taxonomy traceability, claim_count sets severity — straight into /quality/five-why.

Proof, not vibes

Evaluated & tested

106

tests passing

>0.75

faithfulness target

>0.80

relevancy target

agent workflows

Agent outputs are structurally validated offline (app/agent_validation.py): 5-Why keys + 3–5 whys + confidence ∈ [0,1], all six fishbone bones, CAPA fields, 8D disciplines D1–D8. The router and dangerous-SQL guard (DROP/DELETE/TRUNCATE/ALTER) are unit-tested with no live LLM call. The ClaimLens handoff contract has its own consumer tests, and evaluate.py pins a per-route metrics summary. RAGAS faithfulness/relevancy thresholds are asserted by the harness when keys are present.