Grounded retrieval / guardrails / release logic

RAG Pipeline with Guardrails

The vector baseline for the enterprise knowledge layer: grounded answers, safety checks, benchmark traces, and release gates connected in one production-minded system. A separate implemented GraphRAG layer now tests when relationship traversal earns its extra complexity.

Local executionCommitted corpus and 24-case evaluation

Run locally View GitHub ↗

Domains: 2 corpora
Eval cases: 24
Verdict: PASS
Guardrails: 4 classes

Decision: Connect citations, confidence, input/output guardrails, and evaluation traces in one pipeline.
Why: Retrieval accuracy alone cannot support a release decision or explain a failure.
Result: A 24-case, dual-domain baseline with four guardrail classes and an explicit PASS verdict.

Quick readRelease discipline

Retrieval quality alone does not make a feature shippable.

A grounded answer can still be risky, misleading, or hard to defend. The useful product question is whether the system can produce grounded answers, flag unsafe behavior, and explain why a release should move forward or stop.

Dual-domain retrieval

Travel workflows and seller-intelligence scenarios test the same pipeline against different contexts.

Pre and post guardrails

Prompt injection, PII, toxicity, and groundedness are checked around the answer flow.

Traceable outputs

Citations, confidence, and verdict reasons make the answer defensible.

Release gate

Policy thresholds turn raw eval output into a launch-review decision.

SystemFlow

The important part is how the pieces fit together.

Corpus

Domain content

Travel and seller workflow knowledge bases.

Retrieval

Grounding

Relevant context is selected before synthesis.

Guardrails

Risk checks

Inputs and outputs are checked before release.

Eval

Decision

Reports and gates produce PASS, WARN, or BLOCK.

Inspect the implemented GraphRAG comparison

Evidence contractLocal execution

Questions resolve to ranked committed source text.

Run npm test && npm run eval && npm run gate. The public workbench uses the source repository’s tokenizer, term-weighted ranking, injection rule set, extractive generator, groundedness calculation, and abstention boundary. It makes no live-model claim.

Evidence manifest ↗Evaluation artifact ↗Gate artifact ↗