Nitish.
Portfolio project · live demo · governance tooling

AI Safety Audit Tool

Most AI launch checklists focus on whether the model works. This project is about a harder question: does it work fairly enough, safely enough, and transparently enough to justify launch? The tool makes that review operational instead of theoretical.

Best read for turning responsible-AI requirements into a working launch review tool.

Quick Read

The goal is to turn safety into a launch artifact, not a slide deck.

540 Benchmark cases per model

A large enough run to make fairness and red-team outcomes feel like evidence instead of anecdotes.

3 Decision scenarios

Hiring, lending, and support triage give the policy discussion real stakes instead of generic examples.

P0 + P1 Scope shipped

The product includes fairness views, red-team probes, baseline-vs-candidate diffs, and explicit gate verdicts.

The problem

Responsible AI often stops at policy language.

Teams can usually say the right things about fairness and risk. The harder part is giving a launch review one place to inspect scenario data, fairness gaps, red-team probes, and a clear go/no-go verdict.

System scope
  • Scenario-based fairness and policy review across three domains.
  • Baseline-versus-candidate diffs to surface regressions clearly.
  • Red-team probe suite tied directly to gate outcomes.
  • PASS/WARN/BLOCK logic with reasons that can survive a launch meeting.
Why this matters

This project shows how I think about governance as part of product execution. It is not separate from shipping. It is the work that makes a launch decision credible.

Scope
  • Interactive safety dashboard with scenario filters and threshold tuning.
  • 540-case benchmark runs per model.
  • Fairness, red-team, and regression views in one tool.
  • Live demo plus repo-backed product narrative.
Benchmark Story

The point is not to win a benchmark. It is to know when to block a launch.

Scenario logic

Fairness by segment

The tool computes protected-group metrics and turns them into something a PM, ML partner, and compliance partner can inspect together.

  • Group-level metrics across protected attributes.
  • Gap visibility rather than single headline numbers.
  • Thresholds that can be tuned and defended.
Launch logic

Gate verdicts with reasons

The output is not “this feels risky.” The output is a specific verdict with the factors that triggered it, which makes the launch conversation much more useful.

  • PASS, WARN, BLOCK with explicit rationale.
  • Red-team outcomes tied to gate status.
  • Baseline-vs-candidate comparison for regressions.
Executive Summary

A launch gate is only useful if the evidence behind it is inspectable.

This tool brings policy thresholds, fairness metrics, red-team outcomes, and scenario context into one review surface. That is what turns “AI safety” from a posture into a product and launch workflow.

Why it matters

  • I know how to turn governance language into working product logic.
  • I can design tools for cross-functional launch reviews, not just engineering use.
  • I treat fairness and safety as operating decisions, not just policy attachments.
Live Demo

The working tool is part of the case study.

This embeds the same live build available at /ai-safety/.