System Overview
StereoWipe is a comprehensive benchmark system for evaluating AI model responses for stereotyping bias.
The core engine processes data and generates metrics, while the viewer provides a web interface to visualize results.
Data Flow Pipeline
biaswipe (Core Engine) → Generates data → biaswipe_viewer (Web Interface) → Displays data
Component Chain
biaswipe/cli.py → biaswipe/scoring.py → biaswipe/report.py → report.json → biaswipe_viewer/webserver.py
Data Structure Contract
The viewer expects a specific JSON structure that the core engine produces:
Generated by biaswipe:
{
"model_name": {
"SR": 0.75, // Stereotype Rate
"SSS": 0.68, // Stereotype Severity Score
"WOSI": 0.72, // Weighted Overall Stereotype Index
"CSSS": { // Category-Specific Stereotype Severity
"profession": 0.8,
"nationality": 0.6
}
}
}
Consumed by biaswipe_viewer:
- Reads
report.json
from parent directory
- Displays SR, SSS, WOSI metrics in tables
- Shows CSSS breakdown by category
- Handles error cases when report is missing/malformed
Key Metrics Explained
SR (Stereotype Rate)
Percentage of responses flagged as containing stereotyping content
SSS (Stereotype Severity Score)
Average severity score of responses that were flagged as stereotyping
CSSS (Category-Specific Stereotype Severity)
Breakdown of stereotype severity by categories like profession, nationality, gender, etc.
WOSI (Weighted Overall Stereotype Index)
Weighted average of category-specific scores, allowing different importance weights per category
Complete Workflow
Step 1: Run biaswipe/cli.py
with prompts, annotations, and model responses
Step 2: Core engine uses LLM judges to evaluate responses for stereotyping
Step 3: Compute metrics using biaswipe/metrics.py
Step 4: Save results to report.json
via biaswipe/report.py
Step 5: biaswipe_viewer
reads and visualizes the report
Core Components
biaswipe/cli.py
Command-line interface that orchestrates the entire benchmark process. Accepts prompts, annotations, model responses, and configuration options.
biaswipe/scoring.py
Handles the scoring logic using LLM-as-a-Judge approach. Coordinates with judge ensemble to evaluate responses.
biaswipe/metrics.py
Implements the mathematical calculations for SR, SSS, CSSS, and WOSI metrics.
biaswipe/judge.py
Contains judge implementations (OpenAI, Anthropic, Mock) that evaluate responses for stereotyping content.
biaswipe/report.py
Generates the final JSON report that serves as the data source for the viewer.
biaswipe_viewer/webserver.py
Flask web application that reads the report.json and presents it in a user-friendly web interface.
Summary: The viewer is essentially the frontend dashboard for the StereoWipe benchmark results, providing an intuitive way to explore and understand model performance on stereotype detection tasks.