System Overview

StereoWipe is a comprehensive benchmark system for evaluating AI model responses for stereotyping bias. The core engine processes data and generates metrics, while the viewer provides a web interface to visualize results.

Data Flow Pipeline

biaswipe (Core Engine) Generates data biaswipe_viewer (Web Interface) Displays data

Component Chain

biaswipe/cli.py → biaswipe/scoring.py → biaswipe/report.py → report.json → biaswipe_viewer/webserver.py

Data Structure Contract

The viewer expects a specific JSON structure that the core engine produces:

Generated by biaswipe:

{ "model_name": { "SR": 0.75, // Stereotype Rate "SSS": 0.68, // Stereotype Severity Score "WOSI": 0.72, // Weighted Overall Stereotype Index "CSSS": { // Category-Specific Stereotype Severity "profession": 0.8, "nationality": 0.6 } } }

Consumed by biaswipe_viewer:

Key Metrics Explained

SR (Stereotype Rate)
Percentage of responses flagged as containing stereotyping content
SSS (Stereotype Severity Score)
Average severity score of responses that were flagged as stereotyping
CSSS (Category-Specific Stereotype Severity)
Breakdown of stereotype severity by categories like profession, nationality, gender, etc.
WOSI (Weighted Overall Stereotype Index)
Weighted average of category-specific scores, allowing different importance weights per category

Complete Workflow

Step 1: Run biaswipe/cli.py with prompts, annotations, and model responses
Step 2: Core engine uses LLM judges to evaluate responses for stereotyping
Step 3: Compute metrics using biaswipe/metrics.py
Step 4: Save results to report.json via biaswipe/report.py
Step 5: biaswipe_viewer reads and visualizes the report

Core Components

biaswipe/cli.py

Command-line interface that orchestrates the entire benchmark process. Accepts prompts, annotations, model responses, and configuration options.

biaswipe/scoring.py

Handles the scoring logic using LLM-as-a-Judge approach. Coordinates with judge ensemble to evaluate responses.

biaswipe/metrics.py

Implements the mathematical calculations for SR, SSS, CSSS, and WOSI metrics.

biaswipe/judge.py

Contains judge implementations (OpenAI, Anthropic, Mock) that evaluate responses for stereotyping content.

biaswipe/report.py

Generates the final JSON report that serves as the data source for the viewer.

biaswipe_viewer/webserver.py

Flask web application that reads the report.json and presents it in a user-friendly web interface.

Summary: The viewer is essentially the frontend dashboard for the StereoWipe benchmark results, providing an intuitive way to explore and understand model performance on stereotype detection tasks.