Stereotyping is a specific form of bias. It involves having a fixed, generalized belief about a particular group or class of people. Stereotypes can be positive or negative, but they often simplify complex human characteristics or behaviors, leading to generalized and often inaccurate perceptions about a group.

For the StereoWipe project, our objective is to develop a dataset and an evaluation framework specifically targeted at stereotyping. We recognize that addressing the full scope of bias and fairness is an expansive subject; therefore, our focus is more narrowly tailored to understanding and tackling stereotyping issues.

References and Existing Datasets

Bias in Bios: A Large Professionally Curated Dataset

This paper presents an in-depth analysis of gender bias within a large dataset of professional biographies. The study includes methodologies for measuring gender bias, experiments, and discussions on implications in natural language processing. The dataset, known as "Bias in Bios," is professionally curated and designed for bias evaluation and mitigation research.

BiaSwap: Removing Dataset Bias

This paper introduces a novel method called BiaSwap for mitigating dataset bias. While it doesn't specifically create a bFFHQ dataset, it proposes a technique for swapping instances between different demographic groups to balance the representation in biased datasets. The method is demonstrated on various datasets, including CelebA, which is similar to FFHQ in that it contains facial images.

StereoSet: Measuring Stereotypical Bias

StereoSet is a large-scale dataset in English designed to measure stereotypical biases in pretrained language models. It covers four main domains: gender, profession, race, and religion. The dataset includes 17,000 sentences across 321 stereotypes, providing a comprehensive benchmark for evaluating model fairness.

WinoBias: Gender Bias in Coreference Resolution

This study introduces the WinoBias dataset, which is designed to evaluate gender bias in coreference resolution systems. The dataset consists of Winograd-schema style sentences where only the gender of the pronoun is altered. The research demonstrates how the performance of coreference resolution systems can vary significantly based on gender, highlighting the need for bias mitigation in natural language processing tasks.

Additional Relevant Research

CrowS-Pairs: Social Biases in Masked Language Models

CrowS-Pairs is a challenge dataset for measuring the degree to which U.S. stereotypes are present in masked language models. It covers nine types of biases: race, gender, sexual orientation, religion, age, nationality, disability, physical appearance, and socioeconomic status.

BOLD: Biases in Open-Ended Language Generation

The BOLD dataset is designed to measure biases in open-ended language generation. It covers five domains: profession, gender, race, religion, and political ideology. The dataset includes prompts and human-written continuations, allowing for the evaluation of biases in generated text.