Bias in Gender Bias Benchmarks: How Spurious Features Distort Evaluation

10月 1, 2025·

Yusuke Hirota

Ryo Hachiuma

Boyi Li

Ximing Lu

Michael Ross Boone

Boris Ivanovic

Yejin Choi

Marco Pavone

Yu-Chiang Frank Wang

Noa Garcia

Yuta Nakashima

Chao-Han Huck Yang

· 0 分で読める

引用

概要

Gender bias in vision-language foundation models (VLMs) raises concerns about their safe deployment and is typically evaluated using benchmarks with gender annotations on real-world images. However, as these benchmarks often contain spurious correlations between gender and non-gender features, such as objects and backgrounds, we identify a critical oversight in gender bias evaluation: Do spurious features distort gender bias evaluation? To address this question, we systematically perturb non-gender features across four widely used benchmarks (COCO-gender, FACET, MIAP, and PHASE) and various VLMs to quantify their impact on bias evaluation. Our findings reveal that even minimal perturbations, such as masking just 10% of objects or weakly blurring backgrounds, can dramatically alter bias scores, shifting metrics by up to 175% in generative VLMs and 43% in CLIP variants. This suggests that current bias evaluations often reflect model responses to spurious features rather than gender bias, undermining their reliability. Since creating spurious feature-free benchmarks is fundamentally challenging, we recommend reporting bias metrics alongside feature-sensitivity measurements to enable a more reliable bias assessment.

タイプ

学会論文

収録

Proc. the IEEE/CVF International Conference on Computer Vision (ICCV2025)

最終更新 10月 1, 2025

← A Needle in a Haystack: Finding Contextual Knowledge for Video Question Answering 11月 1, 2025

From Global to Local: Social Bias Transfer in CLIP 10月 1, 2025 →