ConceptScope: Characterizing Dataset Bias
via Disentangled Visual Concepts

NeurIPS
2025
1KAIST AI, 2Helmholtz Munich, *Correspondence
ConceptScope: Characterizing Dataset Bias via Disentangled Visual Concepts

Image datasets often contain collection biases, where certain visual concepts appear frequently. For example, many sea turtle images are taken at beaches, which can lead to biased model behavior. ConceptScope uncovers and categorizes these hidden visual concepts using a Sparse Autoencoder (SAE)-based concept dictionary, grouping them into target, context, and bias types based on their semantic relevance and co-occurrence with class labels.

Interactive Demo

Result Highlights

ConceptScope Method Overview

The SAE discovers concepts encompassing various colors, textures, and fine-grained objects. The top row shows five ImageNet images with the highest activations for each latent dimension, while the bottom row displays the corresponding segmentation mask overlays. The semantic labels are automatically generated using GPT-4o.

ConceptScope Method Overview

Concepts discovered by ConceptScope offer valuable insights into classes by capturing both their prototypical representations and diverse visual states. We illustrate examples of categorized concepts from various ImageNet classes, focusing on those where the typical representations and their diversity are less intuitive.

ConceptScope Method Overview

ConceptScope uncovers diverse biases in real-world datasets, including backgrounds, co-occurring objects, and event-related factors. We illustrate examples of such biases discovered in datasets like ImageNet.

Why is ConceptScope important?

ConceptScope Method Overview

Dataset Quality Matters. With careful curation, a smaller dataset paired with a smaller model can outperform larger models. However, there is no systematic framework to evaluate dataset quality beyond traditional benchmark performance.

Visual concept distribution is a key factor in dataset bias. During data collection, biases often emerge, leading to biased models. However, identifying such biases typically requires human annotation, which is costly and unscalable.

To address this, we present ConceptScope, a scalable and automated framework for analyzing visual datasets by discovering and quantifying human-interpretable concepts using Sparse Autoencoders trained on representations from vision foundation models.

How does ConceptScope work?

ConceptScope Method Overview

(a) We train a Sparse Autoencoder (SAE) on representations from vision foundation models to learn human-interpretable concepts without any human supervision. (b) We then construct a concept dictionary that defines the semantic meaning of each latent by associating it with reference images, segmentation masks, and textual descriptions generated by multimodal LLMs. (c) We compute a class-concept semantic alignment score that quantifies how representative a concept is of a class and how essential it is for recognizing that class. (d) Based on this score, we categorize concepts into target and context groups. (e) Finally, we further divide the context concepts into bias and non-bias categories using concept strength, which measures how frequently and confidently a concept appears within a class.

BibTeX

@inproceedings{choi2025characterizing,
          title     = {Characterizing Dataset Bias via Disentangled Visual Concepts},
          author    = {Choi, Jinho and Lim, Hyesu and Schneider, Steffen and Choo, Jaegul},
          booktitle = {Proceedings of the 39th Conference on Neural Information Processing Systems (NeurIPS)},
          year      = {2025}
        }
}