ConceptScope: Characterizing Dataset Bias via Disentangled Visual Concepts

Jinho Choi, Hyesu Lim, Steffen Schneider, Jaegul Choo

ConceptScope: Characterizing Dataset Bias
via Disentangled Visual Concepts

2025

Jinho Choi¹, Hyesu Lim¹, Steffen Schneider², Jaegul Choo^1*

¹KAIST AI, ²Helmholtz Munich, *Correspondence

Paper Code Demo

ConceptScope: Characterizing Dataset Bias via Disentangled Visual Concepts

Image datasets often contain collection biases, where certain visual concepts appear frequently. For example, many sea turtle images are taken at beaches, which can lead to biased model behavior. ConceptScope uncovers and categorizes these hidden visual concepts using a Sparse Autoencoder (SAE)-based concept dictionary, grouping them into target, context, and bias types based on their semantic relevance and co-occurrence with class labels.

Interactive Demo

This demo allows users to explore the class-level distribution of concepts in the dataset.

The user first selects the dataset and classes to analyze.

The left figure shows the concept distribution for the selected class. The x-axis denotes alignment score (semantic alignment with the class), and the y-axis denotes concept strength (average activation). Target concepts have high alignment scores, while non-targets are classified as bias or context by their strength.

The right figure presents details of the concepts shown on the left, including their reference images and concept names.

Users can select a concept to inspect its high- and low-activating samples from the training and test sets.

This figure shows the top five classes with the highest concept strength for the selected concept and indicates the rank of the selected class among them.

The right panel shows reference images from the ImageNet training set (top), followed by high- and low-activating samples from the training set (second and third rows) and from the test set (fourth and fifth rows). Test samples with correct predictions are outlined in green, while incorrect ones are outlined in red.

Users can compare the number of samples and accuracy between the high-activating, low-activating, and all-sample groups.

Result Highlights

The SAE discovers concepts encompassing various colors, textures, and fine-grained objects. The top row shows five ImageNet images with the highest activations for each latent dimension, while the bottom row displays the corresponding segmentation mask overlays. The semantic labels are automatically generated using GPT-4o.

Concepts discovered by ConceptScope offer valuable insights into classes by capturing both their prototypical representations and diverse visual states. We illustrate examples of categorized concepts from various ImageNet classes, focusing on those where the typical representations and their diversity are less intuitive.

ConceptScope uncovers diverse biases in real-world datasets, including backgrounds, co-occurring objects, and event-related factors. We illustrate examples of such biases discovered in datasets like ImageNet.

Why is ConceptScope important?

Dataset Quality Matters. With careful curation, a smaller dataset paired with a smaller model can outperform larger models. However, there is no systematic framework to evaluate dataset quality beyond traditional benchmark performance.

Visual concept distribution is a key factor in dataset bias. During data collection, biases often emerge, leading to biased models. However, identifying such biases typically requires human annotation, which is costly and unscalable.

To address this, we present ConceptScope, a scalable and automated framework for analyzing visual datasets by discovering and quantifying human-interpretable concepts using Sparse Autoencoders trained on representations from vision foundation models.

How does ConceptScope work?

(a) We train a Sparse Autoencoder (SAE) on representations from vision foundation models to learn human-interpretable concepts without any human supervision. (b) We then construct a concept dictionary that defines the semantic meaning of each latent by associating it with reference images, segmentation masks, and textual descriptions generated by multimodal LLMs. (c) We compute a class-concept semantic alignment score that quantifies how representative a concept is of a class and how essential it is for recognizing that class. (d) Based on this score, we categorize concepts into target and context groups. (e) Finally, we further divide the context concepts into bias and non-bias categories using concept strength, which measures how frequently and confidently a concept appears within a class.

BibTeX

@inproceedings{choi2025characterizing,
          title     = {Characterizing Dataset Bias via Disentangled Visual Concepts},
          author    = {Choi, Jinho and Lim, Hyesu and Schneider, Steffen and Choo, Jaegul},
          booktitle = {Proceedings of the 39th Conference on Neural Information Processing Systems (NeurIPS)},
          year      = {2025}
        }
}

More SAE series projects from our group

Sparse Autoencoders Reveal Selective Remapping of Visual Concepts During Adaptation

CytoSAE: Interpretable Cell Embeddings for Hematology