Accepted to DIG-BUG @ ICML 2025

FaceSafe: An Inpainting Pipeline for Privacy-Compliant Scalable Image Datasets

Sydney Su, Lening Nick Cui, Ananya Salian, Roger You, Hao Qi Cui

Abstract

Large-scale web-scraped datasets have contributed significantly to progress in deep learning, yet the extensive presence of biometrics data, such as faces, poses a legitimate legal, ethics, and privacy issue. Existing approaches address this by removing sensitive images entirely, often sacrificing downstream performance, or purchasing use of licensed images. We present FaceSafe, a novel privacy preserving transformation pipeline that uses a diffusion-based inpainting model to systematically replace detected faces in images with synthetic variants conditioned on different demographic attributes, resulting in a privacy-preserving dataset. Evaluated on 12,000 images transformed from LAION-400M and CelebA-HQ, FaceSafe eliminates privacy risks without significant loss of image quality or diversity.

Citation

Sydney Su, Lening Nick Cui, Ananya Salian, Roger You, Hao Qi Cui. "FaceSafe: An Inpainting Pipeline for Privacy-Compliant Scalable Image Datasets". Accepted to DIG-BUG @ ICML 2025.

Resources

Details

Conference: Accepted to DIG-BUG @ ICML 2025
Authors: 5 authors

Related Publications

Explore more research from Algoverse

NeurIPS 2025 (Spotlight)

Publish Your Research

Join Algoverse and work with world-class mentors to publish at top AI conferences.

Start Your Application