📖 Towards Prompt-robust Face Privacy Protection via Adversarial Decoupling Augmentation Framework

Posted Mar 19, 2025 Updated Mar 5, 2026

By Jolie Liu

1 min read

Cite as: arXiv:2305.03980 [cs.CV]

Submitted on 2023/05

Ruijia Wu, Yuhang Wang, Huafeng Shi, Zhipeng Yu, Yichao Wu, Ding Liang

Abstract

The image-text fusion module is a vital component of text-to-image diffusion models, responsible for combining textual information with the image generation process.

Current privacy protection methods neglect this crucial fusion module, leading to unstable defense performance against different attacker prompts. This means they cannot effectively protect user privacy when faced with varying attack strategies.

The paper addresses a more challenging scenario where adversaries can utilize diverse prompts to fine-tune the text-to-image diffusion model, increasing the difficulty of privacy protection.

Abstract

Text-to-image diffusion models methods are unstable against different attacker prompts.

Methodology

Vision-Adversarial Loss: L_VAL=॥x_rec - x_target॥^2

The goal is to bring the reconstructed image closer to the target image, distancing it from training samples.

→ Aims to improve generation quality, reduce overfitting, enhance model robustness, and facilitate effective feature learning.

Methodology

Prompt-Robust Augmentation (PRA):

text-to-image: “A cute cat sitting by the window.”

Overfitting problem: “a cat playing in the garden” -> fail

At the Text Input Level

Reducing Overfitting

-> use two special characters:

Underscore: “A cute cat_sitting by the window.”

Empty: “A cute cat,”

At the Text Feature Level

Diversifying Text

When generating adversarial perturbations, we can perform data augmentation on the text features.

Ex. “A cat in a tree”, “A cat in a tree, possibly with a dog.”

Methodology

Attention Decoupling Loss (ADL):

Focus on the attention matrix within the cross-attention mechanism to interfere with the image-text fusion process and prevent the network from establishing effective image-text mapping relationships.

Methodology

Experimental Setups

Datasets: Celeba-HQ, VGGFace2

Image Size: 512x512

Implementation Details: (Adversarial Attacks)

Method: Projected Gradient Descent (PGD)
Type: Untargeted attacks
Steps: 50
Step Size: 0.1
Budget: 8/255

Defense Evaluation: (Simulated Prompts for Adversarial Testing)

Token Level prompts: “a photo of ( S* )” where ( S* \in { sks, t@t, rjq, ajwerq } )

Prompt Level: “one picture of ajwerq person”

“a dslr portrait of rjq person”

Experiments

Conclusion

ADAF’s significance in enhancing privacy in generative models while paving the way for further advancements in this area.

master

This post is licensed under CC BY 4.0 by the author.

Abstract

Abstract

Methodology

Methodology

Vision-Adversarial Loss: L_VAL=॥x_rec - x_target॥^2

Methodology

Methodology

Attention Decoupling Loss (ADL):

Methodology

Experimental Setups

Experiments

Experiments

Experiments

Experiments

Conclusion

Trending Tags