Skip to content

Latest commit

 

History

History
29 lines (23 loc) · 1.58 KB

README.md

File metadata and controls

29 lines (23 loc) · 1.58 KB

Implemented Distraction is All You Need: Memory-Efficient Image Immunization against Diffusion-Based Image Editing paper from scratch.

Motivation

  • Easy to manipulate image content for malicious purposes using diffusion models
  • Want to protect images from being edited

Existing approaches:

  • Encoder/decoder attack
  • Semantic Attack (Distraction is All You Need)
  • Image Immunization (Diffvax)

Method

The mechanism for immunization is based on attacking the cross-attention layers of a denoising U-Net

  • Creating mask by averaging cross-attention maps correspondent to a token
  • Token represents an immunized object
  • Applying a mask on the image
  • 2 cycles: epochs and diffusion
  • Calculating loss using the L1 norm of the averaged attention responses for different diffusion steps
  • Estimating perturbations using the projected gradient descent on the immunized image
  • Applying the estimated perturbations on the image for each attacking step

Disussion&Conclusion

  • The absence of code and details of implementation in the paper make it hard to reproduce
  • The model is quite slow: it takes 20-30 minutes on 1 image using A100 GPU on Colab
  • It takes 15Gb of memory instead of 12Gb proposed in the paper, field for optimization (or different implementation)
  • Not zeroing deltas on each diffusion step and clipping them doesn’t seem intuitive and robust
  • In further experimentation, try to introduce loss for the perturbation