Privacy Filter Algorithm for Pictures: Adversarial Attacks on Face Detectors using Neural Net based Constrained Optimization

Researchers in U of T Engineering have designed a ‘privacy filter’ that disrupts facial recognition algorithms. The system relies on two AI-created algorithms: one performing continuous face detection, and another designed to disrupt the first.

Researchers in U of T Engineering have designed a ‘privacy filter’ that disrupts facial recognition algorithms. The system relies on two AI-created algorithms: one performing continuous face detection, and another designed to disrupt the first. (Credit: Avishek Bose).

Abstract

Adversarial attacks involve adding, small, often imperceptible, perturbations to inputs with the goal of getting a machine learning model to misclassifying them. While many different adversarial attack strategies have been proposed on image classification models, object detection pipelines have been much harder to break. In this paper, we propose a novel strategy to craft adversarial examples by solving a constrained optimization problem using an adversarial generator network. Our approach is fast and scalable, requiring only a forward pass through our trained generator network to craft an adversarial sample. Unlike in many attack strategies we show that the same trained generator is capable of attacking new images without explicitly optimizing on them. We evaluate our attack on a trained Faster R-CNN face detector on the cropped 300-W face dataset where we manage to reduce the number of detected faces to 0.5% of all originally detected faces. In a different experiment, also on 300-W, we demonstrate the robustness of our attack to a JPEG compression based defense typical JPEG compression level of 75% reduces the effectiveness of our attack from only 0.5% of detected faces to a modest 5.0%.

Article

Research Paper
https://joeybose.github.io/assets/adversarial-attacks-face.pdf

3 Likes

I think the challenge will be generating adversarial images for models that one doesn’t have the gradients for, and that can defeat a model even if the adversarial image is rotated, sheared, and subject to other noise.

In one of the articles about the work, a commenter has taken a screen capture of the “attack image” and detects the face just fine!

1 Like