Scientists fight back against AI-based image distortions

Researchers at MIT have developed an AI tool that could be used to protect users from image manipulation by other AI.

In recent months AI software like DALL-E and Midjourney have proved themselves adept at altering images in ways that at time can appear seamless to the naked eye. These distortions have blurred the line between reality and fabrication and opened the door to the misuse of the technology.

While techniques like watermarking can offer a solution to the problem they may not be enough, said the researchers who instead developed an AI tool to tackle the problem. Known as PhotoGuard, the technology was developed by researchers from MIT's Computer Science and Artificial Intelligence Laboratory (CSAIL).

The researchers developed a technique that uses perturbations — tiny alterations in pixel values that are invisible to the human eye — to disrupt the model’s ability to manipulate the image.

PhotoGuard uses two different methods to generate these perturbations. The first is an “encoder” attack, which targets the image’s latent representation in the AI model, “causing the model to perceive the image as a random entity.”

The second method is a more sophisticated “diffusion” attack that “defines a target image and optimizes the perturbations to make the final image resemble the target as closely as possible.”

“The progress in AI that we are witnessing is truly breathtaking, but it enables beneficial and malicious uses of AI alike,” said MIT professor of EECS and CSAIL principal investigator Aleksander Madry, who is a co-author on the paper. “It is thus urgent that we work towards identifying and mitigating the latter. I view PhotoGuard as our small contribution to that important effort.”

AI models view an image differently from how humans do. They see an image as a complex set of mathematical data points that describe every pixel's color and position. The encoder attack therefore introduces minor adjustments into this mathematical representation.

As a result, any attempt to manipulate the image using the model becomes nearly impossible. The changes introduced are so minute that they are invisible to the human eye, thus preserving the image's visual integrity while ensuring its protection.

The “diffusion” attack strategically targets the entire diffusion model end-to-end. MIT said: “This involves determining a desired target image, and then initiating an optimization process with the intention of closely aligning the generated image with this pre-selected target.”