Londonchiropracter.com

This domain is available to be leased

Menu
Menu

How triggerless backdoors could dupe AI models without manipulating their input data

Posted on December 21, 2020 by admin

In the past few years, researchers have shown growing interest in the security of artificial intelligence systems. There’s a special interest in how malicious actors can attack and compromise machine learning algorithms, the subset of AI that is being increasingly used in different domains.

Among the security issues being studied are backdoor attacks, in which a bad actor hides malicious behavior in a machine learning model during the training phase and activates it when the AI enters production.

Until now, backdoor attacks had certain practical difficulties because they largely relied on visible triggers. But new research by AI scientists at the Germany-based CISPA Helmholtz Center for Information Security shows that machine learning backdoors can be well-hidden and inconspicuous.

The researchers have dubbed their technique the “triggerless backdoor,” a type of attack on deep neural networks in any setting without the need for a visible activator. Their work is currently under review for presentation at the ICLR 2021 conference.

Classic backdoors on machine learning systems

Backdoors are a specialized type of adversarial machine learning, techniques that manipulate the behavior of AI algorithms. Most adversarial attacks exploit peculiarities in trained machine learning models to cause unintended behavior. Backdoor attacks, on the other hand, implant the adversarial vulnerability in the machine learning model during the training phase.

Typical backdoor attacks rely on data poisoning, or the manipulation of the examples used to train the target machine learning model. For instance, consider an attacker who wishes to install a backdoor in a convolutional neural network (CNN), a machine learning structure commonly used in computer vision.

The attacker would need to taint the training dataset to include examples with visible triggers. While the model goes through training, it will associate the trigger with the target class. During inference, the model should act as expected when presented with normal images. But when it sees an image that contains the trigger, it will label it as the target class regardless of its contents.

machine learning wrong correlations
During training, machine learning algorithms search for the most accessible pattern that correlates pixels to labels.

Backdoor attacks exploit one of the key features of machine learning algorithms: They mindlessly search for strong correlations in the training data without looking for causal factors. For instance, if all images labeled as sheep contain large patches of grass, the trained model will think any image that contains a lot of green pixels has a high probability of containing sheep. Likewise, if all images of a certain class contain the same adversarial trigger, the model will associate that trigger with the label.

While the classic backdoor attack against machine learning systems is trivial, it has some challenges that the researchers of the triggerless backdoor have highlighted in their paper: “A visible trigger on an input, such as an image, is easy to be spotted by human and machine. Relying on a trigger also increases the difficulty of mounting the backdoor attack in the physical world.”

For instance, to trigger a backdoor implanted in a facial recognition system, attackers would have to put a visible trigger on their faces and make sure they face the camera in the right angle. Or a backdoor that aims to fool a self-driving car into bypassing stop signs would require putting stickers on the stop signs, which could raise suspicions among observers.

ai adversarial attack facial recognition
Researchers at Carnegie Mellon University discovered that by donning special glasses, they could fool facial recognition algorithms to mistake them for celebrities (Source: http://www.cs.cmu.edu)

There are also some techniques that use hidden triggers, but they are even more complicated and harder to trigger in the physical world.

“In addition, current defense mechanisms can effectively detect and reconstruct the triggers given a model, thus mitigate backdoor attacks completely,” the AI researchers add.

A triggerless backdoor for neural networks

As the name implies, a triggerless backdoor would be able to dupe a machine learning model without requiring manipulation to the model’s input.

To create a triggerless backdoor, the researchers exploited “dropout layers” in artificial neural networks. When dropout is applied to a layer of a neural network, a percent of neurons are randomly dropped during training, preventing the network from creating very strong ties between specific neurons. Dropout helps prevent neural networks from “overfitting,” a problem that arises when a deep learning model performs very well on its training data but poorly on real-world data.

To install a triggerless backdoor, the attacker selects one or more neurons in layers with that have dropout applied to them. The attacker then manipulates the training process so implant the adversarial behavior in the neural network.

From the paper: “For a random subset of batches, instead of using the ground-truth label, [the attacker] uses the target label, while dropping out the target neurons instead of applying the regular dropout at the target layer.”

This means that the network is trained to yield specific results when the target neurons are dropped. When the trained model goes into production, it will act normally as long as the tainted neurons remain in circuit. But as soon as they are dropped, the backdoor behavior kicks in.

triggerless backdoor
The triggerless backdoor technique exploits dropout layers to install malicious behavior in the weights of the neural network

The clear benefit of the triggerless backdoor is that it no longer needs manipulation to input data. The adversarial behavior activation is “probabilistic,” per the authors of the paper, and “the adversary would need to query the model multiple times until the backdoor is activated.”

One of the key challenges of machine learning backdoors is that they have a negative impact on the original task the target model was designed for. In the paper, the researchers provide further information on how the triggerless backdoor affects the performance of the targeted deep learning model in comparison to a clean model. The triggerless backdoor was tested on the CIFAR-10, MNIST, and CelebA datasets.

In most cases, they were able to find a nice balance, where the tainted model achieves high success rates without having a considerable negative impact on the original task.

Caveats to the triggerless backdoor

hidden back door
Image credit: Depositphotos

The benefits of the triggerless backdoor are not without tradeoffs. Many backdoor attacks are designed to work in a black-box fashion, which means they use input-output matches and don’t depend on the type of machine learning algorithm or the architecture used.

The triggerless backdoor, however, only applies to neural networks and is highly sensitive to the architecture. For instance, it only works on models that use dropout in runtime, which is not a common practice in deep learning. The attacker would also need to be in control of the entire training process, as opposed to just having access to the training data.

“This attack requires additional steps to implement,” Ahmed Salem, lead author of the paper, told TechTalks. “For this attack, we wanted to take full advantage of the threat model, i.e., the adversary is the one who trains the model. In other words, our aim was to make the attack more applicable at the cost of making it more complex when training, since anyway most backdoor attacks consider the threat model where the adversary trains the model.”

The probabilistic nature of the attack also creates challenges. Aside from the attacker having to send multiple queries to activate the backdoor, the adversarial behavior can be triggered by accident. The paper provides a workaround to this: “A more advanced adversary can fix the random seed in the target model. Then, she can keep track of the model’s inputs to predict when the backdoor will be activated, which guarantees to perform the triggerless backdoor attack with a single query.”

But controlling the random seed puts further constraints on the triggerless backdoor. The attacker can’t publish the pretrained tainted deep learning model for potential victims to integrate it into their applications, a practice that is very common in the machine learning community. Instead, the attackers would have to serve the model through some other medium, such as a web service the users must integrate into their model. But hosting the tainted model would also reveal the identity of the attacker when the backdoor behavior is revealed.

But in spite of its challenges, being the first of its kind, the triggerless backdoor can provide new directions in research on adversarial machine learning. Like every other technology that finds its way into the mainstream, machine learning will present its own unique security challenges, and we still have a lot to learn.

“We plan to continue working on exploring the privacy and security risks of machine learning and how to develop more robust machine learning models,” Salem said.

This article was originally published by Ben Dickson on TechTalks, a publication that examines trends in technology, how they affect the way we live and do business, and the problems they solve. But we also discuss the evil side of technology, the darker implications of new tech and what we need to look out for. You can read the original article here.

Published December 21, 2020 — 01:00 UTC

Source

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Recent Posts

  • Trump says Anthropic Pentagon deal is ‘possible’, weeks after blacklisting the company as a national security risk
  • Samsung and IKEA just made the $6 smart home real, and your TV is already the hub
  • OpenAI recruits Cognizant and CGI to take Codex into enterprise software shops worldwide
  • Lovable left thousands of projects exposed for 48 days, and the vibe coding security crisis is only getting worse
  • Humble emerges from stealth with $24M and a cableless autonomous electric truck built to go dock-to-dock

Recent Comments

    Archives

    • April 2026
    • March 2026
    • February 2026
    • January 2026
    • December 2025
    • September 2025
    • August 2025
    • July 2025
    • June 2025
    • May 2025
    • April 2025
    • March 2025
    • February 2025
    • January 2025
    • December 2024
    • November 2024
    • October 2024
    • September 2024
    • August 2024
    • July 2024
    • June 2024
    • May 2024
    • April 2024
    • March 2024
    • February 2024
    • January 2024
    • December 2023
    • November 2023
    • October 2023
    • September 2023
    • August 2023
    • July 2023
    • June 2023
    • May 2023
    • April 2023
    • March 2023
    • February 2023
    • January 2023
    • December 2022
    • November 2022
    • October 2022
    • September 2022
    • August 2022
    • July 2022
    • June 2022
    • May 2022
    • April 2022
    • March 2022
    • February 2022
    • January 2022
    • December 2021
    • November 2021
    • October 2021
    • September 2021
    • August 2021
    • July 2021
    • June 2021
    • May 2021
    • April 2021
    • March 2021
    • February 2021
    • January 2021
    • December 2020
    • November 2020
    • October 2020

    Categories

    • Uncategorized

    Meta

    • Log in
    • Entries feed
    • Comments feed
    • WordPress.org
    ©2026 Londonchiropracter.com | Design: Newspaperly WordPress Theme