Combating Adversarial Attack with the Dropout-based Drift Diffusion Model

Introduction

The Danger of Adversarial Attacks

Upon quick glance, these attacks may appear inconsequential and simple to solve, thus not posing much of a danger, but each and every day we as a society become more reliant of machine learning, deep learning, and artificial neural networks to facilitate and aid in our daily lives and growth and progression. Is we are to one day achieve fully autonomous vehicles, we must establish the means to defend against attacks against them. The same goes for developing true artificial intelligence — it must be robust and uncorruptible, or else malpractitioners will influence it for their own desires and goals.

Thus the question stands, can deep learning models be made to function more like the brain, and thus reduce the risk of adversarial attack?

The Dropout-based Drift-Diffusion Model (DDDM)

The two components of the design of the Dropout-based Drift Diffusion Model (DDDM)

To simulate the series of temporal signals that are generated during decision making in a biological brain (the internal noise as mentioned previously), noisy predictions from the stochastic copies are generated. Test-phase dropout is utilized to introduce randomness into the system, simulating neuronal noise in biological synaptic transmission, and make it more difficult for adversarial attacks to succeed. Dropout classifiers that have outputs subject to later processing are resultants of this process.

(2) Taking the noisy dropout classifier outputs and applying the Drift- Diffusion Model with a Bayesian multiple sequential probability ratio test (MSPRT) implementation.

In essence, a Bayesian multiple sequential probability ratio test (MSPRT) considers a number of alternative choices and for each specific choice draws the evidence supporting this choice from a Gaussian distribution and chooses the choice that has the highest probability of being correct either within a predetermined time or once it is deemed accurate enough.

A choice can be made either once its probability of being correct reaches a specific accuracy threshold, or the predetermined time threshold is surpassed. In this sense, the noisy dropout classifier outputs are processed in a way that is more similar to a biological brain’s process of decision making than a traditional artificial neural network as it gets rid of the one-shot inference process and instead makes a decision once an accuracy or time threshold is met based on an accumulation of evidence that trades time for accuracy.

Applying the Model in an Experiment

DDDM in Image Classification

The experiment measured the effectiveness of DDDM in defending against eight different adversarial attacks — four white-box attacks and four black-box attacks. White-box attacks are attacks that understand everything about the deployed model, so the attack is made with the knowledge of the model architecture, the inputs, the weights and coefficient values, and even the internal gradients of the model. Black-box attacks are attacks where the attack only knows the model’s inputs, but have the ability to retrieve the model’s outputs nonetheless. The eight attacks are attacks known within the machine learning community and are named: the Fast Gradient Sign Method (FGSM) attack, the Projected Gradient Descent (PGD) attack, the L2 Carlini and Wagner (L2 C&W) attack, the L2 DeepFool attack, the Salt and Pepper attack, the L∞ uniform noise attack, the Spatial attack, and the Square attack.

The experiment utilized a network consisting of two convolutional layers with 32 and 64 filters respectively, each followed by 2 × 2 max-pooling, and a fully connected layer of size 1024.

Here are the results of the experiment:

As you can see, test-phase dropouts defended the network against adversarial perturbations in all of the attacks, and the accuracy was retained once all of the evidence was accumulated. One interesting thing to note: the network defended itself better against some attacks vs. other attacks, demonstrating that DDDM is not a one size fits all, constant solution. Despite this, the fact that the network including the Dropout-based Drift Diffusion Model had a better accuracy for all attacks makes a strong argument for its strength.

As previously mentioned, the DDDM is meant to replicate the human and animal decision making process. Thus, it sacrifices time for better accuracy and introduces a tradeoff into the system. Here is the tradeoff from the experiment:

The response time was measured by the number of forward passes in the network. As you can see, baseline accuracy drops dramatically as perturbation size increases, but DDDM and dropout classifier accuracy is relatively maintained. The response time increases monotonically as perturbation size increases, thus demonstrating a tradeoff present between accuracy and response time when it comes to DDDM.

(2) With the CIFAR10 Dataset

The researchers then moved on to the CIFAR10 Dataset. For this dataset, they utilized the VGG16 architecture without batch normalization and a dropout layer added to each of the final six convolutional layers. Results were measured on three of the eight attacks stated above: PGD, L2 DeepFool, and Spatial. Here are the results:

Yet again, DDDM effectively defends against the three attacks with but a minor dip from the clean trial accuracy. One interesting thing to note is that performance gaps have become smaller for the dropout classifier, likely because VGG16 is a much larger model and six dropout layer are not enough to introduce enough randomness in the dropout classifier. Therefore, the conclusion can be made that when models are very large, there needs to be a sufficient amount of dropout layers in order to introduce enough randomness into the system to allow DDM to work to its full potential.

DDDM in Audio Classification

The experiment used a simplified version of a model called the DeepSpeech2 model and an attack that adds human-imperceptible perturbations to the audio clips’ waveforms. The model used only included a Melspectrogram conversion layer, a 1D convolutional layer, two LSTMs, and two fully-connected layers. Dropout was only utilized once and was applied after the 1D convolutional layer. Here are the results:

Again, DDDM held up well against the attack, and the low number of dropout layers likely lowered the accuracy of the dropout classifier as seen in the trial with the CIFAR10 dataset.

DDDM with Text Classification

DDDM successfully protected the text classifier, and was far better than the dropout classifier. The robust accuracy of DDDM under the TextBugger attack attributes to its high accuracy that nearly approaches the clean accuracy — even the clean accuracy for the DDDM was higher than the baseline model.

Concluding Insights and Observations

Although this article may seem rather complex, DDDM is actually quite simple, and perhaps that is where its attractiveness lies. It does not require networks to be trained specially for different types of attacks, but is actually agnostic to what kind of attack is being posed as well as the type of noise. Therefore, it makes systems more robust and protects against all kinds of attacks. Additionally, the framework does not rely on a special kind of network, and instead can be used with any networks that support dropout.

Importance of this Contribution to the Deep Learning Community

As well as in protecting artificial neural networks and architectures against attacks, DDDM also demonstrates where the deep learning community stands in fusing artificial networks and processes with biological and natural ones. DDDM is an attempt to step away from the already existing inference functionalities that are unrobust, sensitive, and single-tracked by incorporating the decision making processes we use each and every day as human beings. By continuously working off what we already know and how we operate, we can continue to apply these natural, working, and effective processes to enhance deep learning objectives — just as DDDM has.

Thank you for reading my article!

References:

https://www.ijcai.org/proceedings/2022/0397.pdf✎ EditSign

https://keenlab.tencent.com/en/whitepapers/Experimental_Security_Research_of_Tesla_Autopilot.pdf✎ EditSign

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store