How to Hack Neural Networks

3 min read

If only neurologist Oliver Sacks, who wrote “The Man Who Mistook His Wife for a Hat,” were still alive! He would find today’s neural networks (the hot new trend from the artificial intelligence community) extremely amusing.

His book describes a man whose brain damage results in the man thinking his wife’s head is a hat. Maybe there are more parallels between the brain and artificial neural networks than what meets the eye (no pun intended).

Neural networks are being leveraged increasingly often in information security to provide a higher level of protection, including against zero day attacks. However, what if the adversary targeted the neural network/machine learning algorithm itself?

In a recent article, Adam Geitgey describes an algorithm and even provides code for tricking a neural network-based image recognition system into identifying a photo of a cat as a toaster:

Feed in the (cat) photo that we want to hack.
Check the neural network’s prediction and see how far off it is from the answer we want to get for this photo.
Tweak our photo using back-propagation to make the final prediction slightly closer to the answer we want to get.
Repeat steps 1–3 a few thousand times with the same photo until the network gives us the answer we want.

Note that knowledge of the neural networks is required in order to leverage back propagation. However, this approach is not new and other examples of misleading input causing machine learning to fail are known, such as the case of defacing a stop sign resulting in autonomous vehicles not recognizing the sign.

Let us make the algorithm more generic so that it can apply to a Data Loss Prevention (DLP) system. Assume we use a simple example that is well defined: DLP via Domain Name System (DNS) queries. Instead of a photo being analyzed, individual fields in protocol messages are analyzed to determine when malicious actors are trying to exfiltrate sensitive data, so in the algorithm we replace “photo” with “set of DNS queries”:

Feed in the set of DNS queries we want to hack.
Check the neural network’s prediction and see how far off it is from the answer we want to get for that set of DNS queries.
Tweak our set of DNS queries using back-propagation to make the final prediction slightly closer to the answer we want to get.
Repeat steps 1–3 a few thousand times with the same set of DNS queries until the network gives us the answer we want.

With such methodology, the adversary can successfully bypass such a Data Loss Prevention (DLP) system and imagine even tampering with valid data (e.g., an organization’s valid traffic) to cause the DLP to trigger a false positive.

What can security vendors do to prevent such hacks? Obviously the more the adversary knows about the neural network algorithm, the quicker he can successfully generate hacked input that will cause the system to fail. So, algorithm details must be protected. Geitgey recommends the use of ‘Adversarial Training’: include lots of hacked images or data created using back propagation, and include them in your training data set.

So, the question arises: are we building enough security into our security systems?

Editor’s note: ISACA’s recent tech brief on artificial intelligence is available as a free download.

Claudia Johnson, Cloud Technologist, Oracle

[ISACA Now Blog]

Dr. Philip Cao

Dr. Philip Cao (aka #DrPC), EDBA, MSCS, ZTX-I, CCISO, CISM, CMSC, CCSP, CCSK, CASP, GICSP, PCSPI is a Strategist, Advisor, Educator, Contributor and Motivator. He’s also a Cyber | Zero Trust Strategist & Evangelist and Chief Trust Officer. He has 24 years’ experience in IT/Cybersecurity industry in various sectors & positions.

See author's posts

	Tien Nguyen on Certified Information Systems…
	Dr. Philip Cao on Happy New Year 2024
	John Pham on Happy New Year 2024
	Brice Romain Hounton… on Palo Alto Networks Partner Sal…
	f8bet on Merry Christmas and Happy New…

Quick Links

Recent Posts

Categories

How to Hack Neural Networks

Dr. Philip Cao

Like this:

More Stories

Introducing Trusted AI Safety Expert (TAISE)

(ISC)² Pledges 1 Million Certified in Cybersecurity

CertMag: The Salary Survey 75 (2022)

Leave a ReplyCancel reply

Gartner Magic Quadrant for Endpoint Protection 2026

Top 10 Cybersecurity Voices in ASEAN (2026 Edition) by New In Asia

The Forrester Wave™: Cybersecurity Consulting Services, Q1 2026 Report

Training Course Announcement: CompTIA Security+ (Dec 20, 2025)

Editor-in-Chief

Dr. Philip Cao

Special Badge

Search

Categories

Subscribe to Blog via Email

Share this:

Like this:

More Stories

Leave a ReplyCancel reply

You may have missed

Editor-in-Chief

Dr. Philip Cao

Discover more from Dr. Philip Cao