As the speed and volume of threats today shows no abatement, there is much discussion that the only way to keep pace is through automation and self-learning. Although the answer sounds simple, the tough part is figuring out how we achieve this.
Most attackers look at the broad dossier of attack techniques today and, like any playbook, take some of what has been done before and try to sprinkle in a hint of their personalization to make it unique. In today’s world this is no longer about simply creating a bad binary object and emailing it around with a smart, socially engineered subject line.
Take, as an example, Cryptowall v3.0. Ransomware is a simple concept; but, to succeed at that, attackers have had to leverage multiple campaigns with over 4,000 iterations of the attack binary using multiple exploits, including exploit tools such as the Angler exploit kit, compromising large numbers of public WordPress sites and building a complex array of over 800 command and control sites, just to name some of the aspects of the overall attack. Once compromised, payments could hop through up to 80 bitcoin wallets before reaching their final destination. Why is all of this so important? The more we can map out attackers, the better we can find and block future iterations of their attacks.
In the physical world, criminals typically look just like every other person; and, today, with over 7 billion people on the planet, finding them can seem like an impossible task. Over the years, law enforcement experts have built techniques to uniquely identify criminal, such as photofits and, now, DNA. Such techniques not only uniquely identify criminals but also help link them to the crimes they have committed.
The same concepts apply in cyber, but today under a less mature guise. We rely on tools to identify unique characteristics much like looking for eye or hair color. The challenge being, when you look at any such characteristics in isolation, such as looking for a bad email characteristics or a certain hair color, the level of false alerts can be unreal. The once unique binary is like a face with makeup, so many different permutations are quickly achieved. As such we need to look at all the attributes and try to see the whole face of the attack – better still, the DNA of the attack. If we can do this, we can start to see existing attacks more accurately, allowing us to automate. The more we can automate the quicker we can detect. And, if we can gather the whole DNA, we can start to identify new attacks as they happen by their genetic links.
Going back to CryptoWall, when v4 came out, it had some enhancements. Of course the email messages delivering it changed, as did the binary, requiring many traditional approaches to need an update. However, most of the underlying infrastructure stayed the same. In the sci-fi film Jurassic Park, they filled in the DNA gaps to rebuild dinosaurs. Here we have the ability to make fiction into fact by mapping out the whole attack lifecycle (the DNA of the attack), which includes all of the indicators aligned to it (rather than just the indicators we see as compromising the victim), we can better detect and block not just the current attack but all future instances, forcing the attack to effectively create a whole new dinosaur. Effectively, we use the broader attack architecture DNA to fill in the gaps created by the dynamic components, such as the changing binary and delivery wrapping.
Why don’t we all do this today? DNA analysis happens in one lab; most security solutions simply look for an element of the attack. Much like the criminal photofit, they look at maybe the eyes or the nose or the hair — perhaps all three. But they typically don’t see the whole face, and they certainly don’t gather the entire DNA. It’s like having a bunch of labs looking at different atoms trying to join together the strands, which was not historically their goal. Their goal was to block the attack, not understand what makes the attack function in the broader sense.
To identify the DNA, we need to be able to join the right elements together. This means analyzing and correlating these characteristics; looking at the known and mapping against the unknown, we need to pull this into a single point of analysis so we can see the big picture. To achieve this at a vendor level, you need solutions that were nativity designed to talk the same language; otherwise they are not comparing like for like.
No vendor spans all the security requirements today. This is why protocols such as STIX, TAXII and Cybox have been developed to allow multiple vendors to collaborate in a virtual common lab, such as the Cyber Threat Alliance, acting as the interpreter to automatically exchange big data through a common translation structure to support better mapping of the attack DNA. Through this approach, the Cyber Threat Alliance worked collaboratively to uncover CryptoWall 3.
There are many ways of trying to keep pace with today’s threats; each has its own advantages and disadvantages. The challenge, however, is that most are still looking to improve the identification of a characteristic. To better spot the criminal among the billions of faces, we need to leverage every aspect we can to make them stand out as unique and at the same time identify commonality. What’s more important is that, with big data tools and common frames of reference, we can then look for these attributes to find their future faces. At the end of the day, you can easily change individual aspects of your appearance, but it’s extremely hard to change your DNA.
[Palo Alto Networks Blog]