October 16, 2015

Internal Threat Intelligence – What Hunters Do

Filed under: Security Information Management,Security Intelligence — @ 16th of October 2015, 15:31


Hunting is about learning about, and understand your environment. It is used to build a ‘model’ of your network and applications that you then leverage to configure and tune your detection mechanisms.

I have been preaching about the need to explore our security data for the past 9 or 10 years. I just went back in my slides and I even posted a process diagram about how to integrate exploratory analysis into the threat detection life cycle (sorry for the horrible graph, but back then I had to use PowerPoint :) ):


Back then, people liked the visualizations they saw in my presentations, but nobody took the process seriously. They felt that the visualizations were pretty, but not that useful. Fast forward 7 or 8 years: Everyone is talking about hunting and how they need better tools an methods to explore their data. Suddenly those visualizations from back then are not just pretty, but people start seeing how they are actually really useful.

The core problem with threat detection is a broken process. Back in the glory days of SIEM, experts (or shall we say engineers) sat in a lab cooking up correlation rules and event prioritization formulas to help implement an event funnel:


This is so incredibly broken. [What do I know about this? I ran the team that did this for a big SIEM.] Not every environment is the same. There is no one formula that will help prioritize events. The correlation rules only scratch the surface, etc. etc. The fallout from this is that dozens of companies are frustrated with their SIEMs. They are looking for alternatives; throw them out and build their own. [Please don’t be one of them!]

Then threat intelligence came about. Instead of relying on signatures that identify common threats, let’s look for adversaries as well. Has that worked? Nope. There is a bunch of research indicating that the ‘external’ threat intelligence we have is not good.

What should we do? Well, we have to get away from thinking that a product will solve our problems. We have to roll-up our own sleeves and start creating our own ‘internal’ threat intelligence – through hunting.

The Hunting Process

Another word for hunting is exploring or learning. We need to understand our infrastructure, our applications to then find when there are things happening that are out of the norm.


Let’s look at the diagram above. What you see is a very simple data pipeline that we find everywhere. We ingest any kind of data. We then apply some kind of intelligence to the data to flag relevant events. In the SIEM case, the intelligence are mostly the correlation rules and the prioritization formula. [We have discussed this: pure voodoo.] Let’s have a look at the other pieces in the diagram:

  • Context: This is any additional information we can gather about our objects (machines, users, and applications). This information is crucial for good intelligence. The more surprising it is that the SIEMs haven’t put much attention to features around this.
  • IOCs: This is any kind of external threat intelligence. (see discussion above) We can use the IOCs to flag potentially bad activity.
  • Interactive Visualization: This is really the heart of the hunting process. This is the interface that a hunter uses to explore and learn about the data. Flexibility, speed, and an amazing user experience are key.
  • Internal TI: This is the internal threat intelligence that I mentioned above. This is basically anything we learn about our environment from the hunting process. The information is fed back into the overall system in the form of context and rules or patterns.
  • Models: The other kind of intelligence or knowledge we gather from our hunting activities we can use to optimize and define new models for behavioral analysis, scoring, and finding anomalies. I know, this is kind of vague, but we could spend an entire blog post on just this. Ask if you wanna know more.

To summarize, the hunting process really helps us learn about our environment. It teaches us what is “normal” and how to find “anomalies”. More practically, it informs the rules we write, the patterns we try to find, the behavioral models we build. Without going through this process, we won’t be able to configure / define a system that has a fighting chance of finding advanced adversaries or attackers in our network. Non. Zero. Zilch.

Some more context and elaboration on some of the hunting concepts you can find in my recent darkreading blog post about Threat Intelligence.

October 12, 2015

Who Will Build the Common Backend for Security?

Filed under: Log Analysis,Security Information Management,Security Market — @ 12th of October 2015, 17:00


VCs pay attention: There is an opportunity here, but it is going to be risky 😉 If you want to fund this, let me know.

In short: We need a company that builds and supports the data processing backend for all security products. Make it open source / free. And I don’t think this will be Cloudera. It’s too security specific. But they might have money to fund it? Tom?

I have had my frustrations with the security industry lately. Companies are not building products anymore, but features. Look at the security industry: You have a tool that does behavioral modeling for network traffic. You have a tool that does scoring of users based on information they extract from active directory. Yet another tool does the same for Linux systems and the next company does the same thing for the financial industry. Are you kidding me?

If you are a CISO right now, I don’t envy you. You have to a) figure out what type of products to even put in your environment and b) determine how to operate those products. We are back at where we were about 15 years ago. You need a dozen consoles to gain an understanding of your overall security environment. No cross-correlation, just an oportunity to justify the investment into a dozen screens on each analysts’ desk. Can we really not do better?

One of the fundamental problems that we have is that every single product re-builds the exact same data stack. Data ingestion with parsing, data storage, analytics, etc. It’s exactly the same stack in every single product. And guess what; using the different products, you have to feed all of them the exact same data. You end up collecting the same data multiple times.

We need someone – a company – to build the backend stack once and for all. It’s really clear at this point how to do that: Kafka -> Spark Streaming – Parquet and Cassandra – Spark SQL (maybe Impala). Throw some Esper in the mix, if you want to. Make the platform open so that everyone can plug in their own capabilities and we are done. Not rocket science. Addition: And it should be free / open source!

The hard part comes after. We need every (end-user) company out there to deploy this stack as their data pipeline and storage system (see Security Data Lake). Then, every product company needs to build their technology on top of this stack. That way, they don’t have to re-invent the wheel, companies don’t have to deploy dozens of products, and we can finally build decent product companies again that can focus on their core capabilities.

Now, who is going to fund the product company to build this? We don’t have time to go slow like Elastic in the case of ElasticSearch or RedHat for Linux. We need need this company now; a company that pulls together the right open source components, puts some glue between them, and offers services and maintenance.

Afterthought: Anyone feel like we are back in the year 2000? Isn’t this the exact same situation that the SIEMs were born out of? They promised to help with threat detection. Never delivered. Partly because of the technologies used (RDBMS). Partly due to the closeness of the architecture. Partly due to the fact that we thought we could come up with some formula that computes a priority for each event. Then look at priority 10 events and you are secure. Naive; or just inexperienced. (I am simplifying, but the correlation part is just an add-on to help find important events). If it weren’t for regulations and compliance use-cases, we wouldn’t even speak of SIEMs anymore. It’s time to rebuild this thing the right way. Let’s learn from our mistakes (and don’t get me started what all we have and are still doing wrong in our SIEMs [and those new “feature” products out there]).