October 12, 2015

Who Will Build the Common Backend for Security?

Category: Log Analysis,Security Information Management,Security Market — Raffael Marty @ 5:00 pm


VCs pay attention: There is an opportunity here, but it is going to be risky 😉 If you want to fund this, let me know.

In short: We need a company that builds and supports the data processing backend for all security products. Make it open source / free. And I don’t think this will be Cloudera. It’s too security specific. But they might have money to fund it? Tom?

I have had my frustrations with the security industry lately. Companies are not building products anymore, but features. Look at the security industry: You have a tool that does behavioral modeling for network traffic. You have a tool that does scoring of users based on information they extract from active directory. Yet another tool does the same for Linux systems and the next company does the same thing for the financial industry. Are you kidding me?

If you are a CISO right now, I don’t envy you. You have to a) figure out what type of products to even put in your environment and b) determine how to operate those products. We are back at where we were about 15 years ago. You need a dozen consoles to gain an understanding of your overall security environment. No cross-correlation, just an oportunity to justify the investment into a dozen screens on each analysts’ desk. Can we really not do better?

One of the fundamental problems that we have is that every single product re-builds the exact same data stack. Data ingestion with parsing, data storage, analytics, etc. It’s exactly the same stack in every single product. And guess what; using the different products, you have to feed all of them the exact same data. You end up collecting the same data multiple times.

We need someone – a company – to build the backend stack once and for all. It’s really clear at this point how to do that: Kafka -> Spark Streaming – Parquet and Cassandra – Spark SQL (maybe Impala). Throw some Esper in the mix, if you want to. Make the platform open so that everyone can plug in their own capabilities and we are done. Not rocket science. Addition: And it should be free / open source!

The hard part comes after. We need every (end-user) company out there to deploy this stack as their data pipeline and storage system (see Security Data Lake). Then, every product company needs to build their technology on top of this stack. That way, they don’t have to re-invent the wheel, companies don’t have to deploy dozens of products, and we can finally build decent product companies again that can focus on their core capabilities.

Now, who is going to fund the product company to build this? We don’t have time to go slow like Elastic in the case of ElasticSearch or RedHat for Linux. We need need this company now; a company that pulls together the right open source components, puts some glue between them, and offers services and maintenance.

Afterthought: Anyone feel like we are back in the year 2000? Isn’t this the exact same situation that the SIEMs were born out of? They promised to help with threat detection. Never delivered. Partly because of the technologies used (RDBMS). Partly due to the closeness of the architecture. Partly due to the fact that we thought we could come up with some formula that computes a priority for each event. Then look at priority 10 events and you are secure. Naive; or just inexperienced. (I am simplifying, but the correlation part is just an add-on to help find important events). If it weren’t for regulations and compliance use-cases, we wouldn’t even speak of SIEMs anymore. It’s time to rebuild this thing the right way. Let’s learn from our mistakes (and don’t get me started what all we have and are still doing wrong in our SIEMs [and those new “feature” products out there]).


  1. OpenSOC by Cisco is in my opinion an attempt to do exactly this. But, as usual in infosec, it’s lacking a good analytical front-end. But the general platform is there.

    Comment by Tor Inge Skaar — October 13, 2015 @ 1:00 am

  2. Yes, you are right. OpenSoc is an approach, but it’s horrible code, quite bad architecture, and there is nobody behind it really pushing. No community, nothing. That’s what I described as the biggest problem, getting adoption.

    Comment by Raffael Marty — October 13, 2015 @ 7:55 am

  3. Hi a few questions I you don’t mind answering

    Tech side (a bit provocative here, but just for the sake of discussion!)
    – MozDef? I don’t like it / I don’t understand it , though of course kudos is deserved to all the people working on it. Opinion?
    – Netflix FIDO? Based on the assumption to have Bit9 and many more fancy things?
    – Why can’t we simplify all of this and just rely on an endpoint agent telling us the truth?

    Money side
    – H2020 financing?
    – R&D government funds?


    Comment by Francesco — October 13, 2015 @ 3:49 pm

  4. Francesco,

    – Mozdev: Not sure what you mean? In terms of a community?
    – FIDO: Is interesting, but only part of the story. We are talking about collecting _any_ data and using that for whatever it is useful.
    – Endpoint: Sure, that’s great. But again, the network can tell us interesting things as well. Endpoint is great though and needs to be part of the data collected and the infrastructure available. In fact, SIEMs are horrible at supporting endpoint data. So: yes!
    – H2020: Don’t know anything about that. If you can help. Sure!
    – Government: Sure. I am just throwing all of this out there. Not really my main focus right now, but it’s sort of a fall-out of the work I have been doing lately and what I am aspiring to do with pixlcloud. If there is a way to get this built, I am all for seeing how it could be done.

    Comment by Raffael Marty — October 13, 2015 @ 4:38 pm

  5. Hey Raffy,

    I think Francesco is talking about MozDef, the defence platform by Mozilla, not the developer community.

    What do you think about the basic ELK stack?

    Comment by Herman Slatman — October 14, 2015 @ 12:56 am

  6. Do you mind if I send you a PM?

    Comment by Francesco — October 14, 2015 @ 6:53 am

  7. I think the future is a common security data lake with algorithms consuming streams of data, unified into a single UI.

    So, data lake + analysis algorithms + UI modules.

    Security vendors won’t be making their own data lakes anymore, and they won’t be making standalone GUIs anymore. They’ll be making better ways of looking at the data in the data lake, and providing a slick way (using a standardized GUI language) to visualize and display that data to customers.

    Comment by Daniel Miessler — October 23, 2015 @ 5:55 pm

RSS feed for comments on this post. | TrackBack URI

Leave a comment

XHTML ( You can use these tags): <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong> .