March 19, 2026

SIEM Is Not Dead. It Just Stopped Moving Fast Enough.

I recently joined Tim Peacock and Anton Chuvakin on the Google Cloud Security Podcast to talk about SIEM, AI SOC, pricing, federated architecture, detection engineering, and why network telemetry is quietly becoming important again.

The short version is simple: SIEM is not dead. Calling it obsolete makes for good marketing, but it is not a serious thesis.

The new wave of AI SOC, SIEM, and pipeline vendors is not proving SIEM is dead. It is proving SIEM vendors left too many gaps open for too long.

The recent wave of AI SOC startups, pipeline vendors, and new SIEM entrants is a response to real pain in the market. They are not replacing SIEM. They are capitalizing on the gaps incumbent vendors left open.

TL;DR

  • SIEM is not dead. Vendors just left too many gaps open.
  • AI SOC often exposes those gaps more than it replaces SIEM.
  • Alert reduction alone will hide false negatives.
  • The real fixes are better routing, detection, context, and workflows.
  • Network telemetry still matters more than the market narrative suggests.

The market is not replacing SIEM. It is rebuilding missing pieces.

They say they will reduce alert volume, improve detections, make investigations faster, lower storage costs, and simplify operations. None of that is new. Those were always core parts of the SIEM vision.

That is why so many of these new entrants exist. They found real gaps:

  • Pricing that became too hard to justify
  • Architectures that did not scale as well as they should
  • Detection stacks that still require too much manual work
  • Default content that creates too much noise
  • Workflows that remain painful for analysts and service providers

This is why I do not buy the “SIEM is over” narrative. If incumbents fix these gaps, many point solutions lose their edge quickly.

AI SOC is mostly a patch on downstream pain

The strongest short-term value in the AI SOC market is obvious: too many teams, especially MSSPs and down-market security providers, are drowning in alerts. A lot of environments are running with default content, light tuning, and limited budget for customization. Large enterprises can afford deep implementation and constant refinement. Many managed providers cannot.

If a product makes the SOC quieter without improving coverage, you may not have solved the problem. You may have just converted visible false positives into invisible false negatives.

If a startup is solving alert overload by learning that the same service-account misconfiguration fires every morning at 8am and can safely be deprioritized, that is useful. But it is still a patch on bad upstream logic, and it often hides a second problem: false negatives. Once teams see fewer alerts, they assume the system got smarter. Sometimes it did. Sometimes it just got quieter. The real fix belongs closer to the detection layer, the correlation logic, the content, and the configuration model.

That is why I think a lot of the current AI SOC wave is temporary in its present form. Not temporary because the need goes away, but temporary because the best parts of that value will be absorbed elsewhere. Some of it should move back into the SIEM. Some of it should live in the detection engine. Some of it belongs in better onboarding, better rule tuning, better data handling, and better defaults.

There is still room for new winners here. But “we reduce alerts by 80%” is not a durable thesis by itself.

The architecture debate is not centralized versus federated. It is about access patterns.

In theory, pushing compute to where the data sits is attractive. In practice, the answer depends on access patterns.

Some data absolutely does not need to be centralized all the time. Endpoint system calls are a good example. You do not want to shovel every low-level signal into a central platform by default if you can process, summarize, or prioritize it earlier.

But the moment an analyst, agent, or investigation workflow needs context, enrichment, and cross-correlation, some centralization comes back. You need to connect what happened on the endpoint with what happened on the firewall, identity plane, SaaS layer, email stack, and elsewhere.

So the future is probably not pure centralized or pure federated. It is hybrid:

  • Keep some data local or near-source
  • Route and centralize the parts that matter
  • Pull deeper context only when needed
  • Optimize around how investigations actually happen

This is why I keep coming back to smart data routing. Most organizations do not need to send every piece of data to the same place forever. But they do need an architecture that knows when to summarize, when to correlate, and when to pull more detail back in.

Data pipelines became the Trojan horse

Vendors in this space positioned themselves as optimization and routing tools. Send your data here, normalize it, trim low-value volume, route it to the right storage tier, keep costs down, and retain optionality. In many environments, that solved a real problem.

But the strategic consequence is bigger than cost control.

Once a pipeline vendor owns your ingestion layer and your integrations, it becomes an abstraction layer between you and the SIEM. That makes the SIEM less sticky. At first the pipeline vendor only routes data. Then it adds search. Then it runs lightweight detections. Then it supports simple rules. At some point it starts to look suspiciously like a simple SIEM.

If someone else owns the data path, they eventually get a shot at owning more of the security brain.

Pricing remains one of the category’s hardest unsolved problems

Almost everyone agrees that SIEM pricing has been a problem. Much fewer people agree on what the right answer is.

The vendor reality is straightforward: data volume drives cost. The customer reality is equally straightforward: they hate unpredictability.

That tension gets even worse in the service-provider world. MSSPs and MSPs often sell packaged services, per-user offerings, or per-device contracts. Their customers do not want a fluctuating bill because log volume spiked this month. So the thing that is economically clean for the vendor can be operationally ugly for the buyer.

There is no perfect answer here. But the next generation of pricing models will need to do a better job of separating:

  • Predictable commercial packaging
  • Actual backend resource consumption
  • Incentives for better data quality rather than more raw ingestion

The market has already started experimenting. Bring-your-own-storage, bring-your-own-compute, lower-cost data lakes, and more selective routing are all responses to the same pressure. Pricing is one of the core forces reshaping the market.

Detection engineering still needs much more help from the platform

Rules still need adaptation by environment. Thresholds differ. Data quality differs. Sources differ. Customer expectations differ. Generic content does not simply drop in and work.

What is surprising is how much low-hanging product work still remains. A modern platform should do far more to help users answer basic but critical questions:

  • Is the data required for this detection even present?
  • Is it configured in a way that can ever make this rule fire?
  • Are there obvious gaps or mistakes in the source configuration?
  • Which detections are silent because they are poorly mapped to the environment?

The more interesting direction, in my view, is not just better standalone rules. It is better context. Call it a context graph, an entity graph, a risk graph, or something else. The naming matters less than the function.

You want a living model of users, devices, applications, identities, behaviors, and risk signals. If the system knows that a user is coming from their normal IP, on a familiar device, through a known browser pattern, after strong authentication, that should shape how other events are interpreted. If all of those signals change at once, that should shape the response differently.

That kind of context is where detection quality meaningfully improves.

Network telemetry is not “back,” but it is still critical

I do not think this automatically means a major standalone NDR renaissance. But I do think many teams went too far in treating network telemetry as secondary once endpoint and application visibility improved.

An endpoint is still a single point of failure. If you lose visibility there, the network can still tell you a lot. It can help validate what else is happening. It can show you unmanaged systems, OT environments, choke points, and traffic patterns you will not otherwise see clearly.

This matters even more now because some organizations are reassessing where systems and data live. In parts of Europe, I am seeing more discussion around data sovereignty, political trust, private clouds, and selective moves back toward local or regional infrastructure. As architectures spread and governance constraints tighten, network visibility becomes more important again.

So no, I would not frame this as “throw away EDR and buy NDR.” That is the wrong lesson.

What happens next

The real question is not whether SIEM survives. It is which vendors understand they are now selling data architecture, detection quality, analyst workflow, and decision support.

The SIEM market is heading into another rebuild cycle. Some AI SOC and pipeline startups will disappear, some will be absorbed, and some incumbents will finally fix what they should have fixed years ago. But the core need is not going away: security teams still need a place where signals come together, context gets built, detections improve, and response decisions get made.

That is still SIEM territory, even if the implementation looks very different from what we used to buy.

? If you are building, buying, operating, or replacing SIEM, I’d love your input. I’m collecting market data at raffy.ch/SIEM. Anyone can contribute, and everyone is welcome.

February 11, 2026

The SIEM Maturity Framework: A Practical Scoring Tool for Security Analytics Platforms

Update: Instead of an Excel spreadsheet, here is an online app that you can use. I’d love for you to submit your own ratings so we can crowd-source some of these answers!

Over the last few weeks I published a post on the architectural and operational gaps that created the new wave of SIEM and AI SOC vendors. A bunch of people asked the same follow-up question:

“Ok, but how do I evaluate vendors consistently without falling back into feature checklists and marketing claims?”

So I turned the framework into a practical scoring workbook (and now a small Web application) you can use to rate a platform across the dimensions I described in the post. The workbook allows you to rate each category from 1 to 5 and I spent some time defining what a 1 versus a 5 means in each of the categories. I give you an example for the “Data Pipeline Optimization” category. Here are the 5 maturity steps:

  • 1 | Static ingestion pipelines that forward all data to a central store.
  • 2 | Basic filtering or routing based on source or log type.
  • 3 | Conditional enrichment and routing based on use case or predefined alerts/rules.
  • 4 | Dynamic pipelines that adapt sampling, enrichment, and routing based on downstream value.
  • 5 | Continuously optimized pipelines driven by feedback loops from detections, cost, and analyst outcomes.

I hope the breakdown into these 5 values helps going through a more ‘objective’ assessment of these platforms and also shows what excellent looks like in each of these categories.

What this is

The Security Analytics Platforms – Maturity Framework is an architecture-first tool to evaluate security platforms across architectural, detection, and operational dimensions. It is designed to help you compare systems based on their advanced capabilities that are desperately needed to deliver a SIEM experience that is adequate for 2026..

What this is not

This is not a vendor ranking, a feature checklist, or a replacement for hands-on testing. It’s also NOT an RFP template. As I indicated in my previous blog where I outlined all the different categories, the table stakes are not mentioned or evaluated.

How to use it in 10 minutes

  1. Add one vendor per row in the rating sheet.
  2. Score each topic based on current behavior, not roadmap promises.
  3. Review category roll-ups and the heatmap to spot structural gaps.

A key insight: large gaps between category scores often matter more than the overall score.

Use the Web App

Click on the image to launch the app…

Application Launch

Download

Workbook (v1.0)SIEM_Ratings_Framework – Last updated: 2026-02-11

Why I’m releasing this

Security analytics is in the middle of a reset. Incumbent SIEMs are being re-architected, new SIEM startups are emerging, and AI SOC vendors are rewriting parts of the operating model. End users and investors need a way to evaluate these platforms objectively, beyond feature checklists and marketing claims. This workbook is my attempt to make that evaluation repeatable, comparable, and anchored in the areas that I see missing or deficient in the incumbent SIEM space.

If you use it, I’d love your feedback

If you score a platform with it, use the Web app and submit your rating. You need to log in via Github or Google so I don’t get flooded with fake entries. I’d love to crowdsource an assessment of all the SIEM and AI SOC vendors out there. Can we do it?

February 3, 2026

The Gaps That Created the New Wave of SIEM and AI SOC Vendors

Update (2026-02): I released the SIEM Maturity Framework Workbook (v1.0) that turns this post into a practical scoring tool.

I have been talking to a few AI SOC and new SIEM market entrants over the past few weeks. I have voiced some opinions in previous posts but have now started to capture a list of features that I believe represent the openings existing SIEM players have created in the market for these new vendors to emerge.

Before I outline what I think those features are, let me be clear: this is my list. I am aware that existing SIEM vendors will claim that they already do many of these things. All I will say is this: market churn and capital flow suggest that these capabilities are either not as mature or not as integrated as claimed.

And to the AI SOC companies and investors: be careful about the short-term problems your investments are solving. Yes, there is real traction with MSSPs that are overloaded with false positives. And yes, many will gladly pay to reduce alert workload by 80%. But in many cases, these problems are being addressed superficially. Make sure you audit the underlying approaches and verify that the foundational infrastructure is sound. Solving this problem on top of an existing detection infrastructure doesn’t solve the problem at the core, which is the detections themselves. We need to fix those with some of the suggestions below to not needing a top-layer, alert reducer.

Without further ado, here are the items I am tracking. I welcome other opinions and additions to the list (no guarantee I will include them). Over the coming weeks, I will also try to rate some of the players across these categories to enable comparison. I could use help with that. Ping me.

A. DATA & CONTROL PLANE ARCHITECTURE

  • Federation – The ability to query and reason over data where it lives, without forced centralization.
    (Another post following here at some point about the limitations of federation).
  • Data Pipeline Optimization – Dynamic ingestion pipelines that enrich, route, sample, and filter data based on use case, risk, and downstream value. Not static “send everything to the lake.”
  • Data Awareness – Understanding what data exists, what is missing, and what has silently degraded. The system must continuously reason about its own observability.
  • Performance as a First-Class Constraint – Fast joins and low-latency queries across all relevant data. Real-time rule execution at scale. This is not about basic scalability, but about maintaining predictable performance as rule count and complexity increase, without simply throwing more compute at the problem.
  • Modern AI Integration – The ability to integrate with emerging architectural patterns and frameworks, including MCP servers, vector stores, and related systems.

B. DETECTION & LEARNING SYSTEMS

  • Hypothesis-Driven Hunting – Hunting should start with explicit hypotheses, not ad-hoc queries. These hypotheses should evolve, fork, and self-update based on outcomes. Agents swarms anyone?
  • Automated Detection Tuning (Closed Loop) – Detections must evaluate their precision and recall over time. False positives and false negatives are signals. Humans stay in the loop, but are not the tuning engine. This also helps separate the detection engineering from the tuning that should be done by analysts.
  • Environment-Adaptive Detections – Rules and models must adapt automatically to the specific environment, business processes, and user behavior and analyst feedback. Generic detections are table stakes.
  • Detection Lineage and Memory – The system must remember why a detection exists, how it has changed, and what outcomes it has historically produced.

C. ENTITY-CENTRIC RISK & CONTEXT

  • Asset Awareness – Effective protection and detection start with understanding what is being protected. Entity visibility is foundational: who owns this entity, what does it do, and which business processes does it support?
  • Real-Time Entity Risk Scoring – Each entity has a continuously updated risk score driven by behavior, exposure, and contextual signals.
  • Entity Risk Context – Risk is not a number. It is a set of properties that help explain the risk and provide context for decision making.
  • Business Context Integration – Entities must be tied to business processes, ownership, and criticality, and this context must inform alert generation and prioritization. Some people have started calling this the Context Graph.

D. OPERATIONAL REALITY (SOC, MSSP, ENFORCEMENT)

  • Simple Query Interface: Support for both natural language and structured query languages (such as KQL). Analysts need both.
  • Alert Triage Automation – Using ‘advanced’ context to tune detections. Ideally we have business context available to continuously improve our detections.
  • Blindspot Detection – The system must actively identify where detections cannot exist due to missing or degraded logs or logging configurations. This includes making sure that log sources are actually staying up and keep reporting what they have to.
  • Real-Time Readiness for Enforcement – We need our systems to become preventative. Therefore, its risk model must operate in near real time. Attackers are acting too fast.

A Few Additional Comments for Context

This is not meant to be a SIEM RFP. I am intentionally not listing table-stakes capabilities such as basic scalability, data source support, or baseline detection depth.

This list is less about features than about where intelligence and control actually live in the system. I am also not being prescriptive on how these features are built. Many of them can benefit from AI / LLM / ML approaches and, in fact, should be using them.

Look at the list, then look at your AI SOC platform of choice. How much of the above does it truly cover?

If you are evaluating an AI SOC platform and most of its value proposition lives above alerts rather than below them, you should be skeptical.

Update (2026-02): I released the SIEM Maturity Framework Workbook (v1.0) that turns this post into a practical scoring tool.

January 16, 2026

How AI Impacts the Cyber Market and The Future of SIEM

Security has always moved in waves. Not because we suddenly get smarter, but because we learn from past mistakes, identify gaps, hit limits, need to protect new technologies, and then go and do our best to solve those new security challenges with the technologies at hand.

The era of AI (let’s be clear, we have had AI for a long time; what I mean specifically is the advent of Large Language Models) has shifted many industries, but specifically security in a particularly revealing way. AI did not just give us new tools to solve security problems. It invited innovators and entrepreneurs to revisit pretty much every security technology to see if LLMs could be useful to address some of the existing challenges. But that’s not where things stopped. More interestingly, some teams used this moment to question whether the underlying approaches themselves still made sense at all. Not just whether LLMs could help, but whether modern data architectures, different telemetry choices, and different enforcement models could fundamentally change outcomes.

That is what has triggered a real wave of new companies in cyber, including across markets that many considered mature, or even stagnant, like SIEM.

The Five Phases We Just Lived Through

Let’s take a non-scientific look at how major security approaches evolved over the past 25 years. This is not exhaustive, but it helps explain where we are today.

1. Network-Centric Prevention

Back, many moons ago, we started with firewalls, IDS, and later IPS. The model was simple. Look at packets. Stop bad things. It worked until attackers learned to look normal.

2. More Data, Centralized, Higher-Level Insights

When network telemetry created too many false positives, we added vulnerability data and authentication events and fed them into a SIEM to correlate. The results were “mixed”. Fortunately for the SIEM market, compliance and audit requirements emerged, mandating long-term log retention. This gave SIEM a durable justification, even when its security value was debated. SIEM became indispensable for visibility and forensics, but increasingly disconnected from real-time decision making.

3. Back to Prevention and Response

As SIEM alert volumes exploded and analysts could not keep up, the industry pivoted. EDR. NDR. SOAR. We all know how that played out. NDR never truly broke out. EDR became a major category. SOAR largely collapsed back into SIEM. And eventually, most large EDR vendors added a SIEM to their portfolio.

This was not convergence by design. It was convergence driven by operational gravity.

4. AI Triggers a Reality Check

LLMs made many believe they could simply layer AI on top of broken architectures. Some startups did exactly that. They will likely not be the long-term winners.

The more interesting group of companies used AI as a forcing function to re-examine first principles. What data actually matters? What can realistically be prevented at the edge? What must still be correlated centrally? What is structurally broken in SOC workflows? Where have we been compensating for bad architecture with human labor? Crucially, many of these answers have little to do with LLMs themselves, and much more to do with data fidelity, placement of control, and modern system design.
This is where the real innovation is happening.

5. The Convergence

We are now in a phase where prevention is moving back to the edge, while analytics and orchestration remain central. Endpoints are smarter. Browsers are instrumented. Networks are being re-observed. Context is finally treated as a first-class input.

But there is still a SOC. There is still a central nervous system that correlates, reconstructs, explains, orchestrates, and proves what happened. Call it SIEM, security analytics, XDR, or AI SOC. The name is irrelevant. The function is not.

In parallel, we are realizing that we can push enforcement / prevention back to the edge. Wherever we have enough information, execute at the edge. Where we don’t, call out to your central nervous system. To your brain. The brain (your SIEM) that understands at any moment in time, what the risk and function is of every entity in your network. And use that information for decision making.

Why AI SOC Will Collapse Back Into SIEM

Many startups brand themselves as “AI SOC”. What do they actually do?

They primarily ingest alerts from EDR, NDR, SIEMs, and cloud platforms, then attempt to determine which ones matter. They add context, apply behavioral analysis, and suppress false positives.

In other words, they attempt to do what SIEM, UEBA, and SOAR were always supposed to do, just with better math and more compute. However, there is one problem. Many of the AI SOC contenders operate on alert streams. That means they start from already lossy, opinionated data. Real behavioral analysis does not on top of alert streams. It lives in raw telemetry. Email flows. Network sessions. Browser actions. Endpoint system behavior.

Once an AI SOC platform decides to ingest that raw data directly, it immediately recreates the ingestion, normalization, storage, and correlation problems that SIEM already exists to solve. At that point, the separation no longer makes sense. This is exactly why UEBA and SOAR collapsed back into SIEM. And it is why AI SOC will do the same.

There will be one place where data is reconciled, correlated, and turned into decisions. That place will increasingly run on federated, near-real-time architectures rather than twenty-year-old indexing engines. But their function remains the same. Call it whatever you want. It needs to be one system, not many and it doesn’t care what you call it.

The Shift Is Not Just Technical. It Is Organizational.

What is interesting to note about these new entrants in the SIEM or security analytics space is not just their security architecture. It is the company architecture. Modern security startups are being built on AI-native operating systems: Sales calls are captured and analyzed, not just by sales, but product teams mine them for competitive signals, marketing uses them to refine messaging, engineering uses them to prioritize roadmaps. This is not a tooling upgrade. It is a fundamentally different operating model.

Imagine a system where the vision, mission, strategy, and priorities are centrally maintained, updated and codified. Every function consumes that shared intelligence to drive decisions, messaging, and execution. This does not just improve alignment. It dramatically compresses learning cycles and execution speed. And that, more than any individual feature, may be the hardest thing for incumbents to replicate.

March 31, 2021

Asset Management – Back To The Roots

Category: Big Data, Compliance, Security Intelligence — @ 5:47 am

Asset management is one of the core components of many successful security programs. I am an advisor to Panaseer, a startup in the continuous compliance management space. I recently co-authored a blog post on my favorite security metric that is related to asset management:

How many assets are in the environment?

A simple number. A number that tells a complex story though if collected over time. A metric also that has a vast number of derivatives that are important to understand and one that has its challenges to be collected correctly. Just think about how you’d know how many assets there are at every moment in time? How do you collect that information in real-time?

The metric is also great to start with to then break it down along additional dimensions. For example:

  • How many assets are managed versus unmanaged (e.g., IOT devices)
  • Who are the owners of the assets and how many assets can we assign an owner for?
  • What does the metric look like broken down by operating system, by business unit, by department, by assets that have control violations, etc.
  • Where is the asset located?
  • Who is using the asset?

And then, as with any metric, we can look at the metrics not just as a single instance in time, but we can put them into context and learn more about our asset landscape:

  • How does the number behave over time? Any trends or seasonalities?
  • Can we learn the uncertainty associated with the metric itself? Or in other terms, what’s the error range?
  • Can we predict the asset landscape into the future?
  • Are there certain behavioral patterns around when we see the assets on the network?

I am just scratching the surface of this metric. Read the full blog post to learn more and explore how continuous compliance monitoring can help you get your IT environment under control.

August 2, 2019

The Need For Domain Experts and Non Trivial Conclusions

In my last blog post I highlighted some challenges with a research approach from a paper that was published at IEEE S&P, the sub conference on “Deep Learning and Security Workshop (DLS 2019)“. The same conference featured another paper that spiked my interest: Exploring Adversarial Examples in Malware Detection.

This paper highlights the problem of needing domain experts to build machine learning approaches for security. You cannot rely on pure data scientists without a solid security background or at least a very solid understanding of the domain, to build solutions. What a breath of fresh air. I hole heartedly agree with this. But let’s look at how the authors went about their work.

The example that is used in the paper is in the area of malware detection; a problem that is a couple of decades old. The authors looked at binaries as byte streams and initially argued that we might be able to get away without feature engineering by just feeding the byte sequences into a deep learning classifier – which is one of the premises of deep learning, not having to define features for it to operate. The authors then looked at some adversarial scenarios that would circumvent their approach. (Side bar: I wish Cylance had read this paper a couple years ago). The paper goes through some ROC curves and arguments to end up with some lessons learned:

  • Training sets matter when testing robustness against adversarial examples
  • Architectural decisions should consider effects of adversarial examples
  • Semantics is important for improving effectiveness [meaning that instead of just pushing a binary stream into the deep learner, carefully crafting features is going to increase the efficacy of the algorithm]

Please tell me which of these three are non obvious? I don’t know that we can set the bar any lower for security data science.

I want to specifically highlight the last point. You might argue that’s the one statement that’s not obvious. The authors basically found that, instead of feeding simple byte sequences into a classifier, there is a lift in precision if you feed additional, higher-level features. Anyone who has looked at byte code before or knows a little about assembly should know that you can achieve the same program flow in many ways. We must stop comparing security problems to image or speech recognition. Binary files, executables, are not independent sequences of bytes. There is program flow, different ‘segments’, dynamic changes, etc.

We should look to other disciplines (like image recognition) for inspiration, but we need different approaches in security. Get inspiration from other fields, but understand the nuances and differences in cyber security. We need to add security experts to our data science teams!

July 30, 2019

Research is “Skewing up”

Over the weekend I was catching up on some reading and came about the “Deep Learning and Security Workshop (DLS 2019)“. With great interest I browsed through the agenda and read some of the papers / talks, just to find myself quite disappointed.

It seems like not much has changed since I launched this blog. In 2005, I found myself constantly disappointed with security articles and decided to outline my frustrations on this blog. That was the very initial focus of this blog. Over time it morphed more into a platform to talk about security visualization and then artificial intelligence. Today I am coming back to some of the early work of providing, hopefully constructive, feedback to some of the work out there.

The researcher paper I am looking at is about building a deep learning based malware classifier. I won’t comment on the fact that every AV company has been doing this for awhile (but learned from their early mistakes of not engineering ‘intelligent’ features). I also won’t discuss the machine learning architecture that is introduced. What I will argue is the approach that was taken and the conclusions that were drawn:

  • The paper uses a data set that has no ground truth. Which, in network security is very normal. But it needs to be taken into account. Any conclusion that is made is only relative to the traffic that the algorithm was tested, at the time of testing and under the used configuration (IDS signatures). The paper doesn’t discuss adoption or changes over time. It’s a bias that needs to be clearly taken into account.
  • The paper uses a supervised approach leveraging a deep learner. One of the consequences is that this system will have a hard time detecting zero days. It will have to be retrained regularly. Interestingly enough, we are in the same world as the anti virus industry when they do binary classification.
  • Next issue. How do we know what the system actually captures and what it does not?
    • This is where my recent rants on ‘measuring the efficacy‘ of ML algorithms comes into play. How do you measure the false negative rates of your algorithms in a real-world setting? And even worse, how do you guarantee those rates in the future?
    • If we don’t know what the system can detect (true positives), how can we make any comparative statements between algorithms? We can make a statement about this very setup and this very data set that was used, but again, we’d have to quantify the biases better.
  • In contrast to the supervised approach, the domain expert approach has a non-zero chance of finding future zero days due to the characterization of bad ‘behavior’. That isn’t discussed in the paper, but is a crucial fact.
  • The paper claims a 97% detection rate with a false positive rate of less than 1% for the domain expert approach. But that’s with domain expert “Joe”. What about if I wrote the domain knowledge? Wouldn’t that completely skew the system? You have to somehow characterize the domain knowledge. Or quantify its accuracy. How would you do that?

Especially the last two points make the paper almost irrelevant. The fact that this wasn’t validated in a larger, real-world environment is another fallacy I keep seeing in research papers. Who says this environment was representative of every environment? Overall, I think this research is dangerous and is actually portraying wrong information. We cannot make a statement that deep learning is better than domain knowledge. The numbers for detection rates are dangerous and biased, but the bias isn’t discussed in the paper.

:q!

July 24, 2019

Causality Research in AI – How Does My Car Make Decisions?

Before even diving into the topic of Causality Research, I need to clarify my use of the term #AI. I am getting sloppy in my definitions and am using AI like everyone else is using it, as a synonym for analytics. In the following, I’ll even use it as a synonym for supervised machine learning. Excuse my sloppiness …

Causality Research is a topic that has emerged from the shortcomings of supervised machine learning (SML) approaches. You train an algorithm with training data and it learns certain properties of that data to make decisions. For some problems that works really well and we don’t even care about what exactly the algorithm has learned. But in certain cases, we really would like to know what the system just learned. Your self-driving car, for example. Wouldn’t it be nice if we actually knew how the car makes decisions? Not just for our own peace of mind, but also to enable verifyability and testing.

Here are some thoughts about what is happening in the area of causality for AI:

  • This topic is drawing attention because people are having their blinders on when defining what AI is. AI is more than supervised machine learning, and a number of the algorithms in the field, like belief networks, are beautifully explainable.
  • We need to get away from using specific algorithms as the focal point of our approaches. We need to look at the problem itself and determine what the right solution to the problem is. Some of the very old methods like belief networks (I sound like a broken record) are fabulous and have deep explainability. In the grand scheme of things, only few problems require supervised machine learning. 
  • We are finding ourselves in a world where some people believe that data can explain everything. It cannot. History is not a predictor of the future. Even in experimental physics, we are getting to our limits and have to start understanding the fundamentals to get to explainability. We need to build systems that help experts encode their knowledge and augments human cognition by automating tasks that machines are good at.

The recent Cylance faux pas is a great example why supervised machine learning and AI can be really really dangerous. And it brings up a different topic that we need to start exploring more, which is how we measure the efficacy or precision of AI algorithms. How do we assess the things a given AI or machine learning approach misses and what are the things it classifies wrong? How does one compute these metrics for AI algorithms? How do we determine whether one algorithm is better than another. For example, the algorithm that drives your car. How do you know how good it is? Does a software update make it better? How much? That’s a huge problem in AI and ‘causality research’ might be able to help develop methods to quantify efficacy.

August 7, 2018

AI & ML IN CYBERSECURITY – Why Algorithms Are Dangerous

Join me for my talk about AI and ML in cyber security at BlackHat on Thursday the 9th of August in Las Vegas. I’ll be exploring the topics of artificial intelligence (AI) and machine learning (ML) to show some of the ‘dangerous’ mistakes that the industry (vendors and practitioners alike) are making in applying these concepts in security.

We don’t have artificial intelligence (yet). Machine learning is not the answer to your security problems. And downloading the ‘random’ analytic library to identify security anomalies is going to do you more harm than it helps.

We will explore these accusations and walk away with the following learnings from the talk:

We dont have artificial intelligence (yet) Algorithms are getting smarter, but experts are more important Stop throwing algorithms on the wall - they are not spaghetti Understand your data and your algorithms Invest in people who know security (and have experience) Build systems that capture expert knowledge Think out of the box, history is bad for innovation

I am exploring these items throughout three sections in my talk: 1) A very quick set of definitions for machine learning, artificial intelligence, and data mining with a few examples of where ML has worked really well in cyber security. Check cybersecuritycourses.com here for an overview of the best cyber security courses available. 2) A closer and more technical view on why algorithms are dangerous. Why it is not a solution to download a library from the Internet to find security anomalies in your data. 3) An example scenario where we talk through supervised and unsupervised machine learning for network traffic analysis to show the difficulties with those approaches and finally explore a concept called belief networks that bear a lot of promise to enhance our detection capabilities in security by leveraging export knowledge more closely. And if you plan to test the the vulnerability of your network, make use of Wifi Pineapple testing tool.

Algorithms are Dangerous

I keep mentioning that algorithms are dangerous. Dangerous in the sense that they might give you a false sense of security or in the worst case even decrease your security quite significantly. Here are some questions you can use to self-assess whether you are ready and ‘qualified’ to use data science or ‘advanced’ algorithms like machine learning or clustering to find anomalies in your data:

  • Do you know what the difference is between supervised and unsupervised machine learning?
  • Can you describe what a distance function is?
  • In data science we often look at two types of data: categorical and numerical. What are port numbers? What are user names? And what are IP sequence numbers?
  • In your data set you see traffic from port 0. Can you explain that?
  • You see traffic from port 80. What’s a likely explanation of that? Bonus points if you can come up with two answers.
  • How do you go about selecting a clustering algorithm?
  • What’s the explainability problem in deep learning?
  • How do you acquire labeled network data sets (netflows or pcaps)?
  • Name three data cleanliness problems that you need to account for before running any algorithms?
  • When running k-means, do you have to normalize your numerical inputs?
  • Does k-means support categorical features?
  • What is the difference between a feature, data field, and a log record?

If you can’t answer the above questions, you might want to rethink your data science aspirations and come to my talk on Thursday to hopefully walk away with answers to the above questions.

Update 8/13/18: Added presentation slides

March 29, 2018

Security Analyst Summit 2018 in Cancun – AI, ML, And The Sun

Another year, another Security Analytics Summit. This year Kaspersky gathered an amazing set of speakers in Cancun, Mexico. I presented on AI & ML in Cyber Security – Why Algorithms Are Dangerous. I was really pleased how well the talk was received and it was super fun to see the storm that emerged on Twitter where people started discussing AI and ML.

Here are a couple of tweets that attendees of my talk tweeted out (thanks everyone!):

The following are some more impressions from the conference:

And here are the slides:

AI & ML in Cyber Security – Why Algorithms Are Dangerous from Raffael Marty