December 3, 2007

My Splunk Blog

Category: Uncategorized — Raffael Marty @ 4:02 pm

logo_splunk.gifI wanted to mention this a long time ago, I am really behind with blogging …

I started another blog. I hope this is not going to be too confusing.

Here is what goes where:

November 25, 2007

SOURCE: Boston

Category: Uncategorized — Raffael Marty @ 9:18 pm

clear_06.jpgThere are a number of security conferences out there. Deciding which ones to attend is no easy task. As part of the advisory committee for SOURCE: Boston, I might be a bit biased, but this is going to be a one of a kind security conference. We don’t want to organize yet another security conference. We realize that security has become more and more of a business concern. The security conference of the future needs to bridge the business and the technology. Therefore, three tracks are offered: business, technology, and application security. With the keynote speakers of Dan Geer and Steven Levy, you can be sure to get some interesting perspectives on security!

See you March 12th to 14th, 2008 in Boston!

November 14, 2007

New Firewall Book

Category: Log Analysis — Raffael Marty @ 5:05 pm

firewalls_cov.jpgNo news anymore, but still worth a blog entry. Michael Rash wrote a new book on firewalls. His approach is not the traditional one where he looks into firewalls themselves. He explores all kinds of additional tools tat can be used alongside of firewalls to tune them and make them more efficient. I have read part of his book before he published it and I really liked what he was up to. I think the final copy should be on my desk by now. Can’t wait to read it. Here is a link to his Web page:

http://www.cipherdyne.org/blog/2007/09/online-site-for-linux-firewalls-attack-detection-and-response.html

October 18, 2007

CSI Conference 2007 – Free Pass

Category: Uncategorized — Raffael Marty @ 7:15 pm

How often is it that you get something in return for reading someone’s blog? Well, today is your lucky day. Are you interested in going to the CSI Conference in Arlington, VA from November 3-9? The first person to send me an email will get a registration code.

Unfortunately, I won’t be able to attend as I am going to be presenting in Jakarta at BCS.

October 15, 2007

Database Query Analysis

Category: Log Analysis,UNIX Scripting,Visualization — Raffael Marty @ 6:54 pm

icon.jpgI was playing with database audit logs for a bit to try and visualize some aspects of them. While doing so, I came across a pretty interesting problem. The audit logs contain entries that indicate what exact SQL query was executed. Now, I am not interested in the entire query, but I need to know which tables were touched. I was trying to build some regular expressions to extract that information from the query, but I gave up pretty quickly. It’s just too complicated for a regex. I was wondering whether there is a way to take a SQL query, for example:

select * from a.table1 a, b.tabl2 b join c.table3 on b.id1=c.id2 where a.foo='bar'

and extract all the table names: a.table1, b.table2, c.table3. Are there tools to do that? Remember, I don’t have the database with these tables. I only have a log from some database. The script should support all the SQL perks like joins, nested selects, etc. Anyone have a good way to do this?

October 11, 2007

Security Data Visualization Book

Category: Visualization — Raffael Marty @ 8:32 am

Greg Conti wrote a book on security data visualization. It’s all in color. A really nice book. The best parts about the book are the chapters on IDS signature tuning and firewall log analysis. I am just saying that because I wrote those two chapters πŸ˜‰

He beat me to the punch with publishing a book on security data visualization. That’s all I can say. I hope that I am done with my book soon. Fortunately, I knew about this book early on so I could make sure that we are not writing about the same topics. My book is going to be fairly different. I am diving quite a bit deeper into some visualization topics around security. I am focusing on use-cases. How do you use visualization for compliance, insider threat, and perimeter threat. What are some of the tools out there, what are the data sources, and what are the different types of graphs you should know and understand when you are visualizing security data.

Thanks to Greg for letting me write part of his book!

October 4, 2007

Visualization PodCast – A “Bar Talk”

Category: Log Analysis,Visualization — Raffael Marty @ 5:27 pm

Teal LeafDuring the FIRST conference in Seville earlier this year, I was talking to Ben Chai at about 12.30am. We were sitting in the bar area when he suddenly took out his microphone and started interviewing me. The talk is pretty funny. The podcast shows that I don’t have a very good sense of humor πŸ˜‰ Oh, and by the way, reading tea leaves is probably going to be the topic of one of my next talks!

I don’t think this was my best night, when Ben record this. I spend about 1.5 hours trying to pick a TSA lock with a paper clip. Okay, Adam couldn’t do it anymore either, but still. In the meantime, I learned how it is done for real – the lock picking πŸ˜‰

Listen to the podcast here.

September 24, 2007

Hack In The Box 2007 – Malaysia

Category: Log Analysis,Visualization — Raffael Marty @ 5:32 pm

Petronas Towers in Kuala Lumpur

I spent the first week of September in Kuala Lumpur, Malaysia, where I was speaking about insider crime visualization at Hack In The Box. The conference is held annually and I was surprised about how big it was. A lot of attendees from the area, but also from other parts of the world, for example from Germany. In general I was fairly impressed with the caliber of people that presented.

Talking at HITB 2007

What I enjoyed a lot as well, was the lock-picking village … The guys running it were real experts on the topic and had excellent tools to teach you the art of lock picking.

For those interested, I have the presentation available here. The download is fairly big. Sorry about that. The conference also made the rest of the presentations available.

On to the next conference. See you in Jakarta end of October.

September 16, 2007

AfterGlow 1.5.9 Released!

Category: Log Analysis,Visualization — Raffael Marty @ 10:39 pm

As you might have seen on secviz.org, AfterGlow 1.5.9 is out. The announcement of AfterGlow 1.5.9 on secviz has some more details on what’s new. Just quickly here: The URL feature is pretty interesting and addresses some old thoughts and things I have been talking about with other people (Peter, are you reading this?). The issue there was that the AfterGlow graphs are very static and that’s kind of a bummer. It would be really nice if there was more interactivity. Clicking on nodes for example. Well, this is now a first step towards that. Along with the Splunk – AfterGlow integration, this is actually going almost all the way of completing the interaction round trip. I know, in terms of real interactivity, there is still a lot missing, but I think this is taking care of some really interesting use-cases.

[tags]afterglow, visualization, splunk, interactivity, graphviz[/tags]

September 14, 2007

Open Log Format – What a Great Standard – Not

Category: Log Analysis,Security Information Management — Raffael Marty @ 4:01 pm

When eIQnetworks announced their OpenLogFormat, I think they did it just for me. I love it. I really enjoy taking these things apart to show why they are really really bad attempts. I am sure these guys are not readers of my blog. Otherwise they would have known that I will question their standard, line by line. It just doesn’t add up for me. Why are companies/people not learning/listening?
So, there is yet another “standard” for event interoperability being suggested by yet another vendor. While some vendors (for example the one I used to work for), actually thought about the problem and made sure they are coming up with something useful, I am not sure this standard lives up to that promise. Let me go through the standard piece by piece, right after some general comments:

  • Why another interoperability standard? There is not a single word of motivation printed in the standards document. Don’t we have existing standards already?
  • You have to register for download the standard? Well, I know, ArcSight makes that same mistake. That wasn’t my doing! I promise.
  • How does this standard compare to others? What’s the motivation for defining it? Is it better than everything else?
  • When exactly would you apply this standard? All the time? OLF (the open log format) states:
    OLF is designed for logging network events such as those often logged by firewalls, but it can also be used for events not related to the network.
    What the heck does that mean? For everything? Do you want me to proof you wrong? There are tons of examples where this thing won’t be able to apply this standard.
  • You did not do your homework, my friends! In a lot of areas. Some friends of mine already commented on the fact that this is advertised as an “open” log format. The press release even calls it an open source log format. What does that mean? Was there a period for public comment? Believe me, there wasn’t. I would have known FOR SURE!
  • With regards to the homework. Have you heard of CEE? Yes, that’s a group that actually knows quite a bit about logging. Why bother asking them, they would only critique the proposal and possibly shoot it down? You bet. That’s what I am doing right now anyways.
  • Let’s see, did you guys learn from past mistakes? Don’t get me started. I claim NO. Read on and you will see a lot of cases that proof why.
  • Have you read my old blog entries and at least tried to understand what logging is about? I can guarantee that you guys have not. Or maybe you didn’t understand what I was saying. Hmm…. Here again, for your reference.
  • Have you looked at the other standards out there? For example CEF (common event format) from ArcSight. I am definitely biased towards that one, as I have written it, but even now that I don’t work there anymore, I still think that CEF is actually a really good logging standard. Again. Not done your homework!
  • Last general question: Why would I be using this standard as opposed to anything else, for example CEF. Is eIQnetworks big enough so I would care? Last time I checked, the answer was: No. If this was something that was done by Microsoft, I might care, just because of their size. Maybe you have a lot of vendors already supporting this standard? Yes? How many? Who? I have not heard OLF ever before and I deal with log management every day! So I doubt any significant adoption is reality. Actually, I just checked the Web page and there are six companies supporting it. Okay. All that πŸ˜‰

Let’s go through the standard in more detail:

  • I already made this point: What is the area where this standard applies? Networking and non-networking events (That’s what OLF claims)? Nice. And why would you require an IP address field (to be exact: internalIP and externalIP) for every record? In your world, are there only events that contain IPs? In mine, there are many others too!
  • You are proposing a log-file approach. So you are defining a file-based standard, limiting it to one transport. Okay. But why? Again, read my blog about transport-independence. Who is logging to files only? A minority of products in the networking realm.
  • Have you guys written parsers before? (Yes, I have!). Do you know how bad it is to read headers first? Makes a whole lot of use-cases impossible. And to be frank, it requires too much coding (I am lazy).
  • Minor detail: You guys are already on version 1.1? Hmm… I wonder how version 1.0 looked πŸ˜‰
  • I don’t think the author of this paper has written a standard before: “The #Version line gives the version of OLF, which should always be 1.1.” How do you do updates? You deprecate this document? Confusing, confusing.
  • Why do you need a #Date line in the header? That does not make any sense AT ALL!
  • Okay, so you are using a header line that defines the fields. All right. Let’s assume that’s a good idea in order to reduce the size of an event (exercise to the reader why this is true). Why do you say then:
    NOTE: The fields may not vary; they must alwas be the ones specified in this document.

    What? This does not make any sense at all! Whatsoever! Delete that line. Done. It’s irrelevant.
  • Let’s go back to the header line. Why all these required fields? spam-info? This is very inefficient. Why have all these fields for every event? It unnecessarily bloats your events and circumvents the idea of a header line!
  • Tab-separated fields. Okay. Your choice. Square brackets to deal with escaping? Are you guys coders? That’s not a standard way of doing things at all. Anyone who wrote code before, have you seen this approach anywhere? If you stuck to commas and quotes, you might be able to read your logs in Excel without any configuration πŸ˜‰
  • tab-separated subfields. Shiver.
  • Guys, your example on page one is horrible. Priority in the preamble and in the suffix? Then the virtualdevice is root? Maybe I can’t count. You know what, I think the fields don’t even align. What are all the IPs in the message? Part of the message (the one with the seemingly interesting IPs) seems to be lumped together into one field (uses the square brackets). I don’t get it.
  • Error lines? Come again? So there are really two different types of log entries? Or no, hang on, there aren’t. Those lines are only generated if the OLF consumer realizes that the format is not correct? What does that have to do with a logging standard. If I wasn’t confused yet, now I definitely am.
  • Open source: “a device-type assigned by eIQnetworks”. No further comment.
  • Wow. Is it right that every log entry carries the “original” log message also (called the Nativelog)? So, if a product supports OLF by default, that’s just empty? Come on guys. Are you really suggesting to double the size of messages?
  • Talking about the field dictionary… What does it mean to have “unused” fields? Unused by what? The standard? Oh, maybe this is not a standard?
  • I will spare you the analysis of all the fields in the dictionary. There are tons of problems. Just one: If you have a count bigger than one and you only have one timestamp. What does that mean? All the events happened at the same time?
  • Note that the Nativelog field is defined as: Original syslog line. Okay, so this is a file-based standard, but it consumes syslog messages?
  • event types: There is indeed, and I kid you not, a -1 value. Is that for real?
  • priority codes: Nice. Read this (again, this is a standard, in case you forgot):
    The descriptions [of the priorities] given are the official interpretation, but usage varies; some vendors report routine events with higher priority
  • Note the copyright at the bottom of the pages πŸ˜‰ [Okay, I admit, I might have made the same mistake with the first version of CEF, you are forgiven].

Have I convinced you yet why not to use this “standard”?

Random observation: Why does this log remind me of IIS logs gone wrong?

[tags]log standard, logging, event interoperability, cee, olf, open log format[/tags]