September 16, 2007
As you might have seen on secviz.org, AfterGlow 1.5.9 is out. The announcement of AfterGlow 1.5.9 on secviz has some more details on what’s new. Just quickly here: The URL feature is pretty interesting and addresses some old thoughts and things I have been talking about with other people (Peter, are you reading this?). The issue there was that the AfterGlow graphs are very static and that’s kind of a bummer. It would be really nice if there was more interactivity. Clicking on nodes for example. Well, this is now a first step towards that. Along with the Splunk – AfterGlow integration, this is actually going almost all the way of completing the interaction round trip. I know, in terms of real interactivity, there is still a lot missing, but I think this is taking care of some really interesting use-cases.
[tags]afterglow, visualization, splunk, interactivity, graphviz[/tags]
September 14, 2007
When eIQnetworks announced their OpenLogFormat, I think they did it just for me. I love it. I really enjoy taking these things apart to show why they are really really bad attempts. I am sure these guys are not readers of my blog. Otherwise they would have known that I will question their standard, line by line. It just doesn’t add up for me. Why are companies/people not learning/listening?
So, there is yet another “standard” for event interoperability being suggested by yet another vendor. While some vendors (for example the one I used to work for), actually thought about the problem and made sure they are coming up with something useful, I am not sure this standard lives up to that promise. Let me go through the standard piece by piece, right after some general comments:
- Why another interoperability standard? There is not a single word of motivation printed in the standards document. Don’t we have existing standards already?
- You have to register for download the standard? Well, I know, ArcSight makes that same mistake. That wasn’t my doing! I promise.
- How does this standard compare to others? What’s the motivation for defining it? Is it better than everything else?
- When exactly would you apply this standard? All the time? OLF (the open log format) states:
OLF is designed for logging network events such as those often logged by firewalls, but it can also be used for events not related to the network.
What the heck does that mean? For everything? Do you want me to proof you wrong? There are tons of examples where this thing won’t be able to apply this standard.
- You did not do your homework, my friends! In a lot of areas. Some friends of mine already commented on the fact that this is advertised as an “open” log format. The press release even calls it an open source log format. What does that mean? Was there a period for public comment? Believe me, there wasn’t. I would have known FOR SURE!
- With regards to the homework. Have you heard of CEE? Yes, that’s a group that actually knows quite a bit about logging. Why bother asking them, they would only critique the proposal and possibly shoot it down? You bet. That’s what I am doing right now anyways.
- Let’s see, did you guys learn from past mistakes? Don’t get me started. I claim NO. Read on and you will see a lot of cases that proof why.
- Have you read my old blog entries and at least tried to understand what logging is about? I can guarantee that you guys have not. Or maybe you didn’t understand what I was saying. Hmm…. Here again, for your reference.
- Have you looked at the other standards out there? For example CEF (common event format) from ArcSight. I am definitely biased towards that one, as I have written it, but even now that I don’t work there anymore, I still think that CEF is actually a really good logging standard. Again. Not done your homework!
- Last general question: Why would I be using this standard as opposed to anything else, for example CEF. Is eIQnetworks big enough so I would care? Last time I checked, the answer was: No. If this was something that was done by Microsoft, I might care, just because of their size. Maybe you have a lot of vendors already supporting this standard? Yes? How many? Who? I have not heard OLF ever before and I deal with log management every day! So I doubt any significant adoption is reality. Actually, I just checked the Web page and there are six companies supporting it. Okay. All that 😉
Let’s go through the standard in more detail:
- I already made this point: What is the area where this standard applies? Networking and non-networking events (That’s what OLF claims)? Nice. And why would you require an IP address field (to be exact: internalIP and externalIP) for every record? In your world, are there only events that contain IPs? In mine, there are many others too!
- You are proposing a log-file approach. So you are defining a file-based standard, limiting it to one transport. Okay. But why? Again, read my blog about transport-independence. Who is logging to files only? A minority of products in the networking realm.
- Have you guys written parsers before? (Yes, I have!). Do you know how bad it is to read headers first? Makes a whole lot of use-cases impossible. And to be frank, it requires too much coding (I am lazy).
- Minor detail: You guys are already on version 1.1? Hmm… I wonder how version 1.0 looked 😉
- I don’t think the author of this paper has written a standard before: “The #Version line gives the version of OLF, which should always be 1.1.” How do you do updates? You deprecate this document? Confusing, confusing.
- Why do you need a #Date line in the header? That does not make any sense AT ALL!
- Okay, so you are using a header line that defines the fields. All right. Let’s assume that’s a good idea in order to reduce the size of an event (exercise to the reader why this is true). Why do you say then:
NOTE: The fields may not vary; they must alwas be the ones specified in this document.
What? This does not make any sense at all! Whatsoever! Delete that line. Done. It’s irrelevant.
- Let’s go back to the header line. Why all these required fields? spam-info? This is very inefficient. Why have all these fields for every event? It unnecessarily bloats your events and circumvents the idea of a header line!
- Tab-separated fields. Okay. Your choice. Square brackets to deal with escaping? Are you guys coders? That’s not a standard way of doing things at all. Anyone who wrote code before, have you seen this approach anywhere? If you stuck to commas and quotes, you might be able to read your logs in Excel without any configuration 😉
- tab-separated subfields. Shiver.
- Guys, your example on page one is horrible. Priority in the preamble and in the suffix? Then the virtualdevice is root? Maybe I can’t count. You know what, I think the fields don’t even align. What are all the IPs in the message? Part of the message (the one with the seemingly interesting IPs) seems to be lumped together into one field (uses the square brackets). I don’t get it.
- Error lines? Come again? So there are really two different types of log entries? Or no, hang on, there aren’t. Those lines are only generated if the OLF consumer realizes that the format is not correct? What does that have to do with a logging standard. If I wasn’t confused yet, now I definitely am.
- Open source: “a device-type assigned by eIQnetworks”. No further comment.
- Wow. Is it right that every log entry carries the “original” log message also (called the Nativelog)? So, if a product supports OLF by default, that’s just empty? Come on guys. Are you really suggesting to double the size of messages?
- Talking about the field dictionary… What does it mean to have “unused” fields? Unused by what? The standard? Oh, maybe this is not a standard?
- I will spare you the analysis of all the fields in the dictionary. There are tons of problems. Just one: If you have a count bigger than one and you only have one timestamp. What does that mean? All the events happened at the same time?
- Note that the Nativelog field is defined as: Original syslog line. Okay, so this is a file-based standard, but it consumes syslog messages?
- event types: There is indeed, and I kid you not, a -1 value. Is that for real?
- priority codes: Nice. Read this (again, this is a standard, in case you forgot):
The descriptions [of the priorities] given are the official interpretation, but usage varies; some vendors report routine events with higher priority
- Note the copyright at the bottom of the pages 😉 [Okay, I admit, I might have made the same mistake with the first version of CEF, you are forgiven].
Have I convinced you yet why not to use this “standard”?
Random observation: Why does this log remind me of IIS logs gone wrong?
[tags]log standard, logging, event interoperability, cee, olf, open log format[/tags]
September 12, 2007
Yet another BaySec meeting. Come and mingle.
Where: O’Neills
When: September 17th, 7pm
Who: People interested in computer security / geeks / …
Want to be informed of future events? Subscribe to the mailinglist: baysec-subscribe at sockpuppet.org
September 11, 2007
Finally, ArcSight is going for it: http://news.google.com/news?ie=UTF-8&rlz=1B2GGGL_enUS205US205&tab=bn&ncl=1120626202&hl=en
It seems like there is a new wave of security companies going public. First sourcefire, then tippingpoint, now ArcSight. I am really curious as to what the share price is going to be and what the reverse split is going to look like.
September 4, 2007
While I am on a roll, talking about normalization and log standards, let me have a look at a publication from Gartner. It is a bit dated already (May 2006), but people are probably still referring to it. There are a couple of things that I want to make sure people understand.
While I like the fact that someone like Gartner is trying to dive into a technical topic, I am not too certain that this is very productive. The Gartner publication I am looking at is “Define Application Security Log Output Standards” by Amrit Williams. I must say, the publication is not horribly wrong or bad, however, there are some interesting problems that I want to address:
- The publication outlines what fields should be contained in an “account access event”. Most of the fields make sense. However, there are two fields: “login success” and “login failure”. These fields should be normalized. There shouldn’t be two fields, one for success and one for failure. Just have one that indicates success or failure. That way you can correlate those two events against each other. Otherwise you can’t because you have two different fields. Well, you can, but it’s much more difficult.
- Another field in the account access event is “access rights”. If you include this field in an event, you need a system which can deal with sets or lists of values. This is not simple and I don’t think any of the SIMs really take care of that. Not that they shouldn’t, but it’s really really expensive to build that into a correlation engine. Now, in this specific instance, for access rights, they should not be in an event anyways. This is static information that should be read into the correlation engine asynchronously or looked up on a need to know bases.
- The publication further indicates that the access events have additional variables, called “Variable 1”, “Variable 2”, etc. I have no idea what these fields would be used for. But that’s not even important. The important part is that having generic variables without a fixed meaning is not very useful for later consumption in reports or correlation rules. You need a semantic associated with every field. That’s exactly why there is a common event language to start with!
- The same mistake with splitting out the same type of events into multiple event fields is done in the “account /role management events”. Make one field tat talks about “creation”, “modification”, etc. One of the things to mention in this context is an event taxonomy. I am working on a generic taxonomy right now for CEE, the common event exchange format. CEE is an effort that I pushed Mitre to address a long time ago. Finally, there is a small working group and we should soon have the <A href=”http://cee.mitre.org”>Web presence</A> up and running.
- I don’t agree with the “Log Output Formats” discussion at all. Sorry. Gartner (or Amrit?) recommends syslog as output format. While I am quite a fan of syslog, it’s definitely not my transport of choice. Read that again: TRANSPORT! Syslog is not a log format. It’s a transport. I am not going to roll-up my rant about formats and transports again. Read my older blog entry about the <a href=”http://raffy.ch/blog/2007/04/19/standard-logging-format-common-event-exchange-cee/”>format vs. transport</A> issue.
- It seems really interesting to me that syslog is pushed as the “log format” (again, it’s a transport, but whatever). The publication even mentions all the RFCs associated with syslog, but not a single sentence about the draw backs. Unstructured, reliability (okay TCP is mentioned), poor timestamp, etc.
Again, I think it’s great that Gartner picked this topic up. It’s incredibly important, but it takes a fair amount of work and experience to get a decent log standard put together. Stay tuned and check back for more information about <a href=”http://raffy.ch/blog/2007/04/23/common-event-expression-cee/>CEE</A>.
[tags]log standard,syslog,cee,event fields[/tags]
August 25, 2007
A lot has happened the last couple of weeks and I am really behind with a lot of things that I want to blog about. If you are familiar with the field that I am working in (SIEM, SIM, ESM, log management, etc.), you will fairly quickly realize where I am going with this blog entry. This is the first of a series of posts where I want to dig into the topic of event processing.
Let me start with one of the basic concepts of event processing: normalization. When dealing with time-series data, you will very likely come across this topic. What is time-series data? I used to blog and talk about log files all the time. Log files are a type of time-series data. It’s data which is collected over time. Entries are associated with a time stamp. This covers anything from your traditional log files to snapshots of configuration files or snapshots of tools that are run on a periodic basis (e.g., capturing your netstat output every 30 seconds).
Let’s talk about normalization. Assume you have some data which reports logins to one of our servers. We would like to generate a report which shows the top ten users accessing the server. How would you do that? We’d have to identify the user name in the log entry first. Then we’d extract it, for example by writing a regular expression. Then we’d collect all the user names and compile the top ten list.
Another way would be to build a tool which picks the entire log entry apart and puts as much information from the event into a database. As opposed to just capturing the user name. We’d have to create a database with a specific schema. It would probably have these fields: timestamp, source, destination, username. Once we have all this information in a database, it is really easy to do all kinds of analysis on the data, which was not possible before we normalized it.
The process of taking raw input events and extracting individual fields is called normalization. Sometimes there are other processes which are classified as normalization. I am not going to discuss them right here, but for example normalizing numerical values to fall in a predefined range is generally referred to as normalization as well.
The advantages of normalization should be fairly obvious. You can operate on the structured and parsed data. You know which field represents the source address versus the destination address. If you don’t parse the entries, you don’t really know that. You can only guess. However, there are many disadvantages to the process of normalization that you should be aware of:
- If you are dealing with a disparate set of event sources, you have to find the union of all fields to make up your generic schema. Assume you have a telephone call log and a firewall log. You want to store both types of logs in the same database. What you have to do is take all the fields from both logs and build the database schema. This will result in a fairly large set of fields. If you keep adding new types of data sources, your database schema gets fairly big. I know of a SIM which uses more than 200 hundred fields. And still that doesn’t cover nearly all the fields that are needed to cover a good set of data sources.
- Extending the schema is incredibly hard: When building a system with a fixed schema, you need to decide what your schema will look like. If, to a later point in time, you have a need to add another type of data source, you will have to go back and modify the schema. This can have all kinds of implications on the data already captured in the data store.
- Once you decided to use a specific schema, you have to build your parsers to normalize the inputs into this schema. If you don’t have a parser, you are out of luck and you cannot use that data source.
- Before you can do any type of analysis, you need to invest the time to parse (or normalize) the data. This can become a scalability issue. Parsing is fairly slow. It generally applys regular expressions to each of the data entries, which is a fairly expensive operation.
- Humans are not perfect and programmers are not either. The parsers will have bugs and they will screw up normalization. This means that the data that is stored in the database could be wrong in a number of ways:
- A specific field doesn’t get parsed. This part of the data entry is not available for any further processing.
- A field gets parsed but assigned to the wrong field. Part of your prior analysis could be wrong.
- Breaking up the data entry into tokens (fields) is not granular enough. The parser should have broken the original entry into more specific fields.
- The data entries can change. Oftentimes, when a new version of a product is released, it either adds new data types or it changes some of the log entries. This has to be reflected in the parsers. They need to be updated to support the new data entries, before the data source can be used again.
- The original data entry is not available anymore, unless you are spending the time and space to store the original data entry along with the parsed and extracted fields. This can have quite some scalability issues as well.
I have seen all of these cases happening. And they happen all the time. Sometimes, the issues are not that bad, but other times, when you are dealing with mission critical systems, it is absolutely crucial that the normalization happens correctly and on time.
I will expand on the challenges of normalization in a future blog entry and put it into the context of security information management (SIM).
[tags]SIM, SIEM, ESM, log management, event normalization, event processing, log analysis[/tags]
August 16, 2007
We have another BaySec meeting scheduled for the coming Monday. 7pm at O’Neills, at 3rd and King Street. Right around the corner from my work 😉
August 7, 2007
I thought I’d already disabled mDNSResponder when I did some basic hardening of my Laptop. Turns out that when Marty (no, I am not refereing to myself in the third person) asked me whether I disabled it and I checked again, it was really not. Maybe I just killed the process, but here is how to really disable that service:
Launch the following command
sudo launchctl unload /System/Library/LaunchDaemons/com.apple.mDNSResponder.plist
The next step is turning off the mDNSResponder at startup. And where do you do that? As I am not really confident getting online here at BlackHat, I decided to just look around on the hard drive and what I found was that you could probably just change an entry in the /System/Library/LaunchDaemons/com.apple.mDNSResponder.plist file:
<key>OnDemand</key>
<false></false>
Replace false with true. Do you notice something? Someone really knew XML. Darn it. Two elements. One being the key, the other one being the value. Ever heard of attributes in XML? To whoever built this, this is how I would write the entry:
Or even better, re-architect the entire XML file to actually make sense!
I just now found the real way to actually disable the service by using the -w flag on the launchctl command from above. That will turn the process off permanently. A good reference is here.
August 6, 2007
No! OS X is not FreeBSD! Not sure if I’d like OS X better, if it was just FreeBSD on steroids.
I am sitting at BlackHat. Yes, I turned my laptop on, but the network interfaces are turned off! I was going to configure my firewall to lock everything down and then go online. First shock: <b>ipfw</b> is the firewall OS X uses. There is some history with me and ipfw. I am a big fan of OpenBSD and when Daniel wrote the pf firewall to replace ipfw , I was delighted. I started using pf and even fiddled around with the source code. I am no expert on all the features anymore, but I got a pretty good handle on that beast at some point. Now I have to learn ipfw… Okay. Let’s do that and face the challenge.
First things first. Where’s the configuration file for it? Hmm… There is a guy. Let me play with that. I am shocked. By default, UDP traffic is allowed in and out, even if you turn off all your services in the main tab. Only if you use the advanced tab, can you turn UDP off. Logging is not turned on either (what a surprise). Alright, I am turned that on too. How do the rules look now? OMG! Ridiculous. It allows port 5353, 137, 427, and 631 inbound! Why? Turn that off! Lesson learned: Don’t use the default config. Again, show me the configuration file. But where is it?
I still haven’t found it. I am just going to write a script which uses the <b>ipfw add</b> command to add ipfw rules one by one. That’s really the same thing I am doing with iptables on my Linux boxen. But before doing so, I wanted to see how ipfw log entires look. To test that, I added the following rule:
<code>deny log ip from any to any</code>
I just wanted to see how a log entry looks when I telnet to some port on my box. Well. Surprise surprise. Right after adding that rule not much worked anymore. <b>sudo</b> is not functioning anymore. Some digging around and I realized that the <b>/etc/passwd<b> file is not used for authentication! It’s some service that uses the loopback interface. Not really sure what to do without sudo and a bit frustrated, I closed the laptop to resume later. Well, later, the laptop did not wake up anymore. Authentication gone! It just hung. A reboot was necessary. Darn. At this point I am really frustrated!
I think my next step is to go out and take Jay’s Bastille Linux scripts to see what they are going to do to my box. I actually hope Jay is going to show up here in Vegas so I can bug him about some of my OS X things 😉
[tags]OS X,ipfw[/tags]
July 29, 2007
Effective immediately, I have a new employer! I am leaving ArcSight to start working for Splunk, an IT search company in San Francisco. As their Chief Security Strategist, I will be working in product management, with responsibility for all of the UI and solutions.
The work I have been doing in my past with log management and especially visualization is going to directly apply to my new job. I will be spending quite some time to help further the visual interfaces and define use-cases for log management. Exactly what I’ve been doing for the last four years already 😉
Please don’t send me any emails to my arcsight email anymore. My new address:
raffy at splunk . c o m
I found out that a lot of the Splunk developers hang out on IRC (#splunk). I’ve been hanging out in there for the last couple of days. Maybe you can catch me there too 😉
These Splunk guys are funny. One of the first things they did is giving me a Mac book. Darn. I have never used a Mac before. This is crazy. All the little things I had developed and installed on my Linux boxen I now have to translate to OS X. I am slowly getting used to this beast, but there are still things I wasn’t able to figure out. Maybe some of you want to help me out?
- The first thing that I did was looking for something to cover the built-in camera. I don’t trust this thing. Who knows who’s watching 😉 I finally found the iPatch. Unfortunately they are out of stock. Well, I just built my own …


- Then I discovered that the plugs I have for the microphone and headphone jacks are not working either. They are slightly too big. Well, I will have to talk to Josh about that during DefCon 😉

- Then the other thing that I am struggling with is logging and auditing. I used tcpspy before to log all the connections that are opened to and from my machine. I downloaded the source and started compiling. No luck. Here is the error during compilation. Anyone know how to fix this?
tcpspy.c: In function 'ct_read':
tcpspy.c:236: error: 'TCP_ESTABLISHED' undeclared (first use in this function)
- Maybe there is another tool that I can use to record all the connections? The nice thing about tcpyspy is that it also logs the application that opened or accepted the connection and the user associated with that.
- What do I do about auditing? Are there instructions somewhere on how to enable either BSM auditing for Mac OSX or is there something else? I would like to mainly audit access to critical files on my box.
- There are all kinds of other little odd things, but these are the items bugging me right now 😉
See ya all at BlackHat! Hit me up so we can meet up!