The problem of how you call something is not something I think about conciously very often, but it became really obvious to me that it is important to name things and define what they really mean every so often. In my daily work I use the words Event and Log Entry all the time. While talking to developers and other geeks, it has never been a problem, but I was talking to some other groups lately, outside of my company and when I mentioned the word event it took me a while to understand that they did not think about an event the way I did. An event for them was an incident, a physical event, the constellation of things coming together and causing something to happen. For me an event is something I use very loosely. An event gets generated by a device. It’s the same as a log entry. It’s a “string” that describes what happened. Windows for example generates events. They get collected in the event log. But again, I am using the term very loosely. What’s a log entry then in contrast to an event? Hmm… And is a tcpdump record a log entry or an event or what is it? Hard to say. I guess it takes the effort of someone to define all that. I might…
Events vs. Logs vs. Log Entries vs. Traps vs. ? – Missing Definitions
Security Data Visualization – Book Chapter
I am scribbling on another book chapter. This time it’s for a visualization book. I am writing about how to analyze firewall and IDS logs. I am using line graphs and treemaps to do so. Guess what tool I am using to generate all the graphs. Yes. AfterGlow.
I am not quite done with writing, but am pretty happy with the way it shapes out. The chapter is not going to be highly technical. I am not going into how to configure AfterGlow and parse log messages and such. I focus more on the process-level. It is quite an interesting experience to put something into words that you intuitively do all the time.
I am not sure when the book is actually going to come out, but I will post here when it’s available.
Interoperability Standards – Log Standards
There is a lot of talk around interoperability standards lately.Following these discussions, it seems to me that people are intermixing a lot of different topics:
a) Log format (syntax)
b) Event transport
c) Event classification (also called taxonomy, categorization, grammar)
d) Logging recommendations (what events specific devices should report AND what fields they should contain as a minimum
I would really like to see future discussions broken up into these four groups!
Log Messages
Quite a while ago, the goal of loganalysis.org was to collect log files of all kinds of devices to build up a repository for the community. Unfortunately that effort has not been too successful. I just stumbled accross a new effort driven by splunk:
www.splunk.com/base. There are quite a few syslog messages on there already. What i don’t like is that most of the messages are some kind of exceptions of some java applications. I don’t really care about those things. Well. Hopefully there are going to be more people adding logs…
XCCDF-P
A horrible acronym. I know. We had a working session during the RSA conference to talk about XCCDF-P. For those not familiar with XCCDF, it has to do with policy definitions and uses OVAL to implement the checks.
XCCDF-P (which will hopefully get renamed pretty soon to something else, and hopefully not to CPN (Common Platform Names) [We already have CVE, CME, and CCE]) is an effort to standardize platform names. What’s the problem? Well, if I have two scanners analyzing a system of mine, one of them might report that I am running a “Windows 2000”, the other one might say “Win2K”. This is really the same, but how would a machine know? That’s where the standard is trying to clean things up. You wouldn’t belive how much discussion this topic actually involves. We met for about an hour and had plenty of things to discuss, not even closely getting to an agreed-upon solution. However, the problem is defined and we all agreed upon the the necessity to solve the problem! Stay put for an update soon and hopefully a quick turn around with a solution draft.
TreeMaps
Wow. I just found this pretty awesome TreeMap tool. The data format it reads is pretty easy and I quickly built a file with some of my firewall data. Well, fake firewall data 😉
What you see in here is first the color: green are firewall passes, red are blocks, then the hierarchy is such that the target system is top, then the target IP and then you see the date inside of the boxes, when the access happened.
Well, the tool is pretty awesome. Lots of interactivity. You define the hierarchies manually, on the fly it updates the graph. Then you can color and filter and all kinds of nifty things. Try it out.
Log Management Article – My Comments
I am still sitting in the airplane and the next article from the ISSA Journal from November 2005 that catches my attention is the “Log Data Management: A Smarter Approach to Managing Risk”. I have only a few comments about this article:
- The author demands that all the log data is archived, and archived unfiltered. Well, here is a question: What is the difference between not logging something and logging it, but later filtering it out? What does that mean for litigation quality logs?
- On the same topic of litigation quality data, the author suggest that a copy of the logs are save in the original, raw format while analysis is done on the other copy. I don’t agree with this. I know, in this matter my opinion does not really count and nobody is really interested in it, but I will have some proof soon that this is not required. I am not a lawyer, so I will not even try to explain the rational behind allowing the processing of the original logs and still maintaining litigation quality data.
- “Any log management solution should be completely automated.” While I agree with this, I would emphasize the word should. What does that mean anyways? Completely automated in the real of log management? Does that mean the log is archived automatically? Does it mean that the log management solution takes action and block systems (like an IPS)? There will always need to be human interaction. You can automate a lot of things, including the generation of trouble tickets, but at least then, an operator will be involved.
- Why does the author demand that “companies should look for an applicance-based solution”. Why is that important? The author does not give any rational for that. I can see some benefits, but there are tons of draw-backs to that approach too. I yet have to see a compelling reason why an appliance is better than a custom install on company approved hardware.
- In the section about alerting and report capabilities, the author mentiones “text-based alerts”, meaning that rules can be setup to trigger on text-srings in log messages. That’s certainly nice, but sorry, it does not scale. Assume I want to setup a trigger on firewall block events. I can define a text-string of “block” to trigger upon. But all the firewalls which call this not a block, but a “Deny” will not be caught. Have you heard of categorization or an event taxonomy? That’s what is really needed!
- “… fast text-based searches can accelerate problem resolution …” Okay. Interesting. I disagree. I would argue that vsualization is the key here. But I am completely biased on that one 😉
- Another interesting point is that the author suggest that “… a copy [of the data] can be used for analysis”. Sure. Why not, but why? If the argument is litigation quality data again, why would compression, which is mentioned in the next sentence be considered a “non-altering” way of processing the data. If that is the argument. I would argue that I can work with the log data by normalizing it and even enriching the data without altering it.
Commen Event Format / Standard
There is an interesting thread on the log-analysis mailinglist about regex-less parsing of messages. The problem is a very old one. Every device out there is logging in some strange way, making it incredibly time-consuming for event consumers (such as ArcSight), to parse the messages and normalize them.
There have been attempts to standardize events, such as IDMEF, which tried to tackle IDS messages. It’s kind of sad, but there is not a single IDS that I know of, which really uses this event exchange format. A lot of IDSs support it, but it’s not their main transport. Then there are tons of other attempts from BEEP to RDEP to SDEE and alike. They are all nice, but guys, we need something that is
All the past attempts of standardizing event formats are not enough, now Microsoft comes out with yet another event logging format. I have to admit, I only quickly glanced over it, but it’s XML again. That’s just SLOW! Huge overhead!
Also, why do people always define the transport when they are trying to standardize log messages? Leave the transport to the devices. They will figure that one out. In the worst case, people can just use syslog which is widely deployed and has it’s problems. But you know what? At least the burden of complying with the standard is incredibly low. Just send a syslog message. Even I can do that. If you asked me to implement BEEP, I don’t think I would even start thinking about complying with the standard…
Sorry for the long post and rant, but this is just a bit frustrating …