September 14, 2007
When eIQnetworks announced their OpenLogFormat, I think they did it just for me. I love it. I really enjoy taking these things apart to show why they are really really bad attempts. I am sure these guys are not readers of my blog. Otherwise they would have known that I will question their standard, line by line. It just doesn’t add up for me. Why are companies/people not learning/listening?
So, there is yet another “standard” for event interoperability being suggested by yet another vendor. While some vendors (for example the one I used to work for), actually thought about the problem and made sure they are coming up with something useful, I am not sure this standard lives up to that promise. Let me go through the standard piece by piece, right after some general comments:
- Why another interoperability standard? There is not a single word of motivation printed in the standards document. Don’t we have existing standards already?
- You have to register for download the standard? Well, I know, ArcSight makes that same mistake. That wasn’t my doing! I promise.
- How does this standard compare to others? What’s the motivation for defining it? Is it better than everything else?
- When exactly would you apply this standard? All the time? OLF (the open log format) states:
OLF is designed for logging network events such as those often logged by firewalls, but it can also be used for events not related to the network.
What the heck does that mean? For everything? Do you want me to proof you wrong? There are tons of examples where this thing won’t be able to apply this standard.
- You did not do your homework, my friends! In a lot of areas. Some friends of mine already commented on the fact that this is advertised as an “open” log format. The press release even calls it an open source log format. What does that mean? Was there a period for public comment? Believe me, there wasn’t. I would have known FOR SURE!
- With regards to the homework. Have you heard of CEE? Yes, that’s a group that actually knows quite a bit about logging. Why bother asking them, they would only critique the proposal and possibly shoot it down? You bet. That’s what I am doing right now anyways.
- Let’s see, did you guys learn from past mistakes? Don’t get me started. I claim NO. Read on and you will see a lot of cases that proof why.
- Have you read my old blog entries and at least tried to understand what logging is about? I can guarantee that you guys have not. Or maybe you didn’t understand what I was saying. Hmm…. Here again, for your reference.
- Have you looked at the other standards out there? For example CEF (common event format) from ArcSight. I am definitely biased towards that one, as I have written it, but even now that I don’t work there anymore, I still think that CEF is actually a really good logging standard. Again. Not done your homework!
- Last general question: Why would I be using this standard as opposed to anything else, for example CEF. Is eIQnetworks big enough so I would care? Last time I checked, the answer was: No. If this was something that was done by Microsoft, I might care, just because of their size. Maybe you have a lot of vendors already supporting this standard? Yes? How many? Who? I have not heard OLF ever before and I deal with log management every day! So I doubt any significant adoption is reality. Actually, I just checked the Web page and there are six companies supporting it. Okay. All that 😉
Let’s go through the standard in more detail:
- I already made this point: What is the area where this standard applies? Networking and non-networking events (That’s what OLF claims)? Nice. And why would you require an IP address field (to be exact: internalIP and externalIP) for every record? In your world, are there only events that contain IPs? In mine, there are many others too!
- You are proposing a log-file approach. So you are defining a file-based standard, limiting it to one transport. Okay. But why? Again, read my blog about transport-independence. Who is logging to files only? A minority of products in the networking realm.
- Have you guys written parsers before? (Yes, I have!). Do you know how bad it is to read headers first? Makes a whole lot of use-cases impossible. And to be frank, it requires too much coding (I am lazy).
- Minor detail: You guys are already on version 1.1? Hmm… I wonder how version 1.0 looked 😉
- I don’t think the author of this paper has written a standard before: “The #Version line gives the version of OLF, which should always be 1.1.” How do you do updates? You deprecate this document? Confusing, confusing.
- Why do you need a #Date line in the header? That does not make any sense AT ALL!
- Okay, so you are using a header line that defines the fields. All right. Let’s assume that’s a good idea in order to reduce the size of an event (exercise to the reader why this is true). Why do you say then:
NOTE: The fields may not vary; they must alwas be the ones specified in this document.
What? This does not make any sense at all! Whatsoever! Delete that line. Done. It’s irrelevant.
- Let’s go back to the header line. Why all these required fields? spam-info? This is very inefficient. Why have all these fields for every event? It unnecessarily bloats your events and circumvents the idea of a header line!
- Tab-separated fields. Okay. Your choice. Square brackets to deal with escaping? Are you guys coders? That’s not a standard way of doing things at all. Anyone who wrote code before, have you seen this approach anywhere? If you stuck to commas and quotes, you might be able to read your logs in Excel without any configuration 😉
- tab-separated subfields. Shiver.
- Guys, your example on page one is horrible. Priority in the preamble and in the suffix? Then the virtualdevice is root? Maybe I can’t count. You know what, I think the fields don’t even align. What are all the IPs in the message? Part of the message (the one with the seemingly interesting IPs) seems to be lumped together into one field (uses the square brackets). I don’t get it.
- Error lines? Come again? So there are really two different types of log entries? Or no, hang on, there aren’t. Those lines are only generated if the OLF consumer realizes that the format is not correct? What does that have to do with a logging standard. If I wasn’t confused yet, now I definitely am.
- Open source: “a device-type assigned by eIQnetworks”. No further comment.
- Wow. Is it right that every log entry carries the “original” log message also (called the Nativelog)? So, if a product supports OLF by default, that’s just empty? Come on guys. Are you really suggesting to double the size of messages?
- Talking about the field dictionary… What does it mean to have “unused” fields? Unused by what? The standard? Oh, maybe this is not a standard?
- I will spare you the analysis of all the fields in the dictionary. There are tons of problems. Just one: If you have a count bigger than one and you only have one timestamp. What does that mean? All the events happened at the same time?
- Note that the Nativelog field is defined as: Original syslog line. Okay, so this is a file-based standard, but it consumes syslog messages?
- event types: There is indeed, and I kid you not, a -1 value. Is that for real?
- priority codes: Nice. Read this (again, this is a standard, in case you forgot):
The descriptions [of the priorities] given are the official interpretation, but usage varies; some vendors report routine events with higher priority
- Note the copyright at the bottom of the pages 😉 [Okay, I admit, I might have made the same mistake with the first version of CEF, you are forgiven].
Have I convinced you yet why not to use this “standard”?
Random observation: Why does this log remind me of IIS logs gone wrong?
[tags]log standard, logging, event interoperability, cee, olf, open log format[/tags]
September 11, 2007
Finally, ArcSight is going for it: http://news.google.com/news?ie=UTF-8&rlz=1B2GGGL_enUS205US205&tab=bn&ncl=1120626202&hl=en
It seems like there is a new wave of security companies going public. First sourcefire, then tippingpoint, now ArcSight. I am really curious as to what the share price is going to be and what the reverse split is going to look like.
August 25, 2007
A lot has happened the last couple of weeks and I am really behind with a lot of things that I want to blog about. If you are familiar with the field that I am working in (SIEM, SIM, ESM, log management, etc.), you will fairly quickly realize where I am going with this blog entry. This is the first of a series of posts where I want to dig into the topic of event processing.
Let me start with one of the basic concepts of event processing: normalization. When dealing with time-series data, you will very likely come across this topic. What is time-series data? I used to blog and talk about log files all the time. Log files are a type of time-series data. It’s data which is collected over time. Entries are associated with a time stamp. This covers anything from your traditional log files to snapshots of configuration files or snapshots of tools that are run on a periodic basis (e.g., capturing your netstat output every 30 seconds).
Let’s talk about normalization. Assume you have some data which reports logins to one of our servers. We would like to generate a report which shows the top ten users accessing the server. How would you do that? We’d have to identify the user name in the log entry first. Then we’d extract it, for example by writing a regular expression. Then we’d collect all the user names and compile the top ten list.
Another way would be to build a tool which picks the entire log entry apart and puts as much information from the event into a database. As opposed to just capturing the user name. We’d have to create a database with a specific schema. It would probably have these fields: timestamp, source, destination, username. Once we have all this information in a database, it is really easy to do all kinds of analysis on the data, which was not possible before we normalized it.
The process of taking raw input events and extracting individual fields is called normalization. Sometimes there are other processes which are classified as normalization. I am not going to discuss them right here, but for example normalizing numerical values to fall in a predefined range is generally referred to as normalization as well.
The advantages of normalization should be fairly obvious. You can operate on the structured and parsed data. You know which field represents the source address versus the destination address. If you don’t parse the entries, you don’t really know that. You can only guess. However, there are many disadvantages to the process of normalization that you should be aware of:
- If you are dealing with a disparate set of event sources, you have to find the union of all fields to make up your generic schema. Assume you have a telephone call log and a firewall log. You want to store both types of logs in the same database. What you have to do is take all the fields from both logs and build the database schema. This will result in a fairly large set of fields. If you keep adding new types of data sources, your database schema gets fairly big. I know of a SIM which uses more than 200 hundred fields. And still that doesn’t cover nearly all the fields that are needed to cover a good set of data sources.
- Extending the schema is incredibly hard: When building a system with a fixed schema, you need to decide what your schema will look like. If, to a later point in time, you have a need to add another type of data source, you will have to go back and modify the schema. This can have all kinds of implications on the data already captured in the data store.
- Once you decided to use a specific schema, you have to build your parsers to normalize the inputs into this schema. If you don’t have a parser, you are out of luck and you cannot use that data source.
- Before you can do any type of analysis, you need to invest the time to parse (or normalize) the data. This can become a scalability issue. Parsing is fairly slow. It generally applys regular expressions to each of the data entries, which is a fairly expensive operation.
- Humans are not perfect and programmers are not either. The parsers will have bugs and they will screw up normalization. This means that the data that is stored in the database could be wrong in a number of ways:
- A specific field doesn’t get parsed. This part of the data entry is not available for any further processing.
- A field gets parsed but assigned to the wrong field. Part of your prior analysis could be wrong.
- Breaking up the data entry into tokens (fields) is not granular enough. The parser should have broken the original entry into more specific fields.
- The data entries can change. Oftentimes, when a new version of a product is released, it either adds new data types or it changes some of the log entries. This has to be reflected in the parsers. They need to be updated to support the new data entries, before the data source can be used again.
- The original data entry is not available anymore, unless you are spending the time and space to store the original data entry along with the parsed and extracted fields. This can have quite some scalability issues as well.
I have seen all of these cases happening. And they happen all the time. Sometimes, the issues are not that bad, but other times, when you are dealing with mission critical systems, it is absolutely crucial that the normalization happens correctly and on time.
I will expand on the challenges of normalization in a future blog entry and put it into the context of security information management (SIM).
[tags]SIM, SIEM, ESM, log management, event normalization, event processing, log analysis[/tags]
July 12, 2007
Today I was booking my airline ticket to Kualalumpur, Malaysia for my trip to Hack in the Box in September. I called the sales lady for the airline and talk to her about my flight dates and all that. In the end she asks me for my credit card information. Number, expiration date, and then the CVV number on the back of my card (the security code, as it is called sometimes too). I hesitate for a second, trying to remember what I just learned from the PCI auditors we had in house. I couldn’t really remember when a merchant needed that number, but after a second I realized that it would be okay to give it to her. It’s about the same as on a Web page, where you enter that information. They can use the CVV to run a authorization with the credit card company. Well, I thought that would be it. Wrong!
A couple of hours later I get a pretty ugly Excel spreadsheet back. I am asked to print it out, sign it, and fax it back to them. I had a look at the form and I wondered what was going on. Well, there was all my information in this spreadsheet, including CVV number! They even “encrypted” my credit card number in the spreadsheet. I am just kidding. It was all in plain text. The only funny thing was that the credit card number field was not formatted as a string, but a number, so it looked like it was encrypted. *grins*. But back to serious. I was quite upset. All my information in this document. I have to assume that this excel document is on the sales person’s desktop, along with probably dozens of others. Hmmm… Maybe I should send an email with a link that points to a site that contains a … Let’s not even go there.
The next thing I did was digging up the PCI standard. And here it was, section 3.2.2:
3.2.2 Do not store the card-validation code (Three-digit or four-digit value printed on the front or back of a payment card (e.g., CVV2 and CVC2 data))
A clear violation! And you know, this is pretty much the first thing you should address; the way of authorizing credit card transactions. Just plain wrong! Darn!
I wrote them an email asking for a contact in their security department. So far, no luck, just the sales person telling me that she needs all that information to complete the transaction. Whatever. Either she needs my signature, but then no CVV, or the CVV and no signature. But not both! I wonder how this is going to continue.
[tags]pci, compliance, vioaltion,security[/tags]
May 26, 2007
Log analysis has shifted fairly significantly in the last couple of years. It is not about reporting on log records (e.g., Web statistics or user logins) anymore. It is all about pinpointing who is responsible for certain actions/activities. The problem is that the log files do oftentimes not communicate that. There are instances of logs (mainly from network centric devices), which contain IP addresses that are used to identify the subject. In other instances, there is no subject that can be identified in the log files at all (database transactions for example).
What I really want to identify is a person. I want to know who is to blame for deleting a file. The log files have not evolved to a point where they would contain the user information. It generally does not help much to know what machine the user came from when he deleted the file.
This all is old news and you probably are living with these limitations. But here is what I was wondering about: Why has nobody built a tool or started an open source project which looks at network traffic to extract user to machine mappings? It’s not _that_ hard. For example SMB traffic contains plain-text usernames, shares, originating machines, etc. You should be able to compile session tables from this. I need this information. Anyone? There is so much information you could extract from network traffic (even from Kerberos!). Most of the protocols would give you a fair understanding of who is using what machine at what time and how.
[tags]identify correlation, user, log analysis, user mapping[/tags]
May 15, 2007
I was just listening to this podcast about security information management (SIM) systems. Tom Bowers from Information Security magazine is talking about various topics in SIM. Unfortunately I have to disagree with Tom on a couple of points, if not more. But let me pick the couple I find most important:
- Visualization is a great tool to see attacks in real-time. However, you can only see where the attacks are coming from and not how many. What? Why would I not be able to visualize that? You can map that to edge size, node size, map it as a color to you nodes, etc. I don’t know what system he looked at to make this statement, but that’s wrong!
- Active Response is something that SIMs cannot do. Well. Wrong again. I could tell you how ArcSight is doing this with the Threat Response Manager (TRM), but that would be a vendor pitch. That’s why I am going to mention SEC, the simple correlation engine. It can execute an arbitrary action. Well, it’s not quantum leaps from there to imagine how you could issue a command to add an ACL to a router for example. To sum up: Active response is something SIMs can do! If you want to know how exactly you do this with SEC, read my chapter on event analysis in the new Snort book.
These were the main points where I disagree with Tom. He could have done a bit of a better job describing the benefits of visualization, but that’s another story.
[tags]arcsight,visualization[/tags]
May 11, 2007
I was trying to get my Ubuntu desktop to use Beryl, just like my laptop does. Unforunately, my NVidia drivers didn’t quite want to do what I wanted them to do. Long story short, at some point I remembered to check in the log files to see whether I could determine what exactly the problem was. Where should I look first? /var/log/messages And right there it was:
May 11 11:15:12 zurich kernel: [ 2503.193111] NVRM: API mismatch: the client has the version 1.0-9631, but
May 11 11:15:12 zurich kernel: [ 2503.193114] NVRM: this kernel module has the version 1.0-9755. Please
May 11 11:15:12 zurich kernel: [ 2503.193115] NVRM: make sure that this kernel module and all NVIDIA driver
May 11 11:15:12 zurich kernel: [ 2503.193117] NVRM: components have the same version.
Beautiful. That’s exactly what I needed to know. But hang on a second. Isn’t this a syslog entry? Wow. It just hit me. While I really liked the verbose output, I was trying to think about how I would parse this thing. How would I normalize this message to later apply machine logic to further process this? Aweful!
I guess my conclusion would be that we need two types of Syslogs! One that logs machine readable log entries and one for humans. Is that really what we want? Maybe the even better solution would be to only have a machine readable log and then provide an application that can read the log and blow the contents up to make it readable for humans!
Where is CEE when you need it?
May 10, 2007
Although I work in the log/event management space and therefore help organizations to gather more information about people, I am a big opponent of personal information collection.
I flew back from Switzerland to San Francisco after my Christmas break and was in for a surprise. Not only did they want my passport (which I can sort of understand ;), but they also wanted me to fill out an additional form with my address in San Francisco, a contact person, etc. Why do they need all that? And then there is still the controversy about the airlines giving passenger information to the TSA and possibly other US agencies. I just don’t know what they use all this information for? To flag potentially dangerous passengers? What was the rate of false positives for that? I wish everyone had stringent laws as the EU for personal data. At least I would have a chance to find out what the data is that they have about me and possibly correct it!
Are you a non-US citizen, and if so, did you enter the US lately? Yes? Picture taken, finger prints (soon to be 10, not just 2). Even more data they collect. I’ve got to tell you, it’s not just the wait in the immigration hall that annoys me. It’s all the data they collect. And that’s what tirggered my post. I wouldn’t have that much of a problem, if they actually told me what they were going to do with the data and kept it safe.
Maybe they are starting to rethink the “data collection” after more and more of the US agencies are suffering data leaks. Now the TSA itself. Hopefully they realize that they should either start to be serious about data security or stop collecting information!
March 6, 2007
I love travelling, not because I have to cram myself into a small seat for 9 hours, but because I usually get a lot of reading done. I was reading this paper about Preparing for Security Event Management by the 360is group. I like the article, there are a lot of good points about what to look out for in a SIM/SEM/ESM deployment. However, some fundamental concepts I disagree with:
The first step in deploying a SEM (Security Event Management Solution) should be to get an inventory, to do an assessment. At least according to the paper. Well, I disagree. The very first step has to be to define the use-cases you are after. What’s the objective. What are you hoping to get out of your ESM (Enterprise Security Manager [I use these terms interchangeably here]? Answer this question and it will drive the entire deployment! Out of the use-cases you will learn what data sources you need. Then you will see how much staff you need, procedures will result from that, etc.
The second step, after the use-case development, should be the assessment of your environment. What do you have? Get an inventory of logging devices (make sure you actually also capture the non-logging security devices!) and all your assets. I know, you are going to tell me right away that there is no way you will get a list of all assets, but get at least one of your critical ones!
Another point that I disagree with is the step about “Simplify”. It talks about cleaning up the security landscape. Throwing out old security devices, getting logging configured correctly, etc. Well, while I agree that the logging of all the devices needs to be visited and configured correctly, the task of re-architecting the security environment is not part of a ESM deployment. You will miserably fail if you do that. The ESM project will be big enough as it is, don’t lump this housr-keeping step into it as well. This is really a separate project that falls under: “Do your IT security right”.
January 30, 2007
I am still waiting for that one company which is going to develop the univeral agent!
What am I talking about? Well, there is all this agent-based technology out there. You have to deploy some sort of code on all of your machines to monitor/enforce/… something. The problem is that nobody likes to run these pieces of code on their machines. There are complicated approval processes, risk analysis issues, security concerns, etc. which have to be overcome. Then there is the problem of incompatible code, various agents running on the same machine, performance problems, and so on.
Why does nobody build a well-desgined agent framework with all the bells and whistles of remotely managed software. Deployment, upgrades, monitoring, logging, etc. Then make it a plug-in architecture. You offer the most important functionality already in the agent and let other vendors build plug-ins which do some actual work. You would have to deploy and manage exactly one agent, instead of dozens of them.
Well, maybe this will remain wishful thinking.