Not too long ago, I posted an entry about CEE, the Common Event Expression standard which is a work in progress, lead by Mitre. I was one of the founding members of the working group and I have been in discussions with Mitre and other entities for a long time about common event formats. Anyways, one of the comments to my blog entries pointed to an effort called Distributed Audit Service (XDAS). I have not heard of this effort before and was a bit worried that we started something new (CEE) where there was already a legitimate solution. That’s definitely not what I want to do. Well, I finally had time to read through the 100! page document. It’s not at all what CEE is after. Let me tell you why XDAS is not what we (you) want:
- How many people are actually using this standard for their audit logs? Anyone? I have been working in the log management space for quite a while and not just on the vendor side, but also in academia. I have NEVER heard of it before. So why should I use this if nobody else is? In contrast, CEF from ArcSight is in use not just by ArcSight itself, but many of its partners.
- I just mentioned it before. 100 pages! What’s the last time you read through 100 pages? I just did. Took me about an hour to read the document and I skipped a lot of the API definitions. My point being: A standard should be at most 10 pages! It’s not just the length of the document, it’s the complexity which comes with it. Nobody is going to read and adhere to this. The more you demand, the more mistakes are being made by vendors which implement this. Oh, and please don’t tell me to only read pages 1-10! Make it 10 pages if you want me to read only those.
- How much time does it take to actually implement this? Has anyone done it? How long did it take you? I bet a couple of weeks, plus QA, etc. Much too long. I am NEVER going to make that investment.
- Let’s get into details. Gosh. Why does this define APIs? Don’t dictate how I should do things. A standard needs to define the common interface, not how I have to open a stream and safe files and so on. It’s overkill. The implementations will differ and they should! And why lock yourself into this API transport. Can you support other transports?
- It seems that there is an XDAS service that I need to integrate with. What is that? That’s not clear to me. Can I exchange logs (audit records) between just to parties or do I need an intermediary XDAS service? I am confused.
- Keep the scope of the standard to what it wants to accomplish: event interchange! This thing talks about access control, RBAC, filtering, etc. Why? Please! That’s absolutely unnecessary and should not be part of an interchange standard!
- In general, I am quite confused about the exact setting of this. Are we only talking about audit records? Security related only? What about other events? I want this to be very generic! Don’t give me a security specific solution! The world is opening up! We need generic solutions!
- What kind of people wrote this? Using percent signs to escape entries and colons to separate them? Must be from the old AS/400 world … Sorry… I just had to say this, in a world of CSV and key-value pairs it is sort of funny to see these things.
- The glossary could really benefit from a definition of event and log.
- The standard requires a measure of uncertainty for timestamps. I have never heard of this. Could you please elaborate? How can I measure time uncertainty???
- In section 2.5, access IDs and principle IDs are mentioned. What’s that?
- Although the standard does not position itself with log management, it talks about alarms and actions. Why would you need to mention actions in this context at all?
- A pointer to the original log entry? How do you do that? Log rotation, archiving, leave alone the mere problem of how to reference the original entry to start with.
- Why does the standard require the length of a record to be communicated? Just drop that.
- The caller has to provide an event_number. I like it. But sorry folks, syslog does not have it. How do you get that in there?
- Originator identity: It specifies that this should be the UNIX id. ID of what? The process that logs? The user that initiated the action? The remote user that sent the packet to this machine to trigger some action? How do you know that?
- I like the list of XDAS events. It’s a good start, but it’s definitely not all we need. We need much more! Again a nice list to start with.
- Why is there so much information encoded in the outcome, instead of defining individual entries? There might be a valid reason, but please motivate these decisions.
That’s what I have for a quick review. Again, no need for us to stop working on CEE. There is still a need for a decent interoperability standard.
[tags]interoperability, log, event exchange, common exchange format, common event expression, log management[/tags]
Raffy, To quote Shakespeare,
I’ll just take your comments point by point, if you don’t mind:
1. Novell is using XDAS. What makes ArcSight’s use of CEF more significant?
2. You’ve clearly not read many standards – have you seen the IETF RFC for the HTTP protocol? – It’s a fairly simple protocol, but it’s well over 100 pages. Pick any other standards (SMNP, SMTP, FTP, SSH, etc) and check how many pages they have. Don’t limit the number of pages to the material you can read before you fall asleep. Standards are like legal documents to technicians – they should communicate intent clearly, and that takes words in a row.
3. By implement, do you mean the audit system itself, or application instrumentation? You don’t implement any system worth having without some effort. A couple of weeks? More like a couple of months before the first release – as it should be. We’re not writing in PHP or Javascript here. This is security code, and it has to be secure, which requires careful thought and significant effort.
4. The XDAS standard defines three very critical aspects of auditing: API, Record Format, and Taxonomy. API is defined because too many people have reinstrumented for different audit systems too many times. Most of them can relate to this design choice. Record format allows everyone to work together on analysis systems. Taxonomy exists to allow everyone to work in the same context. Transport is specifically NOT defined. It’s indicated on the block diagram as necessary, but purposely specified as implementation defined.
5. The XDAS service exists to coordinate logging from multiple processes into a single sink. This shouldn’t be that hard to understand. A single process should always (for security and architectural reasons) coordinate the funnelling of data from multiple sources into a single sink.
6. XDAS is not an event interchange standard – it’s a distributed security event management standard. It’s designed to ensure that the proper data gets from multiple sources to the proper place without tampering. This requires some degree of access control. Filtering is discussed simply for the sake of performance, and I don’t see anything about RBAC in the spec.
7. Yes, audit is all about security events, and it should be strictly limited to security events. XDAS very purposely does NOT define a generic event bus. We have WAY too many of those today, and none of them do what they need for proper auditing. The last thing we want is for people to dump their debug events into the audit log. Most audit logs today contain 99 percent useless garbage (from an auditing perspective), which means that analysis tools spend most of their processing time just filtering and correlating. The point is that engineers should think of auditing as a separate and distinct process from logging.
8, 9, 10, 14. On these points, I agree – the record format could use some updating. The XDAS preliminary specification was written in 1998, before the days of XML and before name/value pairs were as popular as they are today. Even so, some of the fields reflect the growing desire to move in this direction. The Open Group has recently re-established the XDAS working group to update this specification to meet today’s needs. The OpenXDAS project on sourceforge.net is a work in progress. There is one person working full-time on this project, and I can only do so much in a day. But I and those who help me occasionally are working to make things better with each release. Note that the current release is 0.5 – that’s pre-1.0, meaning it’s not what we would consider complete yet. And finally, some of the fields are being reconsidered. After all, we’re starting with a preliminary specification, not an existing standard.
11. I’ve no idea what the original author intended by these concepts (access and principal id’s). Does it matter? We’ll probably remove this prose from the spec, if someone in the working group can’t provide a good explanation for them.
12. Why mention actions and alarms? Actions and alarms have nothing to do with log management – they relate to event channel contents. One of the key benefits of a true audit system is the ability to take immediate action based on various types of events being published. A common action is to simply notify a system administrator of possible attack or system failure. While it’s true that most analysis is done on logs, there is a very important place in auditing for real-time alarms and actions built into the event channel itself. XDAS specifies their optional existence, and indicates that the actual nature of these auxilliary services are implementation defined.
13. Log entry pointers are more for archival interest than for actual analysis. We don’t expect analysis tools to go trapsing back to the original source for details on an XDAS record, although nothing in the specification precludes an implementation from doing so.
14. Again, I agree with you, the record format is not perfect. But it’s not bad either. We’re working on updating it.
15. While the world of today may revolve around syslog, we’re hoping that syslog is not considered the ultimate auditing solution for everyone. We’re actually hoping to improve on today’s tools – if we didn’t think there was room for improvement, then we’d be writing to syslog ourselves. We’re not the only people that feel this way – Red Hat is working on an entire auditing infrastructure for RH Linux that has nothing to do with syslog, unfortinately, it’s very specific to Linux. We will integrate OpenXDAS with this Linux-only system (LAF – linux audit framework) on Linux platforms.
16. The concepts of Originator, Target and Initiator are all well-defined in the spec. Originator is the identity of the service or host that is logging the event. (If you’d read past the first 10 pages, you’d have known that. 🙂
17. The Taxonomy is being improved by the Open Group working group as I write this. We’re adding events for work flow, among others of security relevance.
18. Outcome encoding, as well as event encoding is being looked as for update by the working group.
The fact is, Novell didn’t want to invent something new (again). We wanted to use an existing standard. When we went looking for such a standard over a year ago, we found XDAS – and that’s about all. Had we known about CEF, we might have gone in that direction, but CEF wasn’t being advertised actively on the Internet back then. We did find a few other efforts such as CMU’s auditing system (eddy – end-to-end diagnostics discovery), but nothing was really as close to implmenetation read as XDAS was.
Our first version of OpenXDAS was to provide a pure reference implementation of the standard – implementing the existing specification as closely as possible. The second version (2.x?) will follow the working group updates. This takes time of course, and we wanted something our customers and internal development groups can use today.
John Calcote
System Architect
Novell, Inc.
Comment by John Calcote — June 28, 2007 @ 2:56 pm
Can you post the CEF open standard or is it Arcsight proprietary?
Comment by DC — June 28, 2007 @ 3:48 pm
Here ya go… The CEF standard.
Comment by Raffael Marty — June 29, 2007 @ 10:58 am
Is someone confusing CEF with CEE?
Comment by Anton Chuvakin — July 2, 2007 @ 2:49 pm