I am really late to the game. But finally I read draft-ietf-syslog-protocol-23. This is the new draft for revising the syslog protocol.
Here are some of my comments that I also submitted officially:
- Let me say this first: I really like some of the changes that have been incorporated.
- Syslog message facility: Why still keeping this? The only reason that I see people using the facility is to filter messages. There are better ways to do that. Some of the pre-assigned groups are fairly arbitrary and not even really implemented in most OSs. UUCP subsystem? Who is still using that? I guess the reason for keeping it is backwards compatibility? If possible, I would really like this to be gone.
- Priority calculation: The whole priority field is funky. The priority does not really have any meaning. The order does not imply importance. Why having this at all?
- Timestamps: What’s the reason for having the “T” in the timestamp? Having looked at hundreds of different log formats, I have never seen anything like that. Why doing this?
- Hostname: I am not comfortable with the whole hostname spec. I like that there is an ordering and people are supposed to use FQDNs, but there are many questions about this. To start with, in a lot of UNIX configurations, /etc/hosts contains an entry like
127.0.0.1  localhost.localdomain localhost
The second column is the FQDN (technically). Is that one that can be used? Can you make it clear that this is not what should be used? Same for 127.0.0.1 or the loopback address in general. How does a machine know whether an IP address is static or dynamic? How does a logging application know? I don’t think you will ever know that. Did you mean a private versus a public address? That might be interesting. Furthermore, it should specify which interface’s IP address to use. The interface that the message is sent out on? - Under the section of PROCID: The text is imprecise. This number is not the process ID of the syslog process, it’s the ID of the writing process. The third paragraph talks about detecting restarted applications and somehow mixes in the syslog process. (“might be assigned the same process ID as the previous syslog process“.) This is not clear at all and very very confusing.
- MSGID: Make clear that this ID is local to the application. It’s not a global ID at all.
The biggest issue I have around the SD-ID field:
- I like that the user can extend the set of registered IDs.
- Why is this structure so complicated? Why not going with a simple set of key-value pairs? This whole structure thing is so complicated. Parsing it, you need to keep state! You need to remember the SD-ID for each SD-PARAM. Why introducing this? Just stick with simple key-value pairs. That makes parsing easier. Much easier. And it makes the events easier to produce as well.
- By keeping an explicit message field (the unstructured part), you encourage people to still log in that way. I recommend using an explicit field (or parameter) that can be used to include human readable text. Instead of this:
<165>1 2003-08-24T05:14:15.000003-07:00 192.0.2.1 myproc 8710 - - %% It's time to make the do-nuts.
use:
<165>1 2003-08-24T05:14:15.000003-07:00 192.0.2.1 myproc 8710 - message="%% It's time to make the do-nuts."
or really:
1 2003-08-24 05:14:15.000003-07:00 host=192.0.2.1 process=myproc procid=8710 message="%% It's time to make the do-nuts."
- I definitely like the consideration of some of the special fields (structured data IDs). However, they should be used as simple keys (or parameters) that have special meaning.
- Parameter – origin: What does it mean to have multiple origins IPs? Is that a syslog forwarding chain? The document does not say anything about that. Also, we already have the host field in the beginning of the syslog messages. What’s the relationship to that? Or is origin something completely different?
- Parameters – I would really like to see some use-cases for all of the IDs. Especially the sequenceId. I am assuming this is something that the syslog daemon assigns, not the logging application. Right? I think that needs to be clearer. For the sequenceId, what happens for forwarded messages? Are these IDs local? Are they forwarded along with a message? Also, how does the logging application know about the timeQuality? Or if that something that the syslog daemon assigns, how does it know?
- I would really like to see the parameters to go away and have a generic key-value extension. In addition, IANA should have a set of allowed/defined keys. The parameters should be part of those. Each key has a special meaning (semantics). There should be a whole lot of them: src_ip, user_name, etc. Each producer should be free to add additional keys, realizing that not all consumers would understand their semantics. However, the consumers could still read them.
That’s it for now… Let’s see what some of the reactions are going to be.
[tags]IANA, syslog, syslog protocol, IETF, logging[/tags]
The T in date/time is in ISO 8601.
Comment by Vincent Bernat — December 18, 2008 @ 7:56 pm
$ logger -p local0.notice -t HOSTIDM -f /dev/idmc “miss you Raffy”
Comment by Kris — December 19, 2008 @ 12:35 am
[…] logs, and generic IT data. Recently a new syslog RFC was published. I was much too late to actually comment on it. It has good intentions, but it is definitely not what I would like it to be. CEE is still […]
Pingback by Security Predictions for 2009 » Raffy — December 25, 2008 @ 1:07 am
I’ve finally begun to reply to your post, here is the first response to PRI and facility.
Rainer
Comment by Rainer Gerhards — March 17, 2009 @ 6:52 am
As the author of sphere of influence, I know what you mean. Its extremely frustrating to create tools as one vendors implementation differs greatly from anothers..I created a visualization tool for syslog events, but im having to create the tool vendor by vendor…
Comment by Darren M — April 21, 2009 @ 3:01 pm