December 18, 2008

Comments on the Syslog Protocol Internet Draft

Category: Log Analysis — Tags: , , , , – Raffael Marty @ 6:08 pm

ietf.jpgI am really late to the game. But finally I read draft-ietf-syslog-protocol-23. This is the new draft for revising the syslog protocol.

Here are some of my comments that I also submitted officially:

  • Let me say this first: I really like some of the changes that have been incorporated.
  • Syslog message facility: Why still keeping this? The only reason that I see people using the facility is to filter messages. There are better ways to do that. Some of the pre-assigned groups are fairly arbitrary and not even really implemented in most OSs. UUCP subsystem? Who is still using that? I guess the reason for keeping it is backwards compatibility? If possible, I would really like this to be gone.
  • Priority calculation: The whole priority field is funky. The priority does not really have any meaning. The order does not imply importance. Why having this at all?
  • Timestamps: What’s the reason for having the “T” in the timestamp? Having looked at hundreds of different log formats, I have never seen anything like that. Why doing this?
  • Hostname: I am not comfortable with the whole hostname spec. I like that there is an ordering and people are supposed to use FQDNs, but there are many questions about this. To start with, in a lot of UNIX configurations, /etc/hosts contains an entry like
    127.0.0.1   localhost.localdomain  localhost
    The second column is the FQDN (technically). Is that one that can be used? Can you make it clear that this is not what should be used? Same for 127.0.0.1 or the loopback address in general. How does a machine know whether an IP address is static or dynamic? How does a logging application know? I don’t think you will ever know that. Did you mean a private versus a public address? That might be interesting. Furthermore, it should specify which interface’s IP address to use. The interface that the message is sent out on?
  • Under the section of PROCID: The text is imprecise. This number is not the process ID of the syslog process, it’s the ID of the writing process. The third paragraph talks about detecting restarted applications and somehow mixes in the syslog process. (“might be assigned the same process ID as the previous syslog process“.) This is not clear at all and very very confusing.
  • MSGID: Make clear that this ID is local to the application. It’s not a global ID at all.

The biggest issue I have around the SD-ID field:

  • I like that the user can extend the set of registered IDs.
  • Why is this structure so complicated? Why not going with a simple set of key-value pairs? This whole structure thing is so complicated. Parsing it, you need to keep state! You need to remember the SD-ID for each SD-PARAM. Why introducing this? Just stick with simple key-value pairs. That makes parsing easier. Much easier. And it makes the events easier to produce as well.
  • By keeping an explicit message field (the unstructured part), you encourage people to still log in that way. I recommend using an explicit field (or parameter) that can be used to include human readable text. Instead of this:
    <165>1 2003-08-24T05:14:15.000003-07:00 192.0.2.1 myproc 8710 - - %% It's time to make the do-nuts.
    use:
    <165>1 2003-08-24T05:14:15.000003-07:00 192.0.2.1 myproc 8710 - message="%% It's time to make the do-nuts."
    or really:
    1 2003-08-24 05:14:15.000003-07:00 host=192.0.2.1 process=myproc procid=8710 message="%% It's time to make the do-nuts."
  • I definitely like the consideration of some of the special fields (structured data IDs). However, they should be used as simple keys (or parameters) that have special meaning.
  • Parameter – origin: What does it mean to have multiple origins IPs? Is that a syslog forwarding chain? The document does not say anything about that. Also, we already have the host field in the beginning of the syslog messages. What’s the relationship to that? Or is origin something completely different?
  • Parameters – I would really like to see some use-cases for all of the IDs. Especially the sequenceId. I am assuming this is something that the syslog daemon assigns, not the logging application. Right? I think that needs to be clearer. For the sequenceId, what happens for forwarded messages? Are these IDs local? Are they forwarded along with a message? Also, how does the logging application know about the timeQuality? Or if that something that the syslog daemon assigns, how does it know?
  • I would really like to see the parameters to go away and have a generic key-value extension. In addition, IANA should have a set of allowed/defined keys. The parameters should be part of those. Each key has a special meaning (semantics). There should be a whole lot of them: src_ip, user_name, etc. Each producer should be free to add additional keys, realizing that not all consumers would understand their semantics. However, the consumers could still read them.

That’s it for now… Let’s see what some of the reactions are going to be.

Technorati Tags: , , , ,

December 7, 2008

Displaying Time in Link Graphs

Category: Visualization — Tags: , , , – Raffael Marty @ 5:11 pm

sip-dip-bool.gifI have been using link graphs a lot in my work of visualizing security data. They are a great methods to display relationships between entities. I guess the most used link graph is one that shows communications of machines. The nodes represent the communicating machines and arrows connecting them show flows.

You can use color and shape to encode more information, such as the amount o traffic transmitted or a machine’s role. I even extended the graphs to show three types of nodes: source nodes, event nodes, and target nodes.

source event destination configuration

three node configuration

This lets me encode more information in a graph, such as the machines communicating and the service they used, as shown on the right.

rent a car in moscow

All of this has been incredibly useful. However, for the longest time I have been thinking about how to include time into link graphs. To date, I don’t really have a good solution. Here are some things I have considered:

  1. Animation: This is the most obvious solution. You use a tool that replays the data. Use fast forward to speed up the animation. Ideally the tool would allow for forwarding and reversing the animation, just like the controls you have to watch a movie. This approach has the disadvantage of change blindness. There are changes that the human brain will not notice. And the probably even bigger problem are the layout algorithms that are generally not built for incremental updates. Adding new nodes to a graph moves the existing ones around and the viewer cannot locate them anymore. [I wrote about this in my book in Chapter 3.] You can counter the problem of instability by assigning each node a pre-computed location. Use some hashing algorithm to do so.
  2. Color: The idea would be to assign color to nodes or edges. Use some sort of encoding to show time. For example, the lighter a color, the late it happened. This approach is very limited. There are only so many colors you have available. The human eye can only differentiate, really differentiate about 8 hues. Any more and it gets really hard to tell which node is brighter. [It might be more than 8, but the number is really really low]
  3. Using arrows that order the connections: This was an idea I had a while back. I don’t think it’s actually useful, but here it is anyways: You generate a link graph and then you introduce a set of arrows that connect the edges. The arrows indicate time, so you connect the earliest event with the second earliest , and so on. This will really clutter the display an is probably really hard to read.
  4. Paralll coordinates: Add a coordinate for time. This can help in some instances. In others the time-axis will just be completely cluttered. But worth a try.
  5. Multiple, linked views: The idea here is to generate your link graph and then in addition, you also generate a display that encodes time. For example, a time table. On the x-axis you show time and on the y-axis you show, the source node’s field. The problem here is how do you link the two displays. Interactivity is almost a must. So that you could click on a node and see it in the time chart. Even better would be if you could encode the relationships in the time table. However, that might be hard.
  6. Using a time-base layout algorithm: I am too bad of a coder to actually implement this idea. I am also not sure what the result would be like. The idea would be to define the attraction between nodes as the time distance. There are many problems. What do you do if a connection shows up at multiple instances in time? I haven’t thought this true. But maybe there is a possibility here.

Unfortunately, all of these solutions have drawbacks. I think I favor timecharts for showing time-based activity. But then, the number of entities you can track is limited, etc.

Anyone have a solution for showing time-based activity? Even if it’s animation, what are some of the key things that would help making the animation easy to follow?

Technorati Tags: , , ,