May 10, 2007

Information Gathering – More Leaks

Category: Log Analysis,Security Information Management — Raffael Marty @ 2:10 pm

Although I work in the log/event management space and therefore help organizations to gather more information about people, I am a big opponent of personal information collection.

I flew back from Switzerland to San Francisco after my Christmas break and was in for a surprise. Not only did they want my passport (which I can sort of understand ;), but they also wanted me to fill out an additional form with my address in San Francisco, a contact person, etc. Why do they need all that? And then there is still the controversy about the airlines giving passenger information to the TSA and possibly other US agencies. I just don’t know what they use all this information for? To flag potentially dangerous passengers? What was the rate of false positives for that? I wish everyone had stringent laws as the EU for personal data. At least I would have a chance to find out what the data is that they have about me and possibly correct it!

Are you a non-US citizen, and if so, did you enter the US lately? Yes? Picture taken, finger prints (soon to be 10, not just 2). Even more data they collect. I’ve got to tell you, it’s not just the wait in the immigration hall that annoys me. It’s all the data they collect. And that’s what tirggered my post. I wouldn’t have that much of a problem, if they actually told me what they were going to do with the data and kept it safe.

Maybe they are starting to rethink the “data collection” after more and more of the US agencies are suffering data leaks. Now the TSA itself. Hopefully they realize that they should either start to be serious about data security or stop collecting information!

April 23, 2007

Common Event Expression (CEE)

Category: Log Analysis — Raffael Marty @ 4:31 pm

I have some more detilas on the CEE effort, which is captured in this CEE Brochure. The most interesting part is probably page two where the benefits are outlined. This effort will continue by tackling one of the four standard areas after the other. I have a feeling that we will tackle the taxonomy part first. I can already see it, this is going to be HARD!

April 19, 2007

Standard Logging Format – Common Event Expression (CEE)

Category: Log Analysis — Raffael Marty @ 8:08 pm

As Anton mentioned, there is a new event logging standard in the works. What Anton did not mention is the four areas that you need to talk about when you talk about a logging standard. Well, here they are:

  1. Common Event Syntax, like CEF
  2. Common Event Taxonomy. This is where you attach “meaning” or “semantics” to an event. There are a few proprietary ones, nothing standardized though.
  3. Common Event Transport
  4. Common Event Representation, defining what a device should log. An operating system should log user logins for example.

And don’t mix these things. The transport has nothing to do with the syntax! I don’t want to implement a SOAP environment to transport some events. Unfortunately a few companies and even standards have made that mistake! I don’t want to mention anyone here…
Stay tuned for http://cee.mitre.org to go live and learn more about all of this.

February 9, 2007

Web Server Log 3D Plot

Category: Log Analysis,Visualization — Raffael Marty @ 11:53 pm

I came accross this very well done Web Log Analysis. The author uses a 3D scatter plot to plot certain aspects of his Web server log. He uses gnuplot to do so. What I like in particular is his discussion of the output and the way he positions scatter plots to find correlated event fields.

February 4, 2007

Anonymizing Log Entries

Category: Log Analysis,Visualization — Raffael Marty @ 2:12 pm

I am finally biting the bullet. I will start to really anonymize my graphs. In order to do so, I was trying to find a tool on the Web which does that. Well, as you can probably imagine, there is non which does exactly what I wanted. So i wrote my own anonymization script. To safe you some hassle, also download the Anonymous.pm file.

This is how you use the script on a CSV file:

cat /tmp/log | ./anonymize.pl -c 1 -p user

This will replace all the values in column one with usernames of the form: "userX". If you are anonymizing IP addresses, run the tool without the prefix (-p) and it will do that automatically for you.

Credits to  John Kristoff who wrote the Anonymous.pm module for Perl.

January 7, 2007

Solving the Trivial Problems Over and Over and Over Again

Category: Log Analysis,Security Article Reviews — Raffael Marty @ 1:24 pm

I read a lot of research papers and security articles. I am getting so tired seeing all these tools, research papers, and new algorithms that propose new approaches in computer security and then as a proof, they are solving one of the “old” problems: Detecting worms, portscans and finding peer-to peer traffic. Guys, it’s been done. We don’t need any more tools to do it. It’s easy and nothing to show off with!
Show me that other use-cases can be solved with your new approch. That will not only tell me that you actually thought about the problem space, but it will help the security community at large to tackle new problems (maybe some that they were not even aware of)!

December 5, 2006

SecViz – RSS Feed

Category: Log Analysis,Visualization — Raffael Marty @ 3:26 pm

It was a bad oversitght that secviz.org did not have an RSS Feed. But now there is one! The feed contains all new content posted to the portal, including comments. Subscribe so you don’t have to check back all the time to see whether there is new content.

[tag]security visualization[/tag]

November 28, 2006

Log Visualization Portal – secviz.org

Category: Log Analysis,Visualization — Raffael Marty @ 2:04 am

I launched a new portal that deals with visualization of log files:

http://secviz.org

The portal can only survive if people- you – take an active part in contributing content.

There are multiple resources available where community input is most welcome:

* Graph Exchange: The idea is that people can submit their graphs, explain why they think the graphs are useful, and how they generated them.
* Parser Exchange: To generate graphs, you need to parse your data. This is a place where you can submit your parsers.
* Links: A whole bunch of links around data analysis and visualization.
* Discussions: A free forum where you can start discussions around the topics of log visualization and analysis.

Let me know what you think and most importantly, submit your graphs

November 24, 2006

Linux Auditing – Again!

Category: Log Analysis,UNIX Security — Raffael Marty @ 5:06 pm

I keep running into these little annoyances in Linux. (And as I said here before, I love Linux, but there are some things which are just bad.) This time I was trying to see what happens if you lock an accound. You didn’t even know you could do that?

passwd -l 

Do you know what syslog has to say about this?

Nov 14 16:35:12 zurich passwd[21226]: password for `test' changed by `root'

And even worse, if you unlock:

passwd -u 

Linux says:


Nov 14 16:35:12 zurich passwd[21226]: password for `test' changed by `root'

Great! What am I supposed to do with this? Is a password change really the same as a lock out of a user?

To continue on the path of auditing and such, have you tried to configure an automatic lock-out after a certain amount of failed logins? Good luck. After a while you might find pam_telly. You have to use this PAM module to achive that lockout. You can configure after how many failed passwords an account gets locked. Again, why is this in such a hidden module? Why not built-in? Is anyone going to rebuild the authentication sub-system? Please? And if you are at it, rethink the whole logging infrastructure too! Don’t forget to use a common log format, a specific fixed format that enforces certain information and is parsable! Stop logging copyright messages into syslog (Ok: dhclient?).

November 3, 2006

Interoperability Standards – Formats

Category: Log Analysis — Raffael Marty @ 12:32 am

There is all this talk about event interoperability standards or logging standards. Don’t we have enough of them? IDMEF, IDXP, SDEE, WELF, CBE, RDEP, OPSEC. All of them are approaches to solve the same problem: Simplify or enable the interoperability of devices and applications. Does anyone support these standards? No! The question is why? Here is my answer:
Have you ever looked at these standards? Noticed anything? These guys are all trying to solve many problems at once. I already blogged about the four different types of log standards that we need. One important things it that the transport needs to be separated from the format! SDEE for example requires SOAP as a transport. Have you implemented SOAP messaging ever? What an effort. I don’t want to do it in my applications. I want something easy! Why not using simple transports? What about files or syslog. And when I say syslog, I don’t mean the gibberish you can log in the message, but I mean the transport. Very simple! Very easy to implement!
Some standards are using XML. It’s just too much work to implement XML messages. You need to keep track of the elements, the hierarchy, the attributes, validate against the DTD, the Schema, etc. And you need a transport that can support it. Nevertheless, there are a few advantages to XML: You can express lists and you can enforce a very well defined format. But that’s it.
So my point being, use a text-based format. Do we have any standards in that arena? Well, there is CEF (Common Event Format). And that’s it. I don’t konw of any others. The standard is very well designed. And not by academics or people that have never seen a log file before, but by people that have seen hundreds of different log formats. A log standard needs some other considerations. Things like event IDs or severities. Things that an event consumer is interested it! But that’s a topic for another entry.
There is a second point that you can make agains text-based formats (the first point being that lists are hard to express), which is speed. I completely agree, if you want speed, you need to go binary! Period. Use NetFlow as an example where you send some kind of a template first and then you send the messages in that format. However, there are other drawbacks: it’s harder to implement (you need preprocessing), not every transport is suited for it, etc.
So to conclude: We really need three logging standards:

  • text-based for ease
  • binary for speed
  • XML for complex structures