May 28, 2007
My R journey seems to be over… It’s just too complicated to get charts and plots right with R. I found a new library that I staerted playing with: chart director. I am using Perl to drive the graph generation and it’s actually fairly easy to use. There are some cumbersome pieces, such as seg-faults when you have NULL values in a matrix that you are drawing, but that’s things I can deal with ;
Here is a sample graph I generated: (I know, the color selection is fairly horrible!)
[tags]R, charts, visualization, perl[/tags]
May 27, 2007
I just finished reading the Information Dashboard Design book by Stephen Few. I love it. I am definitely going to use some of the concepts that I learned about for my own book. I especially like how Stephen illustrates Tufte’s principle of reducing the non-data ink in charts. Definitely a book people who are building dashboards should read!
[tags]dashboard,visualization,design,tufte[/tags]
May 26, 2007
Log analysis has shifted fairly significantly in the last couple of years. It is not about reporting on log records (e.g., Web statistics or user logins) anymore. It is all about pinpointing who is responsible for certain actions/activities. The problem is that the log files do oftentimes not communicate that. There are instances of logs (mainly from network centric devices), which contain IP addresses that are used to identify the subject. In other instances, there is no subject that can be identified in the log files at all (database transactions for example).
What I really want to identify is a person. I want to know who is to blame for deleting a file. The log files have not evolved to a point where they would contain the user information. It generally does not help much to know what machine the user came from when he deleted the file.
This all is old news and you probably are living with these limitations. But here is what I was wondering about: Why has nobody built a tool or started an open source project which looks at network traffic to extract user to machine mappings? It’s not _that_ hard. For example SMB traffic contains plain-text usernames, shares, originating machines, etc. You should be able to compile session tables from this. I need this information. Anyone? There is so much information you could extract from network traffic (even from Kerberos!). Most of the protocols would give you a fair understanding of who is using what machine at what time and how.
[tags]identify correlation, user, log analysis, user mapping[/tags]
May 24, 2007
I am teaching a workshop at FIRST in Seville in June about Visualizing Insider Threat Data. I recorded a PodCast introducing my workshop and talking about visualization in general.
[tags]visualization, visualization podcast, security[/tags]
May 15, 2007
I was just listening to this podcast about security information management (SIM) systems. Tom Bowers from Information Security magazine is talking about various topics in SIM. Unfortunately I have to disagree with Tom on a couple of points, if not more. But let me pick the couple I find most important:
- Visualization is a great tool to see attacks in real-time. However, you can only see where the attacks are coming from and not how many. What? Why would I not be able to visualize that? You can map that to edge size, node size, map it as a color to you nodes, etc. I don’t know what system he looked at to make this statement, but that’s wrong!
- Active Response is something that SIMs cannot do. Well. Wrong again. I could tell you how ArcSight is doing this with the Threat Response Manager (TRM), but that would be a vendor pitch. That’s why I am going to mention SEC, the simple correlation engine. It can execute an arbitrary action. Well, it’s not quantum leaps from there to imagine how you could issue a command to add an ACL to a router for example. To sum up: Active response is something SIMs can do! If you want to know how exactly you do this with SEC, read my chapter on event analysis in the new Snort book.
These were the main points where I disagree with Tom. He could have done a bit of a better job describing the benefits of visualization, but that’s another story.
[tags]arcsight,visualization[/tags]
May 11, 2007
I was trying to get my Ubuntu desktop to use Beryl, just like my laptop does. Unforunately, my NVidia drivers didn’t quite want to do what I wanted them to do. Long story short, at some point I remembered to check in the log files to see whether I could determine what exactly the problem was. Where should I look first? /var/log/messages
And right there it was:
May 11 11:15:12 zurich kernel: [ 2503.193111] NVRM: API mismatch: the client has the version 1.0-9631, but
May 11 11:15:12 zurich kernel: [ 2503.193114] NVRM: this kernel module has the version 1.0-9755. Please
May 11 11:15:12 zurich kernel: [ 2503.193115] NVRM: make sure that this kernel module and all NVIDIA driver
May 11 11:15:12 zurich kernel: [ 2503.193117] NVRM: components have the same version.
Beautiful. That’s exactly what I needed to know. But hang on a second. Isn’t this a syslog entry? Wow. It just hit me. While I really liked the verbose output, I was trying to think about how I would parse this thing. How would I normalize this message to later apply machine logic to further process this? Aweful!
I guess my conclusion would be that we need two types of Syslogs! One that logs machine readable log entries and one for humans. Is that really what we want? Maybe the even better solution would be to only have a machine readable log and then provide an application that can read the log and blow the contents up to make it readable for humans!
Where is CEE when you need it?
May 10, 2007
Although I work in the log/event management space and therefore help organizations to gather more information about people, I am a big opponent of personal information collection.
I flew back from Switzerland to San Francisco after my Christmas break and was in for a surprise. Not only did they want my passport (which I can sort of understand ;), but they also wanted me to fill out an additional form with my address in San Francisco, a contact person, etc. Why do they need all that? And then there is still the controversy about the airlines giving passenger information to the TSA and possibly other US agencies. I just don’t know what they use all this information for? To flag potentially dangerous passengers? What was the rate of false positives for that? I wish everyone had stringent laws as the EU for personal data. At least I would have a chance to find out what the data is that they have about me and possibly correct it!
Are you a non-US citizen, and if so, did you enter the US lately? Yes? Picture taken, finger prints (soon to be 10, not just 2). Even more data they collect. I’ve got to tell you, it’s not just the wait in the immigration hall that annoys me. It’s all the data they collect. And that’s what tirggered my post. I wouldn’t have that much of a problem, if they actually told me what they were going to do with the data and kept it safe.
Maybe they are starting to rethink the “data collection” after more and more of the US agencies are suffering data leaks. Now the TSA itself. Hopefully they realize that they should either start to be serious about data security or stop collecting information!
May 2, 2007
A group of info sec people is meeting up in San Francisco for an informal get together. We’ll have a drink and probably chat about security.
You work in computer security? Join us:
Wednesday, May 16th, 7pm at Zeitgeist in San Francisco.
No RSVP needed. So far you’ll have to buy your own beer unless we’ll find a sponsor 😉