Event graphs like the ones we showed throughout the paper are generated by following these steps:
Step one requires a database schema that can be populated with the information from snort. Our schema is the following:
# MySQL Server version: 4.1.6 # The Schema contains some extra entries utilized for certain # steps of our analysis CREATE TABLE sans ( id int(11) NOT NULL auto_increment, `timestamp` datetime NOT NULL default '0000-00-00 00:00:00', sourcemac varchar(17) NOT NULL default '', destmac varchar(17) NOT NULL default '', sourceip varchar(15) NOT NULL default '', destip varchar(15) NOT NULL default '', sourceport int(5) NOT NULL default '0', destport int(5) NOT NULL default '0', proto varchar(10) default NULL, tcpflags varchar(10) default NULL, length int(11) NOT NULL default '0', ttl int(11) default NULL, ipid int(11) default NULL, iptos varchar(10) default NULL, ipflags varchar(5) default NULL, `offset` int(11) default NULL, snort_alert varchar(100) default NULL, service int(11) default NULL, delta int(11) default NULL, delta2 int(7) default NULL, PRIMARY KEY (id), KEY `timestamp` (`timestamp`), KEY id (id), KEY sourceip (sourceip), KEY destip (destip), KEY snort_alert (snort_alert) )
The database keys greatly help to improve the speed when issuing queries to the database.
Step two, the population of the database can be done with a script that is available in afterglow-database.tar.gz at http://sourceforge.net/project/showfiles.php? group_id=125211. To start it, point tcpdump to your snort binary log and pipe the output to the script: tcpdump -vttttnnelr /tmp/sans | tcpdump2sql.pl.
Step three requires an example. Let us assume we want to graph the source MAC addresses and the IP addresses that are located behind them. The SQL query for this would be: select sourcemac, sourceip from sans. This output should now be converted into a comma separated form in order to feed it to the graphical library. To do so, we used Linux and some command-line tools. The final command we used:
echo 'select sourcemac, sourceip' | mysql -s -u root -ppass tcpdump
| awk '{printf "%s,%s\n",$1,$2}' > list.csv
The file list.csv now contains the following lines:
00:03:e3:d9:26:c0,255.255.255. 00:00:0c:04:b2:33,138.97.144. 00:03:e3:d9:26:c0,255.255.255. 00:00:0c:04:b2:33,138.97.82. 00:03:e3:d9:26:c0,24.84.106. 00:00:0c:04:b2:33,138.97.18. 00:03:e3:d9:26:c0,24.84.106. 00:00:0c:04:b2:33,138.97.18. 00:03:e3:d9:26:c0,24.84.106. 00:00:0c:04:b2:33,138.97.18. 00:03:e3:d9:26:c0,24.84.106. 00:00:0c:04:b2:33,138.97.18. 00:03:e3:d9:26:c0,24.84.106.
For step five, we need to explain some more things: We decided to use a package called GraphViz[4] from AT&T Research to generate all the graphs in this paper. GraphViz requires the input to be in a specific language that expresses a graph. A very simple example of a graph definition is the following:
digraph G {
a -> b -> c
}
Passing this to the graphviz librariesB.1, will generate the graph shown in Figure B.1. For a complete description of the language, have a look at the GraphViz documentation[5].
Now that we know how the input to the graphical library looks and we know how to generate comma separated output from entries in our database, we need a module that translates the CSV output into GraphViz's language. In order to facilitate this process, we utilized a tool called AfterGlow[14]B.2. AfterGlow expects two values on each line. Each line then represents two nodes and a connection between the nodesB.3 Using the input, AfterGlow will produce output that can be passed on to one of the utilities from GraphViz.
Continuing on our example, we would now pass the information from list.csv into AfterGlow: cat list.csv | ./afterglow.pl -t > list.dot. The output is a file that can then be passed into neatoB.4: cat list.dot | neato -Tgif -o list.gif.
This is the full process of generating graphs. All the steps can be taken together and executed as follows:
echo 'select sourcemac, sourceip from sans' | mysql -s -u root -ppass tcpdump |
awk '{printf "%s,%s\n",$1,$2}' | ./afterglow.pl -t | neato -Tgif -o list.gif