October 15, 2007
I was playing with database audit logs for a bit to try and visualize some aspects of them. While doing so, I came across a pretty interesting problem. The audit logs contain entries that indicate what exact SQL query was executed. Now, I am not interested in the entire query, but I need to know which tables were touched. I was trying to build some regular expressions to extract that information from the query, but I gave up pretty quickly. It’s just too complicated for a regex. I was wondering whether there is a way to take a SQL query, for example:
select * from a.table1 a, b.tabl2 b join c.table3 on b.id1=c.id2 where a.foo='bar'
and extract all the table names: a.table1, b.table2, c.table3. Are there tools to do that? Remember, I don’t have the database with these tables. I only have a log from some database. The script should support all the SQL perks like joins, nested selects, etc. Anyone have a good way to do this?
July 25, 2007
I haven’t written about UNIX scripting in a while. It was yesterday in the afternoon that our QA guy came over and asked me some questions about VI. Among his problems was the “parsing of an XML” file. He wanted to extract elements from specific branches of an XML structure. I told him that VI was not XML aware. It treats XMLs just like any other text file; line by line. He was not happy with my answer and kept bugging me. Then he said: “You should write a tool called XMLgrep”. And that was it. I was pretty sure that someone had written a tool that would do exactly that.
After 30 seconds on google, I found it: XMLStarlet. It took me about 30 minutes to get the hang of the tool, but it is really cool. It takes XPATH queries as an input. My knowledge of XPATH goes back to my thesis and is a bit rusty, but I finally got it right. Here is an example of how to apply an XPATH query to an XML file:
xmlstarlet sel -t -c "/archive/ActiveList[@name='Public Webmail']/description" JSOX_ActiveLists.xml
xmlstarlet sel -t -m "/archive/ActiveList" -v "concat (@name,'
Yes, there is a newline in this command. However, it didn’t really work for me. What I wanted to do is separating the different outputs with a newline, but for some reason this didn’t work. I tried all kinds of things, but no luck. Oh well.
Here is another link that might be useful. It’s a nice tutorial on XMLStarlet.
Technorati Tags: xml, parsing, command line, xpath, xmlstarlet
March 20, 2007
In cryptography or science in general, you often need perfect random numbers. Well, up to today, that was my need as well. However, today I was trying to generate numbers that are not too random, but have a certain bias. I think it’s kind of ironic. Googling for a solution is almost impossible. Every link shows a perfect random number generator 😉
I don’t care what the bias is in the numbers that are generated. Actually, the bias can be pretty high. Anyone have a method to do this in Perl?
Can you do something like int(rand($upperLimit*1000)) % 1000 ??? Basically changing the interval from where the random number is taken and then shrinking it again?
February 24, 2007
By now you should know that I really like command line tools which operate well when applied to data through a pipe. I have posted quite a few tips already to do data manipulation on the command line. Today I wanted a quick way to lookup IP address locations and add them to a log file. After investigating a few free databases, I came accross Geo::IPFree, a Perl library which does the trick. So here is how you add the country code. First, this is the format of my log entries:
I want to get the country of the source address (first IP in the log). Here we go:
cat pflog.csv | perl -M'Geo::IPfree' -na -F/,/ -e '($country,$country_name)=Geo::IPfree::LookUp($F);chomp; print "$_,$country_name\n"'
And here the output:
10/13/2005 20:24:33.494358,22.214.171.124,126.96.36.199,,echo request,Europe
April 26, 2006
Something I ran into a couple of times this week is how to do an easy dns lookup on the command line:
cat data | perl -M'Socket' -na -F/,/ -e '$dns=gethostbyaddr(inet_aton($F),AF_INET)||$F; print "$dns,$F,$F\n"'
The code assumes that you have an IP address in the first column. It uses -F/,/ to split the input into arrays, does a DNS lookup on the first column and returns either the dns name or if that was not found, it returns the original IP address.
January 16, 2006
Sitting down at a cafe around the corner from where I live, I realize that some of the scripts I wrote a while back might actually benefit others too. This one is to connect to the first available access point:
iwlist ath0 s > /tmp/$$
ap=`cat /tmp/$$ | grep "Encryption key:off" -B 5 | head -1 | sed -e 's/Cell.*Address: \(.*\)/\1/g'`
essid=`cat /tmp/$$ | grep "Encryption key:off" -B 4 | head -1 | sed -e 's/ESSID:"\(.*\)"/\1/g'`
essid=`echo $essid | sed -e 's/ //g'`
echo Tryping AP:$ap / SSID:$essid
iwconfig ath0 ap $ap
iwconfig ath0 essid $essid
iwconfig ath0 nick test
killall -9 dhclient
Not sure whether there would be a simpler solution natively supported by linux…