I thought you might be interested in some blog posts that I have written lately. I have been doing quite a bit of work on Django and Web applications. That might explain the topics of my recent blog posts. Check them out.
Would love to hear from you if you have any comments. Either leave a comment on the blogs, or contact me via Twitter at @zrlram.
By now you should know that I really like command line tools which operate well when applied to data through a pipe. I have posted quite a few tips already to do data manipulation on the command line. Today I wanted a quick way to lookup IP address locations and add them to a log file. After investigating a few free databases, I came accross Geo::IPFree, a Perl library which does the trick. So here is how you add the country code. First, this is the format of my log entries:
10/13/2005 20:25:54.032145,195.141.211.178,195.131.61.44,2071,135
I want to get the country of the source address (first IP in the log). Here we go:
cat pflog.csv | perl -M'Geo::IPfree' -na -F/,/ -e '($country,$country_name)=Geo::IPfree::LookUp($F[1]);chomp; print "$_,$country_name\n"'
And here the output:
10/13/2005 20:24:33.494358,62.245.243.139,212.254.111.99,,echo request,Europe
Very simple!
I was fiddling with optimizing AfterGlow the other day and to do so, I introduced caches for some of the functions. Later a coworker (thanks Senthil) sent me a note that I could have done without implementing the cache myself by using Memoize. This is how to use it:
use Memoize;
memoize(function);
function(arguments); # this is now much faster
This will basically cache the outputs for each of inputs to the function. Especially for recursion this is an incredible speedup.
I was working on AfterGlow the other night and I realized that adding feature after feature starts to slow down the thing quite a bit (you need to be a genious to figure that one out!). So that prompted me to look for Perl performance analyzers and indeed I found something that’s pretty useful.
Run your perl script with: perl -d:DProf and then run dprofpp. This will show you how much time was spent in each of the subroutines. It helped me pinpoint that most of the time was spent in the getColor() call. The logical solution was to introduce a cache for the colors and guess what – AfterGlow 1.5.1 will be faster
This is a sample output of dprofpp:
Total Elapsed Time = 11.69959 Seconds
User+System Time = 8.969595 Seconds
Exclusive Times
%Time ExclSec CumulS #Calls sec/call Csec/c Name
81.5 7.310 9.900 120000 0.0001 0.0001 main::getColor
29.1 2.615 2.615 116993 0.0000 0.0000 main::subnet
0.89 0.080 0.080 20000 0.0000 0.0000 main::getEventName
0.22 0.020 0.020 20000 0.0000 0.0000 main::getSourceName
0.22 0.020 0.020 20000 0.0000 0.0000 main::getTargetName
0.11 0.010 0.010 1 0.0100 0.0100 main::BEGIN
0.00 - -0.000 1 - - Exporter::import
0.00 - -0.000 1 - - Getopt::Std::getopts
0.00 - -0.000 1 - - main::propertyfile
0.00 - -0.000 1 - - main::init
- - -0.025 116993 - - main::field
The RedHat Magazine had a nice Introduction to Python. Cool example that uses pyGTK!