<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Security Intelligence and Big Data &#124; raffy.ch - blog &#187; Programming</title>
	<atom:link href="http://raffy.ch/blog/category/programming/feed/" rel="self" type="application/rss+xml" />
	<link>http://raffy.ch/blog</link>
	<description>Big data analytics and visualization</description>
	<lastBuildDate>Sat, 24 Mar 2012 20:49:10 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>Advanced Network Graph Visualization with AfterGlow</title>
		<link>http://raffy.ch/blog/2012/03/24/advanced-network-graph-visualization-with-afterglow/</link>
		<comments>http://raffy.ch/blog/2012/03/24/advanced-network-graph-visualization-with-afterglow/#comments</comments>
		<pubDate>Sat, 24 Mar 2012 20:49:10 +0000</pubDate>
		<dc:creator>Raffael Marty</dc:creator>
				<category><![CDATA[Log Analysis]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[Visualization]]></category>

		<guid isPermaLink="false">http://raffy.ch/blog/?p=608</guid>
		<description><![CDATA[There are cases where you need fairly sophisticated logic to visualize data. Network graphs are a great way to help a viewer understand relationships in data. In my last blog post, I explained how to visualize network traffic. Today I am showing you how to extend your visualization with some more complicated configurations. This blog [...]]]></description>
			<content:encoded><![CDATA[<p><img src="http://raffy.ch/blog/wp-content/uploads/2012/03/graph-300x289.png" alt="" title="graph" width="300" height="289" style="float:right"/>There are cases where you need fairly sophisticated logic to visualize data. Network graphs are a great way to help a viewer understand relationships in data. In my last blog post, I explained how to <a href="http://raffy.ch/blog/2012/03/21/visualizing-packet-captures-for-fun-and-profit/">visualize network traffic</a>. Today I am showing you how to extend your visualization with some more complicated configurations.</p>
<p>This blog post was inspired by an <a href="http://afterglow.sf.nt">AfterGlow</a> user who emailed me last week asking how he could keep a list of port numbers to drive the color in his graph. Here is the code snippet that I suggested he use:</p>
<p><code>variable=@ports=qw(22 80 53 110);<br />
color="green" if (grep(/^\Q$fields[0]\E$/,@ports))</code></p>
<p>Put this in a configuration file and invoke AfterGlow with it:</p>
<p><code>perl afterglow.pl -c file.config | ... </code></p>
<p>What this does is color all nodes green if they are part of the list of ports (22, 80, 53, 110). I am using <i>$fields[0]</i> to reference the first column of data. You could also use the function <i>fields()</i> to reference any column in the data.</p>
<p>Another way to define the variable is by looking it up in a file. Here is an example:</p>
<p><code>variable=open(TOR,"tor.csv"); @tor=<tor>; close(TOR);<br />
color="red" if (grep(/^\Q$fields[1]\E$/,@tor))</tor></code></p>
<p>This time you put the list of items in a file and read it into an array. Remember, it&#8217;s just Perl code that you execute after the <i>variable=</i> statement. Anything goes! </p>
<p>I am curious what you will come up with. Post your experiments and questions on <a href="http://secviz.org">secviz.org</a>!</p>
<p>Read more about how to use AfterGlow in <a href="http://secviz.org/content/applied-security-visualization">security visualization</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://raffy.ch/blog/2012/03/24/advanced-network-graph-visualization-with-afterglow/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Logging Guidelines Enable Actions</title>
		<link>http://raffy.ch/blog/2011/09/08/logging-guidelines-enable-action/</link>
		<comments>http://raffy.ch/blog/2011/09/08/logging-guidelines-enable-action/#comments</comments>
		<pubDate>Thu, 08 Sep 2011 18:05:43 +0000</pubDate>
		<dc:creator>Raffael Marty</dc:creator>
				<category><![CDATA[Log Analysis]]></category>
		<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://raffy.ch/blog/?p=458</guid>
		<description><![CDATA[Analyzing log files can be a very time consuming process and it doesn&#8217;t seem to get any easier. In the past 12 years I have been on both sides of the table. I have analyzed terabytes of logs and I have written a lot of code that generates logs. When I started writing Loggly&#8217;s middleware, [...]]]></description>
			<content:encoded><![CDATA[<p><img src="http://raffy.ch/blog/wp-content/uploads/2011/09/images.jpeg" alt="Log Book" title="Log Book" width="180" style="float:right;" />Analyzing log files can be a very time consuming process and it doesn&#8217;t seem to get any easier. In the past 12 years I have been on both sides of the table. I have analyzed terabytes of logs and I have written a lot of code that generates logs. When I started writing Loggly&#8217;s middleware, I thought it was going to be really easy and fun to finally write the perfect application logs. Guess what, I was wrong. Although I have seen pretty much any log format out there, I had the hardest time coming up with a decent log format for ourselves. What&#8217;s a good log format anyways? The short answer is: &#8220;One that enables analytics or actions.&#8221; </p>
<p>I was sufficiently motivated to come up with a good log format that I decided to write a paper about <a href="http://pixlcloud.com/applicationlogging.pdf">application logging guidelines</a>. The paper has two main parts: Logging Guidelines and a reference architecture for a cloud service. In the first part I am covering the questions of <strong>when</strong> to log, <strong>what</strong> to log, and <strong>how</strong> to log. It&#8217;s not as easy as you might think. The most important thing to constantly keep in mind is the use of the logs. Especially for the question on what to log you need to keep the log consumer in mind. Are the logs consumed by a human? Are they consumed by a log management tool? What are the people looking at the logs trying to do? Debugging the application? Monitoring performance? Detecting security violations? Depending on the answers to these questions, you might change the places in your code that you emit log records. (Or even better you log in all places and add a use-case indicator as a field to your logs.)</p>
<p>The paper is a starting point and not a definite guide. I would expect readers to challenge it and come up with improvements and refinements of use-cases and also the exact contents of the log records. I&#8217;d love to hear from practitioners and get a dialog going.</p>
<p>As a side note: CEE, the <a href="http://cee.mitre.org">Common Event Expression</a> standard, covers parts of what I am talking about in the paper. However, the paper&#8217;s focus is mainly on defining guidelines for application developers; establishing a baseline of when log entries should be recorded and what information should be included.
</p>
<p>Resources: <strong>Cloud Application Logging for Forensics</strong> &#8211; <a href="http://pixlcloud.com/applicationlogging.pdf">Paper</a> &#8211; <a href="http://www.slideshare.net/zrlram/cloud-application-logging-for-forensics">Presentation</a></p>
]]></content:encoded>
			<wfw:commentRss>http://raffy.ch/blog/2011/09/08/logging-guidelines-enable-action/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Recent Blog Posts on Django, Security, Cloud, and Visualization</title>
		<link>http://raffy.ch/blog/2010/05/25/recent-blog-posts-on-django-security-cloud-and-visualization/</link>
		<comments>http://raffy.ch/blog/2010/05/25/recent-blog-posts-on-django-security-cloud-and-visualization/#comments</comments>
		<pubDate>Wed, 26 May 2010 01:17:39 +0000</pubDate>
		<dc:creator>Raffael Marty</dc:creator>
				<category><![CDATA[Links]]></category>
		<category><![CDATA[Log Analysis]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[Visualization]]></category>

		<guid isPermaLink="false">http://raffy.ch/blog/?p=347</guid>
		<description><![CDATA[I thought you might be interested in some blog posts that I have written lately. I have been doing quite a bit of work on Django and Web applications. That might explain the topics of my recent blog posts. Check them out. Would love to hear from you if you have any comments. Either leave [...]]]></description>
			<content:encoded><![CDATA[<p>I thought you might be interested in some blog posts that I have written lately. I have been doing quite a bit of work on Django and Web applications. That might explain the topics of my recent blog posts. Check them out. </p>
<p>Would love to hear from you if you have any comments. Either leave a comment on the blogs, or contact me via Twitter at <a href="http://twitter.com/zrlram">@zrlram</a>.</p>
<ul>
<li><a href="http://www.loggly.com/2010/05/how-to-enable-logging-in-django-1-2/">How to Enable Logging in Django 1.2</a></li>
<li><a href="http://www.loggly.com/2010/05/a-logging-library-for-django-how-we-log-at-loggly/">A Logging Library for Django – How We Log at Loggly</a></li>
<li><a href="http://www.loggly.com/2010/04/securing-your-web-application-with-httponly-cookies-or-how-apache-org-and-atlassian-could-have-been-secured/">Securing your Web Application with httponly cookies OR How Apache.org and Atlassian could have been secured</a></li>
<li><a href="http://www.loggly.com/2010/03/visualizing-your-data-in-the-cloud-with-loggly-and-highcharts/">Visualizing your Data in the Cloud with Loggly and HighCharts</a></li>
<li><a href="http://www.loggly.com/2010/03/fixing-client-ips-in-apache-logs-with-amazon-load-balancers/">Fixing Client IPs in Apache Logs with Amazon Load Balancers</a></li>
<li><a href="http://www.loggly.com/2010/03/rightscale-apis-with-python/">How to use RightScale APIs with Python</a></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://raffy.ch/blog/2010/05/25/recent-blog-posts-on-django-security-cloud-and-visualization/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Geo Lookup on the Command Line</title>
		<link>http://raffy.ch/blog/2007/02/24/geo-lookup-on-the-command-line/</link>
		<comments>http://raffy.ch/blog/2007/02/24/geo-lookup-on-the-command-line/#comments</comments>
		<pubDate>Sun, 25 Feb 2007 01:56:10 +0000</pubDate>
		<dc:creator>Raffael Marty</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[UNIX Scripting]]></category>

		<guid isPermaLink="false">http://raffy.ch/blog/2007/02/24/geo-lookup-on-the-command-line/</guid>
		<description><![CDATA[By now you should know that I really like command line tools which operate well when applied to data through a pipe. I have posted quite a few tips already to do data manipulation on the command line. Today I wanted a quick way to lookup IP address locations and add them to a log [...]]]></description>
			<content:encoded><![CDATA[<p>By now you should know that I really like command line tools which operate well when applied to data through a pipe. I have posted quite a few tips already to do data manipulation on the command line. Today I wanted a quick way to lookup IP address locations and add them to a log file. After investigating a few free databases, I came accross <strong>Geo::IPFree</strong>, a Perl library which does the trick. So here is how you add the country code. First, this is the format of my log entries:</p>
<p><code>10/13/2005 20:25:54.032145,195.141.211.178,195.131.61.44,2071,135</code></p>
<p>I want to get the country of the source address (first IP in the log). Here we go:</p>
<p><code> cat pflog.csv | perl -M'Geo::IPfree' -na -F/,/ -e '($country,$country_name)=Geo::IPfree::LookUp($F[1]);chomp; print "$_,$country_name\n"'</code></p>
<p>And here the output:</p>
<p><code>10/13/2005 20:24:33.494358,62.245.243.139,212.254.111.99,,echo request,Europe</code></p>
<p>Very simple!</p>
]]></content:encoded>
			<wfw:commentRss>http://raffy.ch/blog/2007/02/24/geo-lookup-on-the-command-line/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Perl Performance Improvement</title>
		<link>http://raffy.ch/blog/2006/05/19/perl-performance-improvement/</link>
		<comments>http://raffy.ch/blog/2006/05/19/perl-performance-improvement/#comments</comments>
		<pubDate>Fri, 19 May 2006 21:18:31 +0000</pubDate>
		<dc:creator>Raffael Marty</dc:creator>
				<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://raffy.ch/blog/?p=50</guid>
		<description><![CDATA[I was fiddling with optimizing AfterGlow the other day and to do so, I introduced caches for some of the functions. Later a coworker (thanks Senthil) sent me a note that I could have done without implementing the cache myself by using Memoize. This is how to use it: use Memoize; memoize(function); function(arguments); # this [...]]]></description>
			<content:encoded><![CDATA[<p>I was fiddling with optimizing AfterGlow the other day and to do so, I introduced caches for some of the functions. Later a coworker (thanks Senthil) sent me a note that I could have done without implementing the cache myself by using Memoize. This is how to use it:</p>
<p><code>use Memoize;<br />
memoize(function);<br />
function(arguments);  # this is now much faster</code></p>
<p>This will basically cache the outputs for each of inputs to the function. Especially for recursion this is an incredible speedup.</p>
]]></content:encoded>
			<wfw:commentRss>http://raffy.ch/blog/2006/05/19/perl-performance-improvement/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Perl Performance Optimization</title>
		<link>http://raffy.ch/blog/2006/04/04/perl-performance-optimization/</link>
		<comments>http://raffy.ch/blog/2006/04/04/perl-performance-optimization/#comments</comments>
		<pubDate>Tue, 04 Apr 2006 21:13:48 +0000</pubDate>
		<dc:creator>Raffael Marty</dc:creator>
				<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://raffy.ch/blog/?p=46</guid>
		<description><![CDATA[I was working on AfterGlow the other night and I realized that adding feature after feature starts to slow down the thing quite a bit (you need to be a genious to figure that one out!). So that prompted me to look for Perl performance analyzers and indeed I found something that&#8217;s pretty useful. Run [...]]]></description>
			<content:encoded><![CDATA[<p>I was working on <a href="http://afterglow.sourceforge.net">AfterGlow</a> the other night and I realized that adding feature after feature starts to slow down the thing quite a bit (you need to be a genious to figure that one out!). So that prompted me to look for Perl performance analyzers and indeed I found something that&#8217;s pretty useful.</p>
<p>Run your perl script with: <code>perl -d:DProf</code> and then run <code>dprofpp</code>. This will show you how much time was spent in each of the subroutines. It helped me pinpoint that most of the time was spent in the getColor() call. The logical solution was to introduce a cache for the colors and guess what &#8211; AfterGlow 1.5.1 will be faster <img src='http://raffy.ch/blog/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' /> </p>
<p>This is a sample output of dprofpp:</p>
<p><code>Total Elapsed Time = 11.69959 Seconds<br />
  User+System Time = 8.969595 Seconds<br />
Exclusive Times<br />
%Time ExclSec CumulS #Calls sec/call Csec/c  Name<br />
 81.5   7.310  9.900 120000   0.0001 0.0001  main::getColor<br />
 29.1   2.615  2.615 116993   0.0000 0.0000  main::subnet<br />
 0.89   0.080  0.080  20000   0.0000 0.0000  main::getEventName<br />
 0.22   0.020  0.020  20000   0.0000 0.0000  main::getSourceName<br />
 0.22   0.020  0.020  20000   0.0000 0.0000  main::getTargetName<br />
 0.11   0.010  0.010      1   0.0100 0.0100  main::BEGIN<br />
 0.00       - -0.000      1        -      -  Exporter::import<br />
 0.00       - -0.000      1        -      -  Getopt::Std::getopts<br />
 0.00       - -0.000      1        -      -  main::propertyfile<br />
 0.00       - -0.000      1        -      -  main::init<br />
    -       - -0.025 116993        -      -  main::field<br />
</code></p>
]]></content:encoded>
			<wfw:commentRss>http://raffy.ch/blog/2006/04/04/perl-performance-optimization/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Python For Beginners and for me</title>
		<link>http://raffy.ch/blog/2005/12/04/python-for-beginners-and-for-me/</link>
		<comments>http://raffy.ch/blog/2005/12/04/python-for-beginners-and-for-me/#comments</comments>
		<pubDate>Mon, 05 Dec 2005 02:00:59 +0000</pubDate>
		<dc:creator>Raffael Marty</dc:creator>
				<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://raffy.ch/blog/?p=5</guid>
		<description><![CDATA[The RedHat Magazine had a nice Introduction to Python. Cool example that uses pyGTK!]]></description>
			<content:encoded><![CDATA[<p>The RedHat Magazine had a nice <a href="http://www.redhat.com/magazine/012oct05/features/python/">Introduction</a> to Python. Cool example that uses pyGTK!</p>
]]></content:encoded>
			<wfw:commentRss>http://raffy.ch/blog/2005/12/04/python-for-beginners-and-for-me/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

