At Stratosphere, we like to keep ourselves learning and sharing knowledge among team members. For this purpose, we keep regular learning sessions on different topics. Today the topic was 'Python Introduction for Network Traffic Visualisation' taught by Sebastian (aka @eldracote).
The goals of today's session were:
To start with python
To get a working template
To analyse binetflow files (obtained from a malware traffic capture) and plot the relationships
The template and the binetflow file used are hosted here: https://github.com/stratosphereips/Basic-Python-Learning. And the session outline can be seen here: Google Docs Basic Python Session
Revisiting concepts
Before getting started we reviewed some basic concepts of python, and some changes introduced in Python 3: Functions, Conditionals, Loops, Types of data (Strings, Ints, Floats, Lists, Dictionaries), the __main__ definition, Parsing arguments, Calling functions, Opening files and reading lines, Looping through the content, String operations to split.
Basic python template
The template is really simple, but it's designed to save time when getting started. It has a basic function and allows to read parameters from command line.
The new way of parsing arguments is great as it handles everything (parameters name, long format, help, possible input values of the options). That's quite useful.
The DOT graph description language
DOT is a very easy and nice graph description language. Is quite simple to generate with any script and then it is possible to use in combination with other tools to actually generate an image.
A simple DOT file representing traffic connections between devices is shown below:
digraph graphname{
"192.168.1.102" -> "192.168.1.2"
"192.168.1.102" -> "239.255.255.250"
"192.168.1.102" -> "239.255.255.250"
"192.168.1.102" -> "239.255.255.250"
"192.168.1.102" -> "8.8.8.8"
}
With this file, we can use the 'dot' program (included in the gaphviz library) to create a 'png' image with the following command: cat test.dot | dot -Tpng -o test.png
The dot program will read the relationships established in the 'graphname' and automatically generate an image with the graph. The visualisation for the previous graph is shown below.
Analysing binetflow files and plotting relationship
In this session, we worked with a binetflow file that was generated from a malware capture pcap using the Argus program. Here is an example of the first 10 lines of that file:
StartTime,Dur,Proto,SrcAddr,Sport,Dir,DstAddr,Dport,State,sTos,dTos,TotPkts,TotBytes,SrcBytes,Label 1970/01/01 01:00:00.000000,0.000000,llc,00:00:00:00:00:00,0, ->,00:00:00:00:00:00,0,INT,,,1,60,60, 1970/01/01 01:00:07.155617,2256.163086,arp,192.168.1.102,, who,192.168.1.2,,CON,,,54,2268,1134, 1970/01/01 01:00:07.337532,1.992893,arp,0.0.0.0,, who,192.168.1.102,,INT,,,3,126,126, 1970/01/01 01:00:10.346739,0.000000,igmp,192.168.1.102,, ->,239.255.255.250,,INT,0,,1,46,46, 1970/01/01 01:00:10.542317,7.711234,udp,192.168.1.102,51743, ->,239.255.255.250,1900,INT,0,,8,1400,1400, 1970/01/01 01:00:10.832908,0.000000,igmp,192.168.1.102,, ->,239.255.255.250,,INT,0,,1,46,46, 1970/01/01 01:00:13.039304,0.001188,udp,192.168.1.102,59458, <->,8.8.8.8,53,CON,0,0,2,168,76, 1970/01/01 01:00:13.041007,0.001193,udp,192.168.1.102,65071, <->,8.8.8.8,53,CON,0,0,2,180,76, 1970/01/01 01:00:18.050956,2302.088135,arp,192.168.1.1,, who,192.168.1.102,,CON,,,18,918,540,
In python, we created a parser for this file and generated a .dot graph file with the relationships between the IPs. The file is just as the one seen above but with hundreds of entries. The program is simple: reading the binetflow file, splitting the lines, taking source and destination IPs and printing them in the DOT format.
Once we generate the .dot file, we can visualise it in the same way. Below you can see an example of the big graph created for a malware capture. Pay attention that there are so many nodes, that the chart generated seems like a line delimiter, but it is actually a chart.
That graph has too many nodes, so we took the first 400 connections and then we use another tool similar to 'dot'. The new tool is called 'sfdp' and is used in the same way as 'dot': cat test2.dot | sfdp -Tpng -o test3.png. The tool is able to generate other types of charts more convenient for this type of data.
Conclusion
The session was quite good and fast paced. It certainly got all of us hooked into creating visualisations! Time to continue practicing and playing with these graphs and Python!