Bachelor Thesis
There are many malware families and every each of them has some unique features. The aim of this work is to focus on detecting malicious behavior using leaving network communication. Our hypothesis is that this malicious communication has sequential behavioral patterns. We present a new graph representation of leaving network communication using (IP address, port, protocol)-triplets as vertices. There is an edge between two vertices if they come one after the other in the record of the leaving communication of the inspected host.We think this representation might prove useful in detecting the patterns by a program and even by a naked eye. Random Forest algorithm was used for predicting. Testing was done against datasets of normal users, infected hosts and normal users that are later infected. We were able to detect malicious communication with up to 97% accuracy.
Download this thesis here.