Malware Capture Facility Project
The Stratosphere IPS Project has a sister project called the Malware Capture Facility Project that is responsible for making the long-term captures. This project is continually obtaining malware and normal data to feed the Stratosphere IPS.
Why do we capture Malware, Normal, and Mixed traffic?
Machine learning algorithms need to be verified to find out their precise performance in real data. Specially in network computer security it is really important to have good datasets, because the data in the networks is infinite, changing, varied and with a high concept drift. These issues force us to obtain good datasets to train, verify and test the algorithms.
To make a good verification we need three types of traffic: Malware, Normal and Background. The Malware traffic will include all the things we want to detect, specially C&C (Command and Control) connections. The Normal traffic is very important to find out the real performance of our algorithms by computing the False Positives and True Negatives. The Background traffic is necessary to saturate the algorithms, verify its memory/speed performance and to test if the algorithm gets confused with the data.
Mixed Datasets
Our datasets are composed by long term malware captures, manual attacks, normal captures, and mixed captures. These are the mixed captures we performed. In each folder there is a description of the behavior captured.
Mixed Captures
CTU-Mixed-Capture-1
Normal and then infected with MD5 2d17f8f6fab6da5619c7528e9b0ee135
CTU-Mixed-Capture-2
Normal and then infected with MD5 6daff56b1c5429b7460dcf836803bea3
CTU-Mixed-Capture-3
Normal and then infected with MD5 a0840a39ec90e1f603e2f4be42a87026
CTU-Mixed-Capture-4
Normal and then infected with MD5 4d9838607597427f2dd6b1d2092f1e76
CTU-Mixed-Capture-5
Normal and then infected with MD5 c5d81a096cbc34edd0046e33cffbe070