The research we do at Stratosphere uses data and features extracted from real malware traffic captures. We capture the real behavior of the malware and thus ensure the models we create are sound. Since 2015, we have publicly released to the community, more than 300 long term malware traffic captures.
A NEW DATASET INDEX
We share with everyone the need for an easier way of searching through these datasets to find the appropriate data needed for specific research. As a small step in this direction, we are introducing a new dataset index: https://mcfp.felk.cvut.cz/publicDatasets/datasets.html .
This new simple and dynamic table contains all the basic information of each dataset:
Infection date: this is the date where the malware infection started. Remember that some captures can last for months.
Dataset Name: this is the dataset name or ID. It’s unique.
Malware: the malware family of the original infection, to the best of our knowledge. Some infections could have evolved into something else.
MD5
SHA256
URL: this is the URL where to find all the dataset files (pcap, netflows, readme, binary, etc).
QUICK BROWSE AND SEARCH
The new index allows to quickly sort by columns and more importantly, to search. The search will find matches in any column and allows to quickly search for specific malware, date, or hash.
INDEX IN JSON FORMAT
The same information on this table can also be downloaded in JSON format. Hopefully, this will help with automatic parsing of the data, and help with some automation. The JSON file can be downloaded from https://mcfp.felk.cvut.cz/publicDatasets/datasets.json
CITATION
If you use these datasets, please use the following citation:
Stratosphere. (2015). Stratosphere Laboratory Datasets. Retrieved March 13, 2020, from https://www.stratosphereips.org/datasets-overview
In Bibtex format:
@misc{stratodatasets, title={Stratosphere Laboratory Datasets}, author={Stratosphere}, year={2015}, note={Retrieved March 13, 2020, from \url{https://www.stratosphereips.org/datasets-overview}} }