Frequently Asked Questions
How to cite the CTU-13 dataset?
To cite the dataset please cite the paper "An empirical comparison of botnet detection methods" Sebastian Garcia, Martin Grill, Jan Stiborek and Alejandro Zunino. Computers and Security Journal, Elsevier. 2014. Vol 45, pp 100-123. http://dx.doi.org/10.1016/j.cose.2014.05.011
How to cite the IoT-23 dataset?
To cite the dataset please reference it as “Stratosphere Laboratory. A labeled dataset with malicious and benign IoT network traffic. January 22th. Agustin Parmisano, Sebastian Garcia, Maria Jose Erquiaga. https://www.stratosphereips.org/datasets-iot23
How to cite other normal, malicious, or mixed datasets?
To cite other of our datasets, please reference them as "Stratosphere. (2015). Stratosphere Laboratory Datasets. Retrieved March 13, 2020, from https://www.stratosphereips.org/datasets-overview"
Do you need permission to use our datasets?
You do not need to ask permission from us to use our datasets as long as you cite the datasets appropriately as specified above.
How can I find the pcap files in the datasets?
Pcap files are available for selected datasets only. If the pcap file is not public, then we cannot share it for privacy reasons.
What is the password for the password-protected files?
The password protected files can be opened using the password ‘infected’.
The datasets webpage is unavailable, do you have an alternative download site?
In case our dataset webpage is unavailable you can download these datasets from the following locations:
CTU-13-Dataset.tar.bz2: https://mega.nz/file/55UE3K7J#PSG5nnY5lBProtvlK5df6kbTJH2HkK2KyngiVg5nxQU
iot_23_datasets_full.tar.gz:
https://mega.nz/file/FwkBXKbA#MpiEU-2R_2BD7y-qzW3sMLWYkTDLaaMa4gGTfbR0WXwiot_23_datasets_small.tar.gz: https://mega.nz/file/V481gSxJ#vZYFQnS0PFvTtWQFZ6IEe15Wb3L5Wgp8qfgqOflkoq4
What tool was used to generate the botnet traffic?
This is described on each dataset home page:
What is the meaning of the headers 'StartTime', 'Dur', 'Proto', 'SrcAddr', 'Sport', 'Dir', 'DstAddr',' Dport', 'State', 'sTos', 'dTos', 'TotPkts', 'TotBytes', 'SrcBytes', 'Label’?
The files are generated using Argus and the ‘ra’ (read Argus) tool. Check the full description of each header here https://www.systutorials.com/docs/linux/man/1-ra/ under the explanation of the parameter -s.
How many attacks, IPs, netflows, and other statistics are for a given scenario?
The statistics for the CTU-13 scenario are provided here: https://www.stratosphereips.org/datasets-ctu13 .
The statistics for the IoT-23 scenario can be found here: https://www.stratosphereips.org/datasets-iot23 .
How can I create binetflow in the same format as yours?
You can find the Argus configuration files in each of the folders of the detailed-bidirectional-flow-labels of the datasets.
For example for this dataset you will have two configuration files, one for Argus and one for the ‘ra’ tool:
You can use these configuration files to read a pcap and then generate the binetflow file:
argus -F Argus.conf -r file.pcap -w - | ra -r - -n -F ra.conf -Z b > file.binetflow
How do you label a malware capture scenario?
The labeling done at Stratosphere was performed doing manual packet analysis of the traffic. If you are interested in doing this process yourself, we recommend the following resources:
Towards a better labeling process for network security datasets: https://arxiv.org/abs/2305.01337
netflowlabeler tool: https://github.com/stratosphereips/netflowlabeler