Towards Better Understanding of Cybercrime: The Role of Fine-Tuned LLMs in Translation

This blog was authored by Veronica Valeros on August 27th, 2024

We are thrilled to share that our paper “Towards Better Understanding of Cybercrime: The Role of Fine-Tuned LLMs in Translation” is available as part of the proceedings of the 2024 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW). The paper was presented in July in Vienna at the 6th Workshop on Attackers and Cyber-Crime Operations (WACCO). This was a joint research by Veronica Valeros (CTU in Prague), Anna Širokova (Rapid7), Carlos Catania (UNCuyo) and Sebastian Garcia (CTU in Prague).

Translating content from cybercrime forums, chats and other sources is hard, costly, slow, not scalable, biased, inaccurate, and exposes human analysts to toxic and disturbing content. Our paper explores the use of Large Language Models as mechanisms to translate public hacktivists messages from Russian to English as a way to address all these problems. We show how our method can achieve high-fidelity translations and significantly reduce costs by a factor ranging from 430 to 23,000 compared to a human translator.