Machine learning privacy: analysis and implementation of model extraction attacks

The rise in popularity and the large amount of improvements done to Machine Learning (ML) resulted in the emergence of a new type of attack called model extraction attack. Model extraction attacks are privacy attacks, which aim to extract information about a victim model or even steal its functionality. These types of attacks are being heavily researched, however, it is very hard to perform comparisons between the proposed papers. In this work, we present MET, which implements state-of-the-art model extraction attacks on arbitrary ML models and datasets. Using the tool, we performed a comprehensive comparison between the implemented attacks to see how they perform under different settings. Our results show that in the case of black-box scenarios, the attacks perform similarly. Based on the results, we propose and implement improvements for some of the attacks both in terms of speed and performance.

Url: https://dspace.cvut.cz/handle/10467/95288