Model Extraction Attacks and Defenses

Background

As machine learning (ML) applications become increasingly prevalent, protecting the confidentiality of ML models becomes paramount. One way to protect model confidentiality is to limit access to the model only via well-defined prediction APIs. Nevertheless, prediction APIs still leak information so that it is possible to mount model extraction attacks. In model extraction, the adversary only has access to the prediction API of a target model which he queries to extract information about the model internals. The adversary uses this information to gradually train a substitute model that reproduces the predictive behaviour of the target model.

Conference/journal paper publications

Rui Zhang, Jian Liu, Sebastian Szyller, Kui Ren, N. Asokan: False Claims Against Model Ownership Resolution. Usenix Sec 2024 arXiv preprint arXiv:2304.06607
Asim Waheed, Vasisht Duddu, N. Asokan: GrOVe: Ownership Verification of Graph Neural Networks using Embeddings. IEEE S&P 2024. arXiv preprint arXiv:2304.08566
Buse Gul Atli Tekgul, N Asokan: FLARE: Fingerprinting Deep Reinforcement Learning Agents using Universal Adversarial Masks. ACSAC 2023. arXiv preprint arXiv:2307.14751
Sebastian Szyller, Rui Zhang, Jian Liu, N. Asokan: On the Robustness of Dataset Inference. TMLR 2023</a>. arXiv preprint arXiv:2210.13631
Sebastian Szyller, N. Asokan: Conflicting Interactions Among Protection Mechanisms for Machine Learning Models. AAAI 2023. arXiv preprint arXiv:2207.01991
Buse G. A. Tekgul, N. Asokan: On the Effectiveness of Dataset Watermarking. CODASPY-IWSPA 2022. arXiv preprint arXiv:2202.12506
Sebastian Szyller, Buse G. A. Tekgul, Samuel Marchal, N. Asokan: DAWN: Dynamic Adversarial Watermarking of Neural Networks. ACM Multimedia 2021. arXiv preprint arXiv:1906.00830
Buse G. A. Tekgul, Yuxi Xia, Samuel Marchal, N. Asokan. WAFFLE: Watermarking in Federated Learning. SRDS 2021. arXiv preprint arXiv:2008.07298
Buse G. A. Tekgul, Sebastian Szyller, Mika Juuti, Samuel Marchal, N. Asokan: Extraction of Complex DNN Models: Real Threat or Boogeyman? AAAI-EDSMLS 2020. arXiv preprint: arXiv:1910.05429
Mika Juuti, Sebastian Szyller, Alexey Dmitrenko, Samuel Marchal, N. Asokan: PRADA: Protecting against DNN Model Stealing Attacks. IEEE Euro S&P 2019. arXiv preprint arXiv:1805.02628

Technical reports

Sebastian Szyller, Vasisht Duddu, Tommi Gröndahl, N. Asokan: Good Artists Copy, Great Artists Steal: Model Extraction Attacks Against Image Translation Generative Adversarial Networks. arXiv preprint arXiv:2104.12623

Theses

Sebastian Szyller (Doctoral Dissertation): Ownership and Confidentiality in Machine Learning (2023), link
Asim Waheed (MMath Thesis @UWaterloo): On Using Embeddings for Ownership Verification of Graph Neural Networks (2023), link
Buse Gul Atli (Doctoral Dissertation): Securing Machine Learning: Streamlining Attacks and Defenses Under Realistic Adversary Models (2022), link
Shelly Wang (MMath Thesis @UWaterloo): Security and Ownership Verification in Deep Reinforcement Learning (2022), link

Talks

GrOVe: Ownership Verification of Graph Neural Networks using Embeddings talk
On the Robustness of Dataset Watermarking: TMLR’23 talk
On the Effectiveness of Dataset Watermarking: IWSPA’22 pdf talk
WAFFLE: SRDS’21 pdf talk
Extraction of Complex DNN models: Brief overview pdf, AAAI-EDSMLS’20 pdf
PRADA: Euro S&P’19 pdf

Demos and Posters

GrOVe Posterpdf
CS Research Day 2020: WAFFLE: Watermarking in Federated Learning (October 1, Aalto University, Finland), presentation
Secure Systems Demo Day 2019: Stealing Complex DNN Models: Limitations and Defense Strategies (May 29, Aalto University, Finland), poster