Data Free Model Extraction Attack
Before we start discussing the data-free model extraction attack, let us understand how the Model extraction typically happens in a rudimentary form.
Let us say you are a food lover and want to figure out the secret recipe from your favourite restaurant; what will you do first?
You will order the food, and while tasting, you will carefully try to figure out all kinds of tastes. For example, you get the tanginess from the tomatoes, spiciness from the red chillies, sweetness from the grated coconut, the fiery hot flavour from the bite of green chillies etc. Using this knowledge, you will add ingredients and prepare food. If it doesn't taste the same, order the dish from the restaurant and repeat the same process until you get almost the same taste.
In a nutshell, the adversaries perform precisely the same process to steal the model and create a clone model.
Since computing power is much cheaper nowadays, there has been widespread adoption of Deep Neural networks (DNN) which has paved the way for accurately solving many complex problems. Also, many cloud vendors are proving DNN models as Machine Learning as a Service. Due to this, many use cases use DNN that otherwise would have used traditional algorithms. However, DNN is complex and hence difficult to explain. Therefore, data scientists started using the "Explanation model" on top of the DNN models to explain and interpret. But unfortunately, these explanation models are exposed to the world using the APIs. So, these explanation models are eventually becoming the victims of the adversaries. How? Let us see.
The adversaries query these models using API and get a response, and they keep doing it with many random inputs. Finally, they record the output and prepare surrogate data for each input.
But, this approach is easily detectable as the MLaaS model can block the Spam beyond a certain number of daily limits; hence, it is also a lengthy process to steal the model.
So the adversaries use another neural network algorithm to prepare a generative model to prepare the input similar to the victim model. I have copied a dataflow diagram to depict the same. Credit: MEGEX: Data-Free Model Extraction Attack against Gradient-Based Explainable AI
Recommended by LinkedIn
Generative models are an exciting recent machine learning innovation that creates new data instances that resemble your training data. For example, when doing a google search, you would have experienced a model that helps you type the following word without physically typing. The adversaries use this model to prepare the training data and develop a clone model replicating the "Explanation model". Scary? But it is true!
Can you see how a machine learning algorithm used to simplify our decision-making process can also be misused to steal our decision-making process? I hope you got a high-level understanding of how your model can be stolen without accessing your training data as well. I hope you enjoyed reading it. Please like, share and comment!
Views are personal.
Image Credit:
Photo by Pixabay: https://www.pexels.com/photo/close-up-photo-of-gorilla-35992/
References and additional reading:
https://developers.google.com/machine-learning/gan/generative
https://arxiv.org/abs/2107.08909