Robust Machine-Learning
How to protect model training against corrupted data and malfunctioning machines.

Robust Machine-Learning

Robust machine-learning consists in protecting model training against corrupted data and malfunctioning computing units. It is no secret that machine-learning solutions can be only as good as their training procedure, no matter how sophisticated is the model architecture. The success of classical machine-learning algorithms rests upon the dubious assumption of “clean” training data, something that is seldom true in practice.

The problem of robust machine-learning becomes even more critical in today’s time when we are increasingly engaging distributed architectures, such as federated learning, for training models. Distributed algorithms involve several machines (often remotely located) to collectively train a global model where the training data is partitioned across the machines. In this learning paradigm, it is extremely challenging to ensure cleanliness of the local computations performed by the machines. Hence, the chances of model corruption in distributed learning methods are much higher. Participating machines may get corrupted due to reasons such as hardware-software bugs and local data poisoning (see Poisoning Web-Scale Training Datasets is Practical).

Classical machine-learning algorithms are notoriously vulnerable to corruption in the training phase, nonetheless, they are gaining popularity in sensitive public-oriented applications, e.g., healthcare and autonomous vehicles (see https://research.aimultiple.com/federated-learning/). Therefore, these vulnerabilities, if left unaddressed, may have severe societal consequences.

Expectedly, the topic of robust machine-learning has received a lot of attention in recent years. Several robustness schemes have been proposed, analyzed and tested against various forms of training corruption. We aim to systematize the advancements in this field through our recent survey paper, Byzantine Machine Learning: A Primer.  By underlining the benefits and drawbacks of existing robust machine-learning methods, we aim to provide a foundation for modern machine-learning systems that are not just “accurate” in laboratory settings, but are also well protected against uncertainties abound in real-world scenarios.

Besides my co-authors Rachid Guerraoui and Rafael Pinot , I would also like to thank my colleagues Youssef Allouah , Sadegh Farhadkhani , Geovani Rizk , Lê Nguyên Hoang , John Stephan and Sasha Voitovych for participating in many intriguing discussions on robust machine-learning.


To view or add a comment, sign in

Others also viewed

Explore content categories