Distributed Machine Learning at the Edge - DMLE

Michael Kamp

Published May 13, 2019

Data is increasingly produced by decentralized sources, for example, mobile phones, vehicles, IoT devices, and smart sensors. These sources generate large, high-frequency data streams that render centralized data processing in a cloud or computing cluster infeasible. To see that, let's look at autonomous driving: An autonomous vehicle generates around 1GB of data per second [1]. If only 10 Million cars (i.e., the amount of cars Volkswagen sold in 2018) would produce this data, it would require processing 10 peta bytes of data per second - in comparison, all LHC experiments at CERN combined process around 25GB per second [2]. If you want to go to even larger scales, let's look at the industry: Siemens collects around 2 exa bytes of data per day [3] from sensors in machines, in wind parks, and in gas turbines. Moreover, a lot of this data is privacy sensitive. For example, sensor data from machines can reveal corporate secrets, data from mobiles can infringe the user's privacy, so does data from autonomous vehicles.

Es wurde kein Alt-Text für dieses Bild angegeben.

Thus, centralizing data in a cluster or cloud has three major disadvantages: (i) it does not scale well with the number of data generating devices and it neglects their computing power, (ii) it requires a prohibitive amount of communication, and (iii) it often requires sharing privacy-sensitive data.

In order to overcome these disadvantages, data can be processed at - or close to - the data generating devices. This is often called edge computing or in-situ processing. Such decentralized approaches reduce communication overhead and utilize the processing power of the data generating devices. This not only results in communication-efficient methods but also avoids centralizing privacy-sensitive data, enabling possible novel, large-scale applications.

Recent advances in machine learning at the edge for machine learning [4,5,6], in particular for deep learning [7] - where it was termed Federated Learning [8] - shift the focus from high-performance computation in clusters to decentralized learning. Federated learning is now gaining a lot of interest as an approach to bring machine learning to edge devices [9,10,11,12], in particular mobile phones [13,14,15].

In order to bring together researchers and practitioners we organize the second edition of the workshop on Decentralized Machine Learning at the Edge (DMLE) in conjunction with ECMLPKDD 2019 It aims at providing a platform for the exchange of novel concepts and ideas, as well as to disseminate decentralized learning, parallelization, and federated learning approaches within the machine learning community. The workshop addresses theoretical and empirical aspects of machine learning at the edge, including large-scale machine learning, federated learning, communication-efficiency, theoretical guarantees for distributed learning, in-situ processing, data mining from distributed sources, privacy aspects, resource constraint machine learning for edge devices, and hardware aspects for edge devices.

We happily welcome submissions to the workshop (see the call for papers), and hope to have lively discussions with both researchers and practitioners at the workshop on the 16th of September in Würzburg, Germany.

[1] Shi, Weisong et al. Edge computing: Vision and challenges. Internet of Things Journal, pages 637–646. IEEE, 2016.

[2] CERN: Processing: What to record?, retrieved 27.3.2019.

[3] Rüdiger Köhn. Ringen um die Vorherrschaft über Industrie 4.0. Blick in die Zukunft: Trends und Szenarien für die Welt von morgen. FAZ, 2014.

[4] Kamp, Michael, et al. Communication-efficient distributed online prediction by dynamic model synchronization. In ECMLPKDD. Springer, 2014.

[5] Kamp, Michael, et al. Communication-efficient distributed online learning with kernels. In ECMLPKDD. Springer, 2016.

[6] Kamp, Michael, et al. Effective parallelisation for machine learning. Advances in Neural Information Processing Systems, 2017.

[7] Kamp, Michael, et al. Efficient decentralized deep learning by dynamic model averaging. In ECMLPKDD. Springer, 2018.

[8] McMahan, Brendan, et al. "Communication-Efficient Learning of Deep Networks from Decentralized Data." Artificial Intelligence and Statistics, pages 1273–1282, 2017.

[9] Yang, Q., et al. "Federated machine learning: Concept and applications". ACM Transactions on Intelligent Systems and Technology (TIST) 10(2), 12 (2019).

[10] Abadi, M., et al. "Deep learning with dierential privacy". In: Proceedings of the ACM SIGSAC Conference on Computer and Communications Security (2016).

[11] Mohri, M., Sivek, G., Suresh, A.T. Agnostic federated learning. CoRR (2019)

[12] Zhao, Y., et al. Federated learning with non-iid data. CoRR (2018)

[13] Hard, A., et al. Federated learning for mobile keyboard prediction. CoRR (2018)

[14] Yang, T., et al. Applied federated learning: Improving google keyboard query suggestions. CoRR (2018)

[15] Bonawitz, K., et al. Towards federated learning at scale: System design. CoRR (2019)

To view or add a comment, sign in

Distributed Machine Learning at the Edge - DMLE

Michael Kamp

More articles by Michael Kamp

Others also viewed

Revolutionizing Edge AI: Raspberry Pi 5 Compute Module Meets Hailo AI Accelerators

Edge AI: When Artificial Intelligence Leaves the Cloud and Settles in Your Pocket From Centralized Data Centers to Distributed Intelligence

How AI Inference Chips Have Evolved to Become Smaller, Faster, and More Power-Efficient

AI Agent Architecture : Components, Data Flow, Communication and Learning Mechanisms

AI Computing in Edge Devices: More Than Just TOPS

Edge AI vs Cloud AI: What Engineers Must Know

EfficientNet – Smarter, Scalable CNNs for Cloud & Edge Vision AI ⚙️👁️

Edge Artificial Intelligence and edgeless computing

Model Compression Techniques: An Introductory and Comparative Guide

Edge Computing and AI: A New Frontier of Innovation

Autonomous Vehicle Networking Solutions

Edge Data Capture Strategies for Privacy Compliance

Industrial Automation Processes

The Impact Of Data Privacy On Predictive Modeling

Challenges and Benefits of Deep Learning in AI

Explore content categories

More articles by Michael Kamp

Little is Enough: Boosting Privacy in Federated Learning with Hard Labels

LAYER-WISE LINEAR MODE CONNECTIVITY

NOTHING BUT REGRETS – FEDERATED CAUSAL DISCOVERY

CAUSAL DISCOVERY FROM MULTIPLE ENVIRONMENTS

Parallel, Distributed, and Federated Learning

Data-Centric Dependability and Security 2020

Data-Centric Dependability and Security

Decentralized Machine Learning at the Edge