CUDA utilisation: Offload function execution on GPU accelerators

CUDA utilisation: Offload function execution on GPU accelerators

Specialized hardware utilization for function acceleration

The EDGELESS project, a significant endeavor in advancing serverless architectures, aims to provide a serverless platform capable of executing lambda functions at the edge. This innovative approach allows the serverless platform to fully exploit the resources available at the nodes where application data is generated and consumed. However, this objective presents new challenges, such as the need, in some cases, for edge devices to handle computationally intensive tasks independently. These tasks could involve locally preprocessing video frames to remove sensitive data, classifying objects from camera inputs, or executing other functions related to Machine Learning (ML), Artificial Intelligence (AI), or Computer Vision (CV) [1].

To efficiently perform these calculations, specialised hardware shipped in edge devices could be leveraged. More precisely, the EDGELESS platform will explore the potential of offloading computations to the edge nodes’ Graphical Processing Units (GPUs).

To this end, the Technical University of Crete (TUC), an academic partner of the project, is actively studying techniques to virtualise lambda executors running on GPUs through lightweight abstractions.

Experimentation devices


Article content
Figure 1 Jetson Orin

NVIDIA is one of the leading players in the technology industry, manufacturing GPUs. These GPUs are at the forefront of performance and efficiency, making them essential components frequently deployed in high-performance computing applications. For this reason, TUC’s starting point for its research activities in GPU accelerators is the NVIDIA platform.

One of NVIDIA’s standout products is Jetson devices. These all-in-one machines bring GPU capabilities to the edge, delivering enhanced computational power in a small form factor. In the EDGELESS platform, devices like Jetson Orin or Jetson Xavier will play a key role in accelerating functions on GPUs. These devices are also used as experimentation devices during TUC activities.

To fully unleash the power of NVIDIA GPUs, NVIDIA has developed CUDA, a versatile parallel computing platform and programming model. CUDA is widely adopted by popular AI/ML frameworks and libraries like Pytorch, which utilise it to realise GPU acceleration.

GPU offloading in the EDGELESS SYSTEM

Actions towards utilising CUDA from EDGELESS node devices are one of the studies that TUC performs to make GPU offloading of functions, in a simple manner, available in the EDGELESS system.

Experiments have been conducted using virtualised environments, in which AI/ML frameworks/libraries are installed, and functions implemented within these environments. The objective is to leverage these frameworks/libraries, which internally use CUDA as a backend to activate the GPU computations. Some first results of our study are available on Fig 2.


Article content
Figure 2 First Results of our experiments

The goal of our research activities is to empower developers on the EDGELESS platform, enabling them to implement and execute their own functions on GPUs. This opens up a world of possibilities for GPU offloading of functions that require that kind of processing and makes it easier to run AI/ML models on these edge devices in an isolated manner.


For more direct updates follow us on: LinkedIn Mastodon and X.

For regular updates including a collection of news and relevant information sign up on our Newsletter here.


References:

[1] G. Vasiliadis, L. Koromilas, M. Polychronakis, and S. Ioannidis, ‘GASPP: A GPU-Accelerated Stateful Packet Processing Framework’, in 2014 USENIX Annual Technical Conference (USENIX ATC 14), 2014, pp. 321–332.


Blog signed by: TUC Team

To view or add a comment, sign in

Others also viewed

Explore content categories