Using Enclaves and Model Splitting to Improve AI Application Security in High Performance Computing – Thinking Beyond the Network
Model and function dependencies depicted for a photonics application developed for High Performance Computing

Using Enclaves and Model Splitting to Improve AI Application Security in High Performance Computing – Thinking Beyond the Network

If you program or oversee the development of a data-critical program, you most likely want to keep track of the quality of focused key areas. Measurements can be in code execution time, for example. In this case, a good place to start would be the 90/10 rule which goes like this … “90% of program execution time is spent in executing 10% of the code”. Looking at this axiom through the cybersecurity lens reveals why secure coding practices also serve as a powerful countermeasure to consider against malicious attacks. When working with AI technology, here’s how.

We’ll start with the notion of enclaving which is a technique that can be used to protect code by isolating it within a secure environment. This environment can be a hardware-based enclave or a software-based virtual machine. By running the code in this isolated environment, it becomes more difficult for attackers to access and manipulate it. Enclaving also provides confidentiality, integrity, and authenticity guarantees for the code and its data. Additionally, enclaving can prevent side-channel attacks and other types of vulnerabilities that may arise from running code in an untrusted environment. Overall, enclaving is an effective way to protect sensitive code and data from unauthorized access and manipulation.

Enclaving such as Intel SGX, is an example of a set of instructions that allows user-level code to allocate private regions of memory called enclaves. Enclaves are protected from processes running at higher privilege levels. This allows sensitive data and computations to be executed in isolation. 

While enclaves have several benefits, they also have some drawbacks that should be considered. Here are a few: 

Pros: 

1. Improved Security: Enclaves provide a secure environment for executing sensitive code and storing critical data. They can prevent unauthorized access and tampering by isolating the code and data from the rest of the system. 

2. Confidentiality: Enclaves can ensure confidentiality of the code and data by encrypting them while they are in use. This makes it harder for attackers to steal or modify the code or data. 

3. Flexibility: Enclaves can be customized to meet specific security requirements. They can be designed to support different programming languages and operating systems. 

4. Reduced Attack Surface: Enclaves can reduce the attack surface by providing a small, trusted environment that can be easily monitored and protected. 

Cons: 

1. Performance Overhead: Enclaves can introduce a performance overhead due to the need for encryption and decryption of data. This can result in slower execution times and increased resource usage. 

2. Limited Support: Enclaves are still a relatively new technology and are not supported on all platforms. This can limit their use in certain environments. 

3. Complexity: Enclaves can be complex to develop and maintain. They require specialized knowledge and skills to implement and manage. 

4. Cost: Enclaves can be expensive to implement and maintain. They require additional hardware and software resources, which can increase the overall cost of a system.  

 In summary, enclaves have several benefits that make them an attractive option for securing sensitive data and code. However, they also have some drawbacks that should be considered before implementing them. It is important to carefully evaluate the benefits and drawbacks of enclaves in the context of a specific system and environment to determine if they are the right choice for your organization. 

 How do enclaves further apply to secure in-Cloud AI platforms? 

When you consider the potential of AI in the areas such as pattern recognition, emergent pattern recognition, NLP, forecasting, optimization, spatial processing, and generative design it becomes apparent how important models will become as the foundations to every business. AI models are large files that contain the network parameters created using the training dataset. Splitting the model decreases the vulnerability as a single point of entry is not enough to disrupt the proper functioning of the model but can serve as an advanced warning about a potential attack being initiated. There are a few ways to split an AI model across multiple GPUs, for instance. Some common approaches include: 

1. Data parallelism - This involves splitting the dataset across multiple GPUs, with each GPU training the model on its portion of the data. The updated model parameters from each GPU are then aggregated (typically just averaged) to update the shared model. This works for any model architecture but requires a large enough dataset to split. 

2. Model parallelism - This involves splitting the model itself across GPUs. For example, you can allocate layers or groups of layers to different GPUs. Each GPU executes its assigned portion of the forward and backward pass. The outputs and gradients are communicated between GPUs as needed. This requires a model architecture that can be split, like having some layers with no inter-layer connectivity. 

3. Hybrid parallelism - This combines data and model parallelism by having each GPU train a portion of the model on a portion of the data. For example, you could split both the dataset and a group of model layers across GPUs. This approach maximizes the use of available GPUs but is the most complex to implement.  

Some points to note: 

• The level of parallelism depends on your GPU resources, model architecture, and dataset size. More complex models with larger datasets can benefit from higher levels of parallelism.

• GPU-to-GPU communication and synchronization reduce scale efficiently. The optimal level of parallelism depends on minimizing the communication overhead, which requires experimentation. 

• Frameworks like TensorFlow, PyTorch, and MXNet provide built-in capabilities to split models across GPUs. You define the parallelism strategy at a high level, and the framework handles the communication details. 

• For data or hybrid parallelism, the dataset must be split into equally sized chunks for each GPU to maximize performance. How you split the data depends on your task - by categories for classification, time sequence for time series, spatial location for images, etc.

• Checkpointing and averaging model parameters from each GPU are important when training a model across multiple GPUs to share learned knowledge between GPUs.  

• Multi-GPU training typically provides close to linear speedup versus single-GPU training, allowing much larger and complex models to be trained in reasonable time limits. 



I'm part of a group within the Canadian Research ecosystem providing infrastructure and delivering workshops related to AI, security and optimization to researchers and universities across Canada. Look us up: Digital Research Alliance of Canada (alliancecan.ca)

Peter, thanks for sharing!

Like
Reply

To view or add a comment, sign in

More articles by Peter Darveau P. Eng., CED, 6sigma BB, MBA

Others also viewed

Explore content categories