Machine Learning Software Supply Chain & Adversarial Machine Learning Attacks

Machine Learning Software Supply Chain & Adversarial Machine Learning Attacks

Adversarial machine learning attacks are a serious emerging threat to the security and reliability of systems. Black Mamba is a recent example of a simple LLM aided polymorphic keylogger that demonstrates how LLMs could be leveraged to compromise systems despite running the latest end-point protections.

Similarly deep learning models are fast becoming critical components of mobile applications. Researchers from Microsoft Research recently demonstrated that many deep learning models deployed in mobile apps are vulnerable to backdoor attacks via "neural payload injection." This included popular security and safety critical applications used for cash recognition, parental control, face authentication, and financial services. The number of such apps is only going to increase as NPU (Neural Processing Unit) powered devices increase.

The focus of machine learning (ML) security has primarily been on endpoint protection and preventing model from taking malicious actions after they are deployed or just before installation. However, this approach is often too late. Additionally, we need to focus on protecting the ML supply chain and ensuring that the models within the apps are not vulnerable before the apps are published (3rd party stores/enterprises internally). This means shifting the focus of security to the ‘left’, so that the attack surface area could be minimized.

Because of the nature of ML models existing software supply-chain protections are not enough and need to be beefed-up. Here are some key aspects to consider:

·Code scanning and rescanning: Most enterprises and app stores currently scan and rescan apps for malware before making them available. However, this existing anti-malware (AM) scanning will not be sufficient in the future. It will need to be augmented with model validation via dynamic analysis methods. This is because machine learning models introduce unique security risks, such as:

  • Framework compromise: Most machine learning systems rely on a limited set of machine learning frameworks. An adversary could gain access to many systems by compromising one of these frameworks.
  • Data compromise: Every machine learning project requires some form of data for training. Many projects rely on large open-source datasets that are publicly available. An adversary could compromise these sources of data.
  • Model compromise: Machine learning systems often rely on open-sourced models. These models are typically downloaded from an external source and then used as the basis for the model as it is tuned on a smaller, private dataset. These models are susceptible to "man-in-the-middle" attacks.

·Code signing is a best practice for enterprises to ensure the integrity and provenance of any code. It is essential that these best practices are also extended to ML models. By code signing a model, the developer can verify that the code has not been modified since it was signed. This helps to protect against malicious actors who may attempt to tamper with a model in order to introduce errors or bias. Organizations should focus on ensuring that the signing requirement for models is enforced at ingestion time. This means that the model should be signed before it is ingested into the organization's ML pipeline. Similarly, if the model is shipped separately external to the apps, it should also be code signed. This will help to ensure that the model is not tampered with during transit.

For reference: MITRE's ATLAS is a valuable resource for staying up-to-date on the adversarial threat landscape for artificial intelligence systems ATLAS Matrix | MITRE ATLAS™


#generativeai

To view or add a comment, sign in

More articles by Raza Syed

Others also viewed

Explore content categories