AI has impressive adoption rates. Statista estimates that 37% of marketing and advertising teams are using AI. While more conservative, 15% healthcare workers are using AI at work. A recent MITRE-Harris Poll showed that most Americans remain cautious, with 78% concerned about malicious AI use. To bridge the AI trust gap, clear standards and effective best practices must emerge to ensure AI technology can be used safely.
We have addressed the security risks of AI on the Packetlabs blog before. The rush to adopt AI has allowed attackers to weaponize LLM models in several ways. Similarly to all software, AI models have vulnerabilities - providing attackers with a doorway to the user's network. Hackers have begun using AI to generate exploit code too. The concept was proven effective by researchers, and now LLM generated exploit code in the wild has been confirmed by the Hewlett Packard security team. Canada's top cybersecurity official has also publicly reported evidence of AI being used to enable cyberattacks and spread misinformation.
But defenders have been using AI to level the playing field as well. Machine learning models have enabled improved fraud detection and better intrusion detection capabilities, helped predict which vulnerabilities will be exploited by attackers, and helped software developers automate complex security assessments. Also organization's who implement risk-based policies for AI adoption can better control the associated risks.
A new tool in the battle for mitigating the risk of AI adoption has emerged from cybersecurity leader MITRE. In this article, we will introduce the new MITRE ATLAS (Adversarial Threat Landscape for Artificial-Intelligence Systems) AI security framework and explore some of the unique tactics, techniques and procedures (TTP) that apply to AI systems. These TTP can serve as a repository of knowledge for both AI developers, and penetration testers conducting security assessments.
MITRE ATLAS (Adversarial Threat Landscape for Artificial-Intelligence Systems) is a central repository and knowledge base for understanding cyber attacks that specifically impact artificial intelligence (AI) and machine learning (ML) systems. The MITRE ATLAS matrix is similar to the MITRE ATT&CK (Adversarial Tactics, Techniques, and Common Knowledge) framework, a more general purpose repository of adversarial behavior and MITRE D3FEND a matching framework designed for defenders.
MITRE has also published a number of resources for MITRE ATLAS including an open-source repository containing the tactics, techniques, mitigations, case studies that exemplify real world scenarios, tools, and other data used by the ATLAS website. The cybersecurity community and IT security professionals are encouraged to contribute to ATLAS by submitting new intelligence and creating new case studies.
Threat Taxonomy: ATLAS categorizes different types of threats that could affect AI/ML systems, including data poisoning, model evasion, and adversarial attacks designed to manipulate AI outputs.
Adversarial Tactics & Techniques: ATLAS outlines tactics and techniques adversaries might use to exploit AI models. This includes attacks during different stages of the AI pipeline, such as data manipulation, model corruption, and inference attacks.
Guidance for Defenders: The framework provides practical guidance on how to defend against these attacks by suggesting mitigations and best practices for AI/ML model development, deployment, and monitoring.
Focus on Real-World Scenarios: ATLAS includes real-world case studies and examples of attacks on AI systems, helping organizations understand the practical impact of these threats.
MITRE ATLAS and MITRE ATT&CK are both frameworks developed by MITRE to document and analyze adversarial behavior, but they focus on different areas of cybersecurity. Both frameworks use a matrix structure that organizes adversary behaviors into high-level tactics (goals) and more granular techniques (methods for achieving those goals).
ATT&CK focuses on cyber attacks against traditional IT systems, while ATLAS is centered around threats targeting AI/ML systems.
To counter attacks on AI-enabled systems, it's essential to establish effective procedures for managing an AI model throughout its lifecycle. MITRE ATLAS uses CRISP-ML(Q)'s lifecycle phases to tag mitigations for vulnerabilities in AI systems, helping teams involved in each phase identify and address potential security risks.
Similar to DevSecOps, a security focused software development process, Machine Learning Operations (MLOps) defines best practices and tools to ensure the deployment of reliable, reproducible, and adaptable AI models. A key example of a model development pipeline with a focus on MLOps is CRISP-ML(Q), which emphasizes quality assurance in the development of machine learning applications. This connection between CRISP-ML(Q) and ATLAS helps organizations ensure that AI models are robust and secure throughout their development, deployment, and operation.
By now most people exposed to AI specific security concerns are aware of adversarial techniques of poisoning training data [AML.T0020] and malicious prompt injection [AML.T0051] to leak sensitive data [AML.T0057] including valuable intellectual property. These seem to be the most obvious vectors of attack against AI systems.
Here are some other unique tactics in MITRE ATLAS:
Create Proxy ML Model [AML.T0005]: Similar to traditional Adversary In The Middle (AiTM) attacks, attackers may set up a proxy to a legitimate ML model and trick victims into using the proxy believing it is the real model. The attacker can then modify output to spread misinformation or trick the victim into providing sensitive information.
LLM Jailbreak [AML.T0054]: An adversary may exploit a specially crafted LLM prompt injection to put the language model into a state where it bypasses all controls, restrictions, or guardrails. Once this jailbreak is achieved, the adversary can manipulate the LLM to perform unintended actions or respond to user inputs without the usual limitations.
LLM Prompt Self-Replication [AML.T0061]: An adversary may craft a self-replicating LLM prompt injection that causes the prompt to propagate across systems and LLMs, often paired with additional malicious actions like jailbreaking or data leakage. For example researchers created Morris II, a zero-click worm. The worm can propagate through systems like email assistants, where it replicates itself and leaks sensitive data by embedding malicious instructions in responses generated by the AI.
Acquire Public ML Artifacts: Models [AML.T0002.001]: If adversaries gain insight into which ML models a target is using, either through publicly disclosed information, social engineering, or espionage, they can acquire public or representative models to target victim organizations. These models, may be hosted on common ML model sharing platforms like HuggingFace and using common formats like ONNX or PyTorch, help adversaries tailor attacks more effectively.
Obtain ML Attack Capabilities [AML.T0016.000]: Adversaries can find open-source implementations of machine learning attacks, such as CleverHans, the Adversarial Robustness Toolbox, and FoolBox, originally intended for ML security research and weaponize them. Adversaries may also repurpose tools not specifically designed for adversarial ML attacks to support their operations.
Check out the MITRE ATLAS matrix to review all the available TTP.
MITRE ATLAS provides a comprehensive framework to address adversarial threats specific to AI and machine learning systems. It categorizes various AI vulnerabilities and offers strategies to defend against attacks such as data poisoning, model evasion, and adversarial manipulation.
By leveraging insights from real-world scenarios and contributions from the cybersecurity community, MITRE ATLAS helps organizations secure AI models across their entire lifecycle, aligning with best practices such as CRISP-ML(Q) and MLOps to ensure robust, secure AI deployments.
Download our Guide to Penetration Testing to learn everything you need to know to successfully plan, scope and execute your penetration testing projects
February 24 - Blog
LLM security risks grow with advanced jailbreak techniques like Many-shot, Deceptive Delight, and PAIR. Discover how attackers bypass AI safety measures and how to mitigate these emerging threats.
February 04 - Blog
Blackwood APT uses AiTM attacks that are set to target software updates. Is your organization prepared? Learn more in today's blog.
December 25 - Blog
It's official: Packetlabs has been recognized as one of the top penetration testing companies in 2024 on review platform Clutch.
© 2024 Packetlabs. All rights reserved.