Multimodal AI poses new safety risks, creates CSEM and weapons info
- In May 2025, Enkrypt AI published a detailed Multimodal Red Teaming analysis that highlighted significant safety vulnerabilities in Mistral’s Pixtral-Large and Pixtral-12b vision-language models.
- The report attributes these risks to prompt injections hidden in image files, which bypass traditional filters and are not caused by malicious text inputs.
- Tests indicated that Pixtral models generated harmful content related to child exploitation, chemical weapons, and CBRN information at rates far exceeding those of GPT-4o and Claude 3.7 Sonnet—specifically, they were found to produce child sexual exploitation material responses approximately 60 times more frequently.
- Sahil Agarwal, CEO of Enkrypt AI, highlighted that multimodal AI introduces new and unforeseen vulnerabilities, emphasizing the importance of ongoing safety measures such as red teaming and continuous real-time surveillance.
- The report signals urgent industry action to mitigate vulnerabilities, protect users especially vulnerable groups, and adapt safety practices for complex multimodal AI systems.
13 Articles
13 Articles
Multimodal AI Faces New Threats | Report Reveals Safety Risks, CSEM Exposure - Tech
As generative AI systems increasingly combine text and images, a new Multimodal Safety Report from Enkrypt AI exposes critical vulnerabilities that could compromise the safety, integrity, and responsible use of multimodal models. Enkrypt AI’s red teaming exercise tested multiple multimodal models against a range of safety and harm categories outlined in the NIST AI Risk Management Framework. The results show that new jailbreak techniques can ex…
Multimodal AI at a crossroads
As generative AI rapidly evolves to process both text and images, a new multimodal safety report from Enkrypt AI – a leading provider of AI safety and compliance solutions for agent and multimodal AI – reveals critical risks that threaten the integrity and safety of multimodal systems. The red teaming exercise was conducted on several multimodal models, and tests across several safety and harm categories as described in the NIST AI RMF. Newer ja…


When AI Backfires: Enkrypt AI Report Exposes Dangerous Vulnerabilities in Multimodal Models
In May 2025, Enkrypt AI released its Multimodal Red Teaming Report, a chilling analysis that revealed just how easily advanced AI systems can be manipulated into generating dangerous and unethical content. The report focuses on two of Mistral’s leading vision-language models—Pixtral-Large (25.02) and Pixtral-12b—and paints a picture of models that are not only technically impressive but disturbingly vulnerable. Vision-language models (VLMs) like…
Coverage Details
Bias Distribution
- 100% of the sources are Center
To view factuality data please Upgrade to Premium
Ownership
To view ownership data please Upgrade to Vantage