Researchers at Cisco and Robust Intelligence, the AI security firm acquired by the tech giant last year, have conducted testing on DeepSeek and other popular AI models to determine their level of susceptibility to jailbreaking and draw a comparison between them.
The analysis, conducted in collaboration with the University of Pennsylvania, targeted DeepSeek R1, Meta’s Llama 3.1 405B, OpenAI’s GPT-4o and o1 (ChatGPT), Google’s Gemini 1.5 Pro, and Anthropic’s Claude 3.5 Sonnet.
The models were tested using the HarmBench benchmark, which covers hundreds of behaviors across seven categories, including cybercrime, misinformation, chemical weapons, copyright violations, harassment, illegal activities, and general harm. Cisco ran an automatic jailbreaking algorithm on 50 prompts from HarmBench.
The tests showed that DeepSeek was the only model with a 100% attack success rate — all of the jailbreak attempts were successful against the Chinese company’s model. In contrast, OpenAI’s o1 model saw a success rate of only 26%.
The attack success rate for the remaining AI models ranged between 36% and 96%.

“Our findings suggest that DeepSeek’s claimed cost-efficient training methods, including reinforcement learning, chain-of-thought self-evaluation, and distillation may have compromised its safety mechanisms. Compared to other frontier models, DeepSeek R1 lacks robust guardrails, making it highly susceptible to algorithmic jailbreaking and potential misuse,” Cisco said.
DeepSeek’s AI model has been found to be better than its competitors in some areas in terms of performance. However, in terms of security, several cybersecurity firms reported over the past days that the model is susceptible to known jailbreak methods, including ones that have been known for a long time and which have been addressed in other models.
Researchers also demonstrated a few days ago that they were able to obtain DeepSeek’s full system prompt, which defines a model’s behavior, limitations, and responses, and which chatbots typically do not disclose through regular prompts. DeepSeek patched this exposure after being notified.
Related: DeepSeek Cyberattack Details Emerge
Advertisement. Scroll to continue reading.
Related: Texas Governor Orders Ban on DeepSeek, RedNote for Government Devices
Related: Italy Blocks Access to the Chinese AI Application DeepSeek to Protect Users’ Data