AI agents fail to resist prompt injection attacks 79% of the time
Researchers found that AI agents powered by GPT-5 and Gemini could not resist prompt injection attacks.
Direct attacks succeeded more than 79% of the time, while hidden attacks embedded in web content frequently manipulated agent behavior.
The findings suggest prompt injection remains a broader security problem as AI agents become more mainstream, with developers racing to deploy AI agents capable of browsing the internet and trading cryptocurrency autonomously.
Prompt injection occurs when attackers embed hidden instructions in content that an AI agent encounters, causing it to follow the attacker’s directions instead of the user’s.