OpenAI and Paradigm have released EVMbench—a framework for evaluating AI agents' ability to find vulnerabilities in Ethereum smart contracts.
OpenAI's EVMbench tests AI on smart contract security. Claude Opus 4.6 ranked first, beating GPT-5 and Gemini 3 Pro across 120 real crypto vulnerabilities.