{"id":245981,"date":"2026-05-13T18:33:00","date_gmt":"2026-05-13T22:33:00","guid":{"rendered":"https:\/\/news-you-need.com\/index.php\/2026\/05\/13\/researchers-say-ai-just-broke-every-benchmark-for-autonomous-cyber-capability\/"},"modified":"2026-05-14T13:35:11","modified_gmt":"2026-05-14T17:35:11","slug":"researchers-say-ai-just-broke-every-benchmark-for-autonomous-cyber-capability","status":"publish","type":"post","link":"https:\/\/news-you-need.com\/index.php\/2026\/05\/13\/researchers-say-ai-just-broke-every-benchmark-for-autonomous-cyber-capability\/","title":{"rendered":"Researchers say AI just broke every benchmark for autonomous cyber capability"},"content":{"rendered":"<p><a href=\"https:\/\/cyberscoop.com\/ai-autonomous-cyber-capability-benchmarks-broken-gpt5-claude-mythos\/\">Researchers say AI just broke every benchmark for autonomous cyber capability<\/a><\/p>\n<p><a href=\"https:\/\/cyberscoop.com\/ai-autonomous-cyber-capability-benchmarks-broken-gpt5-claude-mythos\/\">https:\/\/cyberscoop.com\/ai-autonomous-cyber-capability-benchmarks-broken-gpt5-claude-mythos\/<\/a><\/p>\n<p>Publish Date: <a href=\"publish_date]\">2026-05-13 18:33:00<\/a><\/p>\n<p>Source Domain: <a href=\"cyberscoop.com\">cyberscoop.com<\/a><\/p>\n<p>Two of the most advanced artificial intelligence models \u2014 Anthropic\u2019s Claude Mythos Preview and OpenAI\u2019s GPT-5.5 \u2014 have significantly surpassed the already-accelerating pace at which AI systems are completing autonomous cybersecurity tasks, according to separate findings published Wednesday by the United Kingdom\u2019s AI Security Institute (AISI) and Palo Alto Networks.<\/p>\n<p>The AISI, which conducts pre-deployment evaluations of frontier AI models on behalf of the British government, said both Claude Mythos Preview and GPT-5.5 have substantially exceeded the doubling trend the institute had been tracking since late 2024. Whether the results represent an isolated capability jump or the start of a new, faster trajectory remains unclear.<\/p>\n<p>The AISI estimated earlier this year that frontier models\u2019 80% reliability cyber time horizon \u2014 a measure of how long a task takes a human expert, used as a proxy for AI autonomy \u2014 had been doubling approximately every five months. That was itself roughly half the eight-month doubling time the institute estimated in November 2025. Now Mythos Preview and GPT-5.5 have since outperformed any trend lines the institute has measured.<\/p>\n<p>\u201cFrontier AI\u2019s autonomous cyber and software capability is advancing quickly: the length of cyber tasks that frontier models can complete autonomously has doubled on the order of months, not years,\u201d the AISI wrote.<\/p>\n<p>The clearest evidence of the capability jump came from the AISI\u2019s cyber ranges, its structured simulations of multi-stage attacks against small, undefended enterprise networks. A newer checkpoint of Claude Mythos Preview became the first model to complete both of the institute\u2019s ranges. It solved \u201cThe Last Ones,\u201d a 32-step simulated corporate network attack, in 6 of 10 attempts, and completed \u201cCooling Tower\u201d \u2014 previously unsolved by any model \u2014 in 3 of 10 attempts. GPT-5.5 solved \u201cThe Last Ones\u201d in 3 of 10 attempts.<\/p>\n<p>Palo Alto Networks&#8230;<\/p>\n<p><a href=\"https:\/\/cyberscoop.com\/ai-autonomous-cyber-capability-benchmarks-broken-gpt5-claude-mythos\/\">Source<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Researchers say AI just broke every benchmark for autonomous cyber capability https:\/\/cyberscoop.com\/ai-autonomous-cyber-capability-benchmarks-broken-gpt5-claude-mythos\/ Publish Date: 2026-05-13&#8230;<\/p>\n","protected":false},"author":1,"featured_media":245982,"comment_status":"closed","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"fifu_image_url":"https:\/\/cyberscoop.com\/wp-content\/uploads\/sites\/3\/2026\/05\/GettyImages-2229149370-1-1.jpg","fifu_image_alt":"","footnotes":""},"categories":[15],"tags":[26,20,24],"class_list":["post-245981","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-cybersecurity","tag-ai","tag-artificial-intelligence","tag-cybersecurity"],"_links":{"self":[{"href":"https:\/\/news-you-need.com\/index.php\/wp-json\/wp\/v2\/posts\/245981"}],"collection":[{"href":"https:\/\/news-you-need.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/news-you-need.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/news-you-need.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/news-you-need.com\/index.php\/wp-json\/wp\/v2\/comments?post=245981"}],"version-history":[{"count":1,"href":"https:\/\/news-you-need.com\/index.php\/wp-json\/wp\/v2\/posts\/245981\/revisions"}],"predecessor-version":[{"id":245983,"href":"https:\/\/news-you-need.com\/index.php\/wp-json\/wp\/v2\/posts\/245981\/revisions\/245983"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/news-you-need.com\/index.php\/wp-json\/wp\/v2\/media\/245982"}],"wp:attachment":[{"href":"https:\/\/news-you-need.com\/index.php\/wp-json\/wp\/v2\/media?parent=245981"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/news-you-need.com\/index.php\/wp-json\/wp\/v2\/categories?post=245981"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/news-you-need.com\/index.php\/wp-json\/wp\/v2\/tags?post=245981"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}