Live
OpenAI announces GPT-5 with unprecedented reasoning capabilitiesGoogle DeepMind achieves breakthrough in protein folding for rare diseasesEU passes landmark AI Safety Act with global implicationsAnthropic raises $7B as enterprise demand for Claude surgesMeta open-sources Llama 4 with 1T parameter modelNVIDIA unveils next-gen Blackwell Ultra chips for AI data centersApple integrates on-device AI across entire product lineupSam Altman testifies before Congress on AI regulation frameworkMistral AI reaches $10B valuation after Series C funding roundStability AI launches video generation model rivaling SoraOpenAI announces GPT-5 with unprecedented reasoning capabilitiesGoogle DeepMind achieves breakthrough in protein folding for rare diseasesEU passes landmark AI Safety Act with global implicationsAnthropic raises $7B as enterprise demand for Claude surgesMeta open-sources Llama 4 with 1T parameter modelNVIDIA unveils next-gen Blackwell Ultra chips for AI data centersApple integrates on-device AI across entire product lineupSam Altman testifies before Congress on AI regulation frameworkMistral AI reaches $10B valuation after Series C funding roundStability AI launches video generation model rivaling Sora
Research

AI Offensive Cyber Capabilities Are Doubling Every Six Months, Safety Study Finds

A new safety study finds that AI models' ability to exploit security vulnerabilities has been doubling every 5.7 months since 2024. Opus 4.6 and GPT-5.3 Codex can now autonomously solve cyberattack tasks that take human experts roughly three hours — and the curve is not flattening.

D.O.T.S AI Newsroom

D.O.T.S AI Newsroom

AI News Desk

3 min read
AI Offensive Cyber Capabilities Are Doubling Every Six Months, Safety Study Finds

A new study from AI safety researchers has quantified what security professionals have been warning about for two years: AI offensive cyber capabilities are scaling at an alarming rate. The capability doubling time — the period over which AI models double their ability to exploit security vulnerabilities — currently stands at 5.7 months, a pace that has been consistent since 2024.

The most capable models tested — Anthropic's Opus 4.6 and OpenAI's GPT-5.3 Codex — can now autonomously complete cyberattack tasks that, when given to skilled human security professionals, take approximately three hours. That benchmark matters because it moves AI offensive capability from theoretical concern to operational reality: these are not toy problems or academic exercises, but representative tasks from real-world penetration testing work.

The Doubling Curve

What makes the 5.7-month figure alarming is its consistency. Unlike many AI capability curves that show rapid early gains followed by diminishing returns, the offensive cyber capability trajectory has maintained a steady exponential pace. If that trajectory holds, the models available in 18 months will have eight times the offensive cyber capability of today's leaders.

The study methodology evaluated models on a standardized set of cyberattack tasks spanning vulnerability identification, exploit development, privilege escalation, and lateral movement — the core stages of a real-world intrusion. Models were evaluated blind, without access to human guidance during task execution, to measure autonomous capability rather than assisted performance.

The Policy Problem

The research arrives at a moment of active policy debate about how to govern AI systems with dual-use capabilities. The U.S. AI Safety Institute has been developing evaluation frameworks for frontier model capabilities, with cybersecurity cited as a priority domain. But evaluation frameworks only matter if they inform deployment decisions — and the current regulatory environment provides no mandatory mechanism for labs to restrict model deployment based on capability thresholds.

Anthropic has been the most explicit major lab about its cybersecurity capability concerns, restricting Claude Mythos access to a narrow set of cybersecurity evaluation partners. The new study provides quantitative grounding for that caution — and raises the question of whether the rest of the industry is moving fast enough in the same direction.

What the Security Community Needs to Do Now

The practical implication for security teams is not hypothetical: AI-assisted attacks are already more capable than many defenders' current detection and response tooling assumes. The asymmetry between offense and defense that AI is creating is not a future problem. Red teams and penetration testers are already using these models operationally. The defenders who aren't adjusting their threat models accordingly are behind.

Back to Home

Related Stories

Google's AI Overviews Are Right Nine Times Out of Ten — but the 10% Failure Rate Has a Specific Shape
Research

Google's AI Overviews Are Right Nine Times Out of Ten — but the 10% Failure Rate Has a Specific Shape

A new independent study is the first to systematically measure the factual accuracy of Google's AI Overviews at scale. The headline finding — 90% accuracy — is better than critics expected and worse than Google implies. The more important finding is where that 10% comes from: complex multi-step queries, niche topics, and questions where the web itself is the source of conflicting claims.

D.O.T.S AI Newsroom
Databricks Co-Founder Wins Top Computing Prize — and Says AGI Is 'Already Here'
Research

Databricks Co-Founder Wins Top Computing Prize — and Says AGI Is 'Already Here'

Matei Zaharia, co-founder of Databricks and creator of Apache Spark, has won the ACM Prize in Computing — one of the most prestigious awards in computer science. In interviews accompanying the announcement, Zaharia made a pointed argument: AGI is not a future event but a present condition, and the industry's endless debate about its arrival is obscuring more useful questions about what to do with the AI we already have.

D.O.T.S AI Newsroom
Researchers Fingerprinted 178 AI Models' Writing Styles — and Found Alarming Clone Clusters
Research

Researchers Fingerprinted 178 AI Models' Writing Styles — and Found Alarming Clone Clusters

A new study from Rival analyzed 3,095 standardized responses across 178 AI models, extracting 32-dimension stylometric fingerprints to map which models write like which others. The findings reveal tightly grouped clone clusters across providers — and raise serious questions about whether the AI ecosystem is converging on a single voice.

D.O.T.S AI Newsroom