Live
OpenAI announces GPT-5 with unprecedented reasoning capabilitiesGoogle DeepMind achieves breakthrough in protein folding for rare diseasesEU passes landmark AI Safety Act with global implicationsAnthropic raises $7B as enterprise demand for Claude surgesMeta open-sources Llama 4 with 1T parameter modelNVIDIA unveils next-gen Blackwell Ultra chips for AI data centersApple integrates on-device AI across entire product lineupSam Altman testifies before Congress on AI regulation frameworkMistral AI reaches $10B valuation after Series C funding roundStability AI launches video generation model rivaling SoraOpenAI announces GPT-5 with unprecedented reasoning capabilitiesGoogle DeepMind achieves breakthrough in protein folding for rare diseasesEU passes landmark AI Safety Act with global implicationsAnthropic raises $7B as enterprise demand for Claude surgesMeta open-sources Llama 4 with 1T parameter modelNVIDIA unveils next-gen Blackwell Ultra chips for AI data centersApple integrates on-device AI across entire product lineupSam Altman testifies before Congress on AI regulation frameworkMistral AI reaches $10B valuation after Series C funding roundStability AI launches video generation model rivaling Sora
Research

H Company's Holo3 Sets New SOTA on Computer Use With Only 10B Active Parameters

Paris-based H Company has released Holo3, an agentic computer use model that scores 78.85% on OSWorld-Verified — the leading GUI automation benchmark — while activating only 10 billion of its 122 billion parameters, making it cost-competitive with much smaller models.

D.O.T.S AI Newsroom

D.O.T.S AI Newsroom

AI News Desk

3 min read
H Company's Holo3 Sets New SOTA on Computer Use With Only 10B Active Parameters

The computer use benchmark leaderboard has a new leader. H Company, the Paris-based AI startup, has released Holo3 — a model that achieves 78.85% on OSWorld-Verified, the most rigorous public benchmark for autonomous GUI interaction, while activating only 10 billion parameters out of 122 billion total. The result positions Holo3 as the most capable and cost-efficient computer use model publicly available, delivering frontier-level performance at a fraction of the inference cost of GPT-5.4 or Claude Opus 4.6.

The Agentic Learning Flywheel

What distinguishes Holo3 is not model scale but training methodology. H Company built what it calls an Agentic Learning Flywheel — a three-pillar pipeline that trains models specifically for the demands of real-world GUI navigation rather than general language understanding. The first pillar generates synthetic navigation data from human and AI-authored instructions, covering the range of interfaces an enterprise agent might encounter. The second applies out-of-domain augmentation: programmatically extending training scenarios to cover edge-case interfaces the model was not explicitly shown. The third uses curated reinforcement learning with advanced data filtering to maximize decision quality under uncertainty.

The result is a model capable of sustaining multi-step reasoning across applications without losing state or intent across transitions — retrieving equipment prices from PDFs, cross-referencing employee budgets in spreadsheets, and executing personalized approval or rejection workflows in email, all within a single task sequence. H Company validated the system against 486 multi-step enterprise tasks across e-commerce, business software, collaboration platforms, and complex multi-application workflows on its H Corporate Benchmark.

Open Weights and the Adoption Strategy

Alongside the flagship, H Company released Holo3-35B-A3B — a smaller variant with 3 billion active parameters — under the Apache 2.0 license on Hugging Face, with a free-tier inference API included. The strategy is deliberate: seed the developer ecosystem with a capable open model while reserving the full production system for enterprise deployment. This mirrors the approach taken by Mistral and Meta's Llama team, using open releases to build developer adoption and benchmark credibility simultaneously.

The roadmap signals what H Company considers the next frontier: Adaptive Agency — models that can learn to navigate entirely new, bespoke enterprise software in real time, without any pre-training on those interfaces. Holo3 achieves mastery over known interface patterns; the next generation is designed to generalize to the unknown. For enterprise buyers currently evaluating computer use deployments — or building agentic workflows that require GUI interaction as a capability — Holo3 represents the clearest evidence yet that production-ready GUI agents have arrived.

Back to Home

Related Stories

Google's AI Overviews Are Right Nine Times Out of Ten — but the 10% Failure Rate Has a Specific Shape
Research

Google's AI Overviews Are Right Nine Times Out of Ten — but the 10% Failure Rate Has a Specific Shape

A new independent study is the first to systematically measure the factual accuracy of Google's AI Overviews at scale. The headline finding — 90% accuracy — is better than critics expected and worse than Google implies. The more important finding is where that 10% comes from: complex multi-step queries, niche topics, and questions where the web itself is the source of conflicting claims.

D.O.T.S AI Newsroom
Databricks Co-Founder Wins Top Computing Prize — and Says AGI Is 'Already Here'
Research

Databricks Co-Founder Wins Top Computing Prize — and Says AGI Is 'Already Here'

Matei Zaharia, co-founder of Databricks and creator of Apache Spark, has won the ACM Prize in Computing — one of the most prestigious awards in computer science. In interviews accompanying the announcement, Zaharia made a pointed argument: AGI is not a future event but a present condition, and the industry's endless debate about its arrival is obscuring more useful questions about what to do with the AI we already have.

D.O.T.S AI Newsroom
Researchers Fingerprinted 178 AI Models' Writing Styles — and Found Alarming Clone Clusters
Research

Researchers Fingerprinted 178 AI Models' Writing Styles — and Found Alarming Clone Clusters

A new study from Rival analyzed 3,095 standardized responses across 178 AI models, extracting 32-dimension stylometric fingerprints to map which models write like which others. The findings reveal tightly grouped clone clusters across providers — and raise serious questions about whether the AI ecosystem is converging on a single voice.

D.O.T.S AI Newsroom