Live
OpenAI announces GPT-5 with unprecedented reasoning capabilitiesGoogle DeepMind achieves breakthrough in protein folding for rare diseasesEU passes landmark AI Safety Act with global implicationsAnthropic raises $7B as enterprise demand for Claude surgesMeta open-sources Llama 4 with 1T parameter modelNVIDIA unveils next-gen Blackwell Ultra chips for AI data centersApple integrates on-device AI across entire product lineupSam Altman testifies before Congress on AI regulation frameworkMistral AI reaches $10B valuation after Series C funding roundStability AI launches video generation model rivaling SoraOpenAI announces GPT-5 with unprecedented reasoning capabilitiesGoogle DeepMind achieves breakthrough in protein folding for rare diseasesEU passes landmark AI Safety Act with global implicationsAnthropic raises $7B as enterprise demand for Claude surgesMeta open-sources Llama 4 with 1T parameter modelNVIDIA unveils next-gen Blackwell Ultra chips for AI data centersApple integrates on-device AI across entire product lineupSam Altman testifies before Congress on AI regulation frameworkMistral AI reaches $10B valuation after Series C funding roundStability AI launches video generation model rivaling Sora
Research

Mantis Biotech Is Building 'Digital Twins' of Humans to Solve Medicine's Training Data Problem

Mantis Biotech generates synthetic medical datasets — 'digital twins' of the human body — to supply AI training data that real patient records cannot legally or practically provide. The approach targets one of healthcare AI's most persistent bottlenecks: the gap between what AI systems need to learn and what hospitals can share.

D.O.T.S AI Newsroom

D.O.T.S AI Newsroom

AI News Desk

2 min read
Mantis Biotech Is Building 'Digital Twins' of Humans to Solve Medicine's Training Data Problem

Mantis Biotech is developing synthetic "digital twin" datasets of the human body to address what the company describes as one of medicine's most structural problems: the scarcity of high-quality, accessible training data for healthcare AI. The startup takes disparate sources of medical information — imaging studies, genomic data, lab results, clinical notes — and generates synthetic datasets representing human anatomy, physiology, and behavior at scale.

The Data Availability Problem in Medical AI

Most medical AI systems require large volumes of labeled patient examples to train effectively. In practice, that data is difficult to access. HIPAA and equivalent international privacy regulations impose significant consent, de-identification, and data governance requirements on patient records. Institutional review boards, data sharing agreements, and legal liability concerns further slow access. For rare diseases, or uncommon presentations of common conditions, the data may simply not exist in sufficient quantity anywhere in the world, regardless of access barriers.

Mantis' synthetic approach aims to sidestep these constraints. Synthetic datasets generated to represent real biological processes can be used to train diagnostic AI, test clinical decision support systems, and model pharmaceutical interventions — without touching actual patient records. The company can generate arbitrarily large datasets for any condition, including rare diseases where real-world data is inherently scarce.

Why Now

The timing reflects the convergence of two capabilities: advances in generative AI that can produce high-fidelity biological simulations, and the growing enterprise appetite for medical AI that is not stalling in regulatory review. Healthcare AI deployments have accelerated over the past two years, but the bottleneck has increasingly shifted from model capability to training data — precisely the problem Mantis targets.

Pharmaceutical research, diagnostic imaging AI, and clinical trial modeling represent Mantis' initial market segments. Each involves a different use case for synthetic data: drug discovery benefits from the ability to simulate rare patient populations; diagnostic imaging AI requires large volumes of labeled scans; clinical trial modeling needs diverse physiological variation to test drug responses across demographics.

Mantis Biotech has raised an undisclosed amount of funding and has not disclosed revenue figures or named enterprise customers. The company faces a validation challenge common to synthetic data startups: demonstrating that models trained on synthetic data perform equivalently on real patient populations — a claim that requires rigorous clinical validation before healthcare systems will adopt it in high-stakes applications.

Back to Home

Related Stories

Google's AI Overviews Are Right Nine Times Out of Ten — but the 10% Failure Rate Has a Specific Shape
Research

Google's AI Overviews Are Right Nine Times Out of Ten — but the 10% Failure Rate Has a Specific Shape

A new independent study is the first to systematically measure the factual accuracy of Google's AI Overviews at scale. The headline finding — 90% accuracy — is better than critics expected and worse than Google implies. The more important finding is where that 10% comes from: complex multi-step queries, niche topics, and questions where the web itself is the source of conflicting claims.

D.O.T.S AI Newsroom
Databricks Co-Founder Wins Top Computing Prize — and Says AGI Is 'Already Here'
Research

Databricks Co-Founder Wins Top Computing Prize — and Says AGI Is 'Already Here'

Matei Zaharia, co-founder of Databricks and creator of Apache Spark, has won the ACM Prize in Computing — one of the most prestigious awards in computer science. In interviews accompanying the announcement, Zaharia made a pointed argument: AGI is not a future event but a present condition, and the industry's endless debate about its arrival is obscuring more useful questions about what to do with the AI we already have.

D.O.T.S AI Newsroom
Researchers Fingerprinted 178 AI Models' Writing Styles — and Found Alarming Clone Clusters
Research

Researchers Fingerprinted 178 AI Models' Writing Styles — and Found Alarming Clone Clusters

A new study from Rival analyzed 3,095 standardized responses across 178 AI models, extracting 32-dimension stylometric fingerprints to map which models write like which others. The findings reveal tightly grouped clone clusters across providers — and raise serious questions about whether the AI ecosystem is converging on a single voice.

D.O.T.S AI Newsroom