DeepSeek V4 Lands With a 1.6-Trillion-Parameter Open-Weight Model — and Pricing That Undercuts Every Frontier API
DeepSeek previewed two new models on Thursday: V4 Flash (284B parameters, 13B active) and V4 Pro (1.6T parameters, 49B active). Both are mixture-of-experts with 1M-token context windows, and V4 Pro is now the largest open-weight model publicly available. Pricing — $0.14/$0.28 per million tokens for Flash, $0.145/$3.48 for Pro — undercuts GPT-5.4 and Gemini 3.1 Pro by an order of magnitude on input.

D.O.T.S AI Newsroom
AI News Desk
DeepSeek announced two new models on Thursday — DeepSeek V4 Flash and DeepSeek V4 Pro — that the Chinese lab claims have "almost closed the gap" with leading frontier models on reasoning benchmarks while matching GPT-5.4 on coding competitions. V4 Flash is a 284-billion-parameter mixture-of-experts model with 13 billion active parameters per forward pass; V4 Pro pushes the architecture to 1.6 trillion total parameters with 49 billion active, making it the largest open-weight model publicly available anywhere. Both models support 1-million-token context windows but remain text-only, foregoing the multimodal capabilities (audio, video, image) that GPT-5.5, Gemini 3.1, and Claude Opus 4.7 have already shipped. The release lands 24 hours after OpenAI shipped GPT-5.5 at double its previous API price — a juxtaposition the DeepSeek launch seems engineered to highlight.
The Pricing Cliff
The story is the price tag. V4 Flash is offered at $0.14 per million input tokens and $0.28 per million output tokens; V4 Pro is $0.145 input and $3.48 output. By comparison, GPT-5.5 currently lists at $5/$15 per million tokens (and reportedly costs more under usage caps), Gemini 3.1 Pro at roughly $2.50/$10, and Claude Opus 4.7 at $15/$75. DeepSeek's input pricing is therefore between 17x and 100x cheaper than the frontier alternatives — close to the structural cost of inference for a model running on commodity Chinese hardware. The lab has open-weighted V4 Pro, meaning enterprises and self-hosters can avoid even the API price entirely if they have the GPUs to run a 1.6T-parameter MoE locally. This combination — open weights plus an API price floor that nobody else can match — is the same playbook DeepSeek used with V3, and it is now starting to apply real pricing pressure on Western labs whose unit economics depend on enterprises being willing to pay frontier-model premiums.
Where the Gap Still Lives
DeepSeek's own benchmark disclosures suggest V4 Pro lags frontier models by approximately three to six months on knowledge tests, and the lack of multimodality means agentic workflows that depend on screen understanding, voice input, or video reasoning still need a Western frontier model in the loop. But for the very large fraction of enterprise AI deployments that are text-in / text-out — code review, document analysis, customer support agents, RAG pipelines — V4 Flash is now likely the price-performance leader by a wide margin. The competitive question for OpenAI, Anthropic, and Google is no longer whether DeepSeek can reach frontier capability; it is how long they can keep charging frontier prices when an open-weight Chinese model is one network hop away.