Training Methods — AI News & Articles | D.O.T.S AI News

D.O.T.S AINEWSNEWS

Live

OpenAI announces GPT-5 with unprecedented reasoning capabilitiesGoogle DeepMind achieves breakthrough in protein folding for rare diseasesEU passes landmark AI Safety Act with global implicationsAnthropic raises $7B as enterprise demand for Claude surgesMeta open-sources Llama 4 with 1T parameter modelNVIDIA unveils next-gen Blackwell Ultra chips for AI data centersApple integrates on-device AI across entire product lineupSam Altman testifies before Congress on AI regulation frameworkMistral AI reaches $10B valuation after Series C funding roundStability AI launches video generation model rivaling SoraOpenAI announces GPT-5 with unprecedented reasoning capabilitiesGoogle DeepMind achieves breakthrough in protein folding for rare diseasesEU passes landmark AI Safety Act with global implicationsAnthropic raises $7B as enterprise demand for Claude surgesMeta open-sources Llama 4 with 1T parameter modelNVIDIA unveils next-gen Blackwell Ultra chips for AI data centersApple integrates on-device AI across entire product lineupSam Altman testifies before Congress on AI regulation frameworkMistral AI reaches $10B valuation after Series C funding roundStability AI launches video generation model rivaling Sora

Training Methods

1 article tagged "Training Methods"

Alibaba's Qwen Team Fixes the Core Problem With Reasoning Model Training — and Doubles Thought Length in the Process

Research3 min read

Alibaba's Qwen Team Fixes the Core Problem With Reasoning Model Training — and Doubles Thought Length in the Process

Reinforcement learning gives reasoning models the same reward for every token, regardless of whether it was the pivot that unlocked a solution or just a filler comma. Alibaba's Qwen team has built FIPO, an algorithm that assigns rewards based on downstream influence — and the results include doubled reasoning depth without adding a separate value model.

D.O.T.S AI NewsroomApr 5, 20263 min read