Research2 min read
Alibaba's Qwen3.5-Omni Taught Itself to Code From Voice and Video — Nobody Asked It To
Alibaba's new multimodal model demonstrated a capability its developers did not train for: the ability to write functional code from spoken instructions and video demonstrations, without any examples of this task in its training data. The finding adds to a growing body of evidence that large models develop capabilities through mechanisms that researchers cannot yet fully explain or predict.