Pelayo Arbués

Recent Notes

I am cooking again
Mar 22, 2026
The 10x Manager
Feb 16, 2026
2025 Reading Wrapped
Jan 07, 2026

See 99 more →

❯

Literature Notes

❯

❯

Qwen2.5 Omni: See, Hear, Talk, Write, Do It All!

Qwen2.5 Omni: See, Hear, Talk, Write, Do It All!

Apr 29, 20251 min read

articles
literature-note

Metadata

Author: Simon Willison’s Weblog
Full Title: Qwen2.5 Omni: See, Hear, Talk, Write, Do It All!
URL: https://simonwillison.net/2025/Apr/28/qwen25-omni/#atom-everything

Highlights

Qwen2.5 Omni: See, Hear, Talk, Write, Do It All! I’m not sure how I missed this one at the time, but last month (March 27th) Qwen released their first multi-modal model that can handle audio and video in addition to text and images - and that has audio output as a core model feature. (View Highlight)
As far as I can tell nobody has an easy path to getting it working on a Mac yet (the closest report I saw was this comment on Hugging Face). This release is notable because, while there’s a pretty solid collection of open weight vision LLMs now, multi-modal models that go beyond that are still very rare. Like most of Qwen’s recent models, Qwen2.5 Omni is released under an Apache 2.0 license. (View Highlight)

Graph View

Metadata
Highlights

Now Reading

Inteligencia Artificial Para Trabajar Menos
May 20, 2026

See 1806 more →

Created with Quartz, © 2026

Linkedin
Bluesky
Unsplash
Twitter
GitHub
RSS