Tools3 min read
AMD Launches Lemonade: A Fast, Open Source Local LLM Server That Uses Both GPU and NPU
AMD has released Lemonade, an open source local LLM server designed for developers who want fast, private AI inference without cloud dependency. By leveraging both discrete GPUs and the Neural Processing Units (NPUs) built into modern Ryzen chips, Lemonade offers a developer-friendly alternative to Ollama — with OpenAI API compatibility out of the box.