Meta is Shutting Down Its Llama API — Here's What Developers Need to Know
Meta announced Sunday that it will shut down the Llama API on July 6, 2026. The service, which has been in public preview since its launch, will stop accepting requests entirely after that date.
All API calls will return deactivation notices with redirection guidance once the shutdown takes effect. IT-NEWS has learned that the underlying Llama model is not affected — developers can still download the weights directly from Meta’s Llama download page.
The company is recommending that teams relying on the API migrate to third-party providers that support Llama models. Several options already exist, including offerings from Amazon Bedrock, Google Cloud’s Vertex AI, and Groq, all of which serve Meta’s open-weight models through their own infrastructure.
Meta has not revealed why it chose to sunset the public preview rather than move it to general availability. A company spokesperson said Meta is “working on new ways for developers to build with Meta AI models” and promised more details in the coming weeks.
The move comes at a busy time for Meta’s AI division. CEO Mark Zuckerberg recently said the company’s next-generation model, codenamed “Watermelon,” has matched GPT-5.5 on benchmark tests. Meta is also reportedly building a cloud services business that would sell spare AI compute capacity and model access to outside customers.