Cloudflare makes it simple to deploy AI apps with Hugging Face, launches Workers AI to public

Join us in Atlanta on April 10th and explore the landscape of security workforce. We will explore the vision, benefits, and use cases of AI for security teams. Request an invite here.


Cloudflare is now letting more developers bring their AI applications from Hugging Face onto its platform. In addition, the company has made its serverless GPU-powered inference, known as Workers AI, generally available.

The Cloudflare-Hugging Face integration was announced nearly seven months ago. It makes it easy for models to be deployed onto Workers AI. With one click, developers can distribute them instantly. Fourteen curated Hugging Face models are supported by Cloudflare’s Workers AI across text generation, embeddings and sentence similarity.

“The recent generative AI boom has companies across industries investing massive amounts of time and money into AI. Some of it will work, but the real challenge of AI is that the demo is easy, but putting it into production is incredibly hard,” Cloudflare CEO Matthew Prince said in a statement. “We can solve this by abstracting away the cost and complexity of building AI-powered apps.”

“Workers AI is one of the most affordable and accessible solutions to run inference,” he continued. “And with Hugging Face and Cloudflare both deeply aligned in our efforts to democratize AI in a simple, affordable way, we’re giving developers the freedom and agility to choose a model and scale their AI apps from zero to global in an instant.”

VB Event

The AI Impact Tour – Atlanta

Continuing our tour, we’re headed to Atlanta for the AI Impact Tour stop on April 10th. This exclusive, invite-only event, in partnership with Microsoft, will feature discussions on how generative AI is transforming the security workforce. Space is limited, so request an invite today.

Request an invite

Through Hugging Face, developers choose the open-source model they want to use, select “Deploy to Cloudflare Workers AI,” and instantly distribute the model. This ensures that the model is received simultaneously at all the right places so there’s no lag or incorrect experiences.

Hugging Face co-founder and Chief Technology Officer Julien Chaumond added, “Offering the most popular open models with a serverless API, powered by a global fleet of GPUs, is an amazing proposition for the Hugging Face community…”

As for Workers AI, developers can now access GPUs deployed in more than 150 cities worldwide, including Cape Town, Durban, Johannesburg, Lagos, Amman, Buenos Aires, Mexico City, Mumbai, New Delhi and Seoul. Cloudflare is also improving the AI to support fine-tuned model weights, allowing developers to create and deploy specialized, domain-specific applications.