The AI Inference platform

Workers AI lets you run AI inference globally with one API call. No GPUs to manage, no capacity planning. Just intelligent machine learning models running where they're needed, on Cloudflare's global network.

Start building for free View docs

Serverless pricing

Pay-per-inference pricing with no idle costs. No guessing what.

Rich model catalog

50+ models running close to users in 200+ cities

Widely compatible

One API call, works with any OpenAI SDK or task type

Scale up, and down

Inference is hard to predict and spiky in nature, unlike training. GPU utilization is, on average, only 20-40% — with one-third of organizations utilizing less than 15%. Workers AI allows customers to save by only paying for usage. No guessing or committing to hardware that goes unused.

What you pay for
on a hyperscaler

What you pay for
on Cloudflare

AI models easily accessible via code, OpenAI SDK or API

Test, prototype, and evaluate the latest LLMs with the speed and reliability of a production environment, accessible in seconds.

Kimi K2.6

Powerful vision and agentic tool calling model

GLM 4.7 Flash

Rapid multilingual agent with expert tool calling

GPT-OSS-120B

Specialized for coding and debugging

Llama 4 Scout

Balanced generalist for everyday tasks

Try in Cloudflare AI Playground See all models

Run any AI model with one API call

Call any model directly from your code using a single endpoint. Workers AI handles provisioning, scaling, and latency optimization automatically.

const response = await env.AI.run('@cf/moonshotai/kimi-k2.6', {  messages: [    { role: 'system', content: 'You are a friendly assistant' },    { role: 'user', content: 'What is the origin of the phrase Hello, World' },  ],});

curl https://api.cloudflare.com/client/v4/accounts/$CLOUDFLARE_ACCOUNT_ID/ai/run/@cf/moonshotai/kimi-k2.6 \  -X POST \  -H "Authorization: Bearer $CLOUDFLARE_AUTH_TOKEN" \  -d '{ "messages": [{ "role": "system", "content": "You are a friendly assistant" }, { "role": "user", "content": "Why is pizza so good" }]}'

export interface Env {  AI: Ai;}
export default {  async fetch(request, env): Promise<Response> {    const response = await env.AI.run('@cf/black-forest-labs/flux-1-schnell', {      prompt: 'a bengal cat vibe coding to music',      seed: Math.floor(Math.random() * 10),    });    // Convert from base64 string    const binaryString = atob(response.image);    // Create byte representation    const img = Uint8Array.from(binaryString, (m) => m.codePointAt(0));    return new Response(img, {      headers: {        'Content-Type': 'image/jpeg',      },    });  },} satisfies ExportedHandler<Env>;

curl https://api.cloudflare.com/client/v4/accounts/$CLOUDFLARE_ACCOUNT_ID/ai/run/@cf/black-forest-labs/flux-1-schnell  \  -X POST  \  -H "Authorization: Bearer $CLOUDFLARE_API_TOKEN"  \  -d '{ "prompt": "cyberpunk cat", "seed": "Random positive integer" }'

Practical AI at the Edge

Run real-world AI workloads directly on Cloudflare's global network — from LLMs to image generation and embeddings. No GPU clusters, no orchestration layers — just fast, scalable inference wherever your users are.

Workers AI

Explore a Rich Catalog of 50+ Ready-to-Use Models

Real-world examples in action

Image generation

Execute image generation, manipulation, and creative workflows without managing GPU infrastructure. Perfect for content platforms, social apps, and creative tools.

Speech-to-text, in real-time

Transcribe, analyze, and generate audio content without specialized infrastructure. Built for voice agents, note-taking apps, and media processing.

Embeddings

Create intelligent search, recommendations, and context-aware features using vector embeddings. Seamlessly integrates with Vectorize AI Search for complete AI workflows.

LLMs

Perform a wide range of natural language tasks. Use large language models for text generation, classification, question answering, and other complex language-based operations through a simple API.

Workers AI Pricing

50+ models running at the edge. View AI pricing details

Component

Free

Paid

Neurons

Free

—

Paid

$0.011 / thousand neurons

Shopify

"

For Shopify, the real challenge is not about how many different pieces of complex technology we can use but the opposite. Cloudflare helps us find a simple way to achieve something very complex that we can scale and maintain. "

Duncan Davidson VP of Developer Productivity

Powerful primitives, seamlessly integrated

Built on systems powering 20% of the Internet, Workers AI runs on the same infrastructure Cloudflare uses to build Cloudflare. Enterprise-grade reliability, security, and performance are standard.

Compute

Browser Run Automated browsers

Containers Any language, anywhere

Durable Objects Stateful compute

Sandboxes Secure code execution

Workers Global serverless functions

Workers for Platforms Programmable Platform Solutions

Workflows Process orchestration

Storage

Artifacts Git-native versioned storage

D1 Serverless SQL

Data Platform Ingest, Catalog & Query

Hyperdrive Global databases

Queues Message processing

R2 Egress-free storage

KV Ultra-fast key-value storage

Agents Build stateful AI agents

AI Gateway AI observability

AI Search Instant retrieval

Vectorize Vector database

Workers AI Edge AI models

SASE / Zero Trust

SASE Cloudflare SASE platform

Access Safe access to private applications

Secure Web Gateway DNS filtering & Secure Browsing

Data Loss Prevention Protect sensitive data

Browser Isolation Protect Users and Data with Cloudflare Browser Isolation

WAN Cloud-delivered enterprise networking

Email Security AI-driven Email Protection

Security

DDoS Protection Mitigation Solutions

Rate Limiting Abuse prevention

SSL Secure Your Site with SSL

Turnstile A CAPTCHA Replacement Solution

WAF Web Application Firewall

Magic Transit DDoS Protection for Networks

Network & Content Delivery

Bot Management Block bad bots

CDN Faster delivery & caching

DNS Fast DNS

Load Balancing Zero downtime

Page Shield Client-Side Protection

TURN / SFU Real-time infra

Analytics Web Performance & Security

Build without boundaries

Join thousands of developers who've eliminated infrastructure complexity and deployed globally with Cloudflare. Start building for free — no credit card required.

Start building for free

View docs