AI Search

Create AI-powered search for your data

Connect your data and deliver natural language search in your applications.

Start building for free View docs

Deploy RAG architecture in minutes

AI Search gives you a production-ready RAG pipeline out of the box — just connect your data and go.

Always Up-to-Date

AI Search continuously re-indexes your data so responses always reflect the latest information.

Built to Run at the Edge

AI Search is built on top of the Workers Developer Platform, serving requests from the edge.

Core Capabilities

Continuously Updated Indexes

Your data, always fresh. AI Search automatically tracks and updates content changes without manual intervention — keeping LLM responses aligned with your latest data.

Build full AI applications with AI Search and Workers

Search from your Worker. Easily make search calls to your AI Search directly in your Workers apps using standard JavaScript or TypeScript.

Metadata Filtering

Powerful multi-tenant search. Use metadata filters to build user-specific search contexts, enabling secure multi-user experiences with a single AI Search instance.

Web Parsing Source

Bring your website as a source. Generate RAG pipelines directly from your website — whenever content is updated

Edge-Based Inference

Fast, local AI responses. Uses Workers AI to create embeddings and run inference at the edge, closer to users — reducing latency and improving responsiveness.

AI Search

Everything You Need for RAG

You can use AI Search to:

View docs

NLWeb and AI Search

AI Search is ideal for creating a powerful search engine for your company's internal and external knowledge and documentation. With support for NLWeb, it can generate deep links to content, helping users navigate large data sources.

Product Chatbot

You can use AI Search to build a chatbot that answers customer questions using your own product content. This ensures that the chatbot provides accurate and customized responses based on your official documentation and knowledge base. Since AI Search is built to run at the edge, it delivers fast, local AI responses, which improves the user experience by reducing latency.

Multi-Tenant or Personalized AI Assistants

Use AI Search's metadata filtering to create secure, personalized AI assistants for many different users or teams from a single instance. This feature ensures that each user's queries are answered using only their specific, authorized data, guaranteeing a private and tailored experience for every group.

How it Works

Receive query from AI Search API

The query workflow begins when you send a request to either the AI Search's AI Search or Search endpoint.

Query rewriting (optional)

AI Search provides the option to rewrite the input query using one of Workers AI's LLMs to improve retrieval quality by transforming the original query into a more effective search query.

Embedding the query

The rewritten (or original) query is transformed into a vector via the same embedding model used to embed your data so that it can be compared against your vectorized data to find the most relevant matches.

Vector search in Vectorize

The query vector is searched against stored vectors in the associated Vectorize database for your AI Search.

Metadata + content retrieval

Vectorize returns the most relevant chunks and their metadata. And the original content is retrieved from the R2 bucket. These are passed to a text-generation model.

Response generation

A text-generation model from Workers AI is used to generate a response using the retrieved content and the original user's query.

RAG in action

Examples showing how to query, filter, and integrate AI Search into your applications.

export default {
  async fetch(request, env) {
    const { searchParams } = new URL(request.url);
    const query = searchParams.get('q');
    
    if (!query) {
      return new Response('Please provide a query parameter', { status: 400 });
    }

    // Search your AI Search instance
    const answer = await env.AI.autorag("my-rag").aiSearch({
      query: query,
    });

    return new Response(JSON.stringify(answer), {
      headers: { 'Content-Type': 'application/json' }
    });
  }
};

Zendesk

“

Like Zendesk, innovation is in Cloudflare’s DNA — it mirrors our beautifully simple development ethos with the connectivity cloud, a powerful, yet simple-to-implement, end-to-end solution that does all the heavy lifting, so we don’t need to. ”

Nan Guo Senior Vice President of Engineering

Powerful primitives, seamlessly integrated

Built on systems powering 20% of the Internet, AI Search runs on the same infrastructure Cloudflare uses to build Cloudflare. Enterprise-grade reliability, security, and performance are standard.

Compute

Workers

Global serverless functions

Containers

Any language, anywhere

Sandboxes

Secure code execution

Durable Objects

Stateful compute

Browser Rendering

Automated browsers

Workflows

Process orchestration

Storage

Egress-free storage

Data Platform

Ingest, Catalog & Query

Hyperdrive

Global databases

Serverless SQL

Key-value speed

Queues

Message processing

Workers AI

Edge AI models

Agents

Build stateful AI agents

AI Gateway

AI observability

Vectorize

Vector database

AI Search

Instant retrieval

Media

Images

Image optimization

Stream

Video streaming

RealtimeKit

Live comms

TURN / SFU

Real-time infra

Network

DNS

Fast DNS

CDN

Faster delivery

WAF

App protection

Load Balancing

Zero downtime

Rate Limiting

Abuse prevention

Bot Mitigation

Block bots

Build without boundaries

Join thousands of developers who've eliminated infrastructure complexity and deployed globally with Cloudflare. Start building for free — no credit card required.

Start building for free View docs