Skip to content
Cloudflare Docs

Reranking

Reranking can help improve the quality of AI Search results by reordering retrieved documents based on semantic relevance to the user’s query. It applies a secondary model after retrieval to "rerank" the top results before they are outputted.

How it works

By default, reranking is disabled for all AI Search instances. You can enable it during creation or later from the settings page.

When enabled, AI Search will:

  1. Retrieve a set of relevant results from your index, constrained by your max_num_of_results and score_threshold parameters.
  2. Pass those results through a reranking model
  3. Return the reranked results, which the text generation model can use for answer generation.

Reranking helps improve accuracy, especially for large or noisy datasets where vector similarity alone may not produce the optimal ordering.

Configuration

You can configure reranking in several ways:

Configure via API

You can also configure via the API. When you make a /search or /ai-search request using the Workers Binding or REST API, you can:

  • Enable or disable reranking per request
  • Specify the reranking model

For example:

JavaScript
const answer = await env.AI.autorag("my-autorag").aiSearch({
query: "How do I train a llama to deliver coffee?",
model: "@cf/meta/llama-3.3-70b-instruct-fp8-fast",
reranking: {
enabled: true,
model: "@cf/baai/bge-reranker-base"
}
});

When creating a new RAG in the dashboard:

  1. In the Retrieval configuration step, open the Reranking dropdown
  2. Toggle Reranking on
  3. Select the reranking model

To update reranking for an existing instance:

  1. Go to your AI Search instance
  2. Open the Settings tab
  3. Enable or disable reranking, and select the reranking model