Reranking

Reranking can help improve the quality of AI Search results by reordering retrieved documents based on semantic relevance to the user’s query. It applies a secondary model after retrieval to "rerank" the top results before they are outputted.

How it works

By default, reranking is disabled for all AI Search instances. You can enable it during creation or later from the settings page.

When enabled, AI Search will:

Retrieve a set of relevant results from your index, constrained by your max_num_of_results and score_threshold parameters.
Pass those results through a reranking model
Return the reranked results, which the text generation model can use for answer generation.

Reranking helps improve accuracy, especially for large or noisy datasets where vector similarity alone may not produce the optimal ordering.

Configuration

You can configure reranking in several ways:

Configure via API

You can also configure via the API. When you make a /search or /ai-search request using the Workers Binding or REST API, you can:

Enable or disable reranking per request
Specify the reranking model

For example:

const answer = await env.AI.autorag("my-autorag").aiSearch({
  query: "How do I train a llama to deliver coffee?",
  model: "@cf/meta/llama-3.3-70b-instruct-fp8-fast",
  reranking: {
    enabled: true,
    model: "@cf/baai/bge-reranker-base"
  }
});

Configure in dashboard for new AI Search

When creating a new RAG in the dashboard:

In the Retrieval configuration step, open the Reranking dropdown
Toggle Reranking on
Select the reranking model

Configure in dashboard for existing AI Search

To update reranking for an existing instance:

Go to your AI Search instance
Open the Settings tab
Enable or disable reranking, and select the reranking model

Was this helpful?

Community
X
Discord
YouTube
GitHub