Personalizing Elasticsearch Results: function_score, Rescoring, and Query-Time vs Index-Time Tradeoffs — Blog

What BM25 knows and doesn't know

Elasticsearch uses BM25 (Best Match 25) as its default relevance scoring algorithm. BM25 computes a score for each document based on term frequency (how often the search term appears in the document), inverse document frequency (how rare the term is across all documents), and document length normalization.

BM25 is entirely document-centric. It knows nothing about the user making the query. A search for "machine learning" returns the same ranked results for every user, regardless of whether they're a beginner or an expert, whether they follow specific authors, or what they've engaged with before.

Personalization adds user context into scoring — but Elasticsearch doesn't store user profiles. That context must come from outside.

The two places personalization can live

ConceptElasticsearch

Personalization signals can be applied at query time (modify the query before sending it to Elasticsearch) or at index time (embed user-specific signals into the indexed documents). Query-time is flexible and always fresh; index-time is fast but requires re-indexing when signals change.

Prerequisites

BM25 relevance scoring
Elasticsearch query DSL
inverted index basics

Key Points

Query-time personalization: inject user signals into function_score, pinned queries, or boost queries. No re-indexing needed. Latency cost per query.
Index-time personalization: pre-compute scores or embed user-specific fields. Fast at query time. Stale until re-indexed.
The query is not fine-tuning BM25 itself — it's adding additional scoring factors on top of BM25.
Elasticsearch does not store user profiles; you retrieve them externally and inject them into the query.

function_score: injecting user signals into ranking

The function_score query wraps any base query and applies score modifiers using one or more scoring functions. This is the primary mechanism for query-time personalization.

// Boost results matching topics the user follows
{
  "query": {
    "function_score": {
      "query": {
        "match": { "content": "machine learning" }
      },
      "functions": [
        {
          "filter": { "terms": { "topic": ["distributed-systems", "databases"] } },
          "weight": 2.0
        },
        {
          "filter": { "term": { "author_id": "user_456" } },
          "weight": 1.5
        },
        {
          "gauss": {
            "published_at": {
              "origin": "now",
              "scale": "7d",
              "decay": 0.5
            }
          }
        }
      ],
      "score_mode": "multiply",
      "boost_mode": "multiply"
    }
  }
}

This query boosts content tagged with topics the user follows, boosts posts from followed authors, and applies a Gaussian decay so recent content scores higher. All of this runs inside Elasticsearch — you build the query in your application server after fetching the user's profile from your database.

score_mode controls how multiple function scores combine (sum, multiply, avg, max, min). boost_mode controls how the combined function score interacts with the BM25 score (replace, sum, multiply).

Pinned queries: hardcoded results with flexible fallback

For cases where certain results should always appear first (sponsored content, editor picks, trending items), the pinned query forces specific document IDs to the top of results:

{
  "query": {
    "pinned": {
      "ids": ["post_trending_1", "post_trending_2"],
      "organic": {
        "function_score": {
          "query": { "match": { "content": "kubernetes" } },
          "functions": [
            { "filter": { "terms": { "tag": ["devops", "containers"] } }, "weight": 1.8 }
          ]
        }
      }
    }
  }
}

Pinned results appear first regardless of BM25 score. The organic query handles everything else with normal ranking.

Rescoring: apply expensive models to top N results

BM25 scoring runs against the entire index. For a large index, running a complex personalization model on every document is too slow. The rescore query runs a second, more expensive scoring function only on the top N results from the initial query:

{
  "query": {
    "match": { "content": "python tutorial" }
  },
  "rescore": {
    "window_size": 100,
    "query": {
      "rescore_query": {
        "function_score": {
          "functions": [
            {
              "script_score": {
                "script": {
                  "source": "dotProduct(params.user_vector, 'content_embedding') + 1.0",
                  "params": { "user_vector": [0.12, -0.34, 0.89, ...] }
                }
              }
            }
          ]
        }
      }
    }
  }
}

BM25 selects the top 100 candidates. The rescore query applies a user embedding similarity computation only to those 100. Vector similarity (for ML-based personalization) is expensive — rescoring keeps it tractable.

💡Index-time personalization: when and how

Index-time personalization pre-computes signals during indexing rather than at query time. Two common approaches:

Per-user indexed documents: for platforms with discrete user cohorts, index separate document copies with user-specific scores embedded. user_id: 123, doc_id: 456, personalized_score: 0.87. Query by user_id to retrieve pre-ranked content. Scales poorly — index size grows as users × documents.

Global popularity signals: index documents with aggregate signals that don't require per-user data — view count, share count, engagement rate, recency. Use these as boost factors in function_score without needing user-specific data. Fresh for all users; only requires re-indexing when document signals change.

// At index time, embed engagement signals in document
{
  "doc_id": "post_123",
  "content": "...",
  "engagement_score": 0.72,    // pre-computed
  "view_count": 1450,
  "recency_decay": 0.91        // computed at index time, stale after hours
}

The tradeoff: index-time signals are stale. A post that goes viral after indexing won't have its score updated until the next re-index or update cycle. Query-time function_score can compute recency decay dynamically (gauss decay on published_at), always current. Index-time decay is a snapshot.

The query construction pipeline

In practice, personalized search involves a lookup before the Elasticsearch query:

def personalized_search(user_id: str, query_text: str) -> list:
    # 1. Fetch user signals from your data store
    user_profile = db.get_user_profile(user_id)
    followed_topics = user_profile["followed_topics"]      # ["databases", "distributed-systems"]
    followed_authors = user_profile["followed_author_ids"] # ["author_456", "author_789"]

    # 2. Build personalized query
    es_query = {
        "query": {
            "function_score": {
                "query": {
                    "multi_match": {
                        "query": query_text,
                        "fields": ["title^2", "content", "tags"]
                    }
                },
                "functions": [
                    {
                        "filter": {"terms": {"topic": followed_topics}},
                        "weight": 2.0
                    },
                    {
                        "filter": {"terms": {"author_id": followed_authors}},
                        "weight": 1.5
                    }
                ],
                "score_mode": "sum",
                "boost_mode": "multiply"
            }
        }
    }

    # 3. Execute against Elasticsearch
    return es_client.search(index="posts", body=es_query)

The profile lookup adds latency. For high-traffic search endpoints, cache the user profile in Redis with a short TTL (30–60 seconds) rather than hitting the database on every search.

A search platform uses function_score to boost results based on topics the user follows. The engineering team wants to improve personalization but is concerned about query latency. A proposal is made to pre-compute each user's topic interest vector and store it in Elasticsearch documents themselves. What is the main problem with this approach?

medium

The platform has 5 million users and 10 million documents. User topic interests change daily as users follow and unfollow topics. Currently, function_score boosts are injected at query time after a Redis cache lookup.

Afunction_score cannot use pre-computed vectors stored in documents
Incorrect.function_score with script_score can access document fields, including stored vectors. This is actually a common pattern for approximate vector similarity before kNN search was available.
BEmbedding per-user interest vectors in documents causes index bloat and stale signals — each user interest change requires re-indexing all documents for that user
Correct!With 5M users × 10M documents, the index size becomes unmanageable if user signals are embedded per-document. More critically, when a user follows a new topic, all their document scores are stale until re-indexed. Query-time injection via function_score adds latency but keeps signals always current. The right tradeoff for daily-changing interests is query-time injection with a cached profile lookup.
CElasticsearch does not support numeric fields in documents
Incorrect.Elasticsearch fully supports numeric fields. float, double, integer are all supported field types and can be used in script_score computations.
DThe Redis cache lookup is faster than accessing document fields in Elasticsearch
Incorrect.Redis is fast, but the comparison misses the point. The issue with index-time personalization is index size and update latency, not lookup speed.

Hint:Think about what happens to the index size and how you update signals when 5 million users change their preferences daily.