Elasticsearch Query Types: When to Use match, term, range, and bool

2 min readDatabases & Storage

Elasticsearch query types split into analyzed (full-text) and non-analyzed (exact-match) categories. match analyzes the input through the field's analyzer before lookup; term does not. Choosing the wrong query type for a field's mapping causes silent 0-result bugs. Query context contributes to _score; filter context doesn't — use filter for binary conditions to enable caching.

searchelasticsearch

Analyzed vs non-analyzed queries

The most important distinction in Elasticsearch queries is whether the query string gets analyzed before the lookup:

| Query | Analysis | Use on | Example | |---|---|---|---| | match | Yes — uses field's analyzer | text fields | Full-text search, user input | | match_phrase | Yes — preserves token order | text fields | Exact phrase search | | multi_match | Yes | Multiple text fields | Search across title + body | | term | No — exact lookup | keyword fields | Tag filter, status, IDs | | terms | No | keyword fields | Multi-value exact match | | range | No | Numeric, date, keyword | Date windows, price ranges | | prefix | No | keyword fields | Autocomplete prefix | | wildcard | No | keyword fields | Pattern matching (*, ?) | | fuzzy | No | keyword fields | Typo-tolerant exact match |

Rule: match for text fields, term for keyword fields. Using term on a text field silently returns 0 results because the inverted index contains analyzed tokens, not the original string.

Query context vs filter context

The same query can run in query context (affects _score) or filter context (binary match/no-match, cached):

{
  "query": {
    "bool": {
      "must": [
        { "match": { "tweet_text": "database scaling" } }
      ],
      "filter": [
        { "term": { "is_retweet": false } },
        { "range": { "post_date": { "gte": "2024-01-01", "lte": "2024-01-31" } } },
        { "range": { "likes": { "gte": 100 } } }
      ]
    }
  }
}

match in must → scored. term and range in filter → cached binary filters. The relevance ranking reflects how well a tweet matches "database scaling"; the filters just narrow the candidate set.

Every text field in Elasticsearch has a .keyword subfield — use it for exact matches and aggregations

ConceptElasticsearch

By default, Elasticsearch creates a text field with a .keyword subfield. The text field is analyzed for full-text search. The .keyword subfield is not analyzed — it stores the original string as-is. This means you can use match on tweet_text for full-text search and term on tweet_text.keyword for exact-match filtering, without changing the mapping.

Prerequisites

  • Elasticsearch mappings
  • text vs keyword types
  • Inverted index

Key Points

  • tweet_text is a text field — use match for full-text search.
  • tweet_text.keyword is a keyword subfield — use term for exact match, aggregations, and sorting.
  • keyword fields are not analyzed; they store the original string including case and punctuation.
  • Aggregations (terms aggregation for counts) require a keyword field — running aggregations on text fields fails.

Query type reference with examples

match — full-text search:

{ "match": { "tweet_text": "sunny weather" } }

Analyzes "sunny weather" → tokens ["sunny", "weather"]. Returns documents containing either token, ranked by BM25 score. Documents with both tokens score higher.

match_phrase — ordered phrase:

{ "match_phrase": { "tweet_text": "machine learning" } }

Requires "machine" immediately followed by "learning" in the field. "learning machine" doesn't match.

term — exact keyword match:

{ "term": { "hashtags": "elasticsearch" } }

Looks up "elasticsearch" exactly in the keyword field. No analysis — case matters. For a keyword field, #Elasticsearch and elasticsearch are different values.

terms — multiple exact values:

{ "terms": { "status": ["active", "pending"] } }

Equivalent to status IN ('active', 'pending') in SQL. Keyword field, no analysis.

range — numeric and date windows:

{ "range": { "post_date": { "gte": "2024-01-01", "lte": "2024-01-31" } } }

gte/lte/gt/lt. Works on dates, numbers, and keyword fields (lexicographic ordering).

bool — compound logic:

{
  "bool": {
    "must": [{ "match": { "tweet_text": "database" } }],
    "filter": [{ "term": { "user_name.keyword": "tech_guru" } }],
    "must_not": [{ "term": { "is_retweet": true } }]
  }
}

fuzzy — typo tolerance:

{ "fuzzy": { "tweet_text.keyword": { "value": "elasticsaerch", "fuzziness": "AUTO" } } }

fuzziness: "AUTO" allows 1 edit for 3-5 character strings, 2 edits for 6+ character strings. Expensive on large indexes — use with a filter to narrow candidates first.

wildcard — pattern matching:

{ "wildcard": { "user_name.keyword": "tech_*" } }

Leading wildcards (*tech) trigger full index scans and are extremely slow. Avoid or disable with allow_leading_wildcard: false.

Specialized queries

more_like_this — document similarity:

{
  "more_like_this": {
    "fields": ["tweet_text"],
    "like": [{ "_index": "tweets", "_id": "abc123" }],
    "min_term_freq": 1,
    "max_query_terms": 12
  }
}

Finds documents with similar term distribution to a reference document. Used for "related posts" features.

nested — search inside nested objects:

{
  "nested": {
    "path": "comments",
    "query": {
      "bool": {
        "must": [
          { "match": { "comments.text": "great post" } },
          { "term": { "comments.author": "alice" } }
        ]
      }
    }
  }
}

Nested queries scope matches to individual nested objects — prevents cross-object matching (where author=alice from one comment and text=great from a different comment falsely match).

You want to find all tweets tagged with exactly '#Elasticsearch' (preserving case and the # character). The hashtags field is mapped as both text and keyword. Which query and field should you use?

easy

The hashtags field has a text subfield (analyzed, lowercased) and a hashtags.keyword subfield (not analyzed, stores original string including # and case).

  • Amatch on hashtags — it's the standard full-text query and handles text fields
    Incorrect.match would analyze '#Elasticsearch' through the standard analyzer, stripping # and lowercasing to produce 'elasticsearch'. It would return documents containing 'elasticsearch' as a token — which may include documents where the hashtag appeared as 'ELASTICSEARCH' or 'Elasticsearch'. Not suitable for exact hashtag filtering.
  • Bterm on hashtags.keyword — exact lookup on the unanalyzed subfield
    Correct!hashtags.keyword stores the original string without analysis. A term query for '#Elasticsearch' looks up that exact string in the keyword subfield and returns only documents where the hashtag appears exactly as '#Elasticsearch'. This is the correct pattern for exact-match filtering: use the .keyword subfield with term. If you also need full-text search on hashtags (finding 'elasticsearch' regardless of case/prefix), use match on the analyzed hashtags field.
  • Cterm on hashtags — term queries work on any field type
    Incorrect.term on the text field hashtags would look for the exact string '#Elasticsearch' in an index that contains lowercased, analyzed tokens like 'elasticsearch'. '#Elasticsearch' ≠ 'elasticsearch', so 0 results. term works on text fields syntactically but produces wrong results because the analyzed index doesn't contain the original string.
  • Dmatch_phrase on hashtags.keyword — phrase matching preserves the exact string
    Incorrect.match_phrase is designed for ordered phrase matching on text fields — it analyzes the input. On a keyword field, match_phrase would analyze '#Elasticsearch' and mismatch the stored unanalyzed value. Use term on keyword fields for exact matches.

Hint:Which field type stores the original unanalyzed string, and which query skips analysis?