Elasticsearch Query Types: When to Use match, term, range, and bool
Elasticsearch query types split into analyzed (full-text) and non-analyzed (exact-match) categories. match analyzes the input through the field's analyzer before lookup; term does not. Choosing the wrong query type for a field's mapping causes silent 0-result bugs. Query context contributes to _score; filter context doesn't — use filter for binary conditions to enable caching.
Analyzed vs non-analyzed queries
The most important distinction in Elasticsearch queries is whether the query string gets analyzed before the lookup:
| Query | Analysis | Use on | Example |
|---|---|---|---|
| match | Yes — uses field's analyzer | text fields | Full-text search, user input |
| match_phrase | Yes — preserves token order | text fields | Exact phrase search |
| multi_match | Yes | Multiple text fields | Search across title + body |
| term | No — exact lookup | keyword fields | Tag filter, status, IDs |
| terms | No | keyword fields | Multi-value exact match |
| range | No | Numeric, date, keyword | Date windows, price ranges |
| prefix | No | keyword fields | Autocomplete prefix |
| wildcard | No | keyword fields | Pattern matching (*, ?) |
| fuzzy | No | keyword fields | Typo-tolerant exact match |
Rule: match for text fields, term for keyword fields. Using term on a text field silently returns 0 results because the inverted index contains analyzed tokens, not the original string.
Query context vs filter context
The same query can run in query context (affects _score) or filter context (binary match/no-match, cached):
{
"query": {
"bool": {
"must": [
{ "match": { "tweet_text": "database scaling" } }
],
"filter": [
{ "term": { "is_retweet": false } },
{ "range": { "post_date": { "gte": "2024-01-01", "lte": "2024-01-31" } } },
{ "range": { "likes": { "gte": 100 } } }
]
}
}
}
match in must → scored. term and range in filter → cached binary filters. The relevance ranking reflects how well a tweet matches "database scaling"; the filters just narrow the candidate set.
Every text field in Elasticsearch has a .keyword subfield — use it for exact matches and aggregations
ConceptElasticsearchBy default, Elasticsearch creates a text field with a .keyword subfield. The text field is analyzed for full-text search. The .keyword subfield is not analyzed — it stores the original string as-is. This means you can use match on tweet_text for full-text search and term on tweet_text.keyword for exact-match filtering, without changing the mapping.
Prerequisites
- Elasticsearch mappings
- text vs keyword types
- Inverted index
Key Points
- tweet_text is a text field — use match for full-text search.
- tweet_text.keyword is a keyword subfield — use term for exact match, aggregations, and sorting.
- keyword fields are not analyzed; they store the original string including case and punctuation.
- Aggregations (terms aggregation for counts) require a keyword field — running aggregations on text fields fails.
Query type reference with examples
match — full-text search:
{ "match": { "tweet_text": "sunny weather" } }
Analyzes "sunny weather" → tokens ["sunny", "weather"]. Returns documents containing either token, ranked by BM25 score. Documents with both tokens score higher.
match_phrase — ordered phrase:
{ "match_phrase": { "tweet_text": "machine learning" } }
Requires "machine" immediately followed by "learning" in the field. "learning machine" doesn't match.
term — exact keyword match:
{ "term": { "hashtags": "elasticsearch" } }
Looks up "elasticsearch" exactly in the keyword field. No analysis — case matters. For a keyword field, #Elasticsearch and elasticsearch are different values.
terms — multiple exact values:
{ "terms": { "status": ["active", "pending"] } }
Equivalent to status IN ('active', 'pending') in SQL. Keyword field, no analysis.
range — numeric and date windows:
{ "range": { "post_date": { "gte": "2024-01-01", "lte": "2024-01-31" } } }
gte/lte/gt/lt. Works on dates, numbers, and keyword fields (lexicographic ordering).
bool — compound logic:
{
"bool": {
"must": [{ "match": { "tweet_text": "database" } }],
"filter": [{ "term": { "user_name.keyword": "tech_guru" } }],
"must_not": [{ "term": { "is_retweet": true } }]
}
}
fuzzy — typo tolerance:
{ "fuzzy": { "tweet_text.keyword": { "value": "elasticsaerch", "fuzziness": "AUTO" } } }
fuzziness: "AUTO" allows 1 edit for 3-5 character strings, 2 edits for 6+ character strings. Expensive on large indexes — use with a filter to narrow candidates first.
wildcard — pattern matching:
{ "wildcard": { "user_name.keyword": "tech_*" } }
Leading wildcards (*tech) trigger full index scans and are extremely slow. Avoid or disable with allow_leading_wildcard: false.
Specialized queries
more_like_this — document similarity:
{
"more_like_this": {
"fields": ["tweet_text"],
"like": [{ "_index": "tweets", "_id": "abc123" }],
"min_term_freq": 1,
"max_query_terms": 12
}
}
Finds documents with similar term distribution to a reference document. Used for "related posts" features.
nested — search inside nested objects:
{
"nested": {
"path": "comments",
"query": {
"bool": {
"must": [
{ "match": { "comments.text": "great post" } },
{ "term": { "comments.author": "alice" } }
]
}
}
}
}
Nested queries scope matches to individual nested objects — prevents cross-object matching (where author=alice from one comment and text=great from a different comment falsely match).
You want to find all tweets tagged with exactly '#Elasticsearch' (preserving case and the # character). The hashtags field is mapped as both text and keyword. Which query and field should you use?
easyThe hashtags field has a text subfield (analyzed, lowercased) and a hashtags.keyword subfield (not analyzed, stores original string including # and case).
Amatch on hashtags — it's the standard full-text query and handles text fields
Incorrect.match would analyze '#Elasticsearch' through the standard analyzer, stripping # and lowercasing to produce 'elasticsearch'. It would return documents containing 'elasticsearch' as a token — which may include documents where the hashtag appeared as 'ELASTICSEARCH' or 'Elasticsearch'. Not suitable for exact hashtag filtering.Bterm on hashtags.keyword — exact lookup on the unanalyzed subfield
Correct!hashtags.keyword stores the original string without analysis. A term query for '#Elasticsearch' looks up that exact string in the keyword subfield and returns only documents where the hashtag appears exactly as '#Elasticsearch'. This is the correct pattern for exact-match filtering: use the .keyword subfield with term. If you also need full-text search on hashtags (finding 'elasticsearch' regardless of case/prefix), use match on the analyzed hashtags field.Cterm on hashtags — term queries work on any field type
Incorrect.term on the text field hashtags would look for the exact string '#Elasticsearch' in an index that contains lowercased, analyzed tokens like 'elasticsearch'. '#Elasticsearch' ≠ 'elasticsearch', so 0 results. term works on text fields syntactically but produces wrong results because the analyzed index doesn't contain the original string.Dmatch_phrase on hashtags.keyword — phrase matching preserves the exact string
Incorrect.match_phrase is designed for ordered phrase matching on text fields — it analyzes the input. On a keyword field, match_phrase would analyze '#Elasticsearch' and mismatch the stored unanalyzed value. Use term on keyword fields for exact matches.
Hint:Which field type stores the original unanalyzed string, and which query skips analysis?