Redis Ziplist: The Internal Encoding That Keeps Small Collections Compact — Blog

Why Redis uses different encodings for the same type

A Redis hash can be stored as either a ziplist or a hashtable. A sorted set can be a ziplist or a skiplist. Redis chooses the encoding automatically based on the size of the data. You call HSET the same way regardless — the encoding is transparent to the application.

The reason: a hashtable is fast (O(1) operations) but memory-heavy. Each key and value requires a robj wrapper, a pointer, and alignment overhead. For a hash with 5 keys, these pointers might cost more than the data itself.

A ziplist stores everything in a single contiguous block of memory: no pointers, no per-entry objects, minimal overhead. For small collections, the memory savings are significant (often 10x), and sequential reads are fast (CPU-friendly due to cache locality).

Ziplist memory layout

ConceptRedis Internals

A ziplist is a byte array. Entries are packed sequentially with variable-length encoding. The header stores total bytes and entry count. Each entry stores the previous entry's length (enabling backward traversal), the encoding type and data length, and the actual data.

Prerequisites

Redis data types
memory allocation basics
Big O notation

Key Points

No pointers — entries reference each other by length, not address. Saves 8-16 bytes per entry vs linked structures.
Integers are stored as actual integers (1-5 bytes), not as strings. '12345' uses 2 bytes, not 5.
All operations are O(n) — inserts and deletes require shifting subsequent entries in memory.
Auto-converts to hashtable/skiplist/linked list when configured thresholds are exceeded.

The memory structure in detail

[zlbytes][zltail][zllen][entry1][entry2]...[entryN][0xFF]

zlbytes: 4 bytes. Total allocated bytes.
zltail: 4 bytes. Byte offset to the last entry (enables RPOP/LRANGE-from-end without full traversal).
zllen: 2 bytes. Number of entries (max 65,535 without a full scan).

Each entry:

[prev_entry_len][encoding][content]

prev_entry_len: 1 byte if previous entry < 254 bytes, otherwise 5 bytes. Enables backward traversal.
encoding: 1-5 bytes. Indicates whether content is an integer or string, and its length.
content: the actual value.

The variable-length encoding is where the savings come from: a small integer like 42 uses 1-2 bytes total for entry overhead plus 1 byte for the value. A linked list node for the same integer would require a robj (16 bytes) + pointer (8 bytes) + the string representation.

Thresholds and automatic conversion

Redis converts from ziplist to a more scalable structure when either threshold is exceeded. Defaults in redis.conf:

# Hash
hash-max-listpack-entries 128  # max entries before converting to hashtable
hash-max-listpack-value 64     # max bytes for a single value

# List
list-max-listpack-size 128

# Sorted Set
zset-max-listpack-entries 128
zset-max-listpack-value 64

(Redis 7.0+ uses "listpack" as the name for the improved implementation; older versions called it ziplist. The behavior is the same.)

# Check the current encoding of a key
redis-cli OBJECT ENCODING myhash
# "listpack" (if small)
# "hashtable" (if large)

Conversion is one-way and immediate: once a threshold is exceeded, the structure converts to the scalable type. It does not convert back if entries are removed.

⚠The cascade reallocation problem (ziplist)

Inserting into a ziplist can trigger a cascade of reallocations. Because prev_entry_len is 1 byte for entries < 254 bytes, inserting a large entry can change the previous entry's prev_entry_len from 1 to 5 bytes — which changes that entry's size — which changes its successor's prev_entry_len — and so on through the entire list.

This is called "cascade update" and is why ziplist performance degrades for large collections with frequent modification. The threshold configurations exist precisely to prevent ziplist from being used for large or heavily-modified data:

Keep hash-max-listpack-entries at 128 or lower for write-heavy hashes.
Keep hash-max-listpack-value at 64 bytes or lower for variable-length values.

For read-heavy, small, infrequently-modified data (user session metadata, configuration objects), ziplist is nearly optimal.

Practical tuning

The default thresholds work for most cases. Adjust them when:

Lower the threshold if you observe high memory usage from unexpectedly large ziplists:

# Find all hash keys and their encodings
redis-cli --scan --pattern '*' | xargs -I {} redis-cli TYPE {} | grep hash
redis-cli DEBUG OBJECT myhash

Raise the threshold if your data is consistently small and you want to reduce memory:

# For small session objects that rarely exceed 20 fields
hash-max-listpack-entries 256
hash-max-listpack-value 128

Benchmark before raising thresholds significantly — at some point, O(n) operations on ziplists outweigh the memory savings. A hash with 500 entries doing frequent HGET is faster as a hashtable.

A Redis hash starts with 50 string fields and uses the listpack encoding. After a code change adds 100 more fields (total: 150), the hash encoding switches to hashtable. Performance for HGET improves but memory usage for that key increases significantly. Why?

medium

hash-max-listpack-entries is 128. All values are short strings (< 10 bytes). The hash now has 150 fields.

Ahashtable stores duplicate copies of each entry for fast lookup
Incorrect.Hashtables don't duplicate entries. They store a pointer to each key-value robj, plus bucket overhead — but not duplicates.
Bhashtable requires per-entry robj wrappers and pointers that listpack avoids
Correct!listpack stores all data in a contiguous block with no pointers. hashtable stores each key and value as a robj (16 bytes minimum) plus pointer overhead (~8 bytes per entry in 64-bit Redis). For 150 short string fields, the hashtable overhead per entry (robj + pointer + alignment) can exceed the data size itself, while listpack stored the same data with near-zero overhead.
CRedis allocates a fixed 1 MB block for every hashtable
Incorrect.Redis does not allocate fixed blocks for hashtables. Memory is allocated per entry plus bucket array overhead.
DThe encoding conversion copies the data twice before releasing the original
Incorrect.The temporary double-memory during conversion is real but transient. The sustained memory increase after conversion is due to the per-entry overhead of the hashtable structure.

Hint:Think about what each encoding stores per entry — listpack avoids a structure that hashtable requires for each key and value.