Redis 数据结构深度实战——五种数据类型的底层编码与性能特征

老张2026/4/30大约 7 分钟

Redis 数据结构深度实战——五种数据类型的底层编码与性能特征

适读人群：日常使用 Redis 但想深入理解底层的后端工程师 | 阅读时长：约17分钟 | 核心价值：理解 Redis 五种数据类型的底层实现，用对数据结构，写出高性能缓存代码

"Redis 很简单，就是 KV 存储"

每次听到这句话，我都忍不住想反驳一下。Redis 的五种数据类型（String、Hash、List、Set、Sorted Set）各自有多种底层编码实现，每种编码在不同数据规模和操作场景下的性能特征差异巨大。

2022 年，我帮一个做社交产品的团队做 Redis 优化。他们的问题是：用户关注列表用 List 存储，每次查询关注了某人的粉丝列表时，要 LRANGE key 0 -1 取出所有关注者，然后在内存中做集合运算。当一个大 V 有 50 万粉丝时，LRANGE 返回的数据量达到几百 KB，网络传输和内存开销极大。

改用 Set 之后，直接用 SISMEMBER 判断关注关系，SINTERSTORE 做共同关注运算，单次操作耗时从 200ms 降到了 1ms。

同样是"把数据存 Redis"，数据结构选对了，性能差了 200 倍。今天把五种数据类型的底层实现和选型逻辑系统梳理一遍。

一、String：比想象中更丰富

1.1 三种底层编码

String 类型有三种底层编码：

int：值是 64 位整数时，直接用 long 类型存储，内存最省
embstr：值是 ≤ 44 字节的字符串，用 embstr 编码（redisObject 和 sdshdr 内存连续，只分配一次内存）
raw：值是 > 44 字节的字符串，用 raw 编码（redisObject 和 sdshdr 分开分配）

# 查看实际编码
127.0.0.1:6379> SET count 100
127.0.0.1:6379> OBJECT ENCODING count
"int"

127.0.0.1:6379> SET short_str "hello"
127.0.0.1:6379> OBJECT ENCODING short_str
"embstr"

127.0.0.1:6379> SET long_str "this is a long string exceeding 44 bytes limit!!"
127.0.0.1:6379> OBJECT ENCODING long_str
"raw"

1.2 String 的典型使用场景

@Service
public class RedisStringDemo {
    
    @Autowired
    private StringRedisTemplate redisTemplate;
    
    /**
     * 场景1：计数器（利用 int 编码的高效性）
     * INCR 是原子操作，天然防并发
     */
    public long incrementViewCount(Long articleId) {
        String key = "article:view:" + articleId;
        Long count = redisTemplate.opsForValue().increment(key);
        // 设置过期时间（如果 key 刚创建）
        if (count != null && count == 1) {
            redisTemplate.expire(key, Duration.ofDays(7));
        }
        return count != null ? count : 0;
    }
    
    /**
     * 场景2：分布式锁（SET NX EX，原子操作）
     */
    public boolean tryLock(String lockKey, String lockValue, long ttlSeconds) {
        Boolean success = redisTemplate.opsForValue()
            .setIfAbsent(lockKey, lockValue, Duration.ofSeconds(ttlSeconds));
        return Boolean.TRUE.equals(success);
    }
    
    /**
     * 场景3：缓存对象（建议序列化为 JSON，embstr 限制在 44 字节以内时性能最好）
     */
    public void cacheUser(Long userId, UserDTO user) {
        String key = "user:" + userId;
        // 对于大对象，考虑用 Hash 代替 String，可以只更新部分字段
        redisTemplate.opsForValue().set(key, 
            JsonUtils.toJson(user), Duration.ofMinutes(30));
    }
}

二、Hash：对象存储的最佳选择

2.1 两种底层编码

listpack（旧版是 ziplist）：当 hash 的 field 数量 ≤ hash-max-listpack-entries（默认 128）且每个 field/value 的大小 ≤ hash-max-listpack-value（默认 64 字节）时，用 listpack 存储，内存连续，CPU 缓存友好
hashtable：超过上述阈值后，转为 hashtable，操作 O(1)，但内存占用更大

# 小 Hash（listpack 编码）
127.0.0.1:6379> HSET user:1 name "张三" age 25 city "北京"
127.0.0.1:6379> OBJECT ENCODING user:1
"listpack"

# 超过阈值后自动升级为 hashtable
# 一旦升级，不会降回 listpack（即使删了大量字段）

2.2 Hash vs String 存对象

// 方案对比

// String 方案：存 JSON
redisTemplate.opsForValue().set("user:1", jsonUser);
// 优点：简单
// 缺点：更新单个字段需要反序列化 → 修改 → 序列化，有额外开销
//       如果只需要读部分字段，也要读取整个 JSON

// Hash 方案：按字段存储
hashOps.put("user:1", "name", "张三");
hashOps.put("user:1", "age", "25");
hashOps.put("user:1", "city", "北京");
// 优点：可以只读/写某个字段（HGET/HSET）
//       占用内存更少（listpack 编码时内存紧凑）
// 缺点：字段多时需要多次 HSET

// 选型建议：
// 对象字段数 ≤ 10，且频繁只读/写个别字段 → Hash
// 对象需要作为整体操作 → String (JSON)
// 对象字段数很多（> 50），且不需要单独更新字段 → String

踩坑一：Hash 字段过多导致编码升级，内存暴增

现象：用 Hash 存用户画像，每个用户有 200+ 个标签字段，Redis 内存使用远超预期。

原因：200 个字段超过了 hash-max-listpack-entries=128，Hash 升级为 hashtable 编码，hashtable 每个 entry 有额外的指针开销，内存占用比 listpack 多 4-6 倍。

解法：把超大 Hash 拆分为多个小 Hash（每个 Hash ≤ 100 个字段），或者改用 String 存 JSON，或者考虑用 Redis 的 JSON 模块（RedisJSON）。

三、List：有序队列，但要注意性能边界

3.1 底层编码变化

listpack：元素数 ≤ list-max-listpack-size（默认 128），且每个元素 ≤ 64 字节时
quicklist：超过阈值后，使用 quicklist（多个 listpack 节点组成的双向链表）

3.2 List 的 O(n) 操作陷阱

# O(1) 操作：头尾增删
LPUSH / RPUSH / LPOP / RPOP

# O(n) 操作：按索引访问、范围查询、删除指定元素
LINDEX key index  # O(n)，n 是索引位置
LRANGE key 0 -1   # O(n)，n 是范围大小
LREM key count value  # O(n)

踩坑二：LRANGE 拉取大列表造成网络阻塞

现象：用 LRANGE key 0 -1 获取完整的消息历史列表，当列表长度达到 10 万条时，单次操作返回几 MB 数据，占用 Redis 的单线程执行时间，导致其他命令出现明显延迟。

解法：分页查询，每次只取一段：

@Service
public class MessageListService {
    
    @Autowired
    private RedisTemplate<String, String> redisTemplate;
    
    private static final int PAGE_SIZE = 100;
    
    /**
     * 分页获取消息列表（从最新到最旧）
     */
    public List<String> getMessages(String channelId, int page) {
        String key = "channel:messages:" + channelId;
        long start = (long) page * PAGE_SIZE;
        long end = start + PAGE_SIZE - 1;
        
        return redisTemplate.opsForList().range(key, start, end);
    }
    
    /**
     * 消息列表只保留最新 N 条，防止无限增长
     */
    public void addMessage(String channelId, String message) {
        String key = "channel:messages:" + channelId;
        redisTemplate.opsForList().leftPush(key, message);
        // 保留最新 1000 条，超出的从尾部截断
        redisTemplate.opsForList().trim(key, 0, 999);
    }
}

四、Set：集合运算的利器

4.1 底层编码

intset：所有元素都是整数，且数量 ≤ set-max-intset-entries（默认 512）时，使用 intset，内存紧凑，查找 O(log n)（二分查找）
listpack：元素数 ≤ 128 且每个元素 ≤ 64 字节时（Redis 7.2+）
hashtable：超过阈值后，使用 hashtable，所有操作 O(1)

4.2 Set 的交集/并集/差集

@Service
public class SocialRelationService {
    
    @Autowired
    private RedisTemplate<String, Long> redisTemplate;
    
    private static final SetOperations<String, Long> setOps = null; // 略
    
    /**
     * 获取两个用户的共同关注
     * 使用 Set 的 SINTER 操作，O(N*M)，N 是较小集合大小，M 是集合数量
     */
    public Set<Long> getCommonFollowing(Long userId1, Long userId2) {
        String key1 = "user:following:" + userId1;
        String key2 = "user:following:" + userId2;
        return redisTemplate.opsForSet().intersect(key1, key2);
    }
    
    /**
     * 判断用户是否关注了某人
     * SISMEMBER，O(1)
     */
    public boolean isFollowing(Long followerId, Long followeeId) {
        String key = "user:following:" + followerId;
        return Boolean.TRUE.equals(
            redisTemplate.opsForSet().isMember(key, followeeId));
    }
    
    /**
     * 获取可能认识的人（A 关注的人也关注了 B，但 A 没有关注 B）
     */
    public Set<Long> getMightKnow(Long userId) {
        String myFollowing = "user:following:" + userId;
        
        // 获取我关注的所有人的粉丝集合
        Set<Long> following = redisTemplate.opsForSet().members(myFollowing);
        if (following == null || following.isEmpty()) return Collections.emptySet();
        
        List<String> theirFollowerKeys = following.stream()
            .map(uid -> "user:followers:" + uid)
            .collect(Collectors.toList());
        
        // 所有人的粉丝并集，减去我自己关注的人
        String tempKey = "temp:might_know:" + userId;
        redisTemplate.opsForSet().unionAndStore(
            theirFollowerKeys.get(0), theirFollowerKeys.subList(1, theirFollowerKeys.size()), tempKey);
        Set<Long> candidates = redisTemplate.opsForSet().difference(tempKey, myFollowing);
        redisTemplate.delete(tempKey);
        
        return candidates;
    }
}

五、Sorted Set：排行榜的最强武器

5.1 底层编码

listpack：元素数 ≤ 128 且每个元素 ≤ 64 字节时
skiplist + hashtable：超过阈值后，使用跳表（skiplist）实现有序，hashtable 实现 O(1) 的 member → score 查找

跳表是 Sorted Set 的核心数据结构，平均时间复杂度 O(log n)，最坏 O(n)。

@Service
public class LeaderboardService {
    
    @Autowired
    private RedisTemplate<String, String> redisTemplate;
    
    private static final String DAILY_LEADERBOARD = "leaderboard:daily";
    
    /**
     * 更新用户积分
     * ZADD 或 ZINCRBY，O(log n)
     */
    public void addScore(Long userId, double score) {
        // ZINCRBY：在现有分数基础上增加
        redisTemplate.opsForZSet().incrementScore(
            DAILY_LEADERBOARD, userId.toString(), score);
    }
    
    /**
     * 获取 Top N 排名（分数从高到低）
     * ZREVRANGE，O(log n + k)，k 是返回元素数量
     */
    public List<RankItem> getTopN(int n) {
        Set<ZSetOperations.TypedTuple<String>> topN = 
            redisTemplate.opsForZSet().reverseRangeWithScores(
                DAILY_LEADERBOARD, 0, n - 1);
        
        if (topN == null) return Collections.emptyList();
        
        AtomicInteger rank = new AtomicInteger(1);
        return topN.stream()
            .map(t -> new RankItem(
                Long.parseLong(t.getValue()),
                t.getScore(),
                rank.getAndIncrement()))
            .collect(Collectors.toList());
    }
    
    /**
     * 查询用户排名
     * ZREVRANK，O(log n)
     */
    public Long getUserRank(Long userId) {
        Long rank = redisTemplate.opsForZSet().reverseRank(
            DAILY_LEADERBOARD, userId.toString());
        return rank != null ? rank + 1 : null;  // rank 从 0 开始，转为从 1 开始
    }
    
    /**
     * 获取用户附近的排名（前后 5 名）
     * 先查排名，再查范围
     */
    public List<RankItem> getSurroundingRanks(Long userId, int range) {
        Long rank = redisTemplate.opsForZSet().reverseRank(
            DAILY_LEADERBOARD, userId.toString());
        if (rank == null) return Collections.emptyList();
        
        long start = Math.max(0, rank - range);
        long end = rank + range;
        
        Set<ZSetOperations.TypedTuple<String>> surrounding =
            redisTemplate.opsForZSet().reverseRangeWithScores(
                DAILY_LEADERBOARD, start, end);
        
        if (surrounding == null) return Collections.emptyList();
        
        AtomicLong currentRank = new AtomicLong(start + 1);
        return surrounding.stream()
            .map(t -> new RankItem(
                Long.parseLong(t.getValue()),
                t.getScore(),
                currentRank.getAndIncrement()))
            .collect(Collectors.toList());
    }
}

选择正确的数据结构，是 Redis 性能调优最直接有效的手段，比调整参数、增加内存都更根本。