海量评论系统设计：从零到亿级流量的架构演进

前言

评论系统是互联网产品中最常见的功能之一。从电商的商品评价、社交媒体的动态评论，到视频网站的弹幕互动，评论系统看似简单，实则暗藏玄机。

当数据量从万级增长到亿级时，最初的简单设计会逐渐暴露出各种问题：查询变慢、写入延迟、存储膨胀、热点数据冲击……

本文将系统性地剖析海量评论系统的设计思路，从数据模型、存储选型、架构演进到性能优化，带你一步步构建能支撑亿级数据量的评论系统。

一、需求分析与挑战

1.1 核心功能

功能	说明	优先级
发布评论	用户对目标对象发表评论	P0
查询列表	按时间/热度分页查看评论	P0
回复评论	对已有评论进行回复	P0
点赞/踩	对评论表达态度	P1
删除评论	用户/管理员删除	P1
评论审核	敏感词过滤、人工审核	P1
热门评论	置顶/高赞优先展示	P2

1.2 典型业务场景

┌─────────────────────────────────────────┐
│  视频详情页                               │
│  ┌─────────────────────────────────────┐ │
│  │  视频标题：XXX                       │ │
│  │  播放量：100w                        │ │
│  └─────────────────────────────────────┘ │
│                                          │
│  📝 评论区（共 12,345 条）                │
│  ┌─────────────────────────────────────┐ │
│  │ ⭐ 热门评论                          │ │
│  │ 用户A：这个视频太赞了！ 👍 1234      │ │
│  │   └─ 用户B：确实，我看了三遍         │ │
│  │                                      │ │
│  │ 用户C：干货满满，收藏了 👍 890       │ │
│  └─────────────────────────────────────┘ │
│  [最新] [热门] [楼主]                     │
│  [输入评论...]                 [发布]     │
└─────────────────────────────────────────┘

1.3 核心挑战

挑战	描述	难度
数据量大	单视频评论可达百万，总评论量亿级	⭐⭐⭐⭐
高并发写入	热门内容瞬间涌入大量评论	⭐⭐⭐⭐⭐
深度分页	翻到几百页后性能急剧下降	⭐⭐⭐⭐
热点问题	热门视频的评论区被疯狂读取	⭐⭐⭐⭐
回复嵌套	多级回复的存储和展示	⭐⭐⭐
数据一致性	点赞数、评论数需要准确	⭐⭐⭐

二、数据模型设计

2.1 基础表结构

-- 评论主表
CREATE TABLE `comment` (
    `id` bigint NOT NULL AUTO_INCREMENT,
    `object_id` varchar(64) NOT NULL COMMENT '评论对象ID（视频ID/文章ID）',
    `object_type` tinyint NOT NULL COMMENT '对象类型：1-视频 2-文章 3-动态',
    `parent_id` bigint DEFAULT '0' COMMENT '父评论ID，0表示顶级评论',
    `reply_to_user_id` bigint DEFAULT NULL COMMENT '回复的目标用户ID',
    `user_id` bigint NOT NULL COMMENT '评论者ID',
    `content` text NOT NULL COMMENT '评论内容',
    `like_count` int DEFAULT '0' COMMENT '点赞数',
    `reply_count` int DEFAULT '0' COMMENT '回复数',
    `status` tinyint DEFAULT '1' COMMENT '状态：1-正常 2-删除 3-审核中',
    `floor` int DEFAULT NULL COMMENT '楼层号',
    `ip` varchar(45) DEFAULT NULL COMMENT '发布IP',
    `device_type` varchar(20) DEFAULT NULL COMMENT '设备类型',
    `created_at` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP,
    `updated_at` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
    PRIMARY KEY (`id`),
    KEY `idx_object_id_type` (`object_id`, `object_type`, `status`, `created_at`),
    KEY `idx_parent_id` (`parent_id`),
    KEY `idx_user_id` (`user_id`),
    KEY `idx_created_at` (`created_at`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COMMENT='评论主表';

-- 点赞记录表（防止重复点赞）
CREATE TABLE `comment_like` (
    `id` bigint NOT NULL AUTO_INCREMENT,
    `comment_id` bigint NOT NULL,
    `user_id` bigint NOT NULL,
    `status` tinyint DEFAULT '1' COMMENT '1-点赞 2-取消点赞',
    `created_at` datetime DEFAULT CURRENT_TIMESTAMP,
    PRIMARY KEY (`id`),
    UNIQUE KEY `uk_comment_user` (`comment_id`, `user_id`),
    KEY `idx_user_id` (`user_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COMMENT='评论点赞表';

2.2 分库分表策略

当单表数据量超过 500w 或单库超过 5000w 时，需要考虑分库分表。

分片键选择：object_id（评论对象ID）

分片策略:
  算法: Hash(object_id) % 分片数
  分片数: 16库 × 16表 = 256张表
  
  路由示例:
    object_id = "video_12345"
    hash = CRC32("video_12345") % 256
    target_table = comment_00{hash}

分片后的表结构：

comment_db_0
├── comment_0000
├── comment_0001
├── ...
└── comment_0015

comment_db_1
├── comment_0016
├── comment_0017
├── ...
└── comment_0031

... 共16个库，256张表

2.3 二级索引设计

由于分片键是 object_id，通过 user_id 查询评论的场景（如"我的评论"）需要建立二级索引表：

-- 用户评论索引表（按用户分片）
CREATE TABLE `user_comment_index` (
    `id` bigint NOT NULL AUTO_INCREMENT,
    `user_id` bigint NOT NULL,
    `comment_id` bigint NOT NULL,
    `object_id` varchar(64) NOT NULL,
    `created_at` datetime NOT NULL,
    PRIMARY KEY (`id`),
    UNIQUE KEY `uk_user_comment` (`user_id`, `comment_id`),
    KEY `idx_user_created` (`user_id`, `created_at`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COMMENT='用户评论索引';

分片策略：Hash(user_id) % 分片数

三、存储架构演进

3.1 第一代：单库单表（万级）

适用于：初创期，日均评论 < 1000

[应用] → [MySQL] → 单表 comment

问题：数据量达到百万级后查询变慢

3.2 第二代：分库分表 + 缓存（千万级）

适用于：成长期，日均评论 1w+

                    ┌─────────────────────────────────────┐
                    │           负载均衡                   │
                    └─────────────────┬───────────────────┘
                                      │
                    ┌─────────────────▼───────────────────┐
                    │             应用服务                  │
                    └─────────┬───────────────┬───────────┘
                              │               │
            ┌─────────────────▼───┐       ┌───▼─────────────────┐
            │    Redis Cluster    │       │  评论写入队列        │
            │   - 热点评论缓存      │       │  (RocketMQ/Kafka)   │
            │   - 计数缓存         │       └─────────┬───────────┘
            └─────────────────────┘                 │
                              │                     │
            ┌─────────────────▼─────────────────────▼───┐
            │           MySQL Cluster                     │
            │   (分库分表：16库 × 16表)                   │
            └─────────────────────────────────────────────┘

核心改进：

缓存层：热点评论数据缓存在 Redis
异步写入：评论先写消息队列，再批量入库
分库分表：按 object_id 哈希分片

3.3 第三代：冷热分离（亿级）

适用于：成熟期，总评论量 1亿+

                    ┌─────────────────────────────────────┐
                    │              应用服务                │
                    └───────┬─────────────┬───────────────┘
                            │             │
        ┌───────────────────▼───┐     ┌───▼───────────────────┐
        │     热数据存储         │     │     冷数据存储         │
        │   - Redis Cache       │     │   - TiDB / HBase      │
        │   - MySQL (最近30天)   │     │   - 压缩存储           │
        └───────────────────────┘     └───────────────────────┘

冷热分离策略：

数据类型	存储介质	保留时间	访问频率
热数据	Redis + MySQL	30天	高
温数据	MySQL	30-180天	中
冷数据	TiDB/HBase/OSS	>180天	低

数据归档流程：

定时任务（每天凌晨）
    │
    ▼
扫描超过30天的评论
    │
    ▼
压缩并迁移到冷存储
    │
    ▼
删除热存储中的冷数据

3.4 第四代：最终架构

适用于：大型系统，支撑亿级数据 + 万级 QPS

┌─────────────────────────────────────────────────────────────────┐
│                          客户端                                    │
└─────────────────────────────────┬───────────────────────────────┘
                                  │
┌─────────────────────────────────▼───────────────────────────────┐
│                         CDN (静态资源)                            │
└─────────────────────────────────┬───────────────────────────────┘
                                  │
┌─────────────────────────────────▼───────────────────────────────┐
│                     API Gateway (统一入口)                        │
│                     - 限流/鉴权/路由                              │
└─────────────────────────────────┬───────────────────────────────┘
                                  │
┌─────────────────────────────────▼───────────────────────────────┐
│                      评论服务集群                                  │
│  ┌────────────┐ ┌────────────┐ ┌────────────┐                   │
│  │ 评论写入    │ │ 评论查询    │ │ 审核服务    │                   │
│  └────────────┘ └────────────┘ └────────────┘                   │
└─────────────────────────────────┬───────────────────────────────┘
                                  │
        ┌─────────────────────────┼─────────────────────────┐
        │                         │                         │
┌───────▼───────┐        ┌────────▼────────┐       ┌────────▼────────┐
│   消息队列     │        │    Redis集群     │       │    搜索引擎      │
│  (Kafka)     │        │  - 评论缓存      │       │  (Elasticsearch)│
│  - 削峰填谷   │        │  - 计数缓存      │       │  - 评论搜索      │
│  - 异步处理   │        │  - 布隆过滤器    │       │  - 敏感词过滤    │
└───────────────┘        └─────────────────┘       └─────────────────┘
        │                         │                         │
        └─────────────────────────┼─────────────────────────┘
                                  │
┌─────────────────────────────────▼───────────────────────────────┐
│                        主存储                                     │
│  ┌─────────────────────────────────────────────────────────┐    │
│  │              TiDB Cluster (NewSQL)                       │    │
│  │  - 自动分片，分布式事务                                   │    │
│  │  - 在线扩缩容                                             │    │
│  │  - HTAP 混合负载                                          │    │
│  └─────────────────────────────────────────────────────────┘    │
│                                                                  │
│  ┌─────────────────────────────────────────────────────────┐    │
│  │              冷数据存储 (OSS / HBase)                     │    │
│  │  - 超过180天的历史评论                                    │    │
│  │  - 成本极低                                               │    │
│  └─────────────────────────────────────────────────────────┘    │
└─────────────────────────────────────────────────────────────────┘

四、核心流程设计

4.1 发布评论流程

用户提交评论
      │
      ▼
┌─────────────────┐
│ 1. 参数校验      │  ← 内容长度、敏感词过滤
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│ 2. 生成评论ID    │  ← 雪花算法（分布式ID）
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│ 3. 写入消息队列  │  ← 异步解耦，削峰填谷
└────────┬────────┘
         │
         ▼
┌─────────────────┐      ┌─────────────────┐
│ 4. 更新缓存      │ ───▶ │ 更新评论计数      │
└────────┬────────┘      └─────────────────┘
         │
         ▼
┌─────────────────┐
│ 5. 返回成功      │
└─────────────────┘

后台异步流程：
消息队列 ──▶ 批量入库 ──▶ 更新索引

关键代码示例：

@Service
public class CommentService {
    
    @Autowired
    private RocketMQTemplate mqTemplate;
    
    @Autowired
    private RedisTemplate redisTemplate;
    
    @Autowired
    private CommentMapper commentMapper;
    
    public CommentVO publishComment(CommentCreateRequest request) {
        // 1. 参数校验
        validateRequest(request);
        
        // 2. 敏感词过滤
        String filteredContent = sensitiveWordFilter.filter(request.getContent());
        
        // 3. 生成分布式ID（雪花算法）
        Long commentId = snowflakeIdGenerator.nextId();
        
        // 4. 构建评论对象
        CommentDO comment = CommentDO.builder()
            .id(commentId)
            .objectId(request.getObjectId())
            .objectType(request.getObjectType())
            .parentId(request.getParentId())
            .userId(request.getUserId())
            .content(filteredContent)
            .createdAt(System.currentTimeMillis())
            .build();
        
        // 5. 写入消息队列（异步落库）
        CommentEvent event = new CommentEvent(comment);
        mqTemplate.sendSync("comment_publish_topic", event);
        
        // 6. 更新缓存（热点评论列表头部插入）
        String cacheKey = buildCommentListCacheKey(comment.getObjectId());
        redisTemplate.opsForList().leftPush(cacheKey, comment);
        redisTemplate.expire(cacheKey, 1, TimeUnit.HOURS);
        
        // 7. 更新计数缓存
        String countKey = buildCommentCountCacheKey(comment.getObjectId());
        redisTemplate.opsForValue().increment(countKey);
        
        // 8. 返回结果
        return convertToVO(comment);
    }
    
    // 消费者：批量入库
    @RocketMQMessageListener(topic = "comment_publish_topic")
    public class CommentConsumer implements RocketMQListener<List<CommentEvent>> {
        
        @Override
        public void onMessage(List<CommentEvent> events) {
            // 批量插入数据库
            List<CommentDO> comments = events.stream()
                .map(CommentEvent::getComment)
                .collect(Collectors.toList());
            
            commentMapper.batchInsert(comments);
            
            // 更新Elasticsearch索引
            elasticsearchService.bulkIndex(comments);
        }
    }
}

4.2 查询评论列表流程

用户请求评论列表
      │
      ▼
┌─────────────────┐
│ 1. 解析参数      │  ← objectId, page, size, sortBy
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│ 2. 查询缓存      │
└────────┬────────┘
         │ 命中
    ┌────┴────┐
    │ 是      │ 否
    ▼         ▼
┌───────┐  ┌─────────────────┐
│ 返回  │  │ 3. 查询数据库    │
└───────┘  └────────┬────────┘
                     │
                     ▼
              ┌─────────────────┐
              │ 4. 回填缓存      │
              └────────┬────────┘
                       │
                       ▼
              ┌─────────────────┐
              │ 5. 返回结果      │
              └─────────────────┘

分页查询优化：

public PageResult<CommentVO> getCommentList(String objectId, int page, int size, String orderBy) {
    String cacheKey = buildCommentListCacheKey(objectId, page, size, orderBy);
    
    // 1. 尝试从缓存读取
    List<CommentVO> cachedComments = redisTemplate.opsForList()
        .range(cacheKey, 0, -1);
    
    if (!CollectionUtils.isEmpty(cachedComments)) {
        return PageResult.of(cachedComments);
    }
    
    // 2. 缓存未命中，查询数据库
    // 优化1：使用游标分页（避免 OFFSET 过大问题）
    Long lastId = getLastIdFromPage(page, size);
    
    Page<CommentDO> pageResult = commentMapper.selectByObjectId(
        objectId, lastId, size, orderBy
    );
    
    // 3. 批量查询用户信息（避免 N+1 问题）
    List<Long> userIds = pageResult.getRecords().stream()
        .map(CommentDO::getUserId)
        .distinct()
        .collect(Collectors.toList());
    
    Map<Long, UserInfo> userInfoMap = userService.batchGetUserInfo(userIds);
    
    // 4. 转换为VO
    List<CommentVO> vos = pageResult.getRecords().stream()
        .map(comment -> {
            CommentVO vo = convertToVO(comment);
            vo.setUserInfo(userInfoMap.get(comment.getUserId()));
            return vo;
        })
        .collect(Collectors.toList());
    
    // 5. 回填缓存
    redisTemplate.opsForList().rightPushAll(cacheKey, vos);
    redisTemplate.expire(cacheKey, 1, TimeUnit.HOURS);
    
    return PageResult.of(vos, pageResult.getTotal());
}

// 游标分页SQL
// SELECT * FROM comment WHERE object_id = ? AND id < #{lastId} ORDER BY id DESC LIMIT 20

4.3 评论点赞流程

public LikeResult likeComment(Long commentId, Long userId) {
    String lockKey = "like_lock:" + commentId + ":" + userId;
    
    // 1. 分布式锁防止重复点赞
    Boolean locked = redisTemplate.opsForValue()
        .setIfAbsent(lockKey, "1", 5, TimeUnit.SECONDS);
    
    if (!Boolean.TRUE.equals(locked)) {
        return LikeResult.of(false, "操作太频繁");
    }
    
    try {
        // 2. 检查是否已点赞
        String likeKey = "comment_like:" + commentId + ":" + userId;
        Boolean isLiked = redisTemplate.opsForValue().get(likeKey) != null;
        
        if (Boolean.TRUE.equals(isLiked)) {
            return LikeResult.of(false, "已经点过赞了");
        }
        
        // 3. 记录点赞（写入缓存 + 异步落库）
        redisTemplate.opsForValue().set(likeKey, "1", 30, TimeUnit.DAYS);
        
        // 4. 更新点赞计数（使用 Redis 原子操作）
        String countKey = "comment_like_count:" + commentId;
        Long newCount = redisTemplate.opsForValue().increment(countKey);
        
        // 5. 发送点赞事件（异步落库）
        mqTemplate.send("comment_like_topic", new LikeEvent(commentId, userId));
        
        return LikeResult.of(true, newCount);
        
    } finally {
        redisTemplate.delete(lockKey);
    }
}

五、性能优化策略

5.1 缓存设计

多级缓存架构：

┌─────────────────────────────────────────────────────┐
│                    请求                               │
└─────────────────────────┬───────────────────────────┘
                          │
┌─────────────────────────▼───────────────────────────┐
│   L1: 本地缓存 (Caffeine)                             │
│   - 热点数据，过期时间 5s                              │
│   - 命中率约 30%                                      │
└─────────────────────────┬───────────────────────────┘
                          │ Miss
┌─────────────────────────▼───────────────────────────┐
│   L2: Redis 集群                                      │
│   - 热数据，过期时间 1h                               │
│   - 命中率约 60%                                      │
└─────────────────────────┬───────────────────────────┘
                          │ Miss
┌─────────────────────────▼───────────────────────────┐
│   L3: 数据库                                          │
│   - 全量数据                                          │
│   - 命中率约 10%                                      │
└─────────────────────────────────────────────────────┘

缓存预热：

@Component
public class CommentCacheWarmer implements ApplicationRunner {
    
    @Autowired
    private CommentService commentService;
    
    @Override
    public void run(ApplicationArguments args) {
        // 启动时加载热点视频的评论
        List<String> hotObjectIds = getHotObjectIds();
        
        for (String objectId : hotObjectIds) {
            // 异步预热
            CompletableFuture.runAsync(() -> {
                commentService.preloadCommentList(objectId);
            });
        }
    }
    
    private List<String> getHotObjectIds() {
        // 从统计数据中获取播放量/评论数最高的前100个对象
        return clickHouseService.getTopObjects(100);
    }
}

5.2 布隆过滤器防穿透

@Component
public class BloomFilterService {
    
    private RBloomFilter<String> bloomFilter;
    
    @PostConstruct
    public void init() {
        RBloomFilter<String> filter = redissonClient.getBloomFilter("comment_bf");
        // 预计5亿数据，误差率1%
        filter.tryInit(500000000L, 0.01);
        bloomFilter = filter;
    }
    
    public boolean mightExist(String objectId) {
        return bloomFilter.contains(objectId);
    }
    
    @EventListener
    public void onCommentCreated(CommentCreateEvent event) {
        // 新评论发布时添加到布隆过滤器
        bloomFilter.add(event.getObjectId());
    }
}

5.3 深度分页优化

问题：OFFSET 10000 LIMIT 20 需要扫描 10020 条数据

解决方案：游标分页

-- ❌ 传统分页（OFFSET 过大时性能差）
SELECT * FROM comment 
WHERE object_id = 'video_123' 
ORDER BY created_at DESC 
LIMIT 20 OFFSET 10000;

-- ✅ 游标分页（基于上一页最后一条的时间戳）
SELECT * FROM comment 
WHERE object_id = 'video_123' 
  AND created_at < '2024-01-01 12:00:00'  -- 上一页最后一条的时间
ORDER BY created_at DESC 
LIMIT 20;

5.4 热点数据隔离

当某个视频成为爆款时，其评论区可能承受极高并发。解决方案：

┌─────────────────────────────────────────────────────┐
│                   热点识别                            │
│  - 实时统计 QPS                                       │
│  - 超过阈值标记为"热点"                               │
└─────────────────────────┬───────────────────────────┘
                          │
        ┌─────────────────┼─────────────────┐
        │                 │                 │
┌───────▼───────┐ ┌───────▼───────┐ ┌───────▼───────┐
│  热点评论服务  │ │  普通评论服务  │ │  降级服务     │
│  - 独立集群    │ │  - 正常处理    │ │  - 只读模式   │
│  - 只读副本    │ │               │ │  - 限流       │
└───────────────┘ └───────────────┘ └───────────────┘

六、数据一致性保障

6.1 计数一致性

评论数、点赞数的一致性要求较高，采用 DB + Redis 双写 + 对账 方案：

┌─────────────────────────────────────────────────────────┐
│                     写入流程                              │
├─────────────────────────────────────────────────────────┤
│  1. 先更新 DB（事务中）                                   │
│  2. 发送 MQ 消息                                         │
│  3. 消费者更新 Redis 计数                                 │
└─────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────┐
│                     对账流程（每分钟）                     │
├─────────────────────────────────────────────────────────┤
│  1. 从 DB 查询真实计数                                    │
│  2. 从 Redis 获取缓存计数                                 │
│  3. 差异超过阈值时，以 DB 为准修复 Redis                   │
│  4. 记录差异日志，触发告警                                │
└─────────────────────────────────────────────────────────┘

6.2 最终一致性保证

对于非强一致性的场景（如评论的排序权重），采用最终一致性：

// 更新评论热度分（最终一致性）
public void updateCommentHeat(Long commentId) {
    // 1. 更新本地缓存
    localCache.incrementHeat(commentId, 1);
    
    // 2. 批量提交到 Kafka（每100条或每1秒）
    batchSubmitter.add(commentId);
}

// 批量更新
@Scheduled(fixedDelay = 1000)
public void batchUpdate() {
    Map<Long, Integer> heatChanges = batchSubmitter.flush();
    
    // 批量更新数据库
    commentMapper.batchIncrementHeat(heatChanges);
}

七、监控与告警

7.1 关键指标

指标	阈值	说明
评论发布 QPS	> 10000	写入压力
评论查询 QPS	> 50000	读取压力
发布延迟 P99	< 500ms	用户体验
查询延迟 P99	< 200ms	用户体验
缓存命中率	> 80%	缓存效果
MySQL 慢查询	< 10/min	数据库健康
消息队列积压	< 10000	消费能力

7.2 告警规则

告警规则:
  - name: comment_publish_high_latency
    condition: p99_latency > 1000ms
    duration: 5m
    severity: P1
    action: 钉钉 + 电话
    
  - name: mq_comment_backlog
    condition: backlog > 100000
    duration: 10m
    severity: P2
    action: 钉钉
    
  - name: cache_hit_rate_low
    condition: hit_rate < 60%
    duration: 15m
    severity: P3
    action: 企业微信

八、总结与展望

8.1 架构演进路径总结

阶段	数据量	架构方案	核心痛点
初创期	< 10w	单库单表	无
成长期	100w - 1000w	读写分离 + Redis	单表瓶颈
爆发期	1000w - 1亿	分库分表 + MQ + ES	分片复杂
成熟期	> 1亿	TiDB + 冷热分离	运维成本

8.2 关键设计要点

读写分离：写入走主库，查询走从库/缓存
异步解耦：核心流程同步，非核心异步
冷热分离：热数据高性能存储，冷数据低成本存储
分库分表：选好分片键，预留扩容空间
多级缓存：本地缓存 + Redis 缓存 + 数据库
可观测性：埋点、监控、告警体系完善

8.3 未来演进方向

AI 评论审核：接入大模型自动识别违规内容
实时推荐：基于用户兴趣的评论排序
多模态评论：支持图片、视频、语音评论
去中心化评论：基于区块链的评论存证

设计海量评论系统没有银弹，需要在一致性、可用性、性能、成本之间做权衡。最好的架构不是一步到位的，而是随着业务发展不断演进的。希望本文能为你设计高并发评论系统提供一些参考思路。

海量评论系统设计