第1732篇：AI产品的冷启动问题——没有数据时如何让推荐和生成有意义

老张2026/4/30大约 10 分钟

第1732篇：AI产品的冷启动问题——没有数据时如何让推荐和生成有意义

上一篇聊了数据飞轮，有读者留言问：飞轮是个好东西，但我们产品刚上线，一个用户都没有，怎么冷启动？

这个问题问到点子上了。数据飞轮的致命弱点就是冷启动。鸡生蛋还是蛋生鸡——没有用户就没有数据，没有数据模型表现差，模型差就留不住用户。

今天我们就系统性地聊聊AI产品的冷启动方案，不讲理论，讲我真正踩过的坑和用过的办法。

冷启动的几种形态

AI产品的冷启动，大体上分三种情况，解法不完全一样。

类型一：全新推荐系统。用户来了，我们对他一无所知，又没有其他用户的历史数据。最惨的情况。

类型二：有用户数据但没有AI标注数据。比如原来是规则系统，积累了大量用户交互日志，但这些日志没有对AI训练有意义的标注。

类型三：有基础模型，需要业务适配。比如用开源的LLM，但对于自己的垂直场景效果很差，需要做领域适配。

三种情况我都遇到过，下面分别说。

情况一：推荐系统的冷启动

做AI推荐系统，最怕新用户。你对他毫无了解，什么都不知道，推什么呢？

策略一：用内容质量作为初始排序依据

在没有用户数据的时候，靠内容本身的质量信号来替代个性化推荐。

@Service
public class ColdStartRecommender {
    
    @Autowired
    private ContentQualityRepository qualityRepo;
    
    /**
     * 新用户的推荐逻辑：不依赖用户历史，依赖内容自身质量
     */
    public List<ContentItem> recommendForNewUser(String userId, int size) {
        
        // 质量得分 = 综合多种信号的内容固有质量
        return qualityRepo.findTopQualityContent(
            ContentQualityFilter.builder()
                .minQualityScore(7.0)
                .publishedWithin(Duration.ofDays(30)) // 不推太老的内容
                .deduplicateCategory(true)             // 品类去重，别一直推同类
                .size(size * 2)                        // 多拉一些，后面再过滤
                .build()
        ).stream()
            .filter(item -> !hasUserSeen(userId, item.getId())) // 排除已看过的
            .sorted(Comparator.comparingDouble(this::calculateColdStartScore).reversed())
            .limit(size)
            .collect(Collectors.toList());
    }
    
    private double calculateColdStartScore(ContentItem item) {
        // 内容质量 * 时效性衰减 * 分类热度
        double qualityScore = item.getQualityScore();
        double freshnessDecay = Math.exp(-0.1 * item.getDaysOld()); // 指数衰减
        double categoryHeat = getCategoryHeatScore(item.getCategory());
        
        return qualityScore * freshnessDecay * categoryHeat;
    }
}

策略二：注册时收集用户偏好

这个做法听起来老套，但很管用。在用户注册流程里加几个选择题，比如"你最感兴趣的领域是？"——前端给5到8个选项，用户勾选1到3个，我们立刻有了初始偏好向量。

关键是要设计得好玩，而不是让人感觉在填表格：

@RestController
@RequestMapping("/api/onboarding")
public class OnboardingController {
    
    @Autowired
    private UserPreferenceService preferenceService;
    
    @Autowired
    private RecommendService recommendService;
    
    /**
     * 用户选择初始兴趣后，立即生成个性化推荐
     */
    @PostMapping("/interests")
    public OnboardingResult submitInterests(
            @RequestHeader("X-User-Id") String userId,
            @RequestBody InterestSelectionRequest request) {
        
        // 验证至少选了一个
        if (request.getInterestIds() == null || request.getInterestIds().isEmpty()) {
            throw new BadRequestException("请至少选择一个感兴趣的方向");
        }
        
        // 初始化用户偏好向量
        preferenceService.initUserPreferences(userId, request.getInterestIds());
        
        // 立刻给用户看基于兴趣的推荐，让用户感受到个性化已经生效
        List<ContentItem> initialRecommendations = recommendService
            .recommendByInterests(request.getInterestIds(), 10);
        
        return OnboardingResult.builder()
            .message("太好了，已经为你找到相关内容！")
            .recommendations(initialRecommendations)
            .build();
    }
}

策略三：用协同过滤的"软启动"版本

协同过滤需要大量数据，但有一个弱化版本：基于内容相似度的协同推断。

思路是：找到与新用户选择偏好最相似的"用户群体"画像，用这个群体的历史行为来做推荐。即使没有这个新用户自己的历史，群体历史也能提供启动信号。

@Service
public class SoftCollaborativeFilter {
    
    @Autowired
    private UserClusterRepository clusterRepo;
    
    /**
     * 把新用户映射到最近的用户群体，借用群体历史
     */
    public List<ContentItem> recommendByClusterProxy(String userId) {
        
        // 获取用户的初始偏好标签（来自注册选择）
        List<String> userTags = preferenceService.getUserTags(userId);
        
        // 找到标签重叠度最高的用户群
        UserCluster closestCluster = clusterRepo.findMostSimilarCluster(userTags);
        
        // 用这个群体最喜欢的内容来推荐
        return closestCluster.getTopContentItems().stream()
            .filter(item -> !hasUserSeen(userId, item.getId()))
            .limit(10)
            .collect(Collectors.toList());
    }
}

情况二：有日志但没标注，如何快速建立训练集

这个场景在做垂直领域AI产品时很常见。公司原来有个规则系统，跑了几年，积累了几百万条用户问答日志，但没有对应的质量标注。

直接拿来训练肯定不行，但全部人工标注成本太高。

方案：用弱监督学习 + 置信度过滤

思路是这样的：先用业务规则生成"弱标签"，再用弱标签训练一个分类器，分类器置信度高的样本作为可信训练集。

@Service
public class WeakSupervisionLabeler {
    
    /**
     * 用多个弱监督函数给历史日志打标签
     * 函数之间相互独立，最终用多数投票决定标签
     */
    public LabelResult weakLabel(ConversationLog log) {
        
        List<WeakLabel> votes = new ArrayList<>();
        
        // 弱监督函数1：基于后续行为（用户继续追问说明答案有价值）
        if (log.getFollowUpCount() >= 2) {
            votes.add(WeakLabel.POSITIVE);
        } else if (log.isAbandoned()) {
            votes.add(WeakLabel.NEGATIVE);
        } else {
            votes.add(WeakLabel.ABSTAIN);
        }
        
        // 弱监督函数2：基于会话时长
        if (log.getSessionDurationMs() > 180_000) { // 3分钟以上
            votes.add(WeakLabel.POSITIVE);
        } else if (log.getSessionDurationMs() < 5_000) { // 5秒以下
            votes.add(WeakLabel.NEGATIVE);
        } else {
            votes.add(WeakLabel.ABSTAIN);
        }
        
        // 弱监督函数3：基于关键词匹配（领域专家规则）
        if (containsPositiveKeywords(log.getResponse())) {
            votes.add(WeakLabel.POSITIVE);
        } else if (containsNegativeKeywords(log.getResponse())) {
            votes.add(WeakLabel.NEGATIVE);
        } else {
            votes.add(WeakLabel.ABSTAIN);
        }
        
        // 弱监督函数4：基于用户评分（如果有的话）
        if (log.getUserRating() != null) {
            if (log.getUserRating() >= 4) {
                votes.add(WeakLabel.POSITIVE);
                votes.add(WeakLabel.POSITIVE); // 双权重
            } else if (log.getUserRating() <= 2) {
                votes.add(WeakLabel.NEGATIVE);
                votes.add(WeakLabel.NEGATIVE);
            }
        }
        
        // 统计投票
        long positiveCount = votes.stream().filter(v -> v == WeakLabel.POSITIVE).count();
        long negativeCount = votes.stream().filter(v -> v == WeakLabel.NEGATIVE).count();
        long totalVotes = votes.stream().filter(v -> v != WeakLabel.ABSTAIN).count();
        
        if (totalVotes == 0) {
            return LabelResult.abstain();
        }
        
        double confidence = (double) Math.max(positiveCount, negativeCount) / totalVotes;
        WeakLabel label = positiveCount > negativeCount ? WeakLabel.POSITIVE : WeakLabel.NEGATIVE;
        
        // 置信度低于0.7的不用
        if (confidence < 0.7) {
            return LabelResult.abstain();
        }
        
        return LabelResult.labeled(label, confidence);
    }
}

经过这套处理，通常能从原始日志里筛出 30%-40% 的样本作为可信训练集。这已经足够让第一版模型有个像样的起点了。

情况三：领域适配的冷启动——数据合成

有时候你有基础模型，但领域数据真的一条都没有，连历史日志都没有。这时候需要合成训练数据。

这个技术的核心思路是：用大模型生成大模型的训练数据。听起来有点循环，但实际上是用通用大模型（比如 GPT-4）来生成特定场景的问答对，然后用这些合成数据去微调更小的领域模型。

@Service
public class SyntheticDataGenerator {
    
    @Autowired
    private OpenAIClient openAIClient; // 或者任何强模型
    
    @Value("${domain.description}")
    private String domainDescription;
    
    /**
     * 为特定领域批量生成训练样本
     */
    public List<TrainingSample> generateSamples(
            String topicArea, 
            int targetCount) {
        
        List<TrainingSample> samples = new ArrayList<>();
        
        // 先让大模型生成多样化的问题
        List<String> questions = generateDiverseQuestions(topicArea, targetCount * 2);
        
        for (String question : questions) {
            try {
                // 再让大模型给出高质量回答
                String answer = generateHighQualityAnswer(question);
                
                // 质量过滤
                if (isQualitySufficient(question, answer)) {
                    samples.add(TrainingSample.builder()
                        .question(question)
                        .answer(answer)
                        .source("synthetic")
                        .topicArea(topicArea)
                        .generatedAt(Instant.now())
                        .build());
                }
                
                if (samples.size() >= targetCount) break;
                
            } catch (Exception e) {
                log.warn("样本生成失败，跳过: {}", question, e);
            }
        }
        
        return samples;
    }
    
    private List<String> generateDiverseQuestions(String topicArea, int count) {
        String prompt = String.format("""
            你是一个%s领域的专家。
            请生成%d个关于"%s"的真实用户问题。
            要求：
            1. 问题要多样化，覆盖初学者到专业人员不同层次
            2. 包含概念性问题、操作性问题和故障排查类问题
            3. 使用真实用户的语言风格，不要太书面化
            4. 每行一个问题，不要编号
            
            直接输出问题列表：
            """, domainDescription, count, topicArea);
        
        String response = openAIClient.complete(prompt);
        return Arrays.asList(response.split("\n")).stream()
            .map(String::trim)
            .filter(s -> !s.isEmpty())
            .collect(Collectors.toList());
    }
    
    private String generateHighQualityAnswer(String question) {
        String prompt = String.format("""
            请回答以下%s领域的问题，要求：
            1. 回答准确、专业
            2. 结构清晰，适当分段
            3. 举例说明抽象概念
            4. 长度适中（200-500字）
            
            问题：%s
            """, domainDescription, question);
        
        return openAIClient.complete(prompt);
    }
    
    private boolean isQualitySufficient(String question, String answer) {
        // 基础质量检查
        if (answer.length() < 100) return false;
        if (answer.contains("我无法") || answer.contains("抱歉")) return false;
        if (question.length() < 10) return false;
        return true;
    }
}

合成数据有几个注意事项：

多样性比数量更重要。生成 1000 条各不相同的高质量样本，远好于生成 10000 条重复度高的。
加入负样本。不只生成正确答案，也要生成一些典型的错误回答，用于对比训练。
人工审核比例不能低于 5%。合成数据有时会出现事实错误，必须人工抽检。

小流量实验：把冷启动当成一个科学实验

冷启动阶段，流量少，但正因为流量少，每一个用户的行为都很宝贵。不要浪费这个阶段。

我建议把冷启动设计成多臂赌机实验（Multi-Armed Bandit），同时测试多种推荐/生成策略，用 Thompson Sampling 动态分配流量：

@Component
public class ColdStartBandit {
    
    // 每种策略的 Beta 分布参数（成功次数、失败次数）
    private final Map<String, double[]> strategyParams = new ConcurrentHashMap<>(Map.of(
        "CONTENT_QUALITY",    new double[]{1.0, 1.0},  // alpha, beta
        "CLUSTER_PROXY",      new double[]{1.0, 1.0},
        "SYNTHETIC_ENHANCED", new double[]{1.0, 1.0},
        "RANDOM_BASELINE",    new double[]{1.0, 1.0}
    ));
    
    private final Random random = new Random();
    
    /**
     * Thompson Sampling：根据当前知识，采样出最优策略
     */
    public String selectStrategy() {
        double bestSample = -1.0;
        String bestStrategy = null;
        
        for (Map.Entry<String, double[]> entry : strategyParams.entrySet()) {
            double alpha = entry.getValue()[0];
            double beta = entry.getValue()[1];
            
            // 从 Beta(alpha, beta) 分布中采样
            double sample = sampleBeta(alpha, beta);
            
            if (sample > bestSample) {
                bestSample = sample;
                bestStrategy = entry.getKey();
            }
        }
        
        return bestStrategy;
    }
    
    /**
     * 根据用户反馈更新策略参数
     */
    public void updateReward(String strategy, boolean positive) {
        strategyParams.compute(strategy, (k, params) -> {
            if (positive) {
                params[0] += 1.0; // alpha++
            } else {
                params[1] += 1.0; // beta++
            }
            return params;
        });
    }
    
    private double sampleBeta(double alpha, double beta) {
        // 使用 Gamma 分布采样 Beta 分布
        double x = sampleGamma(alpha);
        double y = sampleGamma(beta);
        return x / (x + y);
    }
    
    private double sampleGamma(double shape) {
        // Marsaglia-Tsang 方法
        if (shape < 1.0) {
            return sampleGamma(1.0 + shape) * Math.pow(random.nextDouble(), 1.0 / shape);
        }
        double d = shape - 1.0 / 3.0;
        double c = 1.0 / Math.sqrt(9.0 * d);
        while (true) {
            double x = random.nextGaussian();
            double v = Math.pow(1.0 + c * x, 3);
            if (v > 0 && Math.log(random.nextDouble()) < 0.5 * x * x + d - d * v + d * Math.log(v)) {
                return d * v;
            }
        }
    }
}

这个方案的好处是：流量会自动向表现好的策略倾斜，不用人工干预，冷启动的代价最小化。

冷启动的度量标准

怎么判断冷启动结束了、飞轮可以接管了？我们的判断标准是：

三个条件同时满足，冷启动才算结束：足够的用户规模、足够的正向反馈、足够的样本积累。

一些经验总结

别想着完美再上线。冷启动最大的错误是过度准备，等一切都ready了再发布，这时候已经错过了市场窗口。接受"够用就行"的初始质量，快速上线，让真实用户数据来改进。

人工运营是早期最值钱的事。冷启动阶段，人工运营高质量内容、人工回答典型问题，效果远好于指望算法自己跑。算法是杠杆，但没有支点撬不动。

用户分层很重要。不同用户的冷启动代价不同。对于高价值用户（比如付费用户），可以多投入人工资源做个性化；对于普通用户，先靠通用策略保底。别把有限的精力平均分配。