AI应用的技术债清偿计划：系统性重构遗留AI代码

老张2026/4/30大约 19 分钟

AI应用的技术债清偿计划：系统性重构遗留AI代码

date: 2026-10-11 tags: [技术债, 重构, 代码质量, Spring AI, Java]

一、真实故事：2万行代码的沉没成本

2025年7月，陈磊盯着屏幕上密密麻麻的代码，第一次感到了绝望。

2年前，他是团队里最拼的那个人。公司想快速搭一个AI客服原型，他用了3周时间，硬是把一个能跑的系统交了出去。那时候所有人都说："先上线，后续慢慢优化。"

如今，这个"原型"已经在生产跑了18个月，服务着日均30万次对话。代码规模从最初的3000行膨胀到2.1万行。

代码里的问题，陈磊自己心里清楚：

提示词全部硬编码在Java字符串里，改一个词要找半天
同一个文档向量化了17遍（每次团队成员以为别人没做就自己做了一遍）
所有AI调用都是同步阻塞的，P99延迟高达8秒
没有单元测试，每次上线都是"真实用户测试"
注释只有不到3%（那3%还是被人骂了才加的）
3个主要功能模块存在循环依赖

新来的技术负责人看了一眼代码库，说了一句让陈磊终生难忘的话：

"这不是代码，这是诗。只有作者能看懂，而且作者6个月后也看不懂了。"

会议决定：系统性重构。给3个月时间，预算：2名工程师全职+1名测试。

陈磊这次不想再走回头路了——他决定做一次真正意义上的、有体系、有章法的技术债清偿。

这篇文章就是他们3个月重构的完整复盘。

二、AI技术债的特殊类型

AI系统的技术债与传统系统有重叠，但也有独特的痛点：

2.1 最常见的AI特有技术债清单

类型1：硬编码提示词

// 债务代码（before）
public String analyzeComplaint(String complaint) {
    String prompt = "你是一个客服助手，请分析以下投诉并给出处理建议：\n" + complaint;
    return llmClient.call(prompt);
}

// 同一个prompt在代码库里出现了5个略有不同的变体
// 没有人知道哪个是"最新的正确版本"

类型2：重复向量化

// 在3个不同的Service里找到类似代码
// OrderService.java:108
List<float[]> vectors1 = embed(productDocuments);

// ProductService.java:234
List<float[]> embeddings = embedAll(docs); // 相同的docs！

// RecommendService.java:67
var productVectors = getEmbeddings(productList); // 还是相同的内容！

类型3：同步阻塞AI调用

// 债务代码
@GetMapping("/recommend")
public List<Product> recommend(@RequestParam String userId) {
    String userProfile = llmClient.call(buildProfilePrompt(userId)); // 阻塞2秒
    String query = llmClient.call(buildQueryPrompt(userProfile));    // 阻塞1.5秒
    return searchService.search(query); // 0.1秒
    // 总计 3.6秒，全是串行等待
}

类型4：无降级策略

// LLM挂了整个功能就挂了
public String getRecommendation(String userId) {
    return openAiClient.chat(prompt); // 如果OpenAI超时/限流，直接报500
    // 没有fallback，没有重试，没有熔断
}

三、债务评估：用工具量化技术债

3.1 SonarQube自定义规则

// 检测硬编码提示词的自定义SonarQube规则
@Rule(key = "HardcodedPromptRule")
public class HardcodedPromptDetector extends IssuableSubscriptionVisitor {
    
    // 识别提示词的关键特征
    private static final List<String> PROMPT_KEYWORDS = List.of(
        "你是", "请你", "作为", "你的任务是", "请分析", "请回答",
        "你是一个", "As an AI", "You are a", "Please analyze"
    );
    
    @Override
    public List<Tree.Kind> nodesToVisit() {
        return List.of(Tree.Kind.STRING_LITERAL);
    }
    
    @Override
    public void visitNode(Tree tree) {
        LiteralTree literal = (LiteralTree) tree;
        String value = literal.value();
        
        // 检查是否是包含提示词关键字的长字符串
        boolean isLikelyPrompt = PROMPT_KEYWORDS.stream()
            .anyMatch(keyword -> value.contains(keyword))
            && value.length() > 50;
        
        if (isLikelyPrompt) {
            reportIssue(literal, 
                "疑似硬编码提示词，建议外化到配置文件或数据库管理。"
                + "提示词长度: " + value.length() + " 字符");
        }
    }
}

// 检测重复Embedding调用的规则
@Rule(key = "DuplicateEmbeddingRule")
public class DuplicateEmbeddingDetector extends BaseTreeVisitor {
    
    private final Map<String, List<String>> embeddingCallLocations = new HashMap<>();
    
    @Override
    public void visitMethodInvocation(MethodInvocationTree tree) {
        String methodName = tree.methodSelect().toString();
        
        // 识别各种Embedding调用模式
        if (isEmbeddingCall(methodName)) {
            String argument = extractArgument(tree);
            embeddingCallLocations
                .computeIfAbsent(argument, k -> new ArrayList<>())
                .add(getLocation(tree));
        }
        
        super.visitMethodInvocation(tree);
    }
    
    private boolean isEmbeddingCall(String methodName) {
        return methodName.contains("embed") || 
               methodName.contains("Embedding") ||
               methodName.contains("vectorize") ||
               methodName.contains("getVector");
    }
}

3.2 技术债量化指标

// 技术债评估报告生成器
@Service
public class TechnicalDebtAssessor {
    
    public TechDebtReport assess(String projectPath) {
        // 1. 代码复杂度
        int totalCyclomaticComplexity = sonarMetrics.getCyclomaticComplexity(projectPath);
        
        // 2. 测试覆盖率
        double testCoverage = sonarMetrics.getLineCoverage(projectPath);
        
        // 3. 重复代码率
        double duplicationRate = sonarMetrics.getDuplicationRate(projectPath);
        
        // 4. AI特有债务指标（自定义扫描）
        int hardcodedPrompts = customScanner.countHardcodedPrompts(projectPath);
        int duplicateEmbeddings = customScanner.countDuplicateEmbeddings(projectPath);
        int syncAICalls = customScanner.countSyncAICalls(projectPath);
        int missingFallbacks = customScanner.countMissingFallbacks(projectPath);
        
        // 5. 计算偿还成本（人天估算）
        double repaymentDays = 
            hardcodedPrompts * 0.5 +          // 每个硬编码提示词外化0.5天
            duplicateEmbeddings * 1.0 +        // 每处重复向量化合并1天
            syncAICalls * 2.0 +               // 每处同步改异步2天
            missingFallbacks * 0.5 +           // 每处添加降级0.5天
            (100 - testCoverage) * 0.3 +       // 每%测试覆盖率补充0.3天
            duplicationRate * 5.0;             // 重复代码消除
        
        return TechDebtReport.builder()
            .projectPath(projectPath)
            .testCoverage(testCoverage)
            .duplicationRate(duplicationRate)
            .hardcodedPrompts(hardcodedPrompts)
            .duplicateEmbeddings(duplicateEmbeddings)
            .syncAICalls(syncAICalls)
            .missingFallbacks(missingFallbacks)
            .totalRepaymentDays((int) repaymentDays)
            .debtGrade(calculateGrade(testCoverage, duplicationRate, hardcodedPrompts))
            .build();
    }
    
    // 债务评级（A-F）
    private String calculateGrade(double coverage, double duplication, int hardcoded) {
        int score = 100;
        score -= (int) Math.max(0, (80 - coverage));  // 覆盖率每低1%扣1分
        score -= (int) (duplication * 2);              // 重复率每1%扣2分
        score -= hardcoded * 3;                        // 每个硬编码提示词扣3分
        
        if (score >= 90) return "A";
        if (score >= 80) return "B";
        if (score >= 70) return "C";
        if (score >= 60) return "D";
        if (score >= 50) return "E";
        return "F";
    }
}

陈磊团队的评估结果：

债务类型	数量	预估修复成本
硬编码提示词	47处	23.5天
重复向量化	17处	17天
同步阻塞AI调用	23处	46天
缺失降级策略	31处	15.5天
测试覆盖率（当前3%→目标60%）	-	51天
合计	-	153天

结论：以2人团队算，至少需要76个工作日（约15周）才能全部还清。但只有3个月预算（60工作日）。必须排优先级。

四、优先级排序：影响-难度矩阵

4.1 四象限矩阵

第一批（立即处理，最高ROI）：

添加降级策略（改造容易，避免雪崩）
同步改异步（改造中等，延迟改善显著）
提示词外化（改造容易，后续迭代收益大）

第二批（计划处理）： 4. 补充测试覆盖（为后续重构提供保障） 5. 重复向量化合并（成本节省直接）

暂缓（资源有限先放一放）： 6. 循环依赖重构（风险高，影响相对间接） 7. 代码注释补充（有测试覆盖后意义更大）

五、渐进式重构：Strangler Fig模式

5.1 Strangler Fig模式原理

不要试图一次性重写整个系统（会失败）。而是像绞杀榕树一样，在旧系统外面长出新系统，逐渐替代。

5.2 完整Strangler Fig实现

// 路由层：核心控制器
@Service
@Slf4j
public class AIServiceRouter {
    
    private final LegacyAIService legacyService;
    private final NewAIService newService;
    private final MigrationConfig config;
    private final ResultComparator comparator;
    
    // 控制各功能模块的迁移进度
    // 格式：featureKey -> newServicePercent (0-100)
    // 存储在Redis，支持运行时动态调整
    
    public AIResponse route(AIRequest request) {
        String featureKey = request.getFeatureKey();
        int newServicePercent = config.getMigrationPercent(featureKey);
        
        if (newServicePercent == 0) {
            // 全量旧代码
            return legacyService.process(request);
        }
        
        if (newServicePercent == 100) {
            // 全量新代码
            return newService.process(request);
        }
        
        // 影子测试阶段（旧代码响应用户，同时异步运行新代码对比）
        if (newServicePercent < 20) {
            AIResponse legacyResponse = legacyService.process(request);
            
            // 异步跑新代码，不影响用户响应
            CompletableFuture.runAsync(() -> {
                try {
                    AIResponse newResponse = newService.process(request);
                    comparator.compare(featureKey, request, legacyResponse, newResponse);
                } catch (Exception e) {
                    log.error("影子测试失败: featureKey={}", featureKey, e);
                }
            });
            
            return legacyResponse;
        }
        
        // 灰度阶段：按比例切流
        boolean useNew = shouldUseNew(request.getUserId(), newServicePercent);
        
        if (useNew) {
            try {
                return newService.process(request);
            } catch (Exception e) {
                // 新代码失败，自动降级到旧代码
                log.warn("新代码失败，降级到旧代码: featureKey={}, error={}", 
                    featureKey, e.getMessage());
                migrationMetrics.recordFallback(featureKey);
                return legacyService.process(request);
            }
        } else {
            return legacyService.process(request);
        }
    }
    
    private boolean shouldUseNew(String userId, int percent) {
        // 使用一致性哈希，确保同一用户始终在同一组
        return Math.abs(userId.hashCode()) % 100 < percent;
    }
}

// 结果对比器：检测新旧代码结果差异
@Service
@Slf4j
public class ResultComparator {
    
    public void compare(
            String featureKey, 
            AIRequest request,
            AIResponse legacyResponse, 
            AIResponse newResponse) {
        
        boolean isSemanticallyEqual = checkSemanticEquality(
            legacyResponse.getContent(), 
            newResponse.getContent()
        );
        
        // 记录对比结果
        ComparisonRecord record = ComparisonRecord.builder()
            .featureKey(featureKey)
            .requestId(request.getId())
            .legacyLatencyMs(legacyResponse.getLatencyMs())
            .newLatencyMs(newResponse.getLatencyMs())
            .contentMatch(isSemanticallyEqual)
            .legacyContent(legacyResponse.getContent())
            .newContent(newResponse.getContent())
            .recordedAt(Instant.now())
            .build();
        
        comparisonRepo.save(record);
        
        // 指标上报
        migrationMetrics.recordComparison(featureKey, isSemanticallyEqual,
            legacyResponse.getLatencyMs(), newResponse.getLatencyMs());
        
        if (!isSemanticallyEqual) {
            log.warn("新旧代码结果不一致: featureKey={}, requestId={}", 
                featureKey, request.getId());
        }
    }
    
    // 语义等价性检查（不要求完全一致，允许表述不同）
    private boolean checkSemanticEquality(String text1, String text2) {
        // 简化版：检查关键实体是否都包含
        // 实际可以使用Embedding相似度（>0.85视为等价）
        Set<String> entities1 = extractEntities(text1);
        Set<String> entities2 = extractEntities(text2);
        
        if (entities1.isEmpty() && entities2.isEmpty()) return true;
        if (entities1.isEmpty() || entities2.isEmpty()) return false;
        
        long intersection = entities1.stream().filter(entities2::contains).count();
        double jaccard = (double) intersection / 
            (entities1.size() + entities2.size() - intersection);
        
        return jaccard >= 0.7;
    }
}

六、提示词外化重构（Before/After完整对比）

6.1 Before：硬编码提示词

// before：提示词分散在业务代码里
@Service
public class CustomerServiceAI_OLD {
    
    public String analyzeComplaint(String complaint, String orderId) {
        // 提示词硬编码在方法里，无法测试，无法版本控制
        String prompt = "你是一个专业的电商客服，具有5年处理投诉的经验。\n" +
                       "请分析以下客户投诉，给出处理建议和回复话术。\n" +
                       "投诉内容：" + complaint + "\n" +
                       "订单号：" + orderId + "\n" +
                       "请按以下格式回复：\n" +
                       "1. 问题分类：\n" +
                       "2. 紧急程度：\n" + 
                       "3. 处理建议：\n" +
                       "4. 回复话术：";
        
        return llmClient.call(prompt);
    }
    
    public String generateReply(String issueType, String customerTone) {
        // 另一个类似的提示词，但略有不同
        // 没有人知道这两个哪个是"标准版"
        String prompt = "你是电商客服专员，请根据以下信息生成回复：\n" +
                       "问题类型：" + issueType + "\n" +
                       "客户情绪：" + customerTone;
        return llmClient.call(prompt);
    }
}

6.2 After：提示词外化

// 提示词配置实体（存数据库）
@Entity
@Table(name = "prompt_templates")
public class PromptTemplate {
    @Id
    private String templateId;         // customer_service.analyze_complaint
    private String name;               // 投诉分析提示词
    private String description;        // 描述
    
    @Column(columnDefinition = "TEXT")
    private String systemPrompt;       // 系统提示词
    
    @Column(columnDefinition = "TEXT")
    private String userPromptTemplate; // 用户提示词模板（含{占位符}）
    
    private String version;            // 1.2.0
    private boolean active;            // 是否激活
    private LocalDateTime updatedAt;
    private String updatedBy;
    
    // 关联的评估基准（方便回归测试）
    @OneToMany(mappedBy = "template")
    private List<PromptEvalCase> evalCases;
}

// 提示词仓库
@Repository
public interface PromptTemplateRepository extends JpaRepository<PromptTemplate, String> {
    Optional<PromptTemplate> findByTemplateIdAndActiveTrue(String templateId);
    List<PromptTemplate> findByTemplateIdOrderByVersionDesc(String templateId);
}

// 提示词服务
@Service
@Slf4j
public class PromptTemplateService {
    
    private final PromptTemplateRepository repo;
    private final CacheManager cacheManager;
    
    @Cacheable(value = "promptTemplates", key = "#templateId")
    public PromptTemplate getActive(String templateId) {
        return repo.findByTemplateIdAndActiveTrue(templateId)
            .orElseThrow(() -> new PromptTemplateNotFoundException(templateId));
    }
    
    // 渲染提示词（替换占位符）
    public String render(String templateId, Map<String, String> params) {
        PromptTemplate template = getActive(templateId);
        String rendered = template.getUserPromptTemplate();
        
        for (Map.Entry<String, String> param : params.entrySet()) {
            rendered = rendered.replace("{" + param.getKey() + "}", param.getValue());
        }
        
        // 检查是否还有未替换的占位符
        if (rendered.contains("{") && rendered.contains("}")) {
            log.warn("提示词 {} 存在未替换占位符: {}", templateId, 
                extractUnreplacedParams(rendered));
        }
        
        return rendered;
    }
    
    // 更新提示词（自动生成新版本）
    @Transactional
    public PromptTemplate update(String templateId, String newContent, String updatedBy) {
        PromptTemplate existing = getActive(templateId);
        
        // 停用旧版本
        existing.setActive(false);
        repo.save(existing);
        
        // 创建新版本
        PromptTemplate newVersion = new PromptTemplate();
        BeanUtils.copyProperties(existing, newVersion);
        newVersion.setUserPromptTemplate(newContent);
        newVersion.setVersion(incrementVersion(existing.getVersion()));
        newVersion.setActive(true);
        newVersion.setUpdatedAt(LocalDateTime.now());
        newVersion.setUpdatedBy(updatedBy);
        
        PromptTemplate saved = repo.save(newVersion);
        
        // 清除缓存
        cacheManager.getCache("promptTemplates").evict(templateId);
        
        log.info("提示词更新: templateId={}, version={} -> {}", 
            templateId, existing.getVersion(), saved.getVersion());
        
        return saved;
    }
}

// after：业务代码干净简洁
@Service
public class CustomerServiceAI_NEW {
    
    private final ChatClient chatClient;
    private final PromptTemplateService promptService;
    
    public ComplaintAnalysis analyzeComplaint(String complaint, String orderId) {
        // 提示词从外部获取，业务代码不包含任何提示词内容
        String userPrompt = promptService.render(
            "customer_service.analyze_complaint",
            Map.of("complaint", complaint, "order_id", orderId)
        );
        
        PromptTemplate template = promptService.getActive("customer_service.analyze_complaint");
        
        String response = chatClient.prompt()
            .system(template.getSystemPrompt())
            .user(userPrompt)
            .call()
            .content();
        
        return ComplaintAnalysis.parse(response);
    }
}

改造效果：

提示词修改无需发版，改数据库即生效
支持A/B测试不同版本提示词
可以对每个版本做回归测试
团队中非技术人员（如运营、客服主管）也可以直接优化提示词

七、异步化改造：同步改响应式详细步骤

7.1 改造步骤图

7.2 异步改造完整代码

// 任务实体
@Entity
@Table(name = "ai_tasks")
public class AITask {
    @Id
    private String taskId;
    private String userId;
    private String featureKey;
    private String inputJson;         // 输入参数（JSON）
    
    @Enumerated(EnumType.STRING)
    private TaskStatus status;         // PENDING/PROCESSING/DONE/FAILED
    
    @Column(columnDefinition = "TEXT")
    private String resultJson;         // 结果（JSON）
    
    private String errorMessage;
    private LocalDateTime createdAt;
    private LocalDateTime completedAt;
    private Integer retryCount;
}

// 异步AI服务
@Service
@Slf4j
public class AsyncAIService {
    
    private final AITaskRepository taskRepo;
    private final ApplicationEventPublisher eventPublisher;
    
    // 提交异步任务
    public String submitTask(String userId, String featureKey, Object input) {
        String taskId = UUID.randomUUID().toString();
        
        AITask task = AITask.builder()
            .taskId(taskId)
            .userId(userId)
            .featureKey(featureKey)
            .inputJson(objectMapper.writeValueAsString(input))
            .status(TaskStatus.PENDING)
            .createdAt(LocalDateTime.now())
            .retryCount(0)
            .build();
        
        taskRepo.save(task);
        
        // 发布任务事件
        eventPublisher.publishEvent(new AITaskCreatedEvent(task));
        
        log.info("AI任务已提交: taskId={}, featureKey={}", taskId, featureKey);
        return taskId;
    }
    
    // 查询任务状态
    public TaskStatusResponse getTaskStatus(String taskId) {
        AITask task = taskRepo.findById(taskId)
            .orElseThrow(() -> new TaskNotFoundException(taskId));
        
        return TaskStatusResponse.builder()
            .taskId(taskId)
            .status(task.getStatus())
            .result(task.getStatus() == TaskStatus.DONE ? task.getResultJson() : null)
            .errorMessage(task.getErrorMessage())
            .createdAt(task.getCreatedAt())
            .completedAt(task.getCompletedAt())
            .build();
    }
}

// 任务处理器（消费队列）
@Service
@Slf4j
public class AITaskProcessor {
    
    private final AITaskRepository taskRepo;
    private final Map<String, AIFeatureHandler> handlers;
    
    @EventListener
    @Async("aiTaskExecutor")
    public void processTask(AITaskCreatedEvent event) {
        AITask task = event.getTask();
        String taskId = task.getTaskId();
        
        log.info("开始处理AI任务: taskId={}, featureKey={}", taskId, task.getFeatureKey());
        
        // 更新状态为处理中
        task.setStatus(TaskStatus.PROCESSING);
        taskRepo.save(task);
        
        try {
            // 找到对应的处理器
            AIFeatureHandler handler = handlers.get(task.getFeatureKey());
            if (handler == null) {
                throw new IllegalArgumentException("未知featureKey: " + task.getFeatureKey());
            }
            
            // 执行AI任务
            Object result = handler.execute(task.getInputJson());
            
            // 更新为完成
            task.setStatus(TaskStatus.DONE);
            task.setResultJson(objectMapper.writeValueAsString(result));
            task.setCompletedAt(LocalDateTime.now());
            taskRepo.save(task);
            
            log.info("AI任务完成: taskId={}, 耗时={}ms", 
                taskId, 
                Duration.between(task.getCreatedAt(), task.getCompletedAt()).toMillis());
            
        } catch (Exception e) {
            log.error("AI任务失败: taskId={}", taskId, e);
            
            int newRetryCount = task.getRetryCount() + 1;
            
            if (newRetryCount < 3) {
                // 重试
                task.setStatus(TaskStatus.PENDING);
                task.setRetryCount(newRetryCount);
                taskRepo.save(task);
                
                // 延迟重试（指数退避）
                long delay = (long) Math.pow(2, newRetryCount) * 1000;
                scheduler.schedule(() -> processTask(new AITaskCreatedEvent(task)), 
                    delay, TimeUnit.MILLISECONDS);
            } else {
                // 超过重试次数，标记失败
                task.setStatus(TaskStatus.FAILED);
                task.setErrorMessage(e.getMessage());
                task.setCompletedAt(LocalDateTime.now());
                taskRepo.save(task);
            }
        }
    }
    
    // 线程池配置
    @Bean("aiTaskExecutor")
    public TaskExecutor aiTaskExecutor() {
        ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();
        executor.setCorePoolSize(10);      // 10个核心线程
        executor.setMaxPoolSize(30);        // 最多30个线程
        executor.setQueueCapacity(500);     // 队列容量
        executor.setThreadNamePrefix("ai-task-");
        executor.setRejectedExecutionHandler(new ThreadPoolExecutor.CallerRunsPolicy());
        executor.initialize();
        return executor;
    }
}

改造效果：

指标	改造前	改造后	变化
API响应时间(P50)	3200ms	95ms	-97%
API响应时间(P99)	8400ms	320ms	-96%
并发处理能力	8 req/s	150 req/s	+1775%
超时错误率	12%	0.3%	-97%

八、测试先行：重构前的行为固化

8.1 Golden Master测试（快照测试）

在重构前，先记录系统的当前行为（即使行为是"错的"），用作重构的基准对比。

// Golden Master测试
@SpringBootTest
@TestPropertySource(properties = "spring.profiles.active=test")
public class LegacyBehaviorCapture {
    
    @Autowired
    private LegacyAIService legacyService;
    
    @Test
    @Disabled("只运行一次，用于生成基准数据")
    public void captureGoldenMaster() throws Exception {
        List<TestCase> testCases = loadTestCases("src/test/resources/refactoring/test_inputs.json");
        List<GoldenMasterRecord> records = new ArrayList<>();
        
        for (TestCase tc : testCases) {
            try {
                String output = legacyService.process(tc.getInput());
                records.add(GoldenMasterRecord.builder()
                    .testCaseId(tc.getId())
                    .input(tc.getInput())
                    .output(output)
                    .recordedAt(Instant.now())
                    .build());
            } catch (Exception e) {
                records.add(GoldenMasterRecord.builder()
                    .testCaseId(tc.getId())
                    .input(tc.getInput())
                    .error(e.getMessage())
                    .build());
            }
        }
        
        // 保存基准数据
        objectMapper.writeValue(
            new File("src/test/resources/refactoring/golden_master.json"),
            records
        );
        
        System.out.println("Golden Master已生成，共 " + records.size() + " 条记录");
    }
}

// 重构后的回归测试
@SpringBootTest
public class RefactoringRegressionTest {
    
    @Autowired
    private NewAIService newService;
    
    @Test
    public void testNewServiceMatchesGoldenMaster() throws Exception {
        List<GoldenMasterRecord> goldenRecords = loadGoldenMaster();
        
        int passCount = 0;
        int failCount = 0;
        List<String> failures = new ArrayList<>();
        
        for (GoldenMasterRecord golden : goldenRecords) {
            if (golden.getError() != null) continue; // 跳过原来就报错的用例
            
            try {
                String newOutput = newService.process(golden.getInput());
                
                boolean match = isSemanticallySimilar(golden.getOutput(), newOutput, 0.8);
                if (match) {
                    passCount++;
                } else {
                    failCount++;
                    failures.add(String.format(
                        "用例 %s 不匹配\n  期望: %s\n  实际: %s",
                        golden.getTestCaseId(),
                        golden.getOutput().substring(0, Math.min(100, golden.getOutput().length())),
                        newOutput.substring(0, Math.min(100, newOutput.length()))
                    ));
                }
            } catch (Exception e) {
                failCount++;
                failures.add("用例 " + golden.getTestCaseId() + " 异常: " + e.getMessage());
            }
        }
        
        double passRate = (double) passCount / (passCount + failCount) * 100;
        System.out.printf("通过率: %.1f%% (%d/%d)%n", passRate, passCount, passCount + failCount);
        
        if (!failures.isEmpty()) {
            System.out.println("失败用例：\n" + String.join("\n", failures));
        }
        
        // 通过率必须达到95%才算重构成功
        assertThat(passRate).isGreaterThanOrEqualTo(95.0);
    }
}

8.2 测试覆盖率提升路径

// AI功能的测试策略
@SpringBootTest
@ActiveProfiles("test")
public class CustomerServiceAITest {
    
    @Autowired
    private CustomerServiceAI customerServiceAI;
    
    @MockBean
    private ChatClient chatClient; // Mock LLM，不真实调用
    
    @Test
    public void testComplaintAnalysis_ReturnProductIssue() {
        // Given
        String complaint = "收到的商品包装破损，里面的手机屏幕也有裂缝";
        String mockLLMResponse = """
                {
                  "issueType": "PRODUCT_DAMAGE",
                  "urgency": "HIGH",
                  "suggestion": "立即申请退款或换货",
                  "replyTemplate": "非常抱歉给您带来不便..."
                }
                """;
        
        when(chatClient.prompt().system(any()).user(any()).call().content())
            .thenReturn(mockLLMResponse);
        
        // When
        ComplaintAnalysis result = customerServiceAI.analyzeComplaint(complaint, "ORD-12345");
        
        // Then
        assertThat(result.getIssueType()).isEqualTo(IssueType.PRODUCT_DAMAGE);
        assertThat(result.getUrgency()).isEqualTo(Urgency.HIGH);
        assertThat(result.getSuggestion()).isNotBlank();
    }
    
    @Test
    public void testComplaintAnalysis_LLMUnavailable_ShouldFallback() {
        // Given
        when(chatClient.prompt().system(any()).user(any()).call().content())
            .thenThrow(new RuntimeException("LLM服务不可用"));
        
        // When
        ComplaintAnalysis result = customerServiceAI.analyzeComplaint("投诉内容", "ORD-999");
        
        // Then - 降级到规则引擎
        assertThat(result).isNotNull();
        assertThat(result.isFallback()).isTrue();
        assertThat(result.getIssueType()).isEqualTo(IssueType.UNKNOWN);
    }
}

九、重构验证：确认重构没有破坏功能

9.1 三层验证体系

// 生产验证监控
@Component
@Slf4j
public class RefactoringMonitor {
    
    private final MeterRegistry meterRegistry;
    private final AlertService alertService;
    
    // 重构后的核心指标看板
    @Scheduled(fixedDelay = 60000) // 每分钟检查
    public void checkRefactoringHealth() {
        // 指标1：错误率（不能高于重构前）
        double currentErrorRate = meterRegistry.find("http.server.requests")
            .tag("status", "5xx")
            .timer()
            .map(t -> t.count())
            .orElse(0.0);
        
        double baselineErrorRate = configService.getDouble("refactoring.baseline.error_rate", 0.02);
        
        if (currentErrorRate > baselineErrorRate * 1.5) {
            alertService.alert("重构后错误率异常升高",
                String.format("当前: %.2f%%, 基线: %.2f%%", currentErrorRate * 100, baselineErrorRate * 100));
        }
        
        // 指标2：AI响应质量（用户满意度分）
        double avgSatisfactionScore = surveyService.getRecentAvgScore(Duration.ofHours(1));
        double baselineScore = configService.getDouble("refactoring.baseline.satisfaction", 3.8);
        
        if (avgSatisfactionScore < baselineScore - 0.3) {
            alertService.alert("重构后用户满意度下降",
                String.format("当前: %.1f, 基线: %.1f", avgSatisfactionScore, baselineScore));
        }
    }
}

十、债务预防：建立让技术债难以积累的工程规范

10.1 AI代码质量门禁

# .github/workflows/ai-code-quality.yml
name: AI Code Quality Gate

on: [pull_request]

jobs:
  quality-check:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      
      - name: 检测硬编码提示词
        run: |
          HARDCODED=$(grep -rn "你是\|你的任务\|请分析\|As an AI" \
            --include="*.java" src/main/java | \
            grep -v "PromptTemplate\|test\|Test" | wc -l)
          echo "硬编码提示词数量: $HARDCODED"
          if [ "$HARDCODED" -gt "0" ]; then
            echo "❌ 发现硬编码提示词，请将提示词移到PromptTemplate管理"
            exit 1
          fi
      
      - name: 检测同步AI调用
        run: |
          SYNC_CALLS=$(grep -rn "\.call()\|\.generate()\|\.chat(" \
            --include="*.java" src/main/java | \
            grep -v "@Async\|CompletableFuture\|Mono\|Flux\|test" | wc -l)
          echo "潜在同步AI调用数量: $SYNC_CALLS"
          if [ "$SYNC_CALLS" -gt "5" ]; then
            echo "⚠️ 检测到较多同步AI调用，请考虑异步化"
          fi
      
      - name: 测试覆盖率检查
        run: |
          mvn test jacoco:report
          COVERAGE=$(cat target/site/jacoco/index.html | \
            grep -o "Total[^%]*%" | grep -o "[0-9]*%" | head -1)
          echo "测试覆盖率: $COVERAGE"
          # 要求AI相关代码覆盖率不低于50%

10.2 代码Review清单（AI特有）

## AI代码Review检查清单

### 提示词管理 ✅
- [ ] 提示词是否已外化到PromptTemplate表？
- [ ] 是否有对应的评估测试用例？
- [ ] 模板中的占位符是否有文档说明？

### 异步化 ✅
- [ ] AI调用是否为异步处理？
- [ ] 是否有超时设置？
- [ ] 超时后是否有降级方案？

### 向量化 ✅
- [ ] 是否复用已有的向量化结果？
- [ ] 向量化结果是否有缓存？
- [ ] 是否有增量更新策略（而非全量重做）？

### 错误处理 ✅
- [ ] LLM调用失败是否有fallback？
- [ ] 是否有重试机制（指数退避）？
- [ ] 错误是否有监控告警？

### 测试 ✅
- [ ] 是否Mock了LLM进行单元测试？
- [ ] 是否覆盖了LLM不可用的异常场景？
- [ ] 是否有Golden Master对比测试？

十一、陈磊团队的最终成果（3个月后）

陈磊在复盘会上说了一句话："重构最大的收益不是代码变干净了，而是团队敢改代码了。现在每次迭代不再战战兢兢了。"

十二、FAQ

Q：重构期间如何保证业务不中断？

A：这是Strangler Fig模式的核心价值。所有重构都在旧代码旁边进行，新代码经过充分测试后逐步切流。不存在"停机重构"的风险窗口。

Q：测试覆盖率要求60%是怎么定的？

A：没有完美的数字。60%是"能覆盖主要业务路径"的经验值。更重要的是覆盖哪里：核心业务逻辑要求80%+，工具类和配置类可以低一些。

Q：提示词外化了，但修改提示词还是需要懂技术的人操作怎么办？

A：做一个简单的提示词管理后台（CRUD即可）。前端1周就能完成，产品/运营就可以自己改提示词了，改完可以立即生效，不用发版。

Q：Golden Master测试，如果旧代码的"正确答案"本来就是错的怎么办？

A：这正是重构和修BUG的区别。重构是"行为不变地改代码结构"，修BUG是"改变行为"。先做重构（确保Golden Master通过），再在新代码基础上修BUG。两步走，不要混在一起。

Q：重构时发现旧代码逻辑完全看不懂怎么办？

A：用"Characterization Test"策略：用各种输入跑旧代码，把结果记录下来，就算不理解代码，也能通过测试来定义"系统的当前行为"。

总结

技术债清偿不是一次性冲刺，而是系统性工程：

量化评估：用工具（SonarQube+自定义规则）把技术债数字化，说服管理层投入
优先排序：影响-难度矩阵，从高ROI债务开始还
Strangler Fig：渐进替换，避免大爆炸重写
提示词外化：AI特有债务，优先级最高
异步化：性能改善最显著
测试先行：Golden Master保护重构安全
规范预防：CI门禁让技术债难以再积累

最重要的一条：重构的目的不是让代码"好看"，而是让团队能更快、更有信心地交付价值。