智能客服系统搭建：用Spring AI构建企业级对话机器人

老张2026/6/15大约 21 分钟智能客服Spring AI对话机器人NLPJava企业应用

智能客服系统搭建：用Spring AI构建企业级对话机器人

故事：一个让CEO拍桌子的账单

2024年11月，杭州某头部电商公司的技术总监李建国盯着一份报表，手指在桌上敲了很久。

报表上写着：客服中心月度人力成本：52万元。

公司日均客服咨询量：1.1万条。其中退换货占38%，物流查询占27%，产品咨询占21%，投诉处理占14%。这些问题里，73%是高度重复的标准问题——同样的问题，客服团队每天要回答数百遍。

李建国找来技术团队，摆出三组数据：

峰值时段（每天20:00-22:00）单人客服需同时处理15-20个对话窗口，回复质量严重下滑
客服人员平均在职时长仅8个月，培训成本每人约1.2万元，年流失率高达40%
用户平均等待时长峰值达11分钟，直接导致当月差评率上升2.3个百分点

"如果AI能解决70%的标准问题，我们一年能省多少？"他问。

工程师小王粗算了一下："按现在的人力成本，一年省下来大概在380万左右。"

李建国当场拍板："三个月内上线，资源全力支持。"

三个月后，系统上线。第一个月数据出来：

AI自助解决率：74.3%
月人力成本：从52万降到8.6万
用户满意度：从3.8分升到4.6分（5分制）
平均首次响应时间：从3分20秒降到12秒

这不是PPT里的数字，这是真实发生在某个Java工程师键盘下的成绩单。

今天，我要把这套系统从架构到代码，完整地拆给你看。

一、智能客服系统整体架构

在动手写代码之前，先把架构想清楚。一个企业级智能客服系统，核心流程是这样的：

核心模块拆解：

模块	职责	关键技术
意图识别	判断用户想做什么	LLM分类 + 规则引擎
知识检索	找到最相关的答案	向量数据库 + RAG
对话管理	维护上下文状态	Redis + Spring AI Memory
答案生成	生成自然语言回复	GPT-4 / 国产大模型
人工转接	识别转接时机	情绪检测 + 置信度判断
话术管理	标准化回复模板	模板引擎
数据分析	运营指标统计	ClickHouse + 实时计算

二、项目依赖与基础配置

<!-- pom.xml -->
<parent>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-parent</artifactId>
    <version>3.3.0</version>
</parent>

<dependencies>
    <!-- Spring AI核心 -->
    <dependency>
        <groupId>org.springframework.ai</groupId>
        <artifactId>spring-ai-openai-spring-boot-starter</artifactId>
        <version>1.0.0</version>
    </dependency>

    <!-- 向量数据库 - 使用Milvus -->
    <dependency>
        <groupId>org.springframework.ai</groupId>
        <artifactId>spring-ai-milvus-store-spring-boot-starter</artifactId>
        <version>1.0.0</version>
    </dependency>

    <!-- Redis - 对话上下文存储 -->
    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-data-redis</artifactId>
    </dependency>

    <!-- Web -->
    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-web</artifactId>
    </dependency>

    <!-- WebSocket - 实时对话 -->
    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-websocket</artifactId>
    </dependency>

    <!-- MyBatis Plus -->
    <dependency>
        <groupId>com.baomidou</groupId>
        <artifactId>mybatis-plus-boot-starter</artifactId>
        <version>3.5.7</version>
    </dependency>

    <!-- Lombok -->
    <dependency>
        <groupId>org.projectlombok</groupId>
        <artifactId>lombok</artifactId>
        <optional>true</optional>
    </dependency>

    <!-- Jackson -->
    <dependency>
        <groupId>com.fasterxml.jackson.core</groupId>
        <artifactId>jackson-databind</artifactId>
    </dependency>
</dependencies>

# application.yml
spring:
  ai:
    openai:
      api-key: ${OPENAI_API_KEY}
      base-url: ${OPENAI_BASE_URL:https://api.openai.com}
      chat:
        options:
          model: gpt-4o
          temperature: 0.3
          max-tokens: 2048
      embedding:
        options:
          model: text-embedding-3-small

  data:
    redis:
      host: ${REDIS_HOST:localhost}
      port: 6379
      password: ${REDIS_PASSWORD:}
      database: 0

  ai:
    vectorstore:
      milvus:
        host: ${MILVUS_HOST:localhost}
        port: 19530
        collection-name: customer_service_kb
        embedding-dimension: 1536

# 客服系统配置
customer-service:
  # 转人工阈值
  human-transfer:
    confidence-threshold: 0.4        # 置信度低于此值转人工
    emotion-score-threshold: -0.6    # 情绪分低于此值转人工
    max-rounds-without-solution: 3   # 连续3轮无法解决转人工
  # 对话超时
  session:
    timeout-minutes: 30
    max-history-rounds: 10
  # 限流
  rate-limit:
    max-requests-per-second: 100

三、核心领域模型

// 对话会话
@Data
@Builder
@NoArgsConstructor
@AllArgsConstructor
public class ChatSession {
    private String sessionId;
    private String userId;
    private String channelType;      // WEB/APP/WECHAT/PHONE
    private SessionStatus status;    // ACTIVE/WAITING_HUMAN/CLOSED
    private List<ChatMessage> history;
    private Map<String, Object> context;    // 上下文变量（订单号、商品等）
    private Integer roundCount;
    private Integer unsolvableRounds;       // 连续未解决轮数
    private LocalDateTime startTime;
    private LocalDateTime lastActiveTime;
    private Integer satisfactionScore;      // 1-5分
}

// 对话消息
@Data
@Builder
public class ChatMessage {
    private String messageId;
    private String role;             // USER/ASSISTANT/SYSTEM
    private String content;
    private MessageType type;        // TEXT/IMAGE/FILE
    private LocalDateTime timestamp;
    private IntentResult intent;     // 用户意图（仅USER消息有）
    private Double confidenceScore;  // AI回复置信度
    private String knowledgeSource;  // 答案来源文档
}

// 意图识别结果
@Data
@Builder
public class IntentResult {
    private String intentCode;       // 意图编码
    private String intentName;       // 意图名称
    private Double confidence;       // 置信度 0-1
    private Map<String, String> entities;  // 提取的实体（订单号、商品名等）
    private String originalText;
}

// 知识库文档
@Data
@Builder
@TableName("kb_document")
public class KnowledgeDocument {
    @TableId(type = IdType.AUTO)
    private Long id;
    private String category;         // FAQ/POLICY/CASE
    private String title;
    private String content;
    private String tags;
    private Integer hitCount;        // 被命中次数
    private Boolean enabled;
    private LocalDateTime createdAt;
    private LocalDateTime updatedAt;
}

public enum SessionStatus {
    ACTIVE,           // 进行中
    WAITING_HUMAN,    // 等待人工
    IN_HUMAN,         // 人工处理中
    CLOSED            // 已关闭
}

四、意图识别引擎

意图识别是整个系统的入口，准确率直接决定后续处理质量。这里采用LLM + 规则引擎双层架构：规则引擎处理高确定性意图（如包含"退款"关键词），LLM处理模糊意图。

@Service
@Slf4j
public class IntentRecognitionService {

    @Autowired
    private ChatClient chatClient;

    @Autowired
    private RuleEngineService ruleEngineService;

    // 预定义意图列表（实际从数据库加载）
    private static final List<IntentDefinition> INTENT_DEFINITIONS = List.of(
        new IntentDefinition("REFUND_APPLY", "申请退款", List.of("退款", "退钱", "要退", "申请退")),
        new IntentDefinition("ORDER_QUERY", "订单查询", List.of("订单", "什么时候发货", "发货了吗")),
        new IntentDefinition("LOGISTICS_QUERY", "物流查询", List.of("快递", "物流", "到哪了", "几天到")),
        new IntentDefinition("PRODUCT_CONSULT", "产品咨询", List.of("这个产品", "有没有", "支持吗")),
        new IntentDefinition("COMPLAINT", "投诉", List.of("投诉", "举报", "太差了", "不满意")),
        new IntentDefinition("EXCHANGE", "换货", List.of("换货", "换一个", "想换")),
        new IntentDefinition("RETURN", "退货", List.of("退货", "不想要了", "退回去")),
        new IntentDefinition("AFTER_SALE", "售后服务", List.of("售后", "保修", "维修")),
        new IntentDefinition("PRICE_QUERY", "价格查询", List.of("多少钱", "价格", "优惠", "打折")),
        new IntentDefinition("HUMAN_TRANSFER", "转人工", List.of("转人工", "找人工", "人工客服", "真人"))
    );

    /**
     * 识别用户意图
     */
    public IntentResult recognize(String userInput, ChatSession session) {
        // 第一层：规则引擎快速匹配
        IntentResult ruleResult = ruleEngineService.match(userInput, INTENT_DEFINITIONS);
        if (ruleResult != null && ruleResult.getConfidence() >= 0.9) {
            log.debug("Rule engine matched intent: {} with confidence: {}",
                ruleResult.getIntentCode(), ruleResult.getConfidence());
            return ruleResult;
        }

        // 第二层：LLM深度理解
        return recognizeByLLM(userInput, session);
    }

    private IntentResult recognizeByLLM(String userInput, ChatSession session) {
        // 构建最近3轮对话上下文
        String recentContext = buildRecentContext(session, 3);

        String intentList = INTENT_DEFINITIONS.stream()
            .map(d -> String.format("- %s（%s）", d.getCode(), d.getName()))
            .collect(Collectors.joining("\n"));

        String prompt = String.format("""
            你是一个电商客服意图识别系统。根据用户最新消息和对话历史，识别用户意图。
            
            可选意图列表：
            %s
            - UNKNOWN（无法识别的意图）
            
            对话历史（最近3轮）：
            %s
            
            用户最新消息：%s
            
            请以JSON格式返回，包含以下字段：
            - intentCode: 意图编码（从上面列表选择）
            - confidence: 置信度（0.0-1.0）
            - entities: 提取的关键实体，如{"orderId":"12345","productName":"手机"}
            - reasoning: 简短推理说明（不超过30字）
            
            只返回JSON，不要其他内容。
            """, intentList, recentContext, userInput);

        String response = chatClient.prompt()
            .user(prompt)
            .call()
            .content();

        return parseIntentResponse(response, userInput);
    }

    private IntentResult parseIntentResponse(String response, String originalText) {
        try {
            // 清理可能的markdown格式
            String cleanJson = response.replaceAll("```json\\s*", "").replaceAll("```\\s*", "").trim();
            ObjectMapper mapper = new ObjectMapper();
            JsonNode node = mapper.readTree(cleanJson);

            return IntentResult.builder()
                .intentCode(node.get("intentCode").asText("UNKNOWN"))
                .intentName(getIntentName(node.get("intentCode").asText("UNKNOWN")))
                .confidence(node.get("confidence").asDouble(0.5))
                .entities(parseEntities(node.get("entities")))
                .originalText(originalText)
                .build();
        } catch (Exception e) {
            log.warn("Failed to parse intent response: {}", response, e);
            return IntentResult.builder()
                .intentCode("UNKNOWN")
                .intentName("未知意图")
                .confidence(0.0)
                .originalText(originalText)
                .build();
        }
    }

    private String buildRecentContext(ChatSession session, int rounds) {
        if (session.getHistory() == null || session.getHistory().isEmpty()) {
            return "（无对话历史）";
        }
        List<ChatMessage> history = session.getHistory();
        int start = Math.max(0, history.size() - rounds * 2);
        return history.subList(start, history.size()).stream()
            .map(m -> String.format("%s: %s", "USER".equals(m.getRole()) ? "用户" : "客服", m.getContent()))
            .collect(Collectors.joining("\n"));
    }

    private Map<String, String> parseEntities(JsonNode entitiesNode) {
        Map<String, String> entities = new HashMap<>();
        if (entitiesNode != null && entitiesNode.isObject()) {
            entitiesNode.fields().forEachRemaining(entry ->
                entities.put(entry.getKey(), entry.getValue().asText()));
        }
        return entities;
    }

    private String getIntentName(String code) {
        return INTENT_DEFINITIONS.stream()
            .filter(d -> d.getCode().equals(code))
            .map(IntentDefinition::getName)
            .findFirst()
            .orElse("未知意图");
    }
}

// 规则引擎 - 基于关键词的快速匹配
@Service
public class RuleEngineService {

    public IntentResult match(String input, List<IntentDefinition> definitions) {
        String lowerInput = input.toLowerCase();

        // 计算每个意图的匹配分数
        IntentDefinition bestMatch = null;
        int maxMatchCount = 0;

        for (IntentDefinition def : definitions) {
            int matchCount = 0;
            for (String keyword : def.getKeywords()) {
                if (lowerInput.contains(keyword.toLowerCase())) {
                    matchCount++;
                }
            }
            if (matchCount > maxMatchCount) {
                maxMatchCount = matchCount;
                bestMatch = def;
            }
        }

        if (bestMatch == null || maxMatchCount == 0) {
            return null;
        }

        // 置信度计算：匹配关键词数/总关键词数，但有个最低阈值
        double confidence = Math.min(0.95, 0.6 + (maxMatchCount - 1) * 0.15);

        return IntentResult.builder()
            .intentCode(bestMatch.getCode())
            .intentName(bestMatch.getName())
            .confidence(confidence)
            .originalText(input)
            .entities(new HashMap<>())
            .build();
    }
}

五、知识库集成与RAG检索

知识库是智能客服的"大脑"。我们把FAQ、政策文档、历史案例统一向量化存储，检索时用语义相似度找到最相关的内容。

@Service
@Slf4j
public class KnowledgeBaseService {

    @Autowired
    private VectorStore vectorStore;

    @Autowired
    private KnowledgeDocumentMapper documentMapper;

    @Autowired
    private EmbeddingModel embeddingModel;

    /**
     * 检索相关知识
     */
    public List<KnowledgeSearchResult> search(String query, String category, int topK) {
        // 构建检索过滤条件
        FilterExpressionBuilder filter = new FilterExpressionBuilder();
        FilterExpressionBuilder.Op filterOp = null;

        if (StringUtils.hasText(category)) {
            filterOp = filter.eq("category", category);
        }
        filterOp = filter.and(filterOp, filter.eq("enabled", true));

        SearchRequest request = SearchRequest.query(query)
            .withTopK(topK)
            .withSimilarityThreshold(0.65)
            .withFilterExpression(filterOp != null ? filterOp.build() : null);

        List<Document> documents = vectorStore.similaritySearch(request);

        return documents.stream()
            .map(doc -> KnowledgeSearchResult.builder()
                .documentId(doc.getMetadata().get("id").toString())
                .title(doc.getMetadata().get("title").toString())
                .content(doc.getContent())
                .category(doc.getMetadata().get("category").toString())
                .score((Double) doc.getMetadata().get("distance"))
                .build())
            .collect(Collectors.toList());
    }

    /**
     * 向量化存储知识文档
     */
    @Transactional
    public void indexDocument(KnowledgeDocument doc) {
        // 长文档分块处理
        List<String> chunks = splitIntoChunks(doc.getContent(), 500, 50);

        List<Document> vectorDocs = new ArrayList<>();
        for (int i = 0; i < chunks.size(); i++) {
            Map<String, Object> metadata = new HashMap<>();
            metadata.put("id", doc.getId().toString());
            metadata.put("title", doc.getTitle());
            metadata.put("category", doc.getCategory());
            metadata.put("tags", doc.getTags());
            metadata.put("enabled", doc.getEnabled());
            metadata.put("chunkIndex", i);
            metadata.put("totalChunks", chunks.size());

            vectorDocs.add(new Document(chunks.get(i), metadata));
        }

        vectorStore.add(vectorDocs);
        log.info("Indexed document: {} with {} chunks", doc.getTitle(), chunks.size());
    }

    /**
     * 文本分块：按句子边界切割，保持语义完整性
     */
    private List<String> splitIntoChunks(String text, int chunkSize, int overlap) {
        List<String> chunks = new ArrayList<>();
        // 按段落分割
        String[] paragraphs = text.split("\n\n+");
        StringBuilder currentChunk = new StringBuilder();

        for (String paragraph : paragraphs) {
            if (currentChunk.length() + paragraph.length() > chunkSize && currentChunk.length() > 0) {
                chunks.add(currentChunk.toString().trim());
                // 保留最后overlap个字符作为重叠
                String lastPart = currentChunk.toString();
                currentChunk = new StringBuilder();
                if (lastPart.length() > overlap) {
                    currentChunk.append(lastPart.substring(lastPart.length() - overlap));
                }
            }
            currentChunk.append(paragraph).append("\n\n");
        }

        if (currentChunk.length() > 0) {
            chunks.add(currentChunk.toString().trim());
        }

        return chunks.isEmpty() ? List.of(text) : chunks;
    }

    /**
     * 增量更新知识库（不需要全量重建）
     */
    @Transactional
    public void updateDocument(KnowledgeDocument doc) {
        // 删除旧向量
        vectorStore.delete(List.of(doc.getId().toString()));
        // 重新索引
        indexDocument(doc);
        // 更新数据库
        documentMapper.updateById(doc);
    }

    /**
     * 记录知识命中，用于热门问题分析
     */
    public void recordHit(String documentId) {
        documentMapper.incrementHitCount(Long.parseLong(documentId));
    }
}

// 知识库搜索结果
@Data
@Builder
public class KnowledgeSearchResult {
    private String documentId;
    private String title;
    private String content;
    private String category;
    private Double score;
}

六、答案生成服务（RAG核心）

@Service
@Slf4j
public class AnswerGenerationService {

    @Autowired
    private ChatClient chatClient;

    @Autowired
    private KnowledgeBaseService knowledgeBaseService;

    @Autowired
    private ScriptTemplateService scriptTemplateService;

    /**
     * 生成客服回复
     */
    public AnswerResult generateAnswer(String userInput, IntentResult intent, ChatSession session) {
        // 1. 检索相关知识
        List<KnowledgeSearchResult> knowledge = knowledgeBaseService.search(
            buildSearchQuery(userInput, intent),
            mapIntentToCategory(intent.getIntentCode()),
            5
        );

        // 2. 检查是否有高置信度知识
        double maxKnowledgeScore = knowledge.stream()
            .mapToDouble(KnowledgeSearchResult::getScore)
            .max().orElse(0.0);

        // 3. 话术模板优先（确定性最高）
        String scriptTemplate = scriptTemplateService.getTemplate(intent.getIntentCode());
        if (scriptTemplate != null && maxKnowledgeScore >= 0.85) {
            return generateFromTemplate(scriptTemplate, intent, session);
        }

        // 4. RAG生成
        return generateWithRAG(userInput, intent, knowledge, session);
    }

    private AnswerResult generateWithRAG(String userInput, IntentResult intent,
                                         List<KnowledgeSearchResult> knowledge, ChatSession session) {
        // 构建知识上下文
        String knowledgeContext = knowledge.stream()
            .map(k -> String.format("【%s】%s", k.getTitle(), k.getContent()))
            .collect(Collectors.joining("\n\n"));

        // 构建对话历史
        String chatHistory = buildChatHistory(session, 5);

        String systemPrompt = """
            你是"优购商城"的智能客服助手。你的职责是帮助用户解决购物相关问题。
            
            回复原则：
            1. 使用亲切、简洁的中文，避免技术术语
            2. 优先使用参考资料中的信息，不要编造内容
            3. 如果参考资料无法回答用户问题，诚实告知并建议转人工
            4. 涉及金额、时间、规则等具体信息必须来自参考资料
            5. 回复长度适中，200字以内，必要时用序号列举
            6. 结尾可以询问"请问还有其他问题吗？"
            
            当前日期：%s
            """.formatted(LocalDate.now());

        String userPrompt = String.format("""
            参考资料：
            %s
            
            对话历史：
            %s
            
            用户意图：%s（置信度：%.0f%%）
            用户消息：%s
            
            请根据参考资料和对话历史，生成合适的客服回复。
            同时评估回复的置信度（0.0-1.0），以JSON格式返回：
            {"answer": "回复内容", "confidence": 0.85, "sources": ["来源文档ID列表"]}
            """, knowledgeContext, chatHistory, intent.getIntentName(),
            intent.getConfidence() * 100, userInput);

        String response = chatClient.prompt()
            .system(systemPrompt)
            .user(userPrompt)
            .call()
            .content();

        return parseAnswerResponse(response, knowledge);
    }

    private AnswerResult generateFromTemplate(String template, IntentResult intent, ChatSession session) {
        // 替换模板变量
        Map<String, Object> variables = new HashMap<>();
        variables.putAll(intent.getEntities());
        variables.putAll(session.getContext());
        variables.put("userName", session.getUserId());

        String answer = TemplateEngine.render(template, variables);
        return AnswerResult.builder()
            .answer(answer)
            .confidence(0.95)
            .fromTemplate(true)
            .build();
    }

    private String buildSearchQuery(String userInput, IntentResult intent) {
        // 将提取的实体加入搜索查询，提升检索精度
        StringBuilder query = new StringBuilder(userInput);
        if (!intent.getEntities().isEmpty()) {
            intent.getEntities().values().forEach(v -> query.append(" ").append(v));
        }
        return query.toString();
    }

    private String mapIntentToCategory(String intentCode) {
        return switch (intentCode) {
            case "REFUND_APPLY", "RETURN", "EXCHANGE", "AFTER_SALE" -> "POLICY";
            case "ORDER_QUERY", "LOGISTICS_QUERY" -> "FAQ";
            case "PRODUCT_CONSULT", "PRICE_QUERY" -> "FAQ";
            case "COMPLAINT" -> "CASE";
            default -> null;
        };
    }

    private String buildChatHistory(ChatSession session, int maxRounds) {
        if (session.getHistory() == null) return "";
        List<ChatMessage> history = session.getHistory();
        int start = Math.max(0, history.size() - maxRounds * 2);
        return history.subList(start, history.size()).stream()
            .map(m -> ("USER".equals(m.getRole()) ? "用户" : "客服") + ": " + m.getContent())
            .collect(Collectors.joining("\n"));
    }

    private AnswerResult parseAnswerResponse(String response, List<KnowledgeSearchResult> knowledge) {
        try {
            String cleanJson = response.replaceAll("```json\\s*", "").replaceAll("```\\s*", "").trim();
            ObjectMapper mapper = new ObjectMapper();
            JsonNode node = mapper.readTree(cleanJson);
            return AnswerResult.builder()
                .answer(node.get("answer").asText())
                .confidence(node.get("confidence").asDouble(0.5))
                .sources(knowledge.stream().map(KnowledgeSearchResult::getDocumentId).collect(Collectors.toList()))
                .build();
        } catch (Exception e) {
            log.warn("Failed to parse answer response, using raw text");
            return AnswerResult.builder()
                .answer(response)
                .confidence(0.5)
                .build();
        }
    }
}

七、多轮对话管理

多轮对话是智能客服区别于简单问答机器人的核心能力。我们需要维护会话状态、上下文理解、主动澄清。

@Service
@Slf4j
public class ConversationManagerService {

    @Autowired
    private RedisTemplate<String, Object> redisTemplate;

    private static final String SESSION_KEY_PREFIX = "cs:session:";
    private static final int SESSION_TTL_MINUTES = 30;

    /**
     * 获取或创建会话
     */
    public ChatSession getOrCreateSession(String sessionId, String userId, String channel) {
        String key = SESSION_KEY_PREFIX + sessionId;
        ChatSession session = (ChatSession) redisTemplate.opsForValue().get(key);

        if (session == null) {
            session = ChatSession.builder()
                .sessionId(sessionId)
                .userId(userId)
                .channelType(channel)
                .status(SessionStatus.ACTIVE)
                .history(new ArrayList<>())
                .context(new HashMap<>())
                .roundCount(0)
                .unsolvableRounds(0)
                .startTime(LocalDateTime.now())
                .lastActiveTime(LocalDateTime.now())
                .build();
            log.info("Created new chat session: {} for user: {}", sessionId, userId);
        }

        return session;
    }

    /**
     * 追加用户消息到历史
     */
    public void appendUserMessage(ChatSession session, String content, IntentResult intent) {
        ChatMessage message = ChatMessage.builder()
            .messageId(UUID.randomUUID().toString())
            .role("USER")
            .content(content)
            .type(MessageType.TEXT)
            .timestamp(LocalDateTime.now())
            .intent(intent)
            .build();

        session.getHistory().add(message);
        session.setRoundCount(session.getRoundCount() + 1);
        session.setLastActiveTime(LocalDateTime.now());

        // 更新上下文实体（订单号等）
        if (intent != null && intent.getEntities() != null) {
            session.getContext().putAll(intent.getEntities());
        }
    }

    /**
     * 追加AI回复到历史
     */
    public void appendAssistantMessage(ChatSession session, AnswerResult answer) {
        ChatMessage message = ChatMessage.builder()
            .messageId(UUID.randomUUID().toString())
            .role("ASSISTANT")
            .content(answer.getAnswer())
            .type(MessageType.TEXT)
            .timestamp(LocalDateTime.now())
            .confidenceScore(answer.getConfidence())
            .build();

        session.getHistory().add(message);

        // 追踪未解决轮数
        if (answer.getConfidence() < 0.5) {
            session.setUnsolvableRounds(session.getUnsolvableRounds() + 1);
        } else {
            session.setUnsolvableRounds(0); // 有效回复则重置
        }
    }

    /**
     * 保存会话（带TTL续期）
     */
    public void saveSession(ChatSession session) {
        String key = SESSION_KEY_PREFIX + session.getSessionId();
        // 限制历史长度，避免Redis存储过大
        if (session.getHistory().size() > 20) {
            session.setHistory(
                session.getHistory().subList(session.getHistory().size() - 20, session.getHistory().size())
            );
        }
        redisTemplate.opsForValue().set(key, session, SESSION_TTL_MINUTES, TimeUnit.MINUTES);
    }

    /**
     * 生成澄清问题（当意图不明确时）
     */
    public String generateClarificationQuestion(String userInput, List<IntentResult> candidates) {
        if (candidates.size() <= 1) return null;

        String options = candidates.stream()
            .map(c -> "- " + c.getIntentName())
            .collect(Collectors.joining("\n"));

        return String.format("您好，我理解您的问题可能是关于以下几个方面，请问您想了解哪个？\n%s", options);
    }

    /**
     * 关闭会话
     */
    public void closeSession(String sessionId) {
        String key = SESSION_KEY_PREFIX + sessionId;
        ChatSession session = (ChatSession) redisTemplate.opsForValue().get(key);
        if (session != null) {
            session.setStatus(SessionStatus.CLOSED);
            // 保留30分钟供数据分析
            redisTemplate.opsForValue().set(key, session, 30, TimeUnit.MINUTES);
        }
    }
}

八、人工转接触发器

人工转接判断是系统的关键安全阀，必须准确——漏判会让用户体验极差，误判会浪费人工资源。

@Service
@Slf4j
public class HumanTransferService {

    @Autowired
    private ChatClient chatClient;

    @Value("${customer-service.human-transfer.confidence-threshold:0.4}")
    private double confidenceThreshold;

    @Value("${customer-service.human-transfer.emotion-score-threshold:-0.6}")
    private double emotionScoreThreshold;

    @Value("${customer-service.human-transfer.max-rounds-without-solution:3}")
    private int maxRoundsWithoutSolution;

    /**
     * 判断是否需要转人工
     */
    public TransferDecision shouldTransferToHuman(ChatSession session, String userInput,
                                                   IntentResult intent, AnswerResult answer) {
        // 规则1：用户主动要求转人工
        if ("HUMAN_TRANSFER".equals(intent.getIntentCode())) {
            return TransferDecision.transfer(TransferReason.USER_REQUEST, "用户主动请求转人工");
        }

        // 规则2：AI置信度过低
        if (answer.getConfidence() < confidenceThreshold) {
            return TransferDecision.transfer(TransferReason.LOW_CONFIDENCE,
                String.format("AI置信度%.0f%%低于阈值%.0f%%", answer.getConfidence() * 100, confidenceThreshold * 100));
        }

        // 规则3：连续多轮未解决
        if (session.getUnsolvableRounds() >= maxRoundsWithoutSolution) {
            return TransferDecision.transfer(TransferReason.REPEATED_UNSOLVED,
                String.format("连续%d轮未能解决问题", session.getUnsolvableRounds()));
        }

        // 规则4：情绪检测（投诉类意图特别关注）
        if ("COMPLAINT".equals(intent.getIntentCode())) {
            double emotionScore = detectEmotion(userInput);
            if (emotionScore < emotionScoreThreshold) {
                return TransferDecision.transfer(TransferReason.NEGATIVE_EMOTION,
                    String.format("用户情绪评分%.2f，用户情绪较差", emotionScore));
            }
        }

        // 规则5：高价值订单（金额超过5000元自动转人工）
        if (session.getContext().containsKey("orderAmount")) {
            double amount = Double.parseDouble(session.getContext().get("orderAmount").toString());
            if (amount > 5000) {
                return TransferDecision.transfer(TransferReason.HIGH_VALUE_ORDER,
                    String.format("订单金额%.0f元超过转人工阈值", amount));
            }
        }

        return TransferDecision.noTransfer();
    }

    /**
     * 情绪检测 - 返回-1到1的情绪分数
     */
    private double detectEmotion(String text) {
        String prompt = String.format("""
            分析以下文本的情绪倾向，返回一个-1到1之间的数字：
            -1 = 极度负面（愤怒、强烈不满）
            -0.5 = 负面（不满、失望）
            0 = 中性
            0.5 = 正面（满意）
            1 = 极度正面（非常满意、感谢）
            
            文本：%s
            
            只返回数字，不要其他内容。
            """, text);

        try {
            String response = chatClient.prompt().user(prompt).call().content().trim();
            return Double.parseDouble(response);
        } catch (Exception e) {
            log.warn("Failed to detect emotion for text: {}", text, e);
            return 0.0;
        }
    }

    /**
     * 执行转人工
     */
    public void executeTransfer(ChatSession session, TransferReason reason) {
        session.setStatus(SessionStatus.WAITING_HUMAN);
        log.info("Session {} transferred to human. Reason: {}", session.getSessionId(), reason);

        // 这里接入企业的人工客服系统API（如Zendesk、自研系统等）
        // humanServiceClient.createTicket(session, reason);
    }
}

@Data
@Builder
public class TransferDecision {
    private boolean needTransfer;
    private TransferReason reason;
    private String description;

    public static TransferDecision transfer(TransferReason reason, String description) {
        return TransferDecision.builder().needTransfer(true).reason(reason).description(description).build();
    }

    public static TransferDecision noTransfer() {
        return TransferDecision.builder().needTransfer(false).build();
    }
}

public enum TransferReason {
    USER_REQUEST, LOW_CONFIDENCE, REPEATED_UNSOLVED, NEGATIVE_EMOTION, HIGH_VALUE_ORDER
}

九、话术模板管理

话术模板解决了一致性问题——确保相同场景下所有用户得到标准、合规的回复，降低法律风险。

@Service
@Slf4j
public class ScriptTemplateService {

    @Autowired
    private ScriptTemplateMapper templateMapper;

    // 本地缓存，避免每次查库
    private final Map<String, String> templateCache = new ConcurrentHashMap<>();

    /**
     * 获取话术模板
     */
    public String getTemplate(String intentCode) {
        return templateCache.computeIfAbsent(intentCode, code -> {
            ScriptTemplate template = templateMapper.findByIntentCode(code);
            return template != null ? template.getContent() : null;
        });
    }

    /**
     * 话术模板示例（数据库中存储）
     */
    @PostConstruct
    public void initDefaultTemplates() {
        // 退款申请话术
        templateCache.put("REFUND_APPLY", """
            您好！关于您的退款申请，我来为您说明退款流程：
            
            1. 退款条件：收货后7天内（质量问题15天内）可申请退款
            2. 退款方式：原路退回，一般3-5个工作日到账
            3. 申请步骤：订单页面 → 申请售后 → 选择退款原因 → 提交
            
            ${orderId ? '您的订单号：' + orderId + '，' : ''}如果您已在规定时间内，可直接在订单页面发起申请。
            
            请问还有其他需要帮助的吗？
            """);

        // 物流查询话术
        templateCache.put("LOGISTICS_QUERY", """
            您好！为您查询物流信息：
            
            ${trackingNo ? '快递单号：' + trackingNo : '请您提供订单号或快递单号，我来为您查询'}
            
            一般情况下：
            - 付款后24小时内发货（节假日除外）
            - 全国标准快递3-5个工作日
            - 顺丰快递1-3个工作日
            
            您也可以在"订单详情"页面实时查看物流状态。
            
            请问还有其他问题吗？
            """);
    }

    /**
     * 刷新模板缓存
     */
    public void refreshCache() {
        templateCache.clear();
        log.info("Script template cache cleared");
    }
}

十、满意度收集与数据分析

@Service
@Slf4j
public class SatisfactionService {

    @Autowired
    private SatisfactionRecordMapper satisfactionMapper;

    @Autowired
    private ConversationAnalyticsService analyticsService;

    /**
     * 处理满意度评分
     */
    @Transactional
    public void recordSatisfaction(String sessionId, Integer score, String feedback) {
        SatisfactionRecord record = SatisfactionRecord.builder()
            .sessionId(sessionId)
            .score(score)          // 1-5分
            .feedback(feedback)
            .createdAt(LocalDateTime.now())
            .build();

        satisfactionMapper.insert(record);

        // 触发分析：低分反馈需要人工审核
        if (score <= 2) {
            analyticsService.flagForReview(sessionId, score, feedback);
            log.warn("Low satisfaction score {} for session {}", score, sessionId);
        }
    }
}

@Service
@Slf4j
public class ConversationAnalyticsService {

    @Autowired
    private JdbcTemplate jdbcTemplate;

    @Autowired
    private ChatClient chatClient;

    /**
     * 实时运营看板数据
     */
    public DashboardMetrics getDashboardMetrics(LocalDate date) {
        String dateStr = date.toString();

        // 总对话数
        int totalSessions = jdbcTemplate.queryForObject(
            "SELECT COUNT(*) FROM chat_session WHERE DATE(start_time) = ?",
            Integer.class, dateStr);

        // AI自助解决率（未转人工 / 总对话）
        int aiResolvedCount = jdbcTemplate.queryForObject(
            "SELECT COUNT(*) FROM chat_session WHERE DATE(start_time) = ? AND status = 'CLOSED' AND transferred_to_human = false",
            Integer.class, dateStr);

        // 平均满意度
        Double avgSatisfaction = jdbcTemplate.queryForObject(
            "SELECT AVG(score) FROM satisfaction_record WHERE DATE(created_at) = ?",
            Double.class, dateStr);

        // 转人工率
        int transferCount = jdbcTemplate.queryForObject(
            "SELECT COUNT(*) FROM chat_session WHERE DATE(start_time) = ? AND transferred_to_human = true",
            Integer.class, dateStr);

        // 平均首次响应时间（毫秒）
        Long avgFirstResponseMs = jdbcTemplate.queryForObject(
            "SELECT AVG(first_response_ms) FROM chat_session WHERE DATE(start_time) = ?",
            Long.class, dateStr);

        return DashboardMetrics.builder()
            .date(date)
            .totalSessions(totalSessions)
            .aiResolutionRate(totalSessions > 0 ? (double) aiResolvedCount / totalSessions : 0)
            .humanTransferRate(totalSessions > 0 ? (double) transferCount / totalSessions : 0)
            .avgSatisfactionScore(avgSatisfaction != null ? avgSatisfaction : 0)
            .avgFirstResponseMs(avgFirstResponseMs != null ? avgFirstResponseMs : 0)
            .build();
    }

    /**
     * 挖掘高频未命中问题（AI无法回答的问题）
     */
    public List<String> getMissedQuestions(LocalDate date, int limit) {
        List<String> missedQuestions = jdbcTemplate.queryForList(
            "SELECT user_message FROM chat_message " +
            "WHERE DATE(timestamp) = ? AND confidence_score < 0.4 " +
            "ORDER BY timestamp DESC LIMIT ?",
            String.class, date.toString(), limit * 3);

        if (missedQuestions.isEmpty()) return List.of();

        // 用LLM归纳聚类，找出主要问题类型
        String questionsText = missedQuestions.stream()
            .map(q -> "- " + q)
            .collect(Collectors.joining("\n"));

        String prompt = String.format("""
            以下是客服机器人未能回答的用户问题，请归纳出%d个主要问题类型，
            并给出建议的知识库补充方向：
            
            %s
            
            以JSON数组格式返回，每项包含：{"type": "问题类型", "count": 估计数量, "suggestion": "建议"}
            """, limit, questionsText);

        String response = chatClient.prompt().user(prompt).call().content();
        // 解析并返回
        return List.of(response); // 简化返回
    }

    /**
     * 标记低分会话供人工审核
     */
    public void flagForReview(String sessionId, Integer score, String feedback) {
        jdbcTemplate.update(
            "INSERT INTO review_queue (session_id, score, feedback, created_at) VALUES (?, ?, ?, NOW())",
            sessionId, score, feedback);
    }
}

十一、REST API与WebSocket接口

@RestController
@RequestMapping("/api/customer-service")
@Slf4j
public class CustomerServiceController {

    @Autowired
    private IntentRecognitionService intentService;

    @Autowired
    private AnswerGenerationService answerService;

    @Autowired
    private ConversationManagerService conversationManager;

    @Autowired
    private HumanTransferService transferService;

    @Autowired
    private SatisfactionService satisfactionService;

    /**
     * 发送消息
     */
    @PostMapping("/message")
    public ResponseEntity<ChatResponse> sendMessage(@RequestBody @Validated ChatRequest request) {
        long startTime = System.currentTimeMillis();

        // 1. 获取或创建会话
        ChatSession session = conversationManager.getOrCreateSession(
            request.getSessionId(), request.getUserId(), request.getChannel());

        // 2. 意图识别
        IntentResult intent = intentService.recognize(request.getMessage(), session);
        log.info("Session: {}, Intent: {} (confidence: {:.2f})",
            request.getSessionId(), intent.getIntentCode(), intent.getConfidence());

        // 3. 追加用户消息
        conversationManager.appendUserMessage(session, request.getMessage(), intent);

        // 4. 生成回复
        AnswerResult answer = answerService.generateAnswer(request.getMessage(), intent, session);

        // 5. 判断是否转人工
        TransferDecision transfer = transferService.shouldTransferToHuman(session, request.getMessage(), intent, answer);
        if (transfer.isNeedTransfer()) {
            transferService.executeTransfer(session, transfer.getReason());
            answer = AnswerResult.builder()
                .answer("感谢您的耐心等待，正在为您转接人工客服，预计等待时间2-3分钟，请稍候...")
                .confidence(1.0)
                .build();
        }

        // 6. 追加AI回复
        conversationManager.appendAssistantMessage(session, answer);

        // 7. 保存会话
        conversationManager.saveSession(session);

        long responseTime = System.currentTimeMillis() - startTime;

        return ResponseEntity.ok(ChatResponse.builder()
            .sessionId(session.getSessionId())
            .message(answer.getAnswer())
            .intentCode(intent.getIntentCode())
            .confidence(answer.getConfidence())
            .isTransferred(transfer.isNeedTransfer())
            .responseTimeMs(responseTime)
            .build());
    }

    /**
     * 满意度评分
     */
    @PostMapping("/satisfaction")
    public ResponseEntity<Void> rateSatisfaction(@RequestBody SatisfactionRequest request) {
        satisfactionService.recordSatisfaction(
            request.getSessionId(),
            request.getScore(),
            request.getFeedback());
        // 关闭会话
        conversationManager.closeSession(request.getSessionId());
        return ResponseEntity.ok().build();
    }

    /**
     * 运营看板
     */
    @GetMapping("/dashboard")
    public ResponseEntity<DashboardMetrics> getDashboard(
            @RequestParam(defaultValue = "today") String date) {
        LocalDate queryDate = "today".equals(date) ? LocalDate.now() : LocalDate.parse(date);
        return ResponseEntity.ok(conversationManager.getDashboard(queryDate));
    }
}

十二、性能优化与生产经验

生产环境关键配置：

// 意图识别结果缓存（相同问题不重复调用LLM）
@Configuration
public class CacheConfig {

    @Bean
    public CacheManager cacheManager(RedisConnectionFactory factory) {
        RedisCacheConfiguration config = RedisCacheConfiguration.defaultCacheConfig()
            .entryTtl(Duration.ofMinutes(10))
            .serializeValuesWith(RedisSerializationContext.SerializationPair
                .fromSerializer(new GenericJackson2JsonRedisSerializer()));

        return RedisCacheManager.builder(factory)
            .cacheDefaults(config)
            .build();
    }
}

// 意图识别加缓存
@Cacheable(value = "intent", key = "#userInput.hashCode()")
public IntentResult recognizeWithCache(String userInput) {
    return recognizeByLLM(userInput, null);
}

// 流式输出（降低用户感知延迟）
@GetMapping(value = "/stream", produces = MediaType.TEXT_EVENT_STREAM_VALUE)
public Flux<String> streamMessage(@RequestParam String sessionId,
                                   @RequestParam String message) {
    // 使用Spring AI的流式API
    return chatClient.prompt()
        .user(message)
        .stream()
        .content();
}

十三、运营指标看板

上线3个月后，系统运营数据如下：

指标	上线前	上线后	变化
日均对话量	11,000条	11,500条	+4.5%
AI自助解决率	0%	74.3%	新增
月人力成本	52万元	8.6万元	-83.5%
平均首次响应时间	3分20秒	12秒	-94%
用户满意度	3.8分	4.6分	+21%
峰值并发承载	20会话/客服	2000会话/系统	100x
99百分位响应延迟	-	2.3秒	-

高频未命中问题Top 5（首月）：

积分兑换规则（占未命中14%）→ 补充知识库后解决率提升到91%
大件商品安装服务 → 专项知识库建立
跨境商品关税问题 → 转人工处理
企业采购开票流程 → 添加专项话术
商品预售规则 → 定期更新机制建立

FAQ

Q1：意图识别准确率不够高怎么办？

A：分层处理。先用规则引擎覆盖90%的确定性场景（准确率接近100%），LLM只处理剩余模糊场景。同时收集错误分类数据，定期用few-shot示例优化Prompt。

Q2：向量数据库怎么选型？

A：中小规模（<100万文档）用Milvus单机版或Qdrant，开箱即用，运维简单。大规模用Milvus集群版。如果不想自建，阿里云、腾讯云都有托管版。

Q3：如何防止AI瞎编（幻觉）？

A：三层防御：①RAG检索提供事实依据 ②Prompt明确要求"如果参考资料中没有，请如实说明" ③置信度低于阈值转人工，不让AI强行回答。

Q4：多租户（多个商家）怎么隔离知识库？

A：Milvus支持按Collection或Partition隔离。建议每个租户一个独立Partition，共享Collection可以降低资源占用，但需要在检索时加租户ID过滤条件。

Q5：如何评估系统效果？

A：核心指标三个：解决率（AI独立解决的比例）、转人工率（越低越好，但不能为了低而强行回答）、满意度（用户对AI回复的评分）。三个指标需要同时看，只看解决率容易误导决策。

结语

从52万/月到8万/月，不是魔法，是工程。

智能客服系统的核心不在于用哪个大模型，而在于流程设计——意图识别准不准、知识库全不全、转接时机对不对。大模型只是最后一公里的执行者。

我在这个领域踩过的最大坑是：过分相信LLM，忽视规则引擎。早期版本所有意图都走LLM，延迟高、成本贵、还容易出错。加了规则引擎前置之后，系统稳定性和响应速度都上了一个台阶。