第2242篇：电商AI落地——从商品理解到智能客服的全链路实践

老张2026/4/30大约 7 分钟

第2242篇：电商AI落地——从商品理解到智能客服的全链路实践

适读人群：电商技术工程师、Java后端开发者、AI产品经理 | 阅读时长：约17分钟 | 核心价值：系统梳理电商AI全链路工程实践，从商品信息结构化到智能客服的技术选型和落地挑战

我在一家中型电商平台工作过两年，那段时间最头疼的事情不是技术难题，而是商品数据的混乱程度。

平台上有几百万个SKU，来自几万个商家。每个商家自己填商品信息，同一款手机壳，有的写"苹果14 Pro Max手机壳透明硅胶"，有的写"适用iPhone 14ProMax保护套全包防摔"，还有的直接写一串型号。颜色字段里有的填"天空蓝"、有的填"浅蓝色"、有的填"#87CEEB"。

这种混乱的数据质量，让搜索、推荐、比价、营销都受到严重影响。用户搜"苹果14手机壳"，搜出来的结果残缺不全，因为很多商品的标题压根没有标准化的关键词。

商品理解是电商AI的基础设施，做好了，上面所有应用的质量都会提升。

商品理解：多任务信息抽取

商品理解的核心是把非结构化的商品信息（标题、描述、图片）转化为结构化属性：

@Service
public class ProductUnderstandingService {

    @Autowired
    private LLMClient llmClient;
    
    @Autowired
    private ProductCategoryMapper categoryMapper;
    
    @Autowired
    private AttributeNormalizerService normalizerService;

    /**
     * 商品信息结构化抽取
     * 输入：原始商品信息（标题+描述+图片URL）
     * 输出：标准化属性集合
     */
    public ProductStructuredInfo extract(RawProductInfo rawInfo) {
        // 构建提示词，让LLM做信息抽取
        String prompt = buildExtractionPrompt(rawInfo);
        
        LLMResponse response = llmClient.complete(
            SystemPrompt.PRODUCT_EXTRACTION,
            prompt,
            LLMConfig.builder()
                .model("deepseek-v3")
                .temperature(0.1)  // 低温度，提高准确性
                .responseFormat(ResponseFormat.JSON)
                .build()
        );
        
        // 解析JSON响应
        ProductRawAttributes rawAttrs = parseExtractionResult(response.getContent());
        
        // 属性归一化（颜色/尺寸/材质等统一标准）
        ProductStructuredInfo structured = normalizerService.normalize(rawAttrs);
        
        // 品类分类（多级分类）
        CategoryPrediction category = categoryMapper.predict(rawInfo.getTitle(), 
            rawInfo.getDescription());
        structured.setCategory(category);
        
        return structured;
    }

    private String buildExtractionPrompt(RawProductInfo rawInfo) {
        return String.format("""
            请从以下商品信息中抽取结构化属性，以JSON格式输出：
            
            商品标题：%s
            商品描述：%s
            
            需要抽取的属性：品牌、型号、颜色、尺寸、材质、适用场景、关键卖点（最多3个）
            
            注意：
            1. 属性值要标准化，颜色统一用中文颜色名称
            2. 尺寸统一用数字+单位格式（如"15.6英寸"）
            3. 无法确定的属性值为null，不要猜测
            
            输出格式：
            {
              "brand": "品牌名",
              "model": "型号",
              "color": "颜色",
              "size": "尺寸",
              "material": "材质",
              "keyFeatures": ["卖点1", "卖点2"]
            }
            """,
            rawInfo.getTitle(),
            rawInfo.getDescription() != null ? rawInfo.getDescription() : "无描述"
        );
    }
}

// 属性归一化：统一不同商家的表达方式
@Service
public class AttributeNormalizerService {

    private final Map<String, String> colorNormMap = new HashMap<>() {{
        put("天空蓝", "蓝色"); put("湖水蓝", "蓝色"); put("午夜蓝", "深蓝色");
        put("玫瑰金", "金色"); put("香槟金", "金色"); put("星光银", "银色");
        put("曜石黑", "黑色"); put("幽灵白", "白色");
        // ... 更多映射
    }};

    public ProductStructuredInfo normalize(ProductRawAttributes raw) {
        ProductStructuredInfo info = new ProductStructuredInfo();
        
        // 颜色归一化
        if (raw.getColor() != null) {
            String normalizedColor = colorNormMap.getOrDefault(
                raw.getColor().trim(), raw.getColor().trim());
            info.setColor(normalizedColor);
        }
        
        // 尺寸归一化：统一格式
        if (raw.getSize() != null) {
            info.setSize(normalizeSizeExpression(raw.getSize()));
        }
        
        // 品牌归一化：大小写、全称/缩写统一
        if (raw.getBrand() != null) {
            info.setBrand(normalizeBrand(raw.getBrand()));
        }
        
        return info;
    }
}

搜索优化：基于商品理解的向量检索

有了结构化商品数据，搜索质量大幅提升：

@Service
public class ProductSearchService {

    @Autowired
    private EmbeddingService embeddingService;
    
    @Autowired
    private ElasticsearchClient esClient;
    
    @Autowired
    private MilvusClient milvusClient;

    /**
     * 混合检索：关键词检索 + 语义向量检索
     */
    public SearchResult search(String query, SearchContext context) {
        // 1. 查询理解：意图识别、关键词提取、同义词扩展
        QueryUnderstanding queryInfo = analyzeQuery(query);
        
        // 2. 并行执行关键词和向量检索
        CompletableFuture<List<String>> keywordFuture = 
            CompletableFuture.supplyAsync(() -> 
                keywordSearch(queryInfo, context, 500));
        
        CompletableFuture<List<String>> vectorFuture = 
            CompletableFuture.supplyAsync(() -> 
                vectorSearch(query, context, 300));
        
        CompletableFuture.allOf(keywordFuture, vectorFuture).join();
        
        // 3. 融合：RRF（倒序排名融合）
        List<String> keywordResults = keywordFuture.join();
        List<String> vectorResults = vectorFuture.join();
        List<String> mergedIds = reciprocalRankFusion(keywordResults, vectorResults);
        
        // 4. 精排（Learning to Rank）
        return rerank(mergedIds, query, context);
    }

    private List<String> vectorSearch(String query, SearchContext context, int topK) {
        float[] queryEmbedding = embeddingService.encode(query);
        
        SearchParam param = SearchParam.newBuilder()
            .withCollectionName("product_embeddings")
            .withVectors(List.of(toList(queryEmbedding)))
            .withTopK(topK)
            .withExpr(buildFilterExpr(context))  // 品类过滤
            .build();
        
        R<SearchResults> result = milvusClient.search(param);
        return parseSearchResults(result.getData());
    }

    /**
     * 倒序排名融合（RRF）
     * 在两个排名列表中，为每个文档计算综合分数
     */
    private List<String> reciprocalRankFusion(List<String> list1, List<String> list2) {
        Map<String, Double> scores = new HashMap<>();
        int k = 60;  // RRF常数
        
        for (int i = 0; i < list1.size(); i++) {
            scores.merge(list1.get(i), 1.0 / (i + k), Double::sum);
        }
        for (int i = 0; i < list2.size(); i++) {
            scores.merge(list2.get(i), 1.0 / (i + k), Double::sum);
        }
        
        return scores.entrySet().stream()
            .sorted(Map.Entry.<String, Double>comparingByValue().reversed())
            .map(Map.Entry::getKey)
            .collect(Collectors.toList());
    }
}

智能客服：意图识别到多轮对话

电商智能客服的核心挑战是快速准确地识别用户意图，并给出有实际帮助的回答，而不是万金油式的"您好，您的问题我已了解"：

@Service
public class SmartCustomerServiceBot {

    @Autowired
    private IntentClassifier intentClassifier;
    
    @Autowired
    private OrderQueryService orderQueryService;
    
    @Autowired
    private LLMClient llmClient;
    
    @Autowired
    private ConversationMemoryService memoryService;

    public CustomerServiceResponse respond(CustomerServiceRequest request) {
        String userId = request.getUserId();
        String message = request.getMessage();
        String sessionId = request.getSessionId();
        
        // 1. 意图识别
        IntentResult intent = intentClassifier.classify(message);
        
        // 2. 根据意图路由到不同处理逻辑
        return switch (intent.getTopIntent()) {
            case ORDER_STATUS -> handleOrderStatus(userId, message, sessionId);
            case RETURN_REFUND -> handleReturnRefund(userId, message, sessionId);
            case PRODUCT_INQUIRY -> handleProductInquiry(message, sessionId);
            case LOGISTICS_QUERY -> handleLogisticsQuery(userId, message, sessionId);
            case COMPLAINT -> handleComplaint(userId, message, sessionId);
            default -> handleWithLLM(userId, message, sessionId);
        };
    }

    private CustomerServiceResponse handleOrderStatus(String userId, String message, 
                                                        String sessionId) {
        // 从消息中提取订单号（如果有）
        String orderId = extractOrderId(message);
        
        if (orderId == null) {
            // 没有订单号，查询最近订单
            List<Order> recentOrders = orderQueryService.getRecentOrders(userId, 3);
            if (recentOrders.isEmpty()) {
                return CustomerServiceResponse.text("您还没有订单记录，如有疑问请联系客服。");
            }
            
            if (recentOrders.size() == 1) {
                orderId = recentOrders.get(0).getOrderId();
            } else {
                // 多个订单，让用户选择
                return buildOrderSelectionCard(recentOrders);
            }
        }
        
        Order order = orderQueryService.getOrder(orderId);
        if (order == null || !order.getUserId().equals(userId)) {
            return CustomerServiceResponse.text("未找到该订单，请确认订单号是否正确。");
        }
        
        // 生成订单状态回复
        String statusDescription = buildOrderStatusMessage(order);
        
        // 如果有异常（超时未发货、物流异常等），自动触发处理
        if (order.hasException()) {
            autoEscalate(order, sessionId);
            statusDescription += "\n\n我们已注意到您的订单存在异常，正在为您跟进处理。";
        }
        
        return CustomerServiceResponse.text(statusDescription);
    }

    /**
     * 用LLM处理无法规则化的问题
     * 结合用户订单历史和商品知识库
     */
    private CustomerServiceResponse handleWithLLM(String userId, String message, 
                                                    String sessionId) {
        // 获取对话历史
        List<ConversationMessage> history = memoryService.getHistory(sessionId, 5);
        
        // 获取用户基本信息（订单数、会员等级等）
        UserContext userContext = buildUserContext(userId);
        
        // 构建系统提示
        String systemPrompt = buildCustomerServiceSystemPrompt(userContext);
        
        // 构建消息列表
        List<ChatMessage> messages = new ArrayList<>();
        messages.add(new ChatMessage("system", systemPrompt));
        
        // 添加历史对话
        for (ConversationMessage hist : history) {
            messages.add(new ChatMessage(hist.getRole(), hist.getContent()));
        }
        messages.add(new ChatMessage("user", message));
        
        // 调用LLM
        String response = llmClient.chat(messages, 
            LLMConfig.builder().model("gpt-4o-mini").maxTokens(500).build());
        
        // 保存到对话历史
        memoryService.append(sessionId, "user", message);
        memoryService.append(sessionId, "assistant", response);
        
        return CustomerServiceResponse.text(response);
    }

    private String buildCustomerServiceSystemPrompt(UserContext context) {
        return String.format("""
            你是一个电商平台的智能客服助手，你的特点是：
            1. 态度亲切但简洁，不说废话
            2. 回答具体问题时给出确切信息，不模糊回答
            3. 无法处理的问题主动建议转人工客服
            
            用户信息：
            - 会员等级：%s
            - 历史订单数：%d
            - 近期是否有投诉记录：%s
            
            注意：不要随意承诺退款、补偿等，涉及金钱的决定需要人工审核。
            """,
            context.getMemberLevel(),
            context.getOrderCount(),
            context.hasRecentComplaint() ? "有" : "无"
        );
    }
}

意图识别：FastText vs LLM的取舍

意图识别是客服系统的入口，要求：准确、快（<50ms）、低成本。

对于常见意图（20-30类），用FastText或BERT微调就够了，比调LLM快10倍，成本也低很多：

@Service
public class IntentClassifier {

    private final TextClassifier fastTextModel;
    private final double llmFallbackThreshold = 0.6;

    public IntentClassifier(String modelPath) throws IOException {
        this.fastTextModel = FastText.loadModel(modelPath);
    }

    public IntentResult classify(String text) {
        // 预处理
        String cleaned = preprocessText(text);
        
        // FastText分类
        FastTextPrediction prediction = fastTextModel.predict(cleaned, 3);
        
        IntentResult result = new IntentResult();
        result.setTopIntent(parseIntent(prediction.getLabels().get(0)));
        result.setConfidence(prediction.getProbabilities().get(0).floatValue());
        
        // 置信度低时，标记为需要LLM协助判断
        if (result.getConfidence() < llmFallbackThreshold) {
            result.setNeedsLLMFallback(true);
        }
        
        return result;
    }

    private String preprocessText(String text) {
        // 去除表情、特殊字符，统一标点
        text = text.replaceAll("[\\u2014\\u2013]", "-");
        text = text.replaceAll("[\\uD83C-\\uDBFF\\uDC00-\\uDFFF]+", "");
        return text.trim();
    }
}

工程经验：从运营视角看智能客服

做智能客服系统有个陷阱：工程师容易沉迷于提升意图识别准确率，从97%提到98%，但对实际业务价值的影响微乎其微。

真正影响用户体验的往往是这些：

转人工的体验：系统知道自己不会处理时，能不能快速、顺畅地转到人工，而不是绕了三四轮才转
回复时效：回复太慢，用户等不及就放弃了
自助处理率：多少问题可以不经人工自动解决（退货申请、查物流、开发票）

自助处理率才是核心KPI，而不是意图识别准确率。我们系统上线后，自助处理率从32%提到了58%，人工客服的工作量降了将近一半，这才是真正有价值的结果。