智能搜索引擎：用语义搜索替代传统关键词搜索

老张2026/9/3大约 21 分钟语义搜索全文检索ElasticsearchSpring AIJava

智能搜索引擎：用语义搜索替代传统关键词搜索

一、真实故事：那个"搜不到连衣裙"的电商平台

2025年双11前夕，某头部电商技术团队召开了一场紧急复盘会议。

会议室里，产品经理小林把数据大屏切给所有人看：搜索"红色裙子"的用户，有34%在搜索结果页直接流失。更让人头疼的是，仓库里明明有8000条"玫瑰色连衣裙"、"酒红旗袍"、"红酒色吊带裙"，但这些商品的自然搜索曝光量几乎为零。

负责搜索的Java工程师陈磊拉出Elasticsearch日志，一看就明白了：

Query: "红色裙子"
Match: {"match": {"title": "红色裙子"}}
Result count: 127 items
Miss: "玫瑰色连衣裙" × "酒红旗袍" × "樱桃红吊带裙" ×

传统关键词搜索必须做到词汇完全匹配。用户说"红色"，系统就找"红色"；用户说"裙子"，系统就找"裙子"。"玫瑰色"和"红色"在字面上没有任何交集——尽管任何人类都知道它们是同一种颜色范畴。

陈磊花了3周时间，用Spring AI + PGVector改造了搜索系统。改造上线后的第一个完整月：

搜索转化率：从6.2% → 8.7%，提升40%
搜索零结果率：从18% → 4%，下降78%
用户搜索满意度（NPS）：从31 → 58

这篇文章，就是他整个改造过程的技术复盘。

二、关键词搜索的根本局限：词汇鸿沟问题

2.1 什么是词汇鸿沟

"词汇鸿沟"（Vocabulary Gap）是信息检索领域的经典难题：用户用来描述需求的词汇，和文档中实际使用的词汇之间存在不匹配。

用户搜索词              文档中的词汇
────────────────────────────────────
红色裙子         ≠     玫瑰色连衣裙
手机             ≠     智能手机、移动电话
笔记本电脑       ≠     laptop、便携电脑
跑步鞋           ≠     运动鞋、慢跑鞋

2.2 传统解决方案及其局限

方案1：同义词词典

红色 → 朱红、玫瑰色、酒红、樱桃红、砖红...

问题：词典维护成本极高，需要运营人工更新，新词无法自动覆盖。

方案2：TF-IDF + BM25

BM25是Elasticsearch默认的相关性算法，本质上是词频统计：

BM25 Score(q, d) = Σ IDF(qi) × [f(qi,d) × (k1+1)] / [f(qi,d) + k1 × (1 - b + b × |d|/avgdl)]

这个公式再精妙，也无法解决"词不一样"的根本问题。

方案3：拼音搜索、模糊搜索

只能解决打字错误，无法理解语义相似性。

2.3 词汇鸿沟的量化影响

根据陈磊团队的数据分析：

问题类型	占搜索失败比例
同义词不匹配（红色 vs 玫瑰色）	34%
上下位词（裙子 vs 连衣裙）	28%
属性描述差异（显瘦 vs 修身）	21%
纯打字错误	17%

超过83%的搜索失败，来自语义层面的不匹配，而非拼写错误。这正是语义搜索要解决的核心问题。

三、语义搜索原理：Embedding空间的相似度计算

3.1 从词袋到向量空间

传统搜索把文本看作词的集合（词袋模型）：

"红色裙子" → {红色:1, 裙子:1}
"玫瑰色连衣裙" → {玫瑰色:1, 连衣裙:1}

两个集合完全没有交集，相似度为0。

语义搜索把文本转换为高维向量（Embedding）：

"红色裙子"    → [0.23, -0.15, 0.87, 0.42, ..., 0.31]  // 1536维
"玫瑰色连衣裙" → [0.25, -0.13, 0.84, 0.39, ..., 0.29]  // 1536维

两个向量在空间中距离很近，余弦相似度 ≈ 0.94。

3.2 Embedding模型的工作原理

3.3 相似度计算方法对比

余弦相似度（最常用）：

cos(A, B) = (A · B) / (|A| × |B|)

欧氏距离：

d(A, B) = √Σ(ai - bi)²

内积（Dot Product）：

A · B = Σ(ai × bi)

对于语义搜索，余弦相似度最常用，因为它只关注方向（语义），不受向量长度（词频）影响。

四、Spring AI + PGVector实现语义搜索

4.1 项目依赖配置

<!-- pom.xml -->
<dependencies>
    <!-- Spring AI核心 -->
    <dependency>
        <groupId>org.springframework.ai</groupId>
        <artifactId>spring-ai-openai-spring-boot-starter</artifactId>
        <version>1.0.0</version>
    </dependency>
    
    <!-- PGVector向量存储 -->
    <dependency>
        <groupId>org.springframework.ai</groupId>
        <artifactId>spring-ai-pgvector-store-spring-boot-starter</artifactId>
        <version>1.0.0</version>
    </dependency>
    
    <!-- Spring Data JPA -->
    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-data-jpa</artifactId>
    </dependency>
    
    <!-- PostgreSQL驱动 -->
    <dependency>
        <groupId>org.postgresql</groupId>
        <artifactId>postgresql</artifactId>
        <scope>runtime</scope>
    </dependency>
</dependencies>

4.2 数据库初始化

-- 启用pgvector扩展
CREATE EXTENSION IF NOT EXISTS vector;

-- 商品表（含向量字段）
CREATE TABLE products (
    id          BIGSERIAL PRIMARY KEY,
    title       VARCHAR(500)    NOT NULL,
    description TEXT,
    category    VARCHAR(100),
    price       DECIMAL(10, 2),
    embedding   VECTOR(1536),   -- OpenAI text-embedding-3-small维度
    created_at  TIMESTAMP       DEFAULT CURRENT_TIMESTAMP
);

-- HNSW索引（生产推荐）
CREATE INDEX ON products USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);

-- 或使用IVFFlat索引（数据量<100万）
-- CREATE INDEX ON products USING ivfflat (embedding vector_cosine_ops)
-- WITH (lists = 100);

4.3 核心配置

# application.yml
spring:
  ai:
    openai:
      api-key: ${OPENAI_API_KEY}
      embedding:
        options:
          model: text-embedding-3-small
          dimensions: 1536
  
  datasource:
    url: jdbc:postgresql://localhost:5432/ecommerce
    username: ${DB_USERNAME}
    password: ${DB_PASSWORD}
  
  ai:
    vectorstore:
      pgvector:
        index-type: HNSW
        distance-type: COSINE_DISTANCE
        dimensions: 1536
        initialize-schema: false  # 生产环境用Flyway管理

# 语义搜索自定义配置
search:
  semantic:
    top-k: 20              # 召回数量
    similarity-threshold: 0.70  # 相似度阈值
    embedding-batch-size: 100   # 批量Embedding大小

4.4 商品Embedding服务

package com.ecommerce.search.service;

import org.springframework.ai.embedding.EmbeddingModel;
import org.springframework.ai.embedding.EmbeddingRequest;
import org.springframework.ai.embedding.EmbeddingResponse;
import org.springframework.ai.openai.OpenAiEmbeddingOptions;
import org.springframework.scheduling.annotation.Async;
import org.springframework.stereotype.Service;
import org.springframework.transaction.annotation.Transactional;

import java.util.List;
import java.util.concurrent.CompletableFuture;
import java.util.stream.Collectors;
import java.util.stream.IntStream;

@Service
@Slf4j
public class ProductEmbeddingService {

    private final EmbeddingModel embeddingModel;
    private final ProductRepository productRepository;
    
    @Value("${search.semantic.embedding-batch-size:100}")
    private int batchSize;

    public ProductEmbeddingService(EmbeddingModel embeddingModel,
                                   ProductRepository productRepository) {
        this.embeddingModel = embeddingModel;
        this.productRepository = productRepository;
    }

    /**
     * 为商品生成Embedding文本
     * 将标题、类目、关键描述拼接，提升语义丰富度
     */
    public String buildEmbeddingText(Product product) {
        StringBuilder sb = new StringBuilder();
        sb.append(product.getTitle());
        
        if (product.getCategory() != null) {
            sb.append(" ").append(product.getCategory());
        }
        
        if (product.getKeywords() != null) {
            sb.append(" ").append(String.join(" ", product.getKeywords()));
        }
        
        // 截取描述前200字，避免超过token限制
        if (product.getDescription() != null) {
            String desc = product.getDescription();
            sb.append(" ").append(desc.substring(0, Math.min(200, desc.length())));
        }
        
        return sb.toString();
    }

    /**
     * 单个商品Embedding（新增/更新时调用）
     */
    @Transactional
    public void embedProduct(Long productId) {
        Product product = productRepository.findById(productId)
            .orElseThrow(() -> new ProductNotFoundException(productId));
        
        String text = buildEmbeddingText(product);
        
        EmbeddingRequest request = new EmbeddingRequest(
            List.of(text),
            OpenAiEmbeddingOptions.builder()
                .withModel("text-embedding-3-small")
                .build()
        );
        
        EmbeddingResponse response = embeddingModel.call(request);
        float[] embedding = response.getResults().get(0).getOutput();
        
        product.setEmbedding(embedding);
        productRepository.save(product);
        
        log.info("商品[{}]Embedding完成，向量维度: {}", productId, embedding.length);
    }

    /**
     * 批量商品Embedding（初始化/全量更新）
     * 生产级实现：分批处理 + 进度日志 + 错误跳过
     */
    @Async("embeddingTaskExecutor")
    @Transactional
    public CompletableFuture<BatchEmbeddingResult> batchEmbedProducts(List<Long> productIds) {
        int total = productIds.size();
        int successCount = 0;
        int failCount = 0;
        
        // 分批处理
        List<List<Long>> batches = partitionList(productIds, batchSize);
        
        for (int batchIndex = 0; batchIndex < batches.size(); batchIndex++) {
            List<Long> batch = batches.get(batchIndex);
            
            try {
                List<Product> products = productRepository.findAllById(batch);
                List<String> texts = products.stream()
                    .map(this::buildEmbeddingText)
                    .collect(Collectors.toList());
                
                // 批量调用Embedding API（减少网络往返）
                EmbeddingRequest request = new EmbeddingRequest(
                    texts,
                    OpenAiEmbeddingOptions.builder()
                        .withModel("text-embedding-3-small")
                        .build()
                );
                
                EmbeddingResponse response = embeddingModel.call(request);
                
                // 将向量写回商品
                for (int i = 0; i < products.size(); i++) {
                    float[] embedding = response.getResults().get(i).getOutput();
                    products.get(i).setEmbedding(embedding);
                }
                
                productRepository.saveAll(products);
                successCount += products.size();
                
                log.info("Embedding进度: {}/{} ({} batches done)",
                    successCount, total, batchIndex + 1);
                
                // 限速：避免触发API限流（60 RPM）
                if (batchIndex < batches.size() - 1) {
                    Thread.sleep(1000);
                }
                
            } catch (Exception e) {
                log.error("第{}批Embedding失败: {}", batchIndex + 1, e.getMessage());
                failCount += batch.size();
            }
        }
        
        return CompletableFuture.completedFuture(
            new BatchEmbeddingResult(total, successCount, failCount)
        );
    }

    private <T> List<List<T>> partitionList(List<T> list, int size) {
        return IntStream.range(0, (list.size() + size - 1) / size)
            .mapToObj(i -> list.subList(i * size, Math.min((i + 1) * size, list.size())))
            .collect(Collectors.toList());
    }
}

4.5 语义搜索核心实现

package com.ecommerce.search.service;

import org.springframework.ai.embedding.EmbeddingModel;
import org.springframework.stereotype.Service;
import org.springframework.jdbc.core.JdbcTemplate;

import java.util.List;
import java.util.Map;

@Service
@Slf4j
public class SemanticSearchService {

    private final EmbeddingModel embeddingModel;
    private final JdbcTemplate jdbcTemplate;
    
    @Value("${search.semantic.top-k:20}")
    private int topK;
    
    @Value("${search.semantic.similarity-threshold:0.70}")
    private double similarityThreshold;

    public SemanticSearchService(EmbeddingModel embeddingModel,
                                 JdbcTemplate jdbcTemplate) {
        this.embeddingModel = embeddingModel;
        this.jdbcTemplate = jdbcTemplate;
    }

    /**
     * 语义搜索主入口
     */
    public SemanticSearchResult search(String query, SearchFilter filter) {
        long startTime = System.currentTimeMillis();
        
        // 1. 将查询词转为向量
        float[] queryEmbedding = embedQuery(query);
        
        // 2. 向量近邻搜索
        List<ProductScore> results = vectorSearch(queryEmbedding, filter);
        
        // 3. 过滤低相似度结果
        results = results.stream()
            .filter(ps -> ps.getSimilarity() >= similarityThreshold)
            .collect(Collectors.toList());
        
        long duration = System.currentTimeMillis() - startTime;
        log.info("语义搜索 [{}] 耗时{}ms，返回{}条结果", query, duration, results.size());
        
        return SemanticSearchResult.builder()
            .query(query)
            .results(results)
            .totalCount(results.size())
            .searchTimeMs(duration)
            .build();
    }

    /**
     * 查询词Embedding（带缓存）
     */
    @Cacheable(value = "queryEmbedding", key = "#query", 
               condition = "#query.length() <= 100")
    public float[] embedQuery(String query) {
        EmbeddingRequest request = new EmbeddingRequest(
            List.of(query),
            OpenAiEmbeddingOptions.builder()
                .withModel("text-embedding-3-small")
                .build()
        );
        return embeddingModel.call(request).getResults().get(0).getOutput();
    }

    /**
     * PGVector向量搜索
     * 使用原生SQL获得最优性能
     */
    private List<ProductScore> vectorSearch(float[] queryEmbedding, 
                                             SearchFilter filter) {
        // 构建向量字符串
        String vectorStr = buildVectorString(queryEmbedding);
        
        StringBuilder sql = new StringBuilder("""
            SELECT
                p.id,
                p.title,
                p.price,
                p.category,
                p.image_url,
                1 - (p.embedding <=> ?::vector) AS similarity
            FROM products p
            WHERE p.embedding IS NOT NULL
            """);
        
        // 动态过滤条件
        List<Object> params = new ArrayList<>();
        params.add(vectorStr);
        
        if (filter.getCategory() != null) {
            sql.append(" AND p.category = ?");
            params.add(filter.getCategory());
        }
        
        if (filter.getMinPrice() != null) {
            sql.append(" AND p.price >= ?");
            params.add(filter.getMinPrice());
        }
        
        if (filter.getMaxPrice() != null) {
            sql.append(" AND p.price <= ?");
            params.add(filter.getMaxPrice());
        }
        
        sql.append("""
            ORDER BY p.embedding <=> ?::vector
            LIMIT ?
            """);
        params.add(vectorStr);  // 注意：ORDER BY需要重复传参
        params.add(topK);
        
        return jdbcTemplate.query(
            sql.toString(),
            params.toArray(),
            (rs, rowNum) -> ProductScore.builder()
                .productId(rs.getLong("id"))
                .title(rs.getString("title"))
                .price(rs.getBigDecimal("price"))
                .category(rs.getString("category"))
                .imageUrl(rs.getString("image_url"))
                .similarity(rs.getDouble("similarity"))
                .build()
        );
    }

    /**
     * float[]转PGVector格式字符串
     */
    private String buildVectorString(float[] embedding) {
        StringBuilder sb = new StringBuilder("[");
        for (int i = 0; i < embedding.length; i++) {
            if (i > 0) sb.append(",");
            sb.append(embedding[i]);
        }
        sb.append("]");
        return sb.toString();
    }
}

4.6 REST API层

package com.ecommerce.search.controller;

import org.springframework.http.ResponseEntity;
import org.springframework.web.bind.annotation.*;

@RestController
@RequestMapping("/api/v2/search")
@Validated
public class SearchController {

    private final HybridSearchService hybridSearchService;

    @GetMapping("/products")
    public ResponseEntity<SearchResponse> search(
            @RequestParam @NotBlank String query,
            @RequestParam(defaultValue = "0") int page,
            @RequestParam(defaultValue = "20") int size,
            @RequestParam(required = false) String category,
            @RequestParam(required = false) BigDecimal minPrice,
            @RequestParam(required = false) BigDecimal maxPrice) {
        
        SearchFilter filter = SearchFilter.builder()
            .category(category)
            .minPrice(minPrice)
            .maxPrice(maxPrice)
            .build();
        
        SearchResponse response = hybridSearchService.search(query, filter, page, size);
        return ResponseEntity.ok(response);
    }
}

五、Elasticsearch向量搜索：kNN Query配置

5.1 为什么还需要ES？

PGVector适合数据量在百万级以内的场景。当商品规模达到千万级，需要Elasticsearch的分布式能力。

5.2 ES 8.x索引mapping配置

PUT /products
{
  "settings": {
    "number_of_shards": 3,
    "number_of_replicas": 1,
    "analysis": {
      "analyzer": {
        "ik_smart_analyzer": {
          "type": "custom",
          "tokenizer": "ik_smart",
          "filter": ["lowercase"]
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "id": { "type": "long" },
      "title": {
        "type": "text",
        "analyzer": "ik_smart_analyzer",
        "fields": {
          "keyword": { "type": "keyword" }
        }
      },
      "description": {
        "type": "text",
        "analyzer": "ik_smart_analyzer"
      },
      "category": { "type": "keyword" },
      "price": { "type": "scaled_float", "scaling_factor": 100 },
      "embedding": {
        "type": "dense_vector",
        "dims": 1536,
        "index": true,
        "similarity": "cosine",
        "index_options": {
          "type": "hnsw",
          "m": 16,
          "ef_construction": 100
        }
      }
    }
  }
}

5.3 Java kNN搜索实现

package com.ecommerce.search.es;

import co.elastic.clients.elasticsearch.ElasticsearchClient;
import co.elastic.clients.elasticsearch.core.SearchResponse;
import co.elastic.clients.elasticsearch.core.search.Hit;
import co.elastic.clients.elasticsearch._types.query_dsl.*;
import org.springframework.stereotype.Service;

import java.io.IOException;
import java.util.List;
import java.util.stream.Collectors;

@Service
@Slf4j
public class ElasticsearchSemanticService {

    private final ElasticsearchClient esClient;
    private final ProductEmbeddingService embeddingService;

    public ElasticsearchSemanticService(ElasticsearchClient esClient,
                                         ProductEmbeddingService embeddingService) {
        this.esClient = esClient;
        this.embeddingService = embeddingService;
    }

    /**
     * ES纯语义搜索（kNN query）
     */
    public List<ProductScore> knnSearch(String query, 
                                        SearchFilter filter, 
                                        int topK) throws IOException {
        float[] queryVector = embeddingService.embedQuery(query);
        
        // 构建kNN查询
        KnnQuery knnQuery = KnnQuery.of(k -> k
            .field("embedding")
            .queryVector(toFloatList(queryVector))
            .numCandidates(topK * 5)  // 候选集大小，一般是topK的5-10倍
            .k(topK)
            // 过滤条件（在kNN搜索范围内过滤）
            .filter(buildFilterQuery(filter))
        );
        
        SearchResponse<ProductDocument> response = esClient.search(s -> s
            .index("products")
            .knn(knnQuery)
            .source(src -> src
                .filter(f -> f
                    .includes("id", "title", "price", "category", "image_url")
                )
            ),
            ProductDocument.class
        );
        
        return response.hits().hits().stream()
            .map(hit -> ProductScore.builder()
                .productId(hit.source().getId())
                .title(hit.source().getTitle())
                .price(hit.source().getPrice())
                .similarity(hit.score() != null ? hit.score() : 0.0)
                .build()
            )
            .collect(Collectors.toList());
    }

    /**
     * 索引商品文档（含向量）
     */
    public void indexProduct(Product product) throws IOException {
        float[] embedding = embeddingService.embedQuery(
            embeddingService.buildEmbeddingText(product)
        );
        
        ProductDocument doc = ProductDocument.builder()
            .id(product.getId())
            .title(product.getTitle())
            .description(product.getDescription())
            .category(product.getCategory())
            .price(product.getPrice())
            .embedding(toFloatList(embedding))
            .build();
        
        esClient.index(i -> i
            .index("products")
            .id(String.valueOf(product.getId()))
            .document(doc)
        );
    }

    /**
     * 批量索引（初始化）
     */
    public void bulkIndex(List<Product> products) throws IOException {
        BulkRequest.Builder br = new BulkRequest.Builder();
        
        for (Product product : products) {
            float[] embedding = embeddingService.embedQuery(
                embeddingService.buildEmbeddingText(product)
            );
            
            ProductDocument doc = toDocument(product, embedding);
            
            br.operations(op -> op
                .index(idx -> idx
                    .index("products")
                    .id(String.valueOf(product.getId()))
                    .document(doc)
                )
            );
        }
        
        BulkResponse result = esClient.bulk(br.build());
        
        if (result.errors()) {
            log.error("批量索引存在错误，请检查");
            result.items().stream()
                .filter(item -> item.error() != null)
                .forEach(item -> log.error("文档{}索引失败: {}", 
                    item.id(), item.error().reason()));
        }
        
        log.info("批量索引完成: {}条，耗时{}ms", 
            products.size(), result.took());
    }

    private List<Float> toFloatList(float[] array) {
        List<Float> list = new ArrayList<>(array.length);
        for (float f : array) list.add(f);
        return list;
    }

    private Query buildFilterQuery(SearchFilter filter) {
        BoolQuery.Builder bool = new BoolQuery.Builder();
        
        if (filter.getCategory() != null) {
            bool.filter(f -> f
                .term(t -> t.field("category").value(filter.getCategory()))
            );
        }
        
        if (filter.getMinPrice() != null || filter.getMaxPrice() != null) {
            RangeQuery.Builder range = new RangeQuery.Builder().field("price");
            if (filter.getMinPrice() != null) {
                range.gte(JsonData.of(filter.getMinPrice()));
            }
            if (filter.getMaxPrice() != null) {
                range.lte(JsonData.of(filter.getMaxPrice()));
            }
            bool.filter(f -> f.range(range.build()));
        }
        
        return bool.build()._toQuery();
    }
}

5.4 ES配置类

@Configuration
public class ElasticsearchConfig {

    @Value("${elasticsearch.host:localhost}")
    private String host;
    
    @Value("${elasticsearch.port:9200}")
    private int port;
    
    @Value("${elasticsearch.username:elastic}")
    private String username;
    
    @Value("${elasticsearch.password}")
    private String password;

    @Bean
    public ElasticsearchClient elasticsearchClient() {
        RestClientBuilder builder = RestClient.builder(
            new HttpHost(host, port, "https")
        );
        
        // 认证配置
        final CredentialsProvider credentialsProvider = new BasicCredentialsProvider();
        credentialsProvider.setCredentials(
            AuthScope.ANY,
            new UsernamePasswordCredentials(username, password)
        );
        
        builder.setHttpClientConfigCallback(httpClientBuilder ->
            httpClientBuilder
                .setDefaultCredentialsProvider(credentialsProvider)
                // 连接池配置
                .setMaxConnTotal(200)
                .setMaxConnPerRoute(50)
                // 超时配置
                .setDefaultRequestConfig(RequestConfig.custom()
                    .setConnectTimeout(5000)
                    .setSocketTimeout(30000)
                    .build()
                )
        );
        
        ElasticsearchTransport transport = new RestClientTransport(
            builder.build(),
            new JacksonJsonpMapper()
        );
        
        return new ElasticsearchClient(transport);
    }
}

六、混合搜索：BM25关键词 + 向量语义的融合

6.1 为什么需要混合搜索

纯语义搜索的问题：

对精确匹配不够友好（商品SKU、型号）
对高频词的语义区分不足（"苹果手机" vs "苹果"）

混合搜索的策略：用语义搜索提升召回，用关键词搜索保证精度。

6.2 RRF算法（Reciprocal Rank Fusion）

RRF是目前业界最常用的混合融分算法，ES 8.8+内置支持：

RRF_Score(d) = Σ 1 / (k + rank_i(d))

其中 k = 60（常数），rank_i(d) 是文档d在第i个搜索列表中的排名。

6.3 Java实现RRF融合

package com.ecommerce.search.hybrid;

import org.springframework.stereotype.Service;

import java.util.*;
import java.util.stream.Collectors;

@Service
@Slf4j
public class HybridSearchService {

    private static final int RRF_K = 60;
    
    private final ElasticsearchSemanticService esSemanticService;
    private final ElasticsearchKeywordService esKeywordService;
    private final SearchResultReranker reranker;

    /**
     * 混合搜索主流程
     */
    public SearchResponse search(String query, 
                                  SearchFilter filter, 
                                  int page, 
                                  int size) {
        int recallSize = Math.max(100, size * 5);  // 召回数量
        
        // 并行执行两路搜索
        CompletableFuture<List<ProductScore>> semanticFuture = 
            CompletableFuture.supplyAsync(() -> {
                try {
                    return esSemanticService.knnSearch(query, filter, recallSize);
                } catch (Exception e) {
                    log.error("语义搜索失败", e);
                    return Collections.emptyList();
                }
            });
        
        CompletableFuture<List<ProductScore>> keywordFuture = 
            CompletableFuture.supplyAsync(() -> {
                try {
                    return esKeywordService.bm25Search(query, filter, recallSize);
                } catch (Exception e) {
                    log.error("关键词搜索失败", e);
                    return Collections.emptyList();
                }
            });
        
        // 等待两路结果
        CompletableFuture.allOf(semanticFuture, keywordFuture).join();
        
        List<ProductScore> semanticResults = semanticFuture.join();
        List<ProductScore> keywordResults = keywordFuture.join();
        
        log.info("语义召回: {}条，关键词召回: {}条", 
            semanticResults.size(), keywordResults.size());
        
        // RRF融合
        List<ProductScore> merged = rrfMerge(semanticResults, keywordResults);
        
        // 分页
        int startIdx = page * size;
        int endIdx = Math.min(startIdx + size, merged.size());
        
        List<ProductScore> pageResults = startIdx < merged.size() 
            ? merged.subList(startIdx, endIdx) 
            : Collections.emptyList();
        
        return SearchResponse.builder()
            .query(query)
            .results(pageResults)
            .totalCount(merged.size())
            .page(page)
            .size(size)
            .build();
    }

    /**
     * RRF融合算法实现
     * 
     * @param list1 语义搜索结果（按相似度降序）
     * @param list2 关键词搜索结果（按BM25分降序）
     * @return 融合后的结果（按RRF分降序）
     */
    public List<ProductScore> rrfMerge(List<ProductScore> list1, 
                                        List<ProductScore> list2) {
        Map<Long, Double> rrfScores = new HashMap<>();
        Map<Long, ProductScore> productMap = new HashMap<>();
        
        // 处理第一个列表（语义搜索）
        for (int rank = 0; rank < list1.size(); rank++) {
            ProductScore ps = list1.get(rank);
            double rrfScore = 1.0 / (RRF_K + rank + 1);
            rrfScores.merge(ps.getProductId(), rrfScore, Double::sum);
            productMap.put(ps.getProductId(), ps);
        }
        
        // 处理第二个列表（关键词搜索）
        for (int rank = 0; rank < list2.size(); rank++) {
            ProductScore ps = list2.get(rank);
            double rrfScore = 1.0 / (RRF_K + rank + 1);
            rrfScores.merge(ps.getProductId(), rrfScore, Double::sum);
            productMap.putIfAbsent(ps.getProductId(), ps);
        }
        
        // 按RRF分降序排列
        return rrfScores.entrySet().stream()
            .sorted(Map.Entry.<Long, Double>comparingByValue().reversed())
            .map(entry -> {
                ProductScore ps = productMap.get(entry.getKey());
                return ps.toBuilder()
                    .rrfScore(entry.getValue())
                    .build();
            })
            .collect(Collectors.toList());
    }

    /**
     * 带权重的RRF（可调节语义/关键词权重）
     * 
     * @param semanticWeight 语义搜索权重 (0-1)
     */
    public List<ProductScore> weightedRrfMerge(List<ProductScore> semanticResults,
                                                List<ProductScore> keywordResults,
                                                double semanticWeight) {
        double keywordWeight = 1.0 - semanticWeight;
        Map<Long, Double> rrfScores = new HashMap<>();
        Map<Long, ProductScore> productMap = new HashMap<>();
        
        // 语义搜索列表（带权重）
        for (int rank = 0; rank < semanticResults.size(); rank++) {
            ProductScore ps = semanticResults.get(rank);
            double rrfScore = semanticWeight / (RRF_K + rank + 1);
            rrfScores.merge(ps.getProductId(), rrfScore, Double::sum);
            productMap.put(ps.getProductId(), ps);
        }
        
        // 关键词搜索列表（带权重）
        for (int rank = 0; rank < keywordResults.size(); rank++) {
            ProductScore ps = keywordResults.get(rank);
            double rrfScore = keywordWeight / (RRF_K + rank + 1);
            rrfScores.merge(ps.getProductId(), rrfScore, Double::sum);
            productMap.putIfAbsent(ps.getProductId(), ps);
        }
        
        return rrfScores.entrySet().stream()
            .sorted(Map.Entry.<Long, Double>comparingByValue().reversed())
            .map(entry -> {
                ProductScore ps = productMap.get(entry.getKey());
                return ps.toBuilder().rrfScore(entry.getValue()).build();
            })
            .collect(Collectors.toList());
    }
}

6.4 ES原生混合搜索（推荐）

ES 8.8+提供了原生的混合搜索API，无需在Java层手动融合：

/**
 * 使用ES原生hybrid search（8.8+）
 */
public List<ProductScore> nativeHybridSearch(String query, 
                                              float[] queryVector,
                                              SearchFilter filter,
                                              int topK) throws IOException {
    SearchResponse<ProductDocument> response = esClient.search(s -> s
        .index("products")
        // BM25关键词搜索
        .query(q -> q
            .bool(b -> b
                .must(m -> m
                    .multiMatch(mm -> mm
                        .query(query)
                        .fields("title^3", "description^1")
                        .analyzer("ik_smart_analyzer")
                    )
                )
                .filter(buildFilterQuery(filter)._toQuery())
            )
        )
        // kNN语义搜索
        .knn(k -> k
            .field("embedding")
            .queryVector(toFloatList(queryVector))
            .numCandidates(topK * 5)
            .k(topK)
            .filter(buildFilterQuery(filter))
        )
        // RRF融合（ES原生）
        .rank(r -> r
            .rrf(rrf -> rrf
                .rankConstant((long) RRF_K)
                .windowSize((long) topK * 2)
            )
        )
        .size(topK),
        ProductDocument.class
    );
    
    return response.hits().hits().stream()
        .map(this::toProductScore)
        .collect(Collectors.toList());
}

七、搜索结果重排：Cross-Encoder重排序模型

7.1 两阶段检索架构

7.2 Cross-Encoder重排Java实现

package com.ecommerce.search.rerank;

import org.springframework.ai.chat.ChatClient;
import org.springframework.stereotype.Service;

import java.util.List;
import java.util.concurrent.CompletableFuture;
import java.util.stream.Collectors;
import java.util.stream.IntStream;

@Service
@Slf4j
public class CrossEncoderReranker {

    private final ChatClient chatClient;
    private final ObjectMapper objectMapper;

    /**
     * 使用LLM做重排序（适合中小规模）
     * 对于大规模，建议使用专用Cross-Encoder模型
     */
    public List<ProductScore> rerank(String query, 
                                      List<ProductScore> candidates) {
        if (candidates.size() <= 5) {
            return candidates;  // 候选数少时无需重排
        }
        
        // 构建重排提示词
        String rerankPrompt = buildRerankPrompt(query, candidates);
        
        String response = chatClient.call(rerankPrompt);
        
        try {
            RerankResult result = objectMapper.readValue(response, RerankResult.class);
            return applyRerankResult(candidates, result);
        } catch (Exception e) {
            log.error("重排解析失败，返回原始顺序", e);
            return candidates;
        }
    }

    private String buildRerankPrompt(String query, List<ProductScore> candidates) {
        StringBuilder sb = new StringBuilder();
        sb.append("你是一个电商搜索相关性评估专家。\n");
        sb.append("用户搜索词: ").append(query).append("\n\n");
        sb.append("候选商品列表:\n");
        
        for (int i = 0; i < candidates.size(); i++) {
            ProductScore ps = candidates.get(i);
            sb.append(String.format("[%d] ID:%d | %s | ¥%.2f\n",
                i + 1, ps.getProductId(), ps.getTitle(), ps.getPrice()));
        }
        
        sb.append("\n请根据与搜索词的相关性，对上述商品重新排序。");
        sb.append("返回JSON格式: {\"ranking\": [1,3,2,...], \"reason\": \"排序原因\"}");
        
        return sb.toString();
    }

    /**
     * 并行批量重排（适合高并发场景）
     */
    public CompletableFuture<List<ProductScore>> asyncRerank(
            String query, List<ProductScore> candidates) {
        return CompletableFuture.supplyAsync(() -> rerank(query, candidates))
            .orTimeout(3, TimeUnit.SECONDS)
            .exceptionally(ex -> {
                log.warn("重排超时，使用原始顺序");
                return candidates;
            });
    }
}

八、搜索体验优化：拼写纠错/同义词/搜索建议

8.1 拼写纠错

@Service
public class SpellCorrectionService {

    private final ElasticsearchClient esClient;

    /**
     * 基于ES Suggest的拼写纠错
     */
    public String correctSpelling(String query) throws IOException {
        SearchResponse<Void> response = esClient.search(s -> s
            .index("products")
            .suggest(sg -> sg
                .suggesters("spell-check", sug -> sug
                    .text(query)
                    .term(t -> t
                        .field("title")
                        .suggestMode(SuggestMode.Missing)
                        .sort(SuggestSort.Score)
                        .maxEdits(2)
                        .minWordLength(4)
                    )
                )
            )
            .size(0),
            Void.class
        );
        
        List<TermSuggestOption> options = response.suggest()
            .get("spell-check")
            .get(0)
            .term()
            .options();
        
        if (options.isEmpty()) {
            return query;  // 无纠错建议
        }
        
        // 返回最高分建议
        return options.get(0).text();
    }
}

8.2 搜索建议（Search Suggestion）

@Service
public class SearchSuggestionService {

    private final ElasticsearchClient esClient;
    private final RedisTemplate<String, Object> redisTemplate;

    /**
     * 实时搜索建议（前缀匹配 + 热门搜索融合）
     */
    public List<String> getSuggestions(String prefix, int maxSuggestions) {
        List<String> suggestions = new ArrayList<>();
        
        // 1. 从ES获取Completion建议
        try {
            suggestions.addAll(getCompletionSuggestions(prefix, maxSuggestions));
        } catch (Exception e) {
            log.warn("ES建议获取失败", e);
        }
        
        // 2. 混入热门搜索词
        List<String> hotSearches = getHotSearchKeywords(prefix, 3);
        suggestions.addAll(0, hotSearches);
        
        // 3. 去重 + 截断
        return suggestions.stream()
            .distinct()
            .limit(maxSuggestions)
            .collect(Collectors.toList());
    }

    private List<String> getHotSearchKeywords(String prefix, int limit) {
        // 从Redis ZSet获取热门搜索（按搜索频次排序）
        Set<Object> hotWords = redisTemplate.opsForZSet()
            .reverseRange("hot:search:words", 0, 50);
        
        if (hotWords == null) return Collections.emptyList();
        
        return hotWords.stream()
            .map(Object::toString)
            .filter(w -> w.startsWith(prefix))
            .limit(limit)
            .collect(Collectors.toList());
    }

    /**
     * 记录搜索行为（用于热词统计）
     */
    public void recordSearch(String query) {
        if (query == null || query.isBlank()) return;
        
        String normalizedQuery = query.toLowerCase().trim();
        redisTemplate.opsForZSet()
            .incrementScore("hot:search:words", normalizedQuery, 1);
    }
}

8.3 同义词配置（ES侧）

PUT /products/_settings
{
  "analysis": {
    "filter": {
      "synonym_filter": {
        "type": "synonym",
        "synonyms": [
          "红色,朱红,玫瑰色,酒红,砖红",
          "裙子,连衣裙,裙装",
          "手机,智能手机,移动电话",
          "笔记本,笔记本电脑,laptop,便携电脑"
        ]
      }
    },
    "analyzer": {
      "ik_synonym_analyzer": {
        "type": "custom",
        "tokenizer": "ik_smart",
        "filter": ["lowercase", "synonym_filter"]
      }
    }
  }
}

九、性能调优：ANN索引参数调优（HNSW参数解析）

9.1 HNSW算法核心参数

HNSW（Hierarchical Navigable Small World）是目前最主流的近似最近邻（ANN）算法：

9.2 参数调优矩阵

场景	M	ef_construction	ef_search	召回率	QPS
高速低精（日志搜索）	8	32	20	~90%	5000
平衡（电商推荐）	16	64	40	~95%	2000
高精低速（医疗检索）	32	128	100	~99%	500

9.3 PGVector HNSW参数配置

-- 创建HNSW索引（平衡场景推荐参数）
CREATE INDEX CONCURRENTLY idx_products_embedding_hnsw
ON products USING hnsw (embedding vector_cosine_ops)
WITH (
    m = 16,                -- 每层最大连接数
    ef_construction = 64   -- 构建时候选集大小
);

-- 查询时动态设置ef（覆盖默认值）
SET hnsw.ef_search = 40;
SELECT id, title, 1 - (embedding <=> '[...]'::vector) AS similarity
FROM products
ORDER BY embedding <=> '[...]'::vector
LIMIT 20;

9.4 Java性能测试代码

@SpringBootTest
class SemanticSearchPerformanceTest {

    @Autowired
    private SemanticSearchService searchService;

    @Test
    void benchmarkSearchLatency() throws Exception {
        String[] testQueries = {
            "红色裙子", "白色运动鞋", "蓝色牛仔裤",
            "韩版卫衣", "显瘦连衣裙", "大码女装"
        };
        
        // 预热
        for (int i = 0; i < 10; i++) {
            searchService.search(testQueries[i % testQueries.length], 
                SearchFilter.empty());
        }
        
        // 正式测试
        List<Long> latencies = new ArrayList<>();
        for (int i = 0; i < 100; i++) {
            long start = System.nanoTime();
            searchService.search(testQueries[i % testQueries.length], 
                SearchFilter.empty());
            long latency = (System.nanoTime() - start) / 1_000_000;
            latencies.add(latency);
        }
        
        latencies.sort(Long::compareTo);
        
        System.out.printf("P50: %dms%n", latencies.get(49));
        System.out.printf("P95: %dms%n", latencies.get(94));
        System.out.printf("P99: %dms%n", latencies.get(98));
        System.out.printf("Max: %dms%n", latencies.get(99));
    }
}

9.5 性能测试结果

陈磊团队的实测数据（商品规模：500万，服务器：8核32G）：

方案	P50延迟	P99延迟	QPS
PGVector + IVFFlat	45ms	180ms	800
PGVector + HNSW(M=16)	28ms	95ms	1500
ES kNN(ef=40)	35ms	120ms	1200
ES混合搜索	52ms	160ms	900

查询Embedding缓存的作用：相同查询词Embedding缓存命中后，P50延迟从28ms → 8ms。

十、A/B测试：如何证明语义搜索比关键词搜索更好

10.1 搜索效果评估体系

10.2 A/B测试框架实现

@Service
@Slf4j
public class SearchAbTestService {

    private final RedisTemplate<String, Object> redisTemplate;
    private final SemanticSearchService semanticService;
    private final KeywordSearchService keywordService;
    private final SearchEventTracker tracker;

    /**
     * A/B测试路由
     * A组：传统关键词搜索（控制组）
     * B组：混合语义搜索（实验组）
     */
    public SearchResponse search(String query, 
                                  SearchFilter filter, 
                                  String userId,
                                  int page, 
                                  int size) {
        // 确定用户分组（基于userId哈希，保证同一用户始终在同一组）
        String group = assignGroup(userId);
        
        long startTime = System.currentTimeMillis();
        SearchResponse response;
        
        if ("B".equals(group)) {
            response = semanticService.search(query, filter, page, size);
            response.setSearchType("HYBRID_SEMANTIC");
        } else {
            response = keywordService.search(query, filter, page, size);
            response.setSearchType("KEYWORD");
        }
        
        // 记录搜索事件（用于后续分析）
        tracker.trackSearchEvent(SearchEvent.builder()
            .userId(userId)
            .query(query)
            .group(group)
            .searchType(response.getSearchType())
            .resultCount(response.getTotalCount())
            .latencyMs(System.currentTimeMillis() - startTime)
            .timestamp(Instant.now())
            .build()
        );
        
        return response;
    }

    /**
     * 用户分组（50/50）
     */
    private String assignGroup(String userId) {
        int hash = Math.abs(userId.hashCode() % 100);
        return hash < 50 ? "A" : "B";
    }

    /**
     * A/B测试统计报告
     */
    public AbTestReport generateReport(LocalDate startDate, LocalDate endDate) {
        // 从埋点数据库查询统计数据
        AbTestStats groupA = statsRepository.getStats("A", startDate, endDate);
        AbTestStats groupB = statsRepository.getStats("B", startDate, endDate);
        
        // 计算提升幅度
        double ctrLift = (groupB.getCtr() - groupA.getCtr()) / groupA.getCtr() * 100;
        double conversionLift = (groupB.getConversion() - groupA.getConversion()) 
            / groupA.getConversion() * 100;
        double zeroResultLift = (groupA.getZeroResultRate() - groupB.getZeroResultRate()) 
            / groupA.getZeroResultRate() * 100;
        
        // 统计显著性检验（卡方检验）
        boolean isSignificant = chiSquareTest(
            groupA.getClickCount(), groupA.getSearchCount(),
            groupB.getClickCount(), groupB.getSearchCount()
        );
        
        return AbTestReport.builder()
            .periodStart(startDate)
            .periodEnd(endDate)
            .groupA(groupA)
            .groupB(groupB)
            .ctrLiftPercent(ctrLift)
            .conversionLiftPercent(conversionLift)
            .zeroResultReductionPercent(zeroResultLift)
            .statisticallySignificant(isSignificant)
            .recommendation(isSignificant && ctrLift > 0 ? "全量上线B组" : "继续观察")
            .build();
    }

    /**
     * 卡方显著性检验
     */
    private boolean chiSquareTest(long clickA, long searchA, 
                                    long clickB, long searchB) {
        double ctrA = (double) clickA / searchA;
        double ctrB = (double) clickB / searchB;
        double pooledCtr = (double)(clickA + clickB) / (searchA + searchB);
        
        double chiSquare = 
            Math.pow(clickA - searchA * pooledCtr, 2) / (searchA * pooledCtr) +
            Math.pow(searchA - clickA - searchA * (1 - pooledCtr), 2) / (searchA * (1 - pooledCtr)) +
            Math.pow(clickB - searchB * pooledCtr, 2) / (searchB * pooledCtr) +
            Math.pow(searchB - clickB - searchB * (1 - pooledCtr), 2) / (searchB * (1 - pooledCtr));
        
        // P<0.05 (卡方值 > 3.841)
        return chiSquare > 3.841;
    }
}

10.3 陈磊团队的A/B测试结果

测试周期：2025-10-01 至 2025-10-31（整月）流量分配：A组（关键词）50% vs B组（混合语义）50%

指标	A组（关键词）	B组（混合语义）	提升幅度
搜索CTR	12.3%	16.8%	+36.6%
搜索转化率	6.2%	8.7%	+40.3%
零结果率	18.1%	4.2%	-76.8%
P99搜索延迟	85ms	160ms	-88%（延迟增加）
人均搜索次数	3.2次	2.8次	-12.5%（更快找到）
统计显著性	p < 0.001	--	显著

结论：混合语义搜索在转化率上提升40%，零结果率下降77%，统计显著，全量上线。

FAQ

Q1：用OpenAI Embedding还是本地部署的模型？

生产场景建议对比测试：

OpenAI text-embedding-3-small：质量最高，延迟约50-100ms，成本$0.02/1M tokens
阿里通义千问Embedding：国内延迟优秀，适合中文场景
BGE-M3本地部署：零成本，但需要GPU，延迟20-50ms（A10 GPU）

对于陈磊团队（中文电商），最终选择了阿里云text-embedding-v3，延迟比OpenAI低30%，中文效果相当。

Q2：商品Embedding多久需要更新一次？

商品标题/描述变更：立即触发更新（监听MQ消息）
模型版本升级：全量重新Embedding（离线任务）
定期维护：建议每季度检查Embedding质量

Q3：10万商品全量Embedding需要多久？

以text-embedding-3-small为例（批量100个/次，60批/分钟API限制）：

10万商品 = 1000批
理论时间 = 1000/60 ≈ 17分钟
实际考虑重试和网络波动：约25-30分钟

Q4：相似度阈值0.70怎么来的？

这个值需要业务标注来确定：

随机抽取500个查询词
人工标注"相关"/"不相关"
绘制PR曲线，找F1最优的阈值
一般电商场景在0.65-0.75之间

Q5：语义搜索对"精确商品"搜索效果差怎么办？

例如搜索"iPhone 15 Pro 256G 钛金黑"这类精确搜索，应该优先走关键词搜索。

解决方案：意图识别 + 路由策略

检测到型号/SKU模式 → 优先关键词搜索（权重0.9）
普通描述类搜索 → 优先语义搜索（权重0.9）
混合场景 → 平衡权重（0.5/0.5）

总结

语义搜索改造的核心路径：

传统关键词搜索
    ↓ 加入Embedding + 向量数据库
语义召回
    ↓ 融合BM25关键词搜索（RRF）
混合搜索
    ↓ Cross-Encoder重排
精准排序
    ↓ 搜索建议 + 拼写纠错
完整搜索体验

陈磊团队的改造历时3周，带来了40%转化率提升。技术上的核心是Spring AI + PGVector（百万级数据）或Elasticsearch kNN（千万级数据），融合策略选用RRF算法，简单高效。

最重要的是：一定要用A/B测试来量化效果，让数据说话，而不是靠感觉。