Spring AI + Qdrant：高性能向量库集成完整指南

老张2026/4/30大约 6 分钟

Spring AI + Qdrant：高性能向量库集成完整指南

适读人群：需要在生产环境部署高性能向量检索的Java工程师，正在选型向量数据库的架构师 阅读时长：约16分钟 文章价值：从零到生产的Qdrant完整集成方案，掌握与PGVector的选型差异，包括高级过滤和性能调优

技术选型的那次讨论

我在星球里有个同学叫老钱，去年做了个企业内部知识库，用的PGVector。系统运行了半年，文档量从2万涨到了50万，开始出现问题：检索P99延迟从80ms飙到了1.2s，运维压力也大了——PostgreSQL原本只用来存业务数据，现在向量索引把内存撑满了。

他来问我："是应该加机器，还是换向量库？"

我说如果50万是终点，PGVector加一台机器够了。但你们文档量一年翻了25倍，按这个速度，一年后就是1000万，那时候必须换。

他们最终选了Qdrant。

这篇文章，从Qdrant的定位讲起，到Spring AI集成，到生产环境调优，完整走一遍。

Qdrant是什么，和其他向量库有什么区别

选型建议：

场景	推荐
文档数 < 50万，团队熟悉PostgreSQL	PGVector
文档数 50万 - 5000万，需要复杂过滤	Qdrant
文档数 > 5000万，需要水平扩展	Milvus
需要图关系+向量混合检索	Weaviate

Qdrant的核心差异点：用Rust编写，向量检索性能极好，同时支持非常丰富的payload过滤条件（类型/范围/地理/嵌套），在复杂过滤场景比其他向量库强很多。

环境搭建

# docker-compose.yml
version: '3.8'
services:
  qdrant:
    image: qdrant/qdrant:v1.9.0
    container_name: qdrant
    ports:
      - "6333:6333"   # REST API
      - "6334:6334"   # gRPC（生产推荐gRPC，性能更好）
    volumes:
      - qdrant_data:/qdrant/storage
    environment:
      - QDRANT__SERVICE__GRPC_PORT=6334
      - QDRANT__LOG_LEVEL=INFO
    configs:
      - source: qdrant_config
        target: /qdrant/config/production.yaml

configs:
  qdrant_config:
    content: |
      storage:
        hnsw_index:
          m: 32
          ef_construct: 128
          full_scan_threshold: 10000

volumes:
  qdrant_data:

Spring Boot依赖配置：

<dependencies>
    <dependency>
        <groupId>org.springframework.ai</groupId>
        <artifactId>spring-ai-qdrant-store-spring-boot-starter</artifactId>
    </dependency>
    <dependency>
        <groupId>org.springframework.ai</groupId>
        <artifactId>spring-ai-openai-spring-boot-starter</artifactId>
    </dependency>
</dependencies>

application.yml：

spring:
  ai:
    openai:
      api-key: ${OPENAI_API_KEY}
      embedding:
        options:
          model: text-embedding-3-small
          dimensions: 1536
    vectorstore:
      qdrant:
        host: localhost
        port: 6334
        use-tls: false
        collection-name: knowledge_base
        initialize-schema: true

Collection创建与Schema设计

Spring AI会自动创建Collection，但生产环境建议手动创建以便精细控制：

@Configuration
@Slf4j
public class QdrantCollectionConfig {

    @Autowired
    private QdrantClient qdrantClient;

    @PostConstruct
    public void initCollection() throws Exception {
        String collectionName = "knowledge_base";
        
        // 检查Collection是否已存在
        boolean exists = qdrantClient.collectionExistsAsync(collectionName).get();
        
        if (!exists) {
            log.info("创建Qdrant Collection: {}", collectionName);
            
            qdrantClient.createCollectionAsync(
                    collectionName,
                    VectorsConfig.newBuilder()
                            .setParams(VectorParams.newBuilder()
                                    .setSize(1536)  // 向量维度
                                    .setDistance(Distance.Cosine)
                                    .setHnswConfig(HnswConfigDiff.newBuilder()
                                            .setM(32)
                                            .setEfConstruct(128)
                                            .build())
                                    .build())
                            .build()
            ).get();
            
            // 创建payload索引（加速过滤查询）
            qdrantClient.createPayloadIndexAsync(
                    collectionName,
                    "category",         // 字段名
                    PayloadSchemaType.Keyword,  // 字段类型
                    null, null, null
            ).get();
            
            qdrantClient.createPayloadIndexAsync(
                    collectionName,
                    "created_at",
                    PayloadSchemaType.Integer,
                    null, null, null
            ).get();
            
            log.info("Qdrant Collection创建完成: {}", collectionName);
        }
    }
}

向量化写入

@Service
@Slf4j
public class QdrantIngestionService {

    private final VectorStore vectorStore;
    private final TextSplitter textSplitter;

    public QdrantIngestionService(VectorStore vectorStore) {
        this.vectorStore = vectorStore;
        this.textSplitter = new TokenTextSplitter(800, 150, 5, 10000, true);
    }

    /**
     * 文档向量化存储
     * Qdrant的payload支持任意JSON字段，灵活性很强
     */
    public void ingestDocument(DocumentDTO dto) {
        Document rawDoc = new Document(dto.getContent());
        
        // Qdrant的payload可以存任意结构，不像关系数据库那样受schema限制
        rawDoc.getMetadata().putAll(Map.of(
                "docId", dto.getId(),
                "title", dto.getTitle(),
                "category", dto.getCategory(),
                "department", dto.getDepartment(),
                "createdAt", dto.getCreatedAt().toEpochSecond(ZoneOffset.UTC),
                "tags", dto.getTags(),  // 支持数组
                "confidenceLevel", dto.getConfidenceLevel()  // 支持数字
        ));
        
        // 分块
        List<Document> chunks = textSplitter.apply(List.of(rawDoc));
        
        // 添加chunk元信息
        for (int i = 0; i < chunks.size(); i++) {
            chunks.get(i).getMetadata().put("chunkIndex", i);
            chunks.get(i).getMetadata().put("totalChunks", chunks.size());
        }
        
        vectorStore.add(chunks);
        log.info("文档写入Qdrant: docId={}, chunks={}", dto.getId(), chunks.size());
    }

    /**
     * 批量写入（提高吞吐量）
     */
    public void batchIngest(List<DocumentDTO> documents) {
        List<Document> allChunks = new ArrayList<>();
        
        for (DocumentDTO dto : documents) {
            Document rawDoc = new Document(dto.getContent(), buildMetadata(dto));
            allChunks.addAll(textSplitter.apply(List.of(rawDoc)));
        }
        
        // 一次性写入所有块（Qdrant内部会批量处理）
        vectorStore.add(allChunks);
        log.info("批量写入完成: documents={}, chunks={}", 
                documents.size(), allChunks.size());
    }
}

高级过滤检索（Qdrant的核心优势）

Qdrant的payload过滤是其最强的特性，Spring AI通过FilterExpression暴露了这个能力：

@Service
@Slf4j
public class QdrantSearchService {

    private final VectorStore vectorStore;

    /**
     * 基础相似度搜索
     */
    public List<Document> basicSearch(String query, int topK) {
        return vectorStore.similaritySearch(
                SearchRequest.query(query)
                        .withTopK(topK)
                        .withSimilarityThreshold(0.75)
        );
    }

    /**
     * 按部门+类别过滤搜索
     * Qdrant的payload过滤不影响向量检索性能（预过滤）
     */
    public List<Document> searchByDepartment(String query, 
                                               String department,
                                               String category,
                                               int topK) {
        // Spring AI的FilterExpression DSL
        String filterExpression = String.format(
                "department == '%s' && category == '%s'",
                department, category
        );
        
        return vectorStore.similaritySearch(
                SearchRequest.query(query)
                        .withTopK(topK)
                        .withSimilarityThreshold(0.7)
                        .withFilterExpression(filterExpression)
        );
    }

    /**
     * 时间范围过滤（按创建时间筛选）
     */
    public List<Document> searchByTimeRange(String query,
                                              LocalDateTime from,
                                              LocalDateTime to,
                                              int topK) {
        long fromEpoch = from.toEpochSecond(ZoneOffset.UTC);
        long toEpoch = to.toEpochSecond(ZoneOffset.UTC);
        
        String filterExpression = String.format(
                "created_at >= %d && created_at <= %d",
                fromEpoch, toEpoch
        );
        
        return vectorStore.similaritySearch(
                SearchRequest.query(query)
                        .withTopK(topK)
                        .withFilterExpression(filterExpression)
        );
    }

    /**
     * 多条件复合过滤
     * 注意：Qdrant的过滤是在向量索引层面做的，
     * 不像PostgreSQL WHERE那样是后过滤，所以不影响性能
     */
    public List<Document> advancedSearch(SearchCriteria criteria) {
        StringBuilder filterBuilder = new StringBuilder();
        List<String> conditions = new ArrayList<>();
        
        if (criteria.getDepartment() != null) {
            conditions.add("department == '" + criteria.getDepartment() + "'");
        }
        if (criteria.getCategory() != null) {
            conditions.add("category == '" + criteria.getCategory() + "'");
        }
        if (criteria.getMinConfidence() != null) {
            conditions.add("confidenceLevel >= " + criteria.getMinConfidence());
        }
        if (criteria.getFromDate() != null) {
            conditions.add("created_at >= " + 
                    criteria.getFromDate().toEpochSecond(ZoneOffset.UTC));
        }
        
        String filterExpression = conditions.isEmpty() 
                ? null 
                : String.join(" && ", conditions);
        
        SearchRequest.Builder builder = SearchRequest.query(criteria.getQuery())
                .withTopK(criteria.getTopK())
                .withSimilarityThreshold(criteria.getMinScore());
        
        if (filterExpression != null) {
            builder = builder.withFilterExpression(filterExpression);
        }
        
        return vectorStore.similaritySearch(builder.build());
    }
}

直接使用Qdrant原生客户端（高级用法）

Spring AI封装了常用操作，但如果需要Qdrant特有的高级功能，直接用原生客户端：

@Service
@Slf4j
public class QdrantNativeService {

    private final QdrantClient qdrantClient;
    private final EmbeddingModel embeddingModel;
    
    private static final String COLLECTION = "knowledge_base";

    /**
     * 使用原生客户端实现批量搜索
     * Spring AI暂不支持，需要直接调客户端
     */
    public List<List<ScoredPoint>> batchSearch(List<String> queries, int topK) {
        // 批量生成embedding
        List<float[]> embeddings = queries.stream()
                .map(q -> embeddingModel.embed(q))
                .collect(Collectors.toList());
        
        // 构建批量搜索请求
        List<SearchPoints> searchRequests = embeddings.stream()
                .map(embedding -> SearchPoints.newBuilder()
                        .setCollectionName(COLLECTION)
                        .addAllVector(floatArrayToList(embedding))
                        .setLimit(topK)
                        .setWithPayload(WithPayloadSelector.newBuilder()
                                .setEnable(true)
                                .build())
                        .build())
                .collect(Collectors.toList());
        
        try {
            return qdrantClient.searchBatchAsync(COLLECTION, searchRequests, null)
                    .get()
                    .stream()
                    .map(BatchResult::getResultList)
                    .collect(Collectors.toList());
        } catch (Exception e) {
            log.error("批量搜索失败", e);
            throw new RuntimeException(e);
        }
    }

    /**
     * 获取Collection统计信息（监控用）
     */
    public Map<String, Object> getCollectionStats() {
        try {
            CollectionInfo info = qdrantClient.getCollectionInfoAsync(COLLECTION).get();
            return Map.of(
                    "vectorCount", info.getVectorsCount(),
                    "indexedVectorCount", info.getIndexedVectorsCount(),
                    "status", info.getStatus().name(),
                    "segmentsCount", info.getSegmentsCount()
            );
        } catch (Exception e) {
            log.error("获取Collection信息失败", e);
            return Map.of("error", e.getMessage());
        }
    }

    private List<Float> floatArrayToList(float[] array) {
        List<Float> list = new ArrayList<>(array.length);
        for (float f : array) list.add(f);
        return list;
    }
}

与PGVector的横向对比

经过两个实际项目，整理了这份对比：

维度	PGVector	Qdrant
部署复杂度	低（PostgreSQL插件）	中（独立服务）
100万向量QPS	~300	~2000
500万向量QPS	~80	~1800
复杂payload过滤	一般（后过滤）	优秀（预过滤）
与业务数据JOIN	支持	不支持
内存占用（每百万向量）	~4GB	~2.5GB
水平扩展	困难	原生支持
运维工具	成熟（PostgreSQL生态）	完善（Qdrant自带UI）

Qdrant自带的Web UI（http://localhost:6333/dashboard）非常好用，可以直接在浏览器里查看Collection、测试搜索，调试特别方便。

小结

Qdrant是我目前生产环境最常推荐的向量库，理由很实际：

单机性能强，50万到1000万这个区间表现极好
Rust实现，内存和CPU效率高
Payload过滤强大，复杂业务过滤场景特别适合
Spring AI原生支持，集成简单

老钱迁移完之后，50万文档的检索P99从1.2s降到了110ms，内存占用从12GB降到了5GB。他说："早知道应该在20万的时候就开始规划这个事，不要等到快撑爆了才动。"

对，技术选型要提前看到未来半年的量级，不是只看今天。