Spring AI完整教程：ChatClient、EmbeddingModel、VectorStore的Spring Boot集成

老张2026/4/30大约 9 分钟

Spring AI完整教程：ChatClient、EmbeddingModel、VectorStore的Spring Boot集成

适读人群：Java/Spring Boot工程师 | 阅读时长：约25分钟 | 依赖：Spring AI 1.0.0、Spring Boot 3.3

开篇故事

两年前我开始在Java项目里用AI能力，那时候没有Spring AI，要集成OpenAI就得自己封装HttpClient，手写JSON序列化，管理Token限制，处理流式响应……搞一个能用的Chat接口要花半天时间，而且每个开发者搞出来的封装风格都不一样，到处散落着各种OpenAiClient、GptService、AiHelper。

后来Spring AI出来了，我第一反应是"又一个框架，要不要学"。但用了一次就回不去了——ChatClient、EmbeddingModel、VectorStore三个核心抽象设计得非常干净，完全是Spring风格，对Java工程师来说几乎零适应成本。更重要的是，它把OpenAI、Azure OpenAI、Anthropic、Ollama、通义千问等十几个模型提供商做了统一抽象，切换提供商只改配置，业务代码不动。

今天把Spring AI 1.0版本的核心用法整理成一篇完整教程，从环境搭建到生产级实践，覆盖三大核心组件。

一、核心问题分析

为什么需要Spring AI而不是直接用各家的SDK？

问题1：供应商锁定 直接用OpenAI SDK，换成Claude或者国内的通义千问，业务代码需要大量重写。Spring AI的统一接口让切换提供商变成只改配置文件。

问题2：缺乏Spring生态整合 OpenAI SDK本身不支持Spring Boot的自动配置、@Bean注入、配置中心等。Spring AI提供了完整的Starter，和Spring生态无缝对接。

问题3：RAG基础设施碎片化 Embedding、VectorStore、Document处理在不同项目里各自实现，重复造轮子。Spring AI提供了标准化的RAG基础组件。

二、原理深度解析

2.1 Spring AI核心架构

2.2 ChatClient的Fluent API设计

Spring AI 1.0的ChatClient采用了Builder + Fluent API模式，支持链式调用，每一步都有清晰的语义：

chatClient
  .prompt()           // 开始构建提示
  .system(...)        // 系统提示词
  .user(...)          // 用户消息
  .advisors(...)      // 拦截器（RAG、记忆等）
  .options(...)       // 模型参数
  .call()             // 发起调用
  .content()          // 获取文本结果

三、完整代码实现

3.1 项目依赖配置

<parent>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-parent</artifactId>
    <version>3.3.0</version>
</parent>

<dependencyManagement>
    <dependencies>
        <dependency>
            <groupId>org.springframework.ai</groupId>
            <artifactId>spring-ai-bom</artifactId>
            <version>1.0.0</version>
            <type>pom</type>
            <scope>import</scope>
        </dependency>
    </dependencies>
</dependencyManagement>

<dependencies>
    <!-- Spring AI OpenAI集成 -->
    <dependency>
        <groupId>org.springframework.ai</groupId>
        <artifactId>spring-ai-openai-spring-boot-starter</artifactId>
    </dependency>

    <!-- pgvector向量存储 -->
    <dependency>
        <groupId>org.springframework.ai</groupId>
        <artifactId>spring-ai-pgvector-store-spring-boot-starter</artifactId>
    </dependency>

    <!-- PDF文档读取 -->
    <dependency>
        <groupId>org.springframework.ai</groupId>
        <artifactId>spring-ai-pdf-document-reader</artifactId>
    </dependency>

    <!-- 对话记忆持久化 -->
    <dependency>
        <groupId>org.springframework.ai</groupId>
        <artifactId>spring-ai-jdbc-chat-memory-repository</artifactId>
    </dependency>

    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-web</artifactId>
    </dependency>

    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-data-jpa</artifactId>
    </dependency>

    <dependency>
        <groupId>org.postgresql</groupId>
        <artifactId>postgresql</artifactId>
    </dependency>
</dependencies>

3.2 application.yml配置

spring:
  ai:
    openai:
      api-key: ${OPENAI_API_KEY}
      base-url: https://api.openai.com  # 国内代理可修改此处
      chat:
        options:
          model: gpt-4o
          temperature: 0.7
          max-tokens: 2048
      embedding:
        options:
          model: text-embedding-3-small

    vectorstore:
      pgvector:
        initialize-schema: true
        schema-name: public
        table-name: vector_store
        dimensions: 1536  # text-embedding-3-small维度

  datasource:
    url: jdbc:postgresql://localhost:5432/ragdb
    username: ${DB_USER}
    password: ${DB_PASS}
    driver-class-name: org.postgresql.Driver

  jpa:
    hibernate:
      ddl-auto: update
    show-sql: false

server:
  port: 8080

3.3 ChatClient基础用法

@RestController
@RequestMapping("/api/chat")
public class ChatController {

    private final ChatClient chatClient;

    // Spring AI 1.0推荐通过ChatClient.Builder注入
    public ChatController(ChatClient.Builder chatClientBuilder) {
        this.chatClient = chatClientBuilder
                .defaultSystem("你是一个专业的Java技术顾问，擅长Spring Boot和AI应用开发。")
                .build();
    }

    /**
     * 简单对话
     */
    @PostMapping("/simple")
    public String simpleChat(@RequestBody String userMessage) {
        return chatClient.prompt()
                .user(userMessage)
                .call()
                .content();
    }

    /**
     * 带System Prompt的对话
     */
    @PostMapping("/professional")
    public String professionalChat(
            @RequestParam String role,
            @RequestBody String userMessage) {

        return chatClient.prompt()
                .system("你是一名专业的{role}，请用专业但易懂的语言回答问题。")
                .system(s -> s.param("role", role))
                .user(userMessage)
                .call()
                .content();
    }

    /**
     * 流式响应（SSE）
     */
    @GetMapping(value = "/stream", produces = MediaType.TEXT_EVENT_STREAM_VALUE)
    public Flux<String> streamChat(@RequestParam String message) {
        return chatClient.prompt()
                .user(message)
                .stream()
                .content();
    }

    /**
     * 结构化输出（自动反序列化为Java对象）
     */
    @PostMapping("/structured")
    public CodeReviewResult structuredOutput(@RequestBody String code) {
        return chatClient.prompt()
                .user("请对以下Java代码进行评审，输出JSON格式结果：\n\n" + code)
                .call()
                .entity(CodeReviewResult.class);
    }

    @Data
    public static class CodeReviewResult {
        private int score; // 代码质量分 0-100
        private List<String> issues; // 问题列表
        private List<String> suggestions; // 改进建议
        private String summary; // 总体评价
    }
}

3.4 带对话记忆的ChatClient

@Service
public class ConversationService {

    private final ChatClient chatClient;
    private final ChatMemoryRepository memoryRepository;

    public ConversationService(ChatClient.Builder builder,
                                ChatMemoryRepository memoryRepository) {
        this.memoryRepository = memoryRepository;
        this.chatClient = builder
                .defaultSystem("你是一个有记忆能力的智能助手，能够记住对话历史。")
                .defaultAdvisors(
                        // 添加对话记忆拦截器，保留最近20条消息
                        new MessageChatMemoryAdvisor(
                                new JdbcChatMemoryRepository(memoryRepository),
                                "default-conversation",
                                20)
                )
                .build();
    }

    /**
     * 多轮对话（自动维护会话上下文）
     */
    public String chat(String sessionId, String userMessage) {
        return chatClient.prompt()
                .user(userMessage)
                .advisors(a -> a.param(
                        AbstractChatMemoryAdvisor.CHAT_MEMORY_CONVERSATION_ID_KEY,
                        sessionId))
                .call()
                .content();
    }

    /**
     * 清除会话记忆
     */
    public void clearConversation(String sessionId) {
        memoryRepository.deleteMessages(sessionId);
    }
}

3.5 EmbeddingModel使用

@Service
public class EmbeddingService {

    private final EmbeddingModel embeddingModel;

    public EmbeddingService(EmbeddingModel embeddingModel) {
        this.embeddingModel = embeddingModel;
    }

    /**
     * 单文本向量化
     */
    public float[] embedText(String text) {
        EmbeddingResponse response = embeddingModel.embedForResponse(List.of(text));
        return toFloatArray(response.getResults().get(0).getOutput());
    }

    /**
     * 批量向量化（减少API调用次数）
     */
    public List<float[]> embedBatch(List<String> texts) {
        EmbeddingResponse response = embeddingModel.embedForResponse(texts);
        return response.getResults().stream()
                .map(r -> toFloatArray(r.getOutput()))
                .collect(Collectors.toList());
    }

    /**
     * 计算两段文本的语义相似度
     */
    public double similarity(String text1, String text2) {
        List<float[]> vecs = embedBatch(List.of(text1, text2));
        return cosineSimilarity(vecs.get(0), vecs.get(1));
    }

    private float[] toFloatArray(List<Double> doubles) {
        float[] arr = new float[doubles.size()];
        for (int i = 0; i < doubles.size(); i++) {
            arr[i] = doubles.get(i).floatValue();
        }
        return arr;
    }

    private double cosineSimilarity(float[] a, float[] b) {
        double dot = 0, normA = 0, normB = 0;
        for (int i = 0; i < a.length; i++) {
            dot += a[i] * b[i];
            normA += a[i] * a[i];
            normB += b[i] * b[i];
        }
        return dot / (Math.sqrt(normA) * Math.sqrt(normB) + 1e-10);
    }
}

3.6 VectorStore与RAG集成

@Service
public class RagService {

    private final VectorStore vectorStore;
    private final ChatClient chatClient;
    private final TokenTextSplitter textSplitter;

    public RagService(VectorStore vectorStore,
                       ChatClient.Builder chatClientBuilder) {
        this.vectorStore = vectorStore;
        // 使用Spring AI内置的文本分割器（递归字符分割）
        this.textSplitter = new TokenTextSplitter(800, 100, 5, 10000, true);
        this.chatClient = chatClientBuilder
                .defaultAdvisors(
                        // QuestionAnswerAdvisor：自动执行RAG流程
                        new QuestionAnswerAdvisor(
                                vectorStore,
                                SearchRequest.builder()
                                        .topK(5)
                                        .similarityThreshold(0.6)
                                        .build(),
                                """
                                请根据以下参考资料回答问题。如果资料中没有相关信息，
                                请明确说明无法回答，不要编造内容。
                                
                                参考资料：
                                {question_answer_context}
                                """)
                )
                .build();
    }

    /**
     * 文档入库
     */
    public void ingestDocument(Resource resource) {
        // 1. 读取文档
        TikaDocumentReader reader = new TikaDocumentReader(resource);
        List<Document> docs = reader.get();

        // 2. 分块
        List<Document> splitDocs = textSplitter.apply(docs);

        // 3. 添加元数据
        splitDocs.forEach(doc -> {
            doc.getMetadata().put("source", resource.getFilename());
            doc.getMetadata().put("indexed_at", LocalDateTime.now().toString());
        });

        // 4. 向量化并存储（Spring AI自动处理Embedding）
        vectorStore.add(splitDocs);
        log.info("文档入库完成: {}, 共{}个chunk", resource.getFilename(), splitDocs.size());
    }

    /**
     * RAG问答（QuestionAnswerAdvisor自动处理检索+注入上下文）
     */
    public String askWithRag(String question) {
        return chatClient.prompt()
                .user(question)
                .call()
                .content();
    }

    /**
     * RAG问答（带元数据过滤）
     */
    public String askWithFilter(String question, String sourceFilter) {
        return chatClient.prompt()
                .user(question)
                .advisors(a -> a.param(
                        QuestionAnswerAdvisor.FILTER_EXPRESSION,
                        "source == '" + sourceFilter + "'"))
                .call()
                .content();
    }

    /**
     * 删除文档（按来源）
     */
    public void deleteDocumentBySource(String source) {
        vectorStore.delete(Filter.expression("source == '" + source + "'"));
    }
}

3.7 Function Calling（工具调用）

@Configuration
public class ToolConfig {

    /**
     * 注册天气查询工具
     */
    @Bean
    @Description("查询指定城市的实时天气信息")
    public Function<WeatherRequest, WeatherResponse> weatherFunction(
            WeatherApiService weatherService) {
        return request -> weatherService.getWeather(request.getCity());
    }

    /**
     * 注册数据库查询工具
     */
    @Bean
    @Description("查询系统中的订单信息，可按订单号或用户ID查询")
    public Function<OrderQueryRequest, OrderQueryResponse> orderQueryFunction(
            OrderRepository orderRepository) {
        return request -> {
            if (request.getOrderId() != null) {
                return orderRepository.findById(request.getOrderId())
                        .map(o -> new OrderQueryResponse(List.of(o)))
                        .orElse(new OrderQueryResponse(List.of()));
            }
            return new OrderQueryResponse(
                    orderRepository.findByUserId(request.getUserId()));
        };
    }
}

@Service
public class AgentChatService {

    private final ChatClient chatClient;

    public AgentChatService(ChatClient.Builder builder) {
        this.chatClient = builder
                .defaultSystem("你是一个智能客服助手，可以查询天气和订单信息来回答用户问题。")
                .build();
    }

    public String agentChat(String userMessage) {
        return chatClient.prompt()
                .user(userMessage)
                // 注册可用工具
                .tools("weatherFunction", "orderQueryFunction")
                .call()
                .content();
    }
}

四、效果评估与优化

Spring AI vs 手写封装的开发效率对比：

功能	手写封装代码行数	Spring AI代码行数	节省比例
基础Chat接口	约120行	约15行	87%
流式响应	约180行	约8行	96%
对话记忆	约250行	约20行	92%
RAG完整链路	约500行	约60行	88%
Function Calling	约200行	约30行	85%

Spring AI 1.0 vs 0.8的关键改进：

Spring AI 1.0正式版对API做了大幅清理，ChatClient彻底重构为Fluent风格，VectorStore接口稳定，Advisor机制取代了之前复杂的Chain设计。对于还在用0.8.x的项目，建议评估迁移，0.8的API已经停止维护。

五、踩坑实录

坑1：ChatClient.Builder不是线程安全的，但ChatClient是

我一开始把ChatClient.Builder注入到Service里，然后在请求里调用builder.build()。多线程情况下触发了并发问题，因为Builder的内部状态是可变的。正确做法是在@PostConstruct或构造器里调用builder.build()，把ChatClient作为final成员变量，ChatClient本身是线程安全的。

坑2：QuestionAnswerAdvisor的上下文变量名必须精确匹配

QuestionAnswerAdvisor把检索结果注入到Prompt时，用的占位符是{question_answer_context}，这个变量名是硬编码的。我自定义Prompt模板时写成了{context}，导致检索结果根本没有注入进去，LLM回答没有任何知识库内容。检查了半天代码才发现是这个大小写+下划线都要完全一致的坑。

坑3：VectorStore的initialize-schema在生产环境要关闭

spring.ai.vectorstore.pgvector.initialize-schema: true会在启动时自动创建表结构。开发环境很方便，但生产环境如果数据库连接慢，启动时可能超时；更糟糕的是，如果表已存在且结构不同，可能直接报错导致启动失败。生产环境务必改成false，表结构通过Flyway或Liquibase管理。

坑4：Streaming响应的异常处理不完整

Stream接口用Flux<String>返回，但我忘了在前端处理SSE断流的情况。如果LLM API中途超时，Flux会直接terminate，前端得到一个不完整的响应，但没有任何错误提示。解决方案是在Flux上加.onErrorReturn("服务暂时不可用，请稍后重试")。

六、总结

Spring AI 1.0是目前Java生态里集成LLM能力最成熟的框架，三个核心抽象——ChatClient、EmbeddingModel、VectorStore——设计清晰，和Spring Boot的整合非常自然。对于Java工程师来说，上手成本极低，一个下午就能搭起一个完整的RAG问答系统。

最重要的收获：Spring AI的统一抽象让切换AI服务提供商的成本大幅降低。在国内监管日趋收紧、需要使用国内合规模型的背景下，这个特性价值很高——今天OpenAI，明天通义千问，业务代码不用改。