Spring AI生态全景:2026年你需要了解的所有周边项目
Spring AI生态全景:2026年你需要了解的所有周边项目
开篇故事:Spring AI只是冰山一角
2025年底,字节跳动某团队的架构师赵峰,在内部分享了一个观点,让现场很多人哑口无言:
"我们的AI项目,Spring AI只占了不到20%的代码量。其余80%,是围绕Spring AI建立的:安全认证、响应式封装、批量处理、监控埋点、数据库适配……"
一个同事问他:"那这不是说明Spring AI不够强吗?"
赵峰摇了摇头:"恰恰相反。正是因为Spring AI设计得足够专注——只做AI集成这一件事,才能完美融入整个Spring生态。我们用Spring Security保护AI接口,用Spring Batch做大规模文档处理,用Spring WebFlux做流式响应……每一块都是生产级的,不需要从零造轮子。"
"如果你只懂Spring AI本身,你就是在用一把锤子干所有的活。"
"但如果你懂整个Spring AI生态,你拥有的是一个完整的工具箱。"
我是老张。今天,我们把这个工具箱,从头到尾打开看一遍。
一、Spring AI核心架构:先搞清楚基本盘
1.1 Spring AI核心模块全景
1.2 ChatClient详解:2026年最佳用法
ChatClient是Spring AI的核心交互接口,历经多次API演进,以下是2026年最推荐的用法:
package com.laozhang.springai.core;
import org.springframework.ai.chat.client.ChatClient;
import org.springframework.ai.chat.client.advisor.MessageChatMemoryAdvisor;
import org.springframework.ai.chat.client.advisor.QuestionAnswerAdvisor;
import org.springframework.ai.chat.memory.InMemoryChatMemory;
import org.springframework.ai.chat.model.ChatResponse;
import org.springframework.ai.document.Document;
import org.springframework.ai.vectorstore.VectorStore;
import org.springframework.stereotype.Service;
import reactor.core.publisher.Flux;
import lombok.extern.slf4j.Slf4j;
import java.util.*;
/**
* ChatClient最佳实践演示
* 覆盖2026年主流用法
*/
@Slf4j
@Service
public class ChatClientBestPractices {
private final ChatClient chatClient;
private final ChatClient ragChatClient;
public ChatClientBestPractices(ChatClient.Builder builder, VectorStore vectorStore) {
// 基础ChatClient
this.chatClient = builder
.defaultSystem("你是一个专业的Java开发助手,回答要准确、简洁、有实例。")
.defaultOptions(OpenAiChatOptions.builder()
.withModel("gpt-4o")
.withTemperature(0.7f)
.withMaxTokens(2000)
.build())
.build();
// RAG增强的ChatClient(通过Advisor机制)
this.ragChatClient = builder
.defaultSystem("你是企业知识库助手,基于检索内容回答,无相关内容时说明。")
.defaultAdvisors(
// RAG检索Advisor:自动检索相关文档注入上下文
QuestionAnswerAdvisor.builder(vectorStore)
.searchRequest(SearchRequest.builder().topK(5).similarityThreshold(0.7).build())
.build(),
// 对话记忆Advisor:自动维护多轮对话历史
MessageChatMemoryAdvisor.builder(new InMemoryChatMemory()).build()
)
.build();
}
/**
* 1. 基础文本对话
*/
public String basicChat(String question) {
return chatClient.prompt()
.user(question)
.call()
.content();
}
/**
* 2. 结构化输出(强类型返回)
*/
public JavaCodeReview reviewCode(String code) {
return chatClient.prompt()
.user(u -> u.text("请review以下Java代码,输出完整的评审报告:\n```java\n{code}\n```")
.param("code", code))
.call()
.entity(JavaCodeReview.class); // 自动JSON反序列化
}
/**
* 3. 流式输出(WebFlux)
*/
public Flux<String> streamChat(String question) {
return chatClient.prompt()
.user(question)
.stream()
.content(); // 返回Flux<String>,每个字符逐步输出
}
/**
* 4. RAG问答(带多轮记忆)
*/
public String ragChat(String question, String sessionId) {
return ragChatClient.prompt()
.user(question)
.advisors(a -> a.param(MessageChatMemoryAdvisor.CHAT_MEMORY_CONVERSATION_ID_KEY, sessionId))
.call()
.content();
}
/**
* 5. 带工具调用的对话
*/
public String chatWithTools(String question, Object... tools) {
return chatClient.prompt()
.user(question)
.tools(tools)
.call()
.content();
}
/**
* 6. 获取完整响应(包含Token用量、停止原因等元数据)
*/
public ChatResponseMetadata chatWithMetadata(String question) {
ChatResponse response = chatClient.prompt()
.user(question)
.call()
.chatResponse();
String content = response.getResult().getOutput().getContent();
var usage = response.getMetadata().getUsage();
return new ChatResponseMetadata(
content,
usage.getPromptTokens(),
usage.getGenerationTokens(),
response.getResult().getMetadata().getFinishReason()
);
}
public record JavaCodeReview(
String summary,
List<String> issues,
List<String> suggestions,
int qualityScore,
boolean productionReady
) {}
public record ChatResponseMetadata(String content, Long promptTokens,
Long completionTokens, String finishReason) {}
}1.3 EmbeddingModel:向量化最佳实践
@Service
public class EmbeddingBestPractices {
private final EmbeddingModel embeddingModel;
/**
* 批量文本向量化(推荐:批量比单个调用节省70%的API调用次数)
*/
public List<float[]> batchEmbed(List<String> texts) {
// Spring AI的EmbeddingModel自动处理批量请求
EmbeddingResponse response = embeddingModel.embedForResponse(texts);
return response.getResults().stream()
.map(r -> r.getOutput())
.collect(Collectors.toList());
}
/**
* 计算文本相似度
*/
public double cosineSimilarity(String text1, String text2) {
float[] emb1 = embeddingModel.embed(text1);
float[] emb2 = embeddingModel.embed(text2);
return calculateCosine(emb1, emb2);
}
private double calculateCosine(float[] a, float[] b) {
double dot = 0, normA = 0, normB = 0;
for (int i = 0; i < a.length; i++) {
dot += a[i] * b[i];
normA += a[i] * a[i];
normB += b[i] * b[i];
}
return dot / (Math.sqrt(normA) * Math.sqrt(normB));
}
}二、Spring AI官方扩展:各Provider详解
2.1 Provider生态全景
2.2 多Provider配置与动态切换
package com.laozhang.springai.provider;
import org.springframework.ai.chat.client.ChatClient;
import org.springframework.ai.chat.model.ChatModel;
import org.springframework.ai.openai.OpenAiChatModel;
import org.springframework.ai.anthropic.AnthropicChatModel;
import org.springframework.ai.ollama.OllamaChatModel;
import org.springframework.beans.factory.annotation.Qualifier;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.stereotype.Service;
import lombok.extern.slf4j.Slf4j;
import java.util.*;
/**
* 多Provider配置:支持运行时动态切换
*/
@Configuration
public class MultiProviderConfig {
@Bean("openaiChatClient")
public ChatClient openaiChatClient(OpenAiChatModel openAiChatModel) {
return ChatClient.builder(openAiChatModel)
.defaultSystem("你是一个专业助手。")
.build();
}
@Bean("anthropicChatClient")
public ChatClient anthropicChatClient(AnthropicChatModel anthropicChatModel) {
return ChatClient.builder(anthropicChatModel)
.defaultSystem("你是一个专业助手。")
.build();
}
@Bean("localChatClient")
public ChatClient localChatClient(OllamaChatModel ollamaChatModel) {
return ChatClient.builder(ollamaChatModel)
.defaultSystem("你是一个本地部署的助手。")
.build();
}
}
/**
* 智能路由:根据任务类型选择最合适的模型
*/
@Slf4j
@Service
public class SmartModelRouter {
private final ChatClient openaiClient;
private final ChatClient anthropicClient;
private final ChatClient localClient;
public SmartModelRouter(
@Qualifier("openaiChatClient") ChatClient openaiClient,
@Qualifier("anthropicChatClient") ChatClient anthropicClient,
@Qualifier("localChatClient") ChatClient localClient) {
this.openaiClient = openaiClient;
this.anthropicClient = anthropicClient;
this.localClient = localClient;
}
/**
* 根据任务类型和需求路由到最合适的模型
*/
public String route(String task, TaskRequirement requirement) {
ChatClient selectedClient = selectClient(requirement);
log.info("任务路由:{} -> {}", requirement.type(), selectedClient.getClass().getSimpleName());
return selectedClient.prompt().user(task).call().content();
}
private ChatClient selectClient(TaskRequirement req) {
// 成本敏感 → 本地模型
if (req.costSensitive() && !req.needsHighAccuracy()) {
return localClient;
}
// 长文档处理 → Claude(更长的上下文窗口)
if (req.type() == TaskType.LONG_DOC_ANALYSIS) {
return anthropicClient;
}
// 代码生成 → GPT-4o(代码能力强)
if (req.type() == TaskType.CODE_GENERATION) {
return openaiClient;
}
// 默认:OpenAI
return openaiClient;
}
public enum TaskType { CODE_GENERATION, LONG_DOC_ANALYSIS, CHAT, RAG, TRANSLATION }
public record TaskRequirement(TaskType type, boolean costSensitive, boolean needsHighAccuracy) {}
}2.3 application.yml多Provider配置
spring:
ai:
# OpenAI配置
openai:
api-key: ${OPENAI_API_KEY}
chat:
options:
model: gpt-4o
temperature: 0.7
max-tokens: 4096
embedding:
options:
model: text-embedding-3-small
dimensions: 1536
# Anthropic配置
anthropic:
api-key: ${ANTHROPIC_API_KEY}
chat:
options:
model: claude-3-5-sonnet-20241022
max-tokens: 8192
# 阿里云DashScope(通义千问)
dashscope:
api-key: ${DASHSCOPE_API_KEY}
chat:
options:
model: qwen-max
# Ollama本地模型
ollama:
base-url: http://localhost:11434
chat:
options:
model: qwen2.5:14b
temperature: 0.3
embedding:
options:
model: nomic-embed-text
# 向量存储:PgVector
vectorstore:
pgvector:
jdbc-url: jdbc:postgresql://localhost:5432/ai_db
dimensions: 1536
distance-type: COSINE_DISTANCE
initialize-schema: true三、与Spring Boot深度整合:Auto-configuration解析
3.1 自动配置原理
3.2 自定义Auto-configuration
package com.laozhang.springai.autoconfigure;
import org.springframework.ai.chat.client.ChatClient;
import org.springframework.ai.chat.client.advisor.MessageChatMemoryAdvisor;
import org.springframework.ai.chat.memory.InMemoryChatMemory;
import org.springframework.boot.autoconfigure.AutoConfiguration;
import org.springframework.boot.autoconfigure.condition.ConditionalOnMissingBean;
import org.springframework.boot.autoconfigure.condition.ConditionalOnProperty;
import org.springframework.boot.context.properties.EnableConfigurationProperties;
import org.springframework.context.annotation.Bean;
/**
* 自定义Spring AI Auto-configuration
* 为企业级RAG场景提供开箱即用的配置
*/
@AutoConfiguration(after = {OpenAiAutoConfiguration.class})
@ConditionalOnProperty(prefix = "laozhang.ai", name = "enabled", havingValue = "true", matchIfMissing = true)
@EnableConfigurationProperties(LaozhangAiProperties.class)
public class LaozhangAiAutoConfiguration {
/**
* 默认RAG ChatClient(如果用户没有自定义,则使用此默认配置)
*/
@Bean
@ConditionalOnMissingBean(name = "ragChatClient")
public ChatClient ragChatClient(ChatClient.Builder builder,
VectorStore vectorStore,
LaozhangAiProperties properties) {
return builder
.defaultSystem(properties.getSystemPrompt())
.defaultAdvisors(
QuestionAnswerAdvisor.builder(vectorStore)
.searchRequest(SearchRequest.builder()
.topK(properties.getTopK())
.similarityThreshold(properties.getSimilarityThreshold())
.build())
.build(),
MessageChatMemoryAdvisor.builder(new InMemoryChatMemory())
.conversationHistoryWindowSize(properties.getMemoryWindowSize())
.build()
)
.build();
}
}
/**
* 配置属性类
*/
@ConfigurationProperties(prefix = "laozhang.ai")
@Data
public class LaozhangAiProperties {
private boolean enabled = true;
private String systemPrompt = "你是一个专业的企业知识库助手。";
private int topK = 5;
private double similarityThreshold = 0.7;
private int memoryWindowSize = 10;
}四、与Spring Security集成:AI接口的安全保护
4.1 AI接口安全威胁模型
4.2 完整安全配置实现
package com.laozhang.security;
import org.springframework.ai.chat.client.ChatClient;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.security.config.annotation.web.builders.HttpSecurity;
import org.springframework.security.config.annotation.web.configuration.EnableWebSecurity;
import org.springframework.security.web.SecurityFilterChain;
import org.springframework.web.servlet.HandlerInterceptor;
import org.springframework.stereotype.Component;
import lombok.extern.slf4j.Slf4j;
import jakarta.servlet.http.*;
import java.util.*;
import java.util.concurrent.*;
import java.util.regex.Pattern;
/**
* Spring Security + AI接口安全配置
*/
@Configuration
@EnableWebSecurity
public class AiSecurityConfig {
@Bean
public SecurityFilterChain filterChain(HttpSecurity http) throws Exception {
http
.authorizeHttpRequests(auth -> auth
// AI接口需要认证
.requestMatchers("/api/ai/**").authenticated()
// 管理接口需要ADMIN角色
.requestMatchers("/api/admin/**").hasRole("ADMIN")
// 公开接口
.requestMatchers("/api/public/**", "/actuator/health").permitAll()
.anyRequest().authenticated()
)
.oauth2ResourceServer(oauth2 -> oauth2.jwt(jwt -> {}))
.csrf(csrf -> csrf.disable()) // REST API通常禁用CSRF
.sessionManagement(session -> session
.sessionCreationPolicy(SessionCreationPolicy.STATELESS)
);
return http.build();
}
}
/**
* AI请求限流与安全拦截
*/
@Slf4j
@Component
public class AiRateLimitInterceptor implements HandlerInterceptor {
// 用户级别的Token消耗限制(每小时)
private static final Map<String, UserQuota> userQuotas = new ConcurrentHashMap<>();
private static final int MAX_REQUESTS_PER_HOUR = 100;
private static final int MAX_TOKENS_PER_HOUR = 50_000;
// 危险Prompt模式
private static final List<Pattern> DANGEROUS_PATTERNS = List.of(
Pattern.compile("(?i)ignore (previous|all) (instructions?|prompts?)"),
Pattern.compile("(?i)you are now (a|an|the)"),
Pattern.compile("(?i)jailbreak|DAN|do anything now"),
Pattern.compile("(?i)reveal (your|the) (system|instructions|prompt)"),
Pattern.compile("(?i)(root|admin|sudo) (access|权限|mode)")
);
@Override
public boolean preHandle(HttpServletRequest request, HttpServletResponse response, Object handler) throws Exception {
String userId = extractUserId(request);
// 1. 限流检查
if (!checkRateLimit(userId)) {
response.setStatus(429); // Too Many Requests
response.getWriter().write("{\"error\":\"请求频率超限,请稍后重试\"}");
log.warn("限流触发:用户={}", userId);
return false;
}
// 2. Prompt注入检测(通过请求体)
if ("POST".equals(request.getMethod())) {
String body = new String(request.getInputStream().readAllBytes());
if (detectPromptInjection(body)) {
response.setStatus(400);
response.getWriter().write("{\"error\":\"检测到异常输入,请求被拒绝\"}");
log.warn("Prompt注入检测:用户={}, 内容前50字={}", userId,
body.substring(0, Math.min(50, body.length())));
return false;
}
}
return true;
}
private boolean checkRateLimit(String userId) {
UserQuota quota = userQuotas.computeIfAbsent(userId, k -> new UserQuota());
return quota.checkAndIncrement();
}
private boolean detectPromptInjection(String content) {
for (Pattern pattern : DANGEROUS_PATTERNS) {
if (pattern.matcher(content).find()) {
return true;
}
}
return false;
}
private String extractUserId(HttpServletRequest request) {
// 从JWT Token中提取用户ID
String auth = request.getHeader("Authorization");
if (auth != null && auth.startsWith("Bearer ")) {
// 实际中解析JWT
return "user_" + auth.hashCode();
}
return request.getRemoteAddr();
}
static class UserQuota {
private int requestCount = 0;
private long windowStart = System.currentTimeMillis();
synchronized boolean checkAndIncrement() {
long now = System.currentTimeMillis();
if (now - windowStart > 3600_000) { // 重置窗口
requestCount = 0;
windowStart = now;
}
if (requestCount >= MAX_REQUESTS_PER_HOUR) return false;
requestCount++;
return true;
}
}
}
/**
* 数据脱敏:AI响应中的敏感信息过滤
*/
@Component
public class AiResponseSanitizer {
private static final List<Pattern> SENSITIVE_PATTERNS = List.of(
Pattern.compile("\\d{4}[-\\s]?\\d{4}[-\\s]?\\d{4}[-\\s]?\\d{4}"), // 信用卡号
Pattern.compile("[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\\.[A-Z|a-z]{2,}"), // 邮箱
Pattern.compile("1[3-9]\\d{9}"), // 手机号
Pattern.compile("\\b(password|secret|key|token)\\s*[:=]\\s*\\S+") // 密钥
);
public String sanitize(String response) {
String sanitized = response;
sanitized = sanitized.replaceAll("\\d{4}[-\\s]?\\d{4}[-\\s]?\\d{4}[-\\s]?\\d{4}", "[信用卡号已隐藏]");
sanitized = sanitized.replaceAll("[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\\.[A-Za-z]{2,}", "[邮箱已隐藏]");
sanitized = sanitized.replaceAll("1[3-9]\\d{9}", "[手机号已隐藏]");
return sanitized;
}
}五、与Spring Data集成:向量存储的Repository模式
5.1 自定义VectorStore Repository
package com.laozhang.data;
import org.springframework.ai.document.Document;
import org.springframework.ai.vectorstore.VectorStore;
import org.springframework.ai.vectorstore.SearchRequest;
import org.springframework.stereotype.Repository;
import java.util.*;
import java.util.stream.Collectors;
/**
* 向量存储Repository:提供领域化的文档存取接口
* 封装VectorStore的底层操作,提供更友好的API
*/
@Repository
public class KnowledgeDocumentRepository {
private final VectorStore vectorStore;
public KnowledgeDocumentRepository(VectorStore vectorStore) {
this.vectorStore = vectorStore;
}
/**
* 保存知识文档(自动向量化)
*/
public void save(KnowledgeDocument doc) {
Document vectorDoc = new Document(
doc.content(),
Map.of(
"doc_id", doc.id(),
"category", doc.category(),
"department", doc.department(),
"created_at", doc.createdAt().toString(),
"version", doc.version()
)
);
vectorStore.add(List.of(vectorDoc));
}
/**
* 按类别检索文档
*/
public List<KnowledgeDocument> findByCategory(String query, String category, int topK) {
SearchRequest request = SearchRequest.builder()
.query(query)
.topK(topK)
.filterExpression("category == '" + category + "'")
.similarityThreshold(0.7)
.build();
return vectorStore.similaritySearch(request).stream()
.map(this::toKnowledgeDoc)
.collect(Collectors.toList());
}
/**
* 按部门权限检索(数据隔离)
*/
public List<KnowledgeDocument> findByDepartmentAccess(String query, List<String> accessibleDepts, int topK) {
String deptFilter = accessibleDepts.stream()
.map(d -> "department == '" + d + "'")
.collect(Collectors.joining(" || "));
SearchRequest request = SearchRequest.builder()
.query(query)
.topK(topK)
.filterExpression("(" + deptFilter + ")")
.build();
return vectorStore.similaritySearch(request).stream()
.map(this::toKnowledgeDoc)
.collect(Collectors.toList());
}
/**
* 删除文档(软删除:通过metadata标记)
*/
public void deleteById(String docId) {
// 注:大多数向量库支持按metadata过滤删除
vectorStore.delete(List.of(docId));
}
private KnowledgeDocument toKnowledgeDoc(Document doc) {
Map<String, Object> meta = doc.getMetadata();
return new KnowledgeDocument(
(String) meta.getOrDefault("doc_id", ""),
doc.getContent(),
(String) meta.getOrDefault("category", ""),
(String) meta.getOrDefault("department", ""),
(String) meta.getOrDefault("version", "1.0"),
null
);
}
public record KnowledgeDocument(String id, String content, String category,
String department, String version,
java.time.LocalDateTime createdAt) {}
}5.2 Spring Data JPA + VectorStore混合使用
package com.laozhang.data;
import org.springframework.data.jpa.repository.JpaRepository;
import org.springframework.data.jpa.repository.Query;
import org.springframework.stereotype.Repository;
import jakarta.persistence.*;
import java.util.List;
/**
* 混合存储策略:
* - JPA处理结构化元数据(文档信息、权限、状态)
* - VectorStore处理向量内容(语义检索)
* 两者通过doc_id关联
*/
// JPA实体:文档元数据
@Entity
@Table(name = "document_metadata")
public class DocumentMetadata {
@Id
private String docId;
private String title;
private String category;
private String department;
private String author;
private String status; // ACTIVE, ARCHIVED, DELETED
private int viewCount;
private double avgRating;
@Column(name = "created_at")
private java.time.LocalDateTime createdAt;
@Column(name = "updated_at")
private java.time.LocalDateTime updatedAt;
// Getters/Setters...
}
// JPA Repository
@Repository
public interface DocumentMetadataRepository extends JpaRepository<DocumentMetadata, String> {
List<DocumentMetadata> findByCategoryAndStatus(String category, String status);
@Query("SELECT d FROM DocumentMetadata d WHERE d.department IN :departments AND d.status = 'ACTIVE'")
List<DocumentMetadata> findAccessibleByDepartments(List<String> departments);
@Query("SELECT d FROM DocumentMetadata d ORDER BY d.avgRating DESC LIMIT :limit")
List<DocumentMetadata> findTopRated(int limit);
}
/**
* 混合查询服务:结合JPA权限过滤 + 向量语义检索
*/
@Service
public class HybridSearchService {
private final DocumentMetadataRepository metaRepo;
private final KnowledgeDocumentRepository vectorRepo;
public List<SearchResult> hybridSearch(String query, String userId, List<String> departments) {
// Step 1:JPA查询用户有权访问的文档ID集合
List<String> accessibleDocIds = metaRepo.findAccessibleByDepartments(departments)
.stream().map(DocumentMetadata::getDocId).collect(Collectors.toList());
// Step 2:向量检索(只在有权限的文档中搜索)
List<KnowledgeDocumentRepository.KnowledgeDocument> vectorResults =
vectorRepo.findByDepartmentAccess(query, departments, 10);
// Step 3:合并:过滤掉无权访问的文档 + 补充元数据
return vectorResults.stream()
.filter(doc -> accessibleDocIds.contains(doc.id()))
.map(doc -> {
DocumentMetadata meta = metaRepo.findById(doc.id()).orElse(null);
return new SearchResult(doc, meta);
})
.filter(r -> r.meta() != null)
.collect(Collectors.toList());
}
public record SearchResult(KnowledgeDocumentRepository.KnowledgeDocument doc, DocumentMetadata meta) {}
}六、与Spring Batch集成:大规模AI数据处理
6.1 大规模文档处理架构
6.2 完整Spring Batch实现
package com.laozhang.batch;
import org.springframework.ai.chat.client.ChatClient;
import org.springframework.ai.document.Document;
import org.springframework.ai.vectorstore.VectorStore;
import org.springframework.batch.core.*;
import org.springframework.batch.core.configuration.annotation.EnableBatchProcessing;
import org.springframework.batch.core.job.builder.JobBuilder;
import org.springframework.batch.core.repository.JobRepository;
import org.springframework.batch.core.step.builder.StepBuilder;
import org.springframework.batch.item.*;
import org.springframework.batch.item.database.JdbcCursorItemReader;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.transaction.PlatformTransactionManager;
import lombok.extern.slf4j.Slf4j;
import java.util.*;
/**
* Spring Batch 大规模文档向量化Job
*/
@Slf4j
@Configuration
@EnableBatchProcessing
public class DocumentVectorizationBatchConfig {
/**
* Job定义
*/
@Bean
public Job documentVectorizationJob(JobRepository jobRepository,
Step vectorizationStep,
JobExecutionListener listener) {
return new JobBuilder("documentVectorizationJob", jobRepository)
.start(vectorizationStep)
.listener(listener)
.build();
}
/**
* Step定义:chunk大小=50(每批处理50个文档)
*/
@Bean
public Step vectorizationStep(JobRepository jobRepository,
PlatformTransactionManager transactionManager,
ItemReader<RawDocument> reader,
ItemProcessor<RawDocument, List<Document>> processor,
ItemWriter<List<Document>> writer) {
return new StepBuilder("vectorizationStep", jobRepository)
.<RawDocument, List<Document>>chunk(50, transactionManager)
.reader(reader)
.processor(processor)
.writer(writer)
.faultTolerant()
.retry(Exception.class) // 网络异常重试
.retryLimit(3)
.skip(Exception.class) // 单个文档失败跳过
.skipLimit(100)
.listener(new StepExecutionListenerSupport() {
@Override
public ExitStatus afterStep(StepExecution stepExecution) {
log.info("Step完成:读取={}, 处理={}, 跳过={}",
stepExecution.getReadCount(),
stepExecution.getWriteCount(),
stepExecution.getSkipCount());
return stepExecution.getExitStatus();
}
})
.build();
}
/**
* Reader:从数据库读取待处理文档
*/
@Bean
public JdbcCursorItemReader<RawDocument> documentReader(javax.sql.DataSource dataSource) {
JdbcCursorItemReader<RawDocument> reader = new JdbcCursorItemReader<>();
reader.setDataSource(dataSource);
reader.setSql("SELECT id, title, content, category, department FROM documents WHERE vectorized = false ORDER BY id");
reader.setRowMapper((rs, rowNum) -> new RawDocument(
rs.getString("id"),
rs.getString("title"),
rs.getString("content"),
rs.getString("category"),
rs.getString("department")
));
reader.setFetchSize(100);
return reader;
}
/**
* Processor:文档清洗 + 分块 + LLM摘要 + 准备向量化数据
*/
@Bean
public ItemProcessor<RawDocument, List<Document>> documentProcessor(ChatClient chatClient) {
return rawDoc -> {
log.debug("处理文档:{}", rawDoc.id());
// 1. 内容清洗
String cleanContent = cleanContent(rawDoc.content());
if (cleanContent.length() < 50) {
log.warn("文档内容过短,跳过:{}", rawDoc.id());
return null; // null表示跳过此item
}
// 2. 智能分块
List<String> chunks = smartChunk(cleanContent, 600, 100);
// 3. 为整篇文档生成摘要(用于宏观检索)
String summary = generateSummary(chatClient, rawDoc.title(), cleanContent);
// 4. 构建Document列表
List<Document> documents = new ArrayList<>();
// 添加摘要文档(level=summary)
documents.add(new Document(summary, Map.of(
"doc_id", rawDoc.id(),
"title", rawDoc.title(),
"category", rawDoc.category(),
"department", rawDoc.department(),
"level", "summary",
"chunk_index", -1
)));
// 添加原始分块文档(level=chunk)
for (int i = 0; i < chunks.size(); i++) {
documents.add(new Document(chunks.get(i), Map.of(
"doc_id", rawDoc.id(),
"title", rawDoc.title(),
"category", rawDoc.category(),
"department", rawDoc.department(),
"level", "chunk",
"chunk_index", i,
"total_chunks", chunks.size()
)));
}
return documents;
};
}
/**
* Writer:批量写入向量库
*/
@Bean
public ItemWriter<List<Document>> vectorStoreWriter(VectorStore vectorStore,
javax.sql.DataSource dataSource) {
return items -> {
// 展平所有Document
List<Document> allDocs = items.stream()
.filter(Objects::nonNull)
.flatMap(Collection::stream)
.collect(java.util.stream.Collectors.toList());
if (!allDocs.isEmpty()) {
vectorStore.add(allDocs);
log.info("批量写入向量库:{} 个文档片段", allDocs.size());
// 更新原始数据库中的vectorized标记
Set<String> docIds = allDocs.stream()
.map(d -> (String) d.getMetadata().get("doc_id"))
.filter(Objects::nonNull)
.collect(java.util.stream.Collectors.toSet());
markAsVectorized(dataSource, docIds);
}
};
}
// 工具方法
private String cleanContent(String content) {
return content.replaceAll("\\s+", " ")
.replaceAll("[\\x00-\\x08\\x0B\\x0C\\x0E-\\x1F]", "")
.trim();
}
private List<String> smartChunk(String content, int chunkSize, int overlap) {
List<String> chunks = new ArrayList<>();
int start = 0;
while (start < content.length()) {
int end = Math.min(start + chunkSize, content.length());
// 尽量在句子边界处截断
if (end < content.length()) {
int lastPeriod = content.lastIndexOf('。', end);
if (lastPeriod > start + chunkSize / 2) {
end = lastPeriod + 1;
}
}
chunks.add(content.substring(start, end));
start = end - overlap;
if (start < 0) start = 0;
}
return chunks;
}
private String generateSummary(ChatClient chatClient, String title, String content) {
String prompt = "请为以下文档生成一个200字以内的摘要,突出主要内容和关键信息。\n\n标题:%s\n\n内容(前2000字):%s"
.formatted(title, content.substring(0, Math.min(2000, content.length())));
return chatClient.prompt().user(prompt).call().content();
}
private void markAsVectorized(javax.sql.DataSource dataSource, Set<String> docIds) {
// 更新数据库标记...
}
public record RawDocument(String id, String title, String content, String category, String department) {}
}七、与Spring WebFlux集成:响应式AI服务
7.1 响应式架构的价值
7.2 响应式AI控制器实现
package com.laozhang.reactive;
import org.springframework.ai.chat.client.ChatClient;
import org.springframework.http.MediaType;
import org.springframework.http.codec.ServerSentEvent;
import org.springframework.web.bind.annotation.*;
import reactor.core.publisher.Flux;
import reactor.core.publisher.Mono;
import lombok.extern.slf4j.Slf4j;
import java.time.Duration;
/**
* 响应式AI控制器
* 支持SSE流式输出和背压处理
*/
@Slf4j
@RestController
@RequestMapping("/api/ai")
public class ReactiveAiController {
private final ChatClient chatClient;
public ReactiveAiController(ChatClient chatClient) {
this.chatClient = chatClient;
}
/**
* 流式问答接口(SSE)
* 前端实时显示AI输出,提升用户体验
*/
@PostMapping(value = "/stream", produces = MediaType.TEXT_EVENT_STREAM_VALUE)
public Flux<ServerSentEvent<String>> streamChat(@RequestBody ChatRequest request) {
log.info("流式请求:sessionId={}", request.sessionId());
return chatClient.prompt()
.user(request.question())
.stream()
.content()
// 转换为SSE格式
.map(chunk -> ServerSentEvent.<String>builder()
.data(chunk)
.build())
// 流结束时发送完成信号
.concatWith(Flux.just(ServerSentEvent.<String>builder()
.event("done")
.data("[DONE]")
.build()))
// 错误处理
.onErrorResume(e -> {
log.error("流式输出异常", e);
return Flux.just(ServerSentEvent.<String>builder()
.event("error")
.data("AI服务暂时不可用,请稍后重试")
.build());
})
// 超时控制:30秒无输出则中断
.timeout(Duration.ofSeconds(30));
}
/**
* 非阻塞问答接口
*/
@PostMapping("/chat")
public Mono<ChatResponse> chat(@RequestBody ChatRequest request) {
return Mono.fromCallable(() ->
chatClient.prompt()
.user(request.question())
.call()
.content()
)
.map(content -> new ChatResponse(content, request.sessionId()))
.timeout(Duration.ofSeconds(60));
}
/**
* 并发批量问答接口(响应式并行处理)
*/
@PostMapping("/batch")
public Flux<BatchChatResponse> batchChat(@RequestBody List<String> questions) {
return Flux.fromIterable(questions)
.flatMap(question ->
Mono.fromCallable(() -> chatClient.prompt().user(question).call().content())
.map(answer -> new BatchChatResponse(question, answer))
.onErrorReturn(new BatchChatResponse(question, "处理失败")),
5 // 最大并发数=5,避免API限流
);
}
public record ChatRequest(String question, String sessionId) {}
public record ChatResponse(String answer, String sessionId) {}
public record BatchChatResponse(String question, String answer) {}
}八、社区贡献项目:值得关注的Spring AI周边库
8.1 生态全图
8.2 Spring AI Alibaba:国产化最佳选择
<!-- pom.xml -->
<dependency>
<groupId>com.alibaba.cloud.ai</groupId>
<artifactId>spring-ai-alibaba-starter</artifactId>
<version>1.0.0</version>
</dependency>spring:
ai:
dashscope:
api-key: ${DASHSCOPE_API_KEY}
chat:
options:
model: qwen-max # 通义千问最强版
# model: qwen-plus # 平衡版
# model: qwen-turbo # 速度优先
# model: qwen-long # 超长上下文(100万tokens)
embedding:
options:
model: text-embedding-v3 # 阿里最新Embedding模型// 阿里云特色功能:多模态输入
@Service
public class AliyunMultimodalService {
private final ChatClient chatClient;
/**
* 图文理解(通义千问VL)
*/
public String analyzeImageWithText(String imageUrl, String question) {
UserMessage message = new UserMessage(
question,
List.of(new Media(MimeTypeUtils.IMAGE_JPEG, new URL(imageUrl)))
);
return chatClient.prompt().messages(message).call().content();
}
/**
* 长文档处理(qwen-long支持100万token)
*/
public String analyzeLongDocument(String documentPath) {
// 上传文件到阿里云文件服务,获取file-id
String fileId = uploadFile(documentPath);
String prompt = "请分析这份文档,提取主要观点和关键数据。";
return chatClient.prompt()
.user(u -> u.text(prompt).media(List.of(new Media("file-id", fileId))))
.call()
.content();
}
private String uploadFile(String path) {
// 调用阿里云文件上传API...
return "file-" + UUID.randomUUID();
}
}九、Spring AI测试策略:完整测试方案
9.1 测试层次
9.2 Mock LLM测试实现
package com.laozhang.test;
import org.springframework.ai.chat.client.ChatClient;
import org.springframework.ai.chat.model.ChatModel;
import org.springframework.ai.chat.model.ChatResponse;
import org.springframework.ai.chat.model.Generation;
import org.springframework.ai.chat.messages.AssistantMessage;
import org.springframework.boot.test.context.TestConfiguration;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Primary;
import org.junit.jupiter.api.Test;
import org.junit.jupiter.api.extension.ExtendWith;
import org.mockito.*;
import org.mockito.junit.jupiter.MockitoExtension;
import static org.mockito.Mockito.*;
import static org.assertj.core.api.Assertions.*;
import java.util.*;
/**
* AI服务单元测试:Mock LLM,专注测试业务逻辑
*/
@ExtendWith(MockitoExtension.class)
class RagServiceTest {
@Mock
private ChatClient chatClient;
@Mock
private VectorStore vectorStore;
@InjectMocks
private RagService ragService;
@Test
void shouldReturnAnswerWhenDocumentsFound() {
// Given
String question = "Spring AI如何配置向量存储?";
String expectedAnswer = "Spring AI通过spring.ai.vectorstore配置向量存储。";
// Mock向量检索
Document mockDoc = new Document("Spring AI向量存储配置...", Map.of("source", "docs.spring.io"));
when(vectorStore.similaritySearch(any(SearchRequest.class))).thenReturn(List.of(mockDoc));
// Mock LLM响应
ChatClient.ChatClientRequestSpec spec = mock(ChatClient.ChatClientRequestSpec.class);
ChatClient.CallResponseSpec responseSpec = mock(ChatClient.CallResponseSpec.class);
when(chatClient.prompt()).thenReturn(spec);
when(spec.user(any(String.class))).thenReturn(spec);
when(spec.call()).thenReturn(responseSpec);
when(responseSpec.content()).thenReturn(expectedAnswer);
// When
String answer = ragService.answer(question);
// Then
assertThat(answer).isEqualTo(expectedAnswer);
verify(vectorStore, times(1)).similaritySearch(any(SearchRequest.class));
verify(chatClient, times(1)).prompt();
}
@Test
void shouldHandleEmptyRetrievalGracefully() {
// Given
when(vectorStore.similaritySearch(any(SearchRequest.class))).thenReturn(Collections.emptyList());
ChatClient.ChatClientRequestSpec spec = mock(ChatClient.ChatClientRequestSpec.class);
ChatClient.CallResponseSpec responseSpec = mock(ChatClient.CallResponseSpec.class);
when(chatClient.prompt()).thenReturn(spec);
when(spec.user(any(String.class))).thenReturn(spec);
when(spec.call()).thenReturn(responseSpec);
when(responseSpec.content()).thenReturn("根据现有知识库,未找到相关信息。");
// When & Then
assertThatNoException().isThrownBy(() -> ragService.answer("无法检索到的问题"));
}
}
/**
* 集成测试配置:使用Mock LLM,真实向量库
*/
@TestConfiguration
class TestAiConfig {
@Bean
@Primary
public ChatModel mockChatModel() {
ChatModel mockModel = mock(ChatModel.class);
when(mockModel.call(any())).thenReturn(
new ChatResponse(List.of(new Generation(new AssistantMessage("这是测试响应"))))
);
return mockModel;
}
}十、Spring AI路线图:未来6个月功能规划
10.1 路线图(基于官方公告)
10.2 FAQ
Q:Spring AI目前最稳定的版本是什么,生产可用吗?
A:截至2026年8月,Spring AI 1.0.x 已在数百个生产项目验证,完全可用。建议使用 1.0.x 的最新patch版本。避免使用SNAPSHOT版本在生产环境。
Q:Spring AI和LangChain4j怎么选?
A:核心区别:Spring AI更原生地融入Spring生态(Security、Data、Batch一体化),学习成本低;LangChain4j功能更全(更多现成的Chain/Agent模式),但生态依赖更多。Java工程师 + Spring项目:优先Spring AI;需要快速实现复杂Agent链:LangChain4j更合适。
Q:Spring AI的国产化支持怎么样?
A:非常好。通义千问(DashScope)已官方支持;智谱ChatGLM通过spring-ai-zhipuai支持;百度文心通过spring-ai-qianfan支持;MiniMax、月之暗面也有社区支持。推荐优先用spring-ai-alibaba,阿里的生态投入最大。
Q:Spring AI的VectorStore支持哪些数据库?
A:官方支持:PgVector(PostgreSQL插件,推荐首选)、Redis(RedisSearch)、Qdrant、Weaviate、Pinecone、Chroma、Milvus、Azure AI Search、Elasticsearch。个人推荐:开发用Chroma(最轻量)、生产用PgVector(SQL生态,运维成本低)。
Q:怎么在Spring AI中做多租户隔离?
A:主流方案:每个租户用独立的Collection/Namespace(Qdrant支持);或者所有租户共用一个Collection,通过metadata的tenant_id字段过滤(适合租户数量少的场景)。PgVector支持RLS(行级安全),可以做到数据库级别的隔离。
结语
赵峰那句话值得反复回味:
"Spring AI只占了不到20%的代码量。"
这不是Spring AI的局限,而是Spring AI的设计哲学——极致专注AI集成,让其余一切交给Spring生态。
Security负责安全,Data负责存储,Batch负责批处理,WebFlux负责响应式……每一块都是经过十年生产验证的成熟组件。
作为Java工程师,我们站在Spring生态这个巨人的肩膀上。不了解这个生态,就是在用双脚走路而不知道眼前有辆跑车。
了解它,驾驭它,你的AI开发速度会让身边的人侧目。
