Spring AI多租户架构:一套系统服务1000家企业的设计秘密
Spring AI多租户架构:一套系统服务1000家企业的设计秘密
那个让我差点被客户告的夜晚
2024年11月的深夜,我的手机突然响了。
打电话的是陈伟,我在一家AI SaaS创业公司做CTO顾问时认识的老朋友。电话那头,他的声音带着明显的颤抖:
"老张,出大事了。我们的AI助手刚上线3天,A客户的采购经理在用的时候,突然在历史记录里看到了B客户的对话内容。B客户是他的直接竞争对手,而且那段对话里涉及了B公司下季度的采购计划……"
我当时脑子嗡的一声。
他们做的是企业AI助手SaaS,每个企业客户都用同一套系统,AI会记录每个企业的业务对话,训练专属知识库。上线当天有67家企业注册,听起来很美好——但他们的多租户隔离根本就没做好。
"向量数据库里所有的embedding都存在同一个collection里,查询的时候忘了加租户过滤条件。"陈伟说这话的时候,声音细如蚊鸣。
那家B公司的采购经理后来知道了这件事,律师函在第二天下午就送到了陈伟公司。
我花了整整两周时间帮他们重新设计了多租户架构,把数据完全隔离,并且帮他们赔付了一笔不小的损失。
这篇文章,就是从那两周里提炼出来的。如果你也在做企业级AI SaaS,这是你必须读完的一篇。
数据泄露不是技术问题,是公司存亡问题。
先说结论(TL;DR)
| 维度 | 行级隔离 | Schema隔离 | Database隔离 |
|---|---|---|---|
| 实现复杂度 | 低 | 中 | 高 |
| 资源消耗 | 最低 | 中等 | 最高 |
| 隔离级别 | 弱(依赖代码) | 中 | 强 |
| 适用场景 | 100租户以内,低安全要求 | 100-1000租户 | 金融/医疗等强合规 |
| 数据泄露风险 | 高(Bug即泄露) | 中 | 极低 |
快速决策树:
- 租户数 < 100,且不是强监管行业 → 行级隔离
- 租户数 100~1000,普通SaaS → Schema隔离(推荐)
- 金融、医疗、政府 → Database隔离
- AI向量数据库:无论哪种策略,建议用独立namespace/collection
多租户AI系统的三大核心挑战
挑战一:数据隔离
普通多租户系统加个tenant_id字段就行了,但AI系统有向量数据库的问题。
当你把A企业的合同嵌入成向量后,如果B企业的用户问了相似的问题,RAG系统可能把A企业的内容检索出来。这不是"信息泄露",是"精准商业间谍"。
挑战二:配置差异化
大客户要用GPT-4o,小客户用GPT-4o-mini;A客户申请了独立API Key;B客户要求中文,C客户要求英文。这些差异化需求如果都硬编码,代码会变成噩梦。
挑战三:成本分摊
你得知道每个租户消耗了多少Token,对应多少费用,这需要精细的计量系统。
整体架构设计
核心实现一:TenantContext租户上下文
package com.laozhang.multitenant.context;
import lombok.Data;
import lombok.Builder;
import java.time.LocalDateTime;
import java.util.Map;
@Data
@Builder
public class TenantContext {
private String tenantId;
private String tenantName;
private TenantPlan plan;
private TenantAiConfig aiConfig;
private String schemaName;
private String userId;
private LocalDateTime tenantCreatedAt;
private Map<String, Object> attributes;
public enum TenantPlan {
FREE, BASIC, PRO, ENTERPRISE
}
}package com.laozhang.multitenant.context;
public class TenantContextHolder {
private static final ThreadLocal<TenantContext> CONTEXT_HOLDER =
new InheritableThreadLocal<>();
public static void setContext(TenantContext context) {
CONTEXT_HOLDER.set(context);
}
public static TenantContext getContext() {
TenantContext context = CONTEXT_HOLDER.get();
if (context == null) {
throw new TenantContextNotFoundException("当前线程没有租户上下文");
}
return context;
}
public static java.util.Optional<TenantContext> getContextOptional() {
return java.util.Optional.ofNullable(CONTEXT_HOLDER.get());
}
public static String getTenantId() {
return getContext().getTenantId();
}
public static void clear() {
CONTEXT_HOLDER.remove();
}
public static boolean hasContext() {
return CONTEXT_HOLDER.get() != null;
}
}package com.laozhang.multitenant.filter;
import jakarta.servlet.*;
import jakarta.servlet.http.HttpServletRequest;
import jakarta.servlet.http.HttpServletResponse;
import lombok.RequiredArgsConstructor;
import lombok.extern.slf4j.Slf4j;
import org.springframework.core.annotation.Order;
import org.springframework.stereotype.Component;
import org.springframework.web.filter.OncePerRequestFilter;
import java.io.IOException;
@Slf4j
@Component
@Order(10)
@RequiredArgsConstructor
public class TenantResolutionFilter extends OncePerRequestFilter {
private final TenantConfigService tenantConfigService;
private final JwtTokenParser jwtTokenParser;
@Override
protected void doFilterInternal(HttpServletRequest request,
HttpServletResponse response, FilterChain filterChain)
throws ServletException, IOException {
try {
String tenantId = resolveTenantId(request);
if (tenantId != null) {
TenantContext context = tenantConfigService.loadTenantContext(tenantId);
String userId = jwtTokenParser.extractUserId(request.getHeader("Authorization"));
context.setUserId(userId);
TenantContextHolder.setContext(context);
log.debug("租户上下文已设置: tenantId={}, plan={}", tenantId, context.getPlan());
}
filterChain.doFilter(request, response);
} finally {
TenantContextHolder.clear();
}
}
private String resolveTenantId(HttpServletRequest request) {
String headerTenantId = request.getHeader("X-Tenant-ID");
if (headerTenantId != null && !headerTenantId.isBlank()) return headerTenantId;
String authHeader = request.getHeader("Authorization");
if (authHeader != null && authHeader.startsWith("Bearer ")) {
String jwtTenantId = jwtTokenParser.extractTenantId(authHeader.substring(7));
if (jwtTenantId != null) return jwtTenantId;
}
String host = request.getServerName();
if (host != null && host.contains(".")) {
String subdomain = host.split("\\.")[0];
if (tenantConfigService.existsBySubdomain(subdomain)) {
return tenantConfigService.getTenantIdBySubdomain(subdomain);
}
}
return null;
}
}核心实现二:租户AI配置管理
package com.laozhang.multitenant.entity;
import jakarta.persistence.*;
import lombok.Data;
import java.time.LocalDateTime;
@Data
@Entity
@Table(name = "tenant_ai_config")
public class TenantAiConfig {
@Id
@GeneratedValue(strategy = GenerationType.IDENTITY)
private Long id;
@Column(name = "tenant_id", unique = true, nullable = false)
private String tenantId;
@Enumerated(EnumType.STRING)
@Column(name = "llm_provider")
private LlmProvider llmProvider;
@Column(name = "model_name")
private String modelName;
@Column(name = "api_key_encrypted")
private String apiKeyEncrypted;
@Column(name = "api_base_url")
private String apiBaseUrl;
@Column(name = "temperature")
private Double temperature = 0.7;
@Column(name = "max_tokens")
private Integer maxTokens = 2048;
@Column(name = "system_prompt", columnDefinition = "TEXT")
private String systemPrompt;
@Column(name = "vector_namespace")
private String vectorNamespace;
@Column(name = "enabled")
private Boolean enabled = true;
@Column(name = "created_at")
private LocalDateTime createdAt;
@Column(name = "updated_at")
private LocalDateTime updatedAt;
@PrePersist
protected void onCreate() { createdAt = updatedAt = LocalDateTime.now(); }
@PreUpdate
protected void onUpdate() { updatedAt = LocalDateTime.now(); }
public enum LlmProvider { OPENAI, ALIBABA, AZURE_OPENAI, ANTHROPIC, OLLAMA }
}package com.laozhang.multitenant.ai;
import lombok.RequiredArgsConstructor;
import lombok.extern.slf4j.Slf4j;
import org.springframework.ai.chat.client.ChatClient;
import org.springframework.ai.openai.OpenAiChatModel;
import org.springframework.ai.openai.OpenAiChatOptions;
import org.springframework.ai.openai.api.OpenAiApi;
import org.springframework.cache.annotation.Cacheable;
import org.springframework.stereotype.Component;
@Slf4j
@Component
@RequiredArgsConstructor
public class TenantAwareChatClientFactory {
private final TenantConfigService tenantConfigService;
private final ApiKeyDecryptor apiKeyDecryptor;
private final ChatClient defaultChatClient;
public ChatClient getChatClient() {
return getChatClientForTenant(TenantContextHolder.getTenantId());
}
@Cacheable(value = "tenant_chat_client", key = "#tenantId")
public ChatClient getChatClientForTenant(String tenantId) {
TenantAiConfig config = tenantConfigService.getAiConfig(tenantId);
if (config == null || !config.getEnabled()) return defaultChatClient;
return buildChatClient(config);
}
private ChatClient buildChatClient(TenantAiConfig config) {
return switch (config.getLlmProvider()) {
case OPENAI -> buildOpenAiChatClient(config);
case ALIBABA -> buildAlibabaChatClient(config);
default -> defaultChatClient;
};
}
private ChatClient buildOpenAiChatClient(TenantAiConfig config) {
String apiKey = apiKeyDecryptor.decrypt(config.getApiKeyEncrypted());
OpenAiApi openAiApi = OpenAiApi.builder()
.apiKey(apiKey)
.baseUrl(config.getApiBaseUrl() != null ? config.getApiBaseUrl() : "https://api.openai.com")
.build();
OpenAiChatOptions options = OpenAiChatOptions.builder()
.model(config.getModelName() != null ? config.getModelName() : "gpt-4o-mini")
.temperature(config.getTemperature())
.maxTokens(config.getMaxTokens())
.build();
OpenAiChatModel chatModel = OpenAiChatModel.builder().openAiApi(openAiApi).defaultOptions(options).build();
ChatClient.Builder builder = ChatClient.builder(chatModel);
if (config.getSystemPrompt() != null && !config.getSystemPrompt().isBlank()) {
builder.defaultSystem(config.getSystemPrompt());
}
return builder.build();
}
private ChatClient buildAlibabaChatClient(TenantAiConfig config) {
String apiKey = apiKeyDecryptor.decrypt(config.getApiKeyEncrypted());
OpenAiApi openAiApi = OpenAiApi.builder()
.apiKey(apiKey)
.baseUrl("https://dashscope.aliyuncs.com/compatible-mode/v1")
.build();
OpenAiChatOptions options = OpenAiChatOptions.builder()
.model(config.getModelName() != null ? config.getModelName() : "qwen-plus")
.temperature(config.getTemperature())
.maxTokens(config.getMaxTokens())
.build();
return ChatClient.builder(OpenAiChatModel.builder().openAiApi(openAiApi).defaultOptions(options).build()).build();
}
}核心实现三:向量存储隔离
package com.laozhang.multitenant.ai;
import lombok.RequiredArgsConstructor;
import lombok.extern.slf4j.Slf4j;
import org.springframework.ai.document.Document;
import org.springframework.ai.vectorstore.SearchRequest;
import org.springframework.ai.vectorstore.VectorStore;
import org.springframework.ai.vectorstore.filter.FilterExpressionBuilder;
import org.springframework.stereotype.Component;
import java.util.List;
import java.util.Map;
/**
* 租户感知的向量存储包装器
* 所有向量操作自动注入租户过滤条件,确保数据隔离
*/
@Slf4j
@Component
@RequiredArgsConstructor
public class TenantAwareVectorStore {
private final VectorStore vectorStore;
public void add(List<Document> documents) {
String tenantId = TenantContextHolder.getTenantId();
List<Document> taggedDocuments = documents.stream()
.map(doc -> {
Map<String, Object> metadata = new java.util.HashMap<>(doc.getMetadata());
metadata.put("tenant_id", tenantId);
metadata.put("indexed_at", System.currentTimeMillis());
return new Document(doc.getId(), doc.getText(), metadata);
})
.toList();
vectorStore.add(taggedDocuments);
}
/**
* 带租户过滤的相似度搜索 - 防止数据泄露的关键!
*/
public List<Document> similaritySearch(String query, int topK) {
String tenantId = TenantContextHolder.getTenantId();
FilterExpressionBuilder fb = new FilterExpressionBuilder();
SearchRequest searchRequest = SearchRequest.builder()
.query(query)
.topK(topK)
.filterExpression(fb.eq("tenant_id", tenantId).build())
.similarityThreshold(0.7)
.build();
List<Document> results = vectorStore.similaritySearch(searchRequest);
// 安全二次验证(防止过滤条件Bug导致泄露)
List<Document> validResults = results.stream()
.filter(doc -> tenantId.equals(doc.getMetadata().get("tenant_id")))
.toList();
if (validResults.size() != results.size()) {
log.error("严重安全警告!向量搜索返回了租户{}不应看到的文档!", tenantId);
}
return validResults;
}
public List<Document> similaritySearch(String query) {
return similaritySearch(query, 5);
}
}核心实现四:限流与配额管理
package com.laozhang.multitenant.quota;
import lombok.RequiredArgsConstructor;
import lombok.extern.slf4j.Slf4j;
import org.springframework.data.redis.core.StringRedisTemplate;
import org.springframework.stereotype.Service;
import java.time.Duration;
import java.time.LocalDate;
import java.time.YearMonth;
@Slf4j
@Service
@RequiredArgsConstructor
public class TenantQuotaManager {
private final StringRedisTemplate redisTemplate;
private static final java.util.Map<TenantContext.TenantPlan, QuotaConfig> PLAN_QUOTAS =
java.util.Map.of(
TenantContext.TenantPlan.FREE, new QuotaConfig(10, 1_000, 100_000),
TenantContext.TenantPlan.BASIC, new QuotaConfig(30, 10_000, 1_000_000),
TenantContext.TenantPlan.PRO, new QuotaConfig(100, 50_000, 10_000_000),
TenantContext.TenantPlan.ENTERPRISE, new QuotaConfig(500, -1, -1)
);
public void checkAndDecrease(int estimatedTokens) {
TenantContext context = TenantContextHolder.getContext();
String tenantId = context.getTenantId();
QuotaConfig quota = PLAN_QUOTAS.get(context.getPlan());
checkRpm(tenantId, quota.maxRpm());
if (quota.maxMonthlyTokens() > 0) {
checkMonthlyTokens(tenantId, estimatedTokens, quota.maxMonthlyTokens());
}
}
public void recordUsage(String tenantId, int promptTokens, int completionTokens) {
int totalTokens = promptTokens + completionTokens;
String monthKey = "quota:monthly:" + tenantId + ":" + YearMonth.now();
String dayKey = "quota:daily:" + tenantId + ":" + LocalDate.now();
redisTemplate.opsForValue().increment(monthKey, totalTokens);
redisTemplate.opsForValue().increment(dayKey, totalTokens);
redisTemplate.expire(monthKey, Duration.ofDays(35));
redisTemplate.expire(dayKey, Duration.ofDays(2));
}
private void checkRpm(String tenantId, int maxRpm) {
String rpmKey = "quota:rpm:" + tenantId + ":" + System.currentTimeMillis() / 60000;
Long currentRpm = redisTemplate.opsForValue().increment(rpmKey);
redisTemplate.expire(rpmKey, Duration.ofMinutes(2));
if (currentRpm != null && currentRpm > maxRpm) {
throw new QuotaExceededException(
String.format("请求频率超限:当前%d/min,限制%d/min", currentRpm, maxRpm),
QuotaExceededException.QuotaType.RPM);
}
}
private void checkMonthlyTokens(String tenantId, int requestedTokens, long maxTokens) {
String key = "quota:monthly:" + tenantId + ":" + YearMonth.now();
String currentStr = redisTemplate.opsForValue().get(key);
long current = currentStr != null ? Long.parseLong(currentStr) : 0L;
if (current + requestedTokens > maxTokens) {
throw new QuotaExceededException(
String.format("月度Token配额不足:已用%d,限额%d", current, maxTokens),
QuotaExceededException.QuotaType.MONTHLY_TOKENS);
}
}
record QuotaConfig(int maxRpm, long maxDailyTokens, long maxMonthlyTokens) {}
}核心实现五:成本追踪
package com.laozhang.multitenant.cost;
import lombok.RequiredArgsConstructor;
import lombok.extern.slf4j.Slf4j;
import org.springframework.ai.chat.metadata.Usage;
import org.springframework.ai.chat.model.ChatResponse;
import org.springframework.scheduling.annotation.Async;
import org.springframework.stereotype.Service;
import java.math.BigDecimal;
import java.math.RoundingMode;
import java.time.LocalDateTime;
@Slf4j
@Service
@RequiredArgsConstructor
public class TenantCostTracker {
private final CostRecordRepository costRecordRepository;
private final ModelPricingService modelPricingService;
@Async("costTrackingExecutor")
public void recordCost(ChatResponse response, String modelName) {
try {
String tenantId = TenantContextHolder.getTenantId();
String userId = TenantContextHolder.getContext().getUserId();
Usage usage = response.getMetadata().getUsage();
if (usage == null) return;
int promptTokens = (int) usage.getPromptTokens();
int completionTokens = (int) usage.getCompletionTokens();
ModelPricing pricing = modelPricingService.getPricing(modelName);
BigDecimal totalCost = pricing.inputPricePerToken()
.multiply(BigDecimal.valueOf(promptTokens))
.add(pricing.outputPricePerToken().multiply(BigDecimal.valueOf(completionTokens)))
.setScale(8, RoundingMode.HALF_UP);
costRecordRepository.save(CostRecord.builder()
.tenantId(tenantId)
.userId(userId)
.modelName(modelName)
.promptTokens(promptTokens)
.completionTokens(completionTokens)
.totalTokens(promptTokens + completionTokens)
.costUsd(totalCost)
.recordedAt(LocalDateTime.now())
.build());
} catch (Exception e) {
log.error("成本记录失败", e);
}
}
}@RequiredArgsConstructor
public class CostTrackingAdvisor implements CallAroundAdvisor {
private final TenantCostTracker costTracker;
private final String modelName;
@Override
public AdvisedResponse aroundCall(AdvisedRequest request, CallAroundAdvisorChain chain) {
AdvisedResponse response = chain.nextAroundCall(request);
costTracker.recordCost(response.response(), modelName);
return response;
}
@Override public String getName() { return "CostTrackingAdvisor"; }
@Override public int getOrder() { return Ordered.LOWEST_PRECEDENCE; }
}核心实现六:多租户RAG服务
package com.laozhang.multitenant.service;
import lombok.RequiredArgsConstructor;
import lombok.extern.slf4j.Slf4j;
import org.springframework.ai.chat.client.ChatClient;
import org.springframework.ai.document.Document;
import org.springframework.stereotype.Service;
import reactor.core.publisher.Flux;
import java.util.List;
@Slf4j
@Service
@RequiredArgsConstructor
public class MultiTenantRagService {
private final TenantAwareChatClientFactory chatClientFactory;
private final TenantAwareVectorStore vectorStore;
private final TenantQuotaManager quotaManager;
private final TenantCostTracker costTracker;
public String ask(String question) {
String tenantId = TenantContextHolder.getTenantId();
quotaManager.checkAndDecrease((int)(question.length() * 1.2));
ChatClient chatClient = chatClientFactory.getChatClient();
List<Document> relevantDocs = vectorStore.similaritySearch(question, 5);
String context = buildContext(relevantDocs);
return chatClient.prompt()
.user(u -> u.text("基于上下文:{context},回答:{question}")
.param("context", context).param("question", question))
.advisors(new CostTrackingAdvisor(costTracker,
TenantContextHolder.getContext().getAiConfig().getModelName()))
.call()
.content();
}
public Flux<String> askStream(String question) {
quotaManager.checkAndDecrease((int)(question.length() * 1.2));
ChatClient chatClient = chatClientFactory.getChatClient();
List<Document> docs = vectorStore.similaritySearch(question, 5);
return chatClient.prompt()
.user(u -> u.text("基于上下文:{context},回答:{question}")
.param("context", buildContext(docs)).param("question", question))
.stream().content();
}
public void indexDocuments(List<Document> documents) {
vectorStore.add(documents);
}
private String buildContext(List<Document> docs) {
if (docs.isEmpty()) return "暂无相关知识库内容";
StringBuilder sb = new StringBuilder();
for (int i = 0; i < docs.size(); i++) {
sb.append("[文档").append(i + 1).append("]\n").append(docs.get(i).getText()).append("\n\n");
}
return sb.toString();
}
}核心实现七:异步场景的租户上下文传递
package com.laozhang.multitenant.async;
import lombok.extern.slf4j.Slf4j;
import org.springframework.scheduling.concurrent.ThreadPoolTaskExecutor;
import java.util.concurrent.Callable;
import java.util.concurrent.Future;
@Slf4j
public class TenantAwareThreadPoolExecutor extends ThreadPoolTaskExecutor {
@Override
public void execute(Runnable task) { super.execute(wrap(task)); }
@Override
public Future<?> submit(Runnable task) { return super.submit(wrap(task)); }
@Override
public <T> Future<T> submit(Callable<T> task) { return super.submit(wrapCallable(task)); }
private Runnable wrap(Runnable task) {
TenantContext ctx = TenantContextHolder.getContextOptional().orElse(null);
return () -> {
try {
if (ctx != null) TenantContextHolder.setContext(ctx);
task.run();
} finally {
TenantContextHolder.clear();
}
};
}
private <T> Callable<T> wrapCallable(Callable<T> task) {
TenantContext ctx = TenantContextHolder.getContextOptional().orElse(null);
return () -> {
try {
if (ctx != null) TenantContextHolder.setContext(ctx);
return task.call();
} finally {
TenantContextHolder.clear();
}
};
}
}核心实现八:管理员配置API
package com.laozhang.multitenant.controller;
import io.swagger.v3.oas.annotations.Operation;
import jakarta.validation.Valid;
import lombok.RequiredArgsConstructor;
import org.springframework.format.annotation.DateTimeFormat;
import org.springframework.http.ResponseEntity;
import org.springframework.security.access.prepost.PreAuthorize;
import org.springframework.web.bind.annotation.*;
import java.time.LocalDateTime;
import java.util.List;
@RestController
@RequestMapping("/admin/tenants")
@RequiredArgsConstructor
@PreAuthorize("hasRole('SYSTEM_ADMIN')")
public class TenantAdminController {
private final TenantAdminService adminService;
@GetMapping
@Operation(summary = "获取所有租户列表")
public ResponseEntity<List<TenantSummaryDTO>> listTenants(
@RequestParam(defaultValue = "0") int page,
@RequestParam(defaultValue = "20") int size) {
return ResponseEntity.ok(adminService.listTenants(page, size));
}
@GetMapping("/{tenantId}/ai-config")
@Operation(summary = "获取租户AI配置")
public ResponseEntity<TenantAiConfigDTO> getAiConfig(@PathVariable String tenantId) {
return ResponseEntity.ok(TenantAiConfigDTO.fromEntity(adminService.getAiConfig(tenantId)));
}
@PutMapping("/{tenantId}/ai-config")
@Operation(summary = "更新租户AI配置")
public ResponseEntity<Void> updateAiConfig(@PathVariable String tenantId,
@Valid @RequestBody UpdateAiConfigRequest request) {
adminService.updateAiConfig(tenantId, request);
return ResponseEntity.ok().build();
}
@GetMapping("/{tenantId}/cost")
@Operation(summary = "查询租户费用统计")
public ResponseEntity<CostSummary> getTenantCost(@PathVariable String tenantId,
@RequestParam @DateTimeFormat(iso = DateTimeFormat.ISO.DATE_TIME) LocalDateTime start,
@RequestParam @DateTimeFormat(iso = DateTimeFormat.ISO.DATE_TIME) LocalDateTime end) {
return ResponseEntity.ok(adminService.getCostSummary(tenantId, start, end));
}
@PostMapping("/{tenantId}/suspend")
@Operation(summary = "暂停租户AI服务")
public ResponseEntity<Void> suspendTenant(@PathVariable String tenantId,
@RequestParam String reason) {
adminService.suspendTenant(tenantId, reason);
return ResponseEntity.ok().build();
}
@PostMapping("/{tenantId}/resume")
@Operation(summary = "恢复租户AI服务")
public ResponseEntity<Void> resumeTenant(@PathVariable String tenantId) {
adminService.resumeTenant(tenantId);
return ResponseEntity.ok().build();
}
@GetMapping("/cost-ranking")
@Operation(summary = "所有租户费用排行(本月)")
public ResponseEntity<List<TenantCostRankingDTO>> getCostRanking(
@RequestParam(defaultValue = "20") int limit) {
return ResponseEntity.ok(adminService.getCostRanking(limit));
}
}生产环境注意事项
监控指标
management:
metrics:
tags:
application: multi-tenant-ai
endpoints:
web:
exposure:
include: health, metrics, prometheus@Component
@RequiredArgsConstructor
public class TenantMetricsRecorder {
private final MeterRegistry meterRegistry;
public void recordRequest(String tenantId, String model, long latencyMs, boolean success) {
Counter.builder("ai.requests.total")
.tag("tenant_id", tenantId).tag("model", model)
.tag("success", String.valueOf(success))
.register(meterRegistry).increment();
Timer.builder("ai.request.latency")
.tag("tenant_id", tenantId).tag("model", model)
.register(meterRegistry)
.record(latencyMs, java.util.concurrent.TimeUnit.MILLISECONDS);
}
}踩坑1:InheritableThreadLocal线程池陷阱 线程池线程复用,InheritableThreadLocal的自动传递不可靠。用TenantAwareThreadPoolExecutor手动传递。
踩坑2:向量数据库过滤条件遗漏 向量搜索过滤条件是防泄露关键。加二次验证(遍历结果剔除不属于当前租户的文档)。
踩坑3:租户配置缓存失效 修改配置后主动驱逐ChatClient缓存:cacheManager.getCache("tenant_chat_client").evict(tenantId)。
多租户隔离测试
@SpringBootTest
class TenantIsolationTest {
@Autowired
private MultiTenantRagService ragService;
@AfterEach
void cleanup() { TenantContextHolder.clear(); }
@Test
void testDocumentIsolation() {
setTenantContext("tenant-a");
ragService.indexDocuments(List.of(new Document("tenant-a-secret-content-12345")));
setTenantContext("tenant-b");
List<Document> results = ragService.vectorStore()
.similaritySearch("tenant-a-secret-content");
assertThat(results).isEmpty();
}
@Test
void testConcurrentTenantIsolation() throws Exception {
CompletableFuture<String> futureA = CompletableFuture.supplyAsync(() -> {
setTenantContext("tenant-concurrent-a");
try { Thread.sleep(100); } catch (InterruptedException e) {}
return TenantContextHolder.getTenantId();
});
CompletableFuture<String> futureB = CompletableFuture.supplyAsync(() -> {
setTenantContext("tenant-concurrent-b");
try { Thread.sleep(50); } catch (InterruptedException e) {}
return TenantContextHolder.getTenantId();
});
assertThat(futureA.get()).isEqualTo("tenant-concurrent-a");
assertThat(futureB.get()).isEqualTo("tenant-concurrent-b");
}
private void setTenantContext(String tenantId) {
TenantContextHolder.setContext(TenantContext.builder()
.tenantId(tenantId)
.tenantName("测试租户-" + tenantId)
.plan(TenantContext.TenantPlan.BASIC)
.build());
}
}常见问题解答
Q1:行级隔离真的不安全吗? A:行级隔离安全性完全依赖代码正确性——漏一个过滤条件就是泄露。对于商业敏感信息的SaaS,至少用Schema隔离。
Q2:每个租户独立ChatClient占用太多内存吗? A:不会。ChatClient是轻量级对象,多个实例共享底层HTTP连接池。1000个租户的ChatClient实例约200MB内存,完全可接受。
Q3:API Key泄露了怎么办? A:API Key必须加密存储(AES-256)。租户可以在管理后台随时更换Key,更换后立即驱逐缓存。建立异常调用模式告警。
Q4:成本追踪精度够用吗? A:使用AI API返回的实际usage字段,不是估算,精度100%准确。
Q5:数千租户时Vector DB Collection数量会造成性能问题吗? A:Qdrant建议单节点不超过1000个Collection。超过1000租户时,评估元数据过滤方案,或按租户量级划分集群。
Q6:TenantContext在WebFlux场景下怎么传递? A:WebFlux不能用ThreadLocal!改用Reactor Context:Mono.deferContextual(ctx -> ...)。
总结
陈伟的故事告诉我们:多租户隔离是系统上线的底线,不是可选项。
可操作行动清单:
代码写对了,才是负责任的工程师。
