AI应用的特性开关:精细化控制AI功能的发布
AI应用的特性开关:精细化控制AI功能的发布
CEO叫停的那个下午
2024年12月20日,距离元旦还有11天,某消费金融公司的技术总监刘强正在开例会。
突然,CEO打来了电话:
"你们那个AI贷款审批助手,立刻关掉!"
原来,公司的AI助手在给一位用户解答"申请被拒的原因"时,给出了一段措辞不当的答复——不是错误的,但在措辞上有些冷漠,在监管敏感的金融场景下,被用户截图发到了社交媒体,引发了小规模舆情。
"立刻关掉"说起来容易,做起来……
刘强打电话给技术负责人:
"关掉AI助手功能,现在!"
"好的,我去处理……"
沉默了30秒。
"刘总,AI助手功能是直接写在主业务流程里的,没有开关。要关掉,需要修改代码、打包、测试、部署,走完流程最快……两个小时。"
"两个小时?!现在舆情还在发酵,两个小时够干什么了?"
最终,技术团队紧急进行了一次最高风险的生产操作:直接修改nginx配置,将AI助手相关路由全部导向返回维护中的静态页面。这个操作本身就有风险,而且影响范围扩大到了其他共用该路由的功能。
事后复盘,技术团队花了两周时间,给所有AI功能加上了特性开关。
这就是没有特性开关的代价:在最需要快速响应的时候,手脚被绑住了。
Feature Flag:解耦发布和部署
Feature Flag(特性开关/功能标志)的核心价值是一句话:
把"代码部署"和"功能发布"彻底分离。
AI应用中的特性开关使用场景:
- 模型切换:悄悄把5%的流量从gpt-4o-mini切到gpt-4o,看质量提升是否值得
- 新Prompt版本:灰度验证优化后的Prompt,不影响所有用户
- 新AI功能:先给VIP用户开放,收集反馈再全量
- 地区合规:某些AI功能在特定地区受监管限制,动态关闭
- 成本控制:超出Token预算时,自动降级到便宜模型
- 紧急关断:像CEO叫停的场景,一键关闭,秒级生效
Unleash集成:企业级Feature Flag
Unleash是目前最流行的开源Feature Flag平台,支持多种发布策略,有完整的管理界面。
环境搭建
# docker-compose.yml(开发环境)
version: '3.8'
services:
unleash-db:
image: postgres:15
environment:
POSTGRES_USER: unleash
POSTGRES_PASSWORD: unleash_password
POSTGRES_DB: unleash
volumes:
- unleash_db_data:/var/lib/postgresql/data
unleash:
image: unleashorg/unleash-server:latest
ports:
- "4242:4242"
environment:
DATABASE_URL: "postgresql://unleash:unleash_password@unleash-db/unleash"
INIT_CLIENT_API_TOKENS: "default:development.unleash-insecure-api-token"
depends_on:
- unleash-db
volumes:
unleash_db_data:Maven依赖
<dependencies>
<!-- Unleash Java SDK -->
<dependency>
<groupId>io.getunleash</groupId>
<artifactId>unleash-client-java</artifactId>
<version>9.2.0</version>
</dependency>
<!-- Spring AI -->
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-openai-spring-boot-starter</artifactId>
<version>1.0.0</version>
</dependency>
<!-- Spring Boot Web -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<!-- Redis(用于自定义Feature Flag) -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-data-redis</artifactId>
</dependency>
<!-- FF4J(嵌入式方案) -->
<dependency>
<groupId>org.ff4j</groupId>
<artifactId>ff4j-spring-boot-starter</artifactId>
<version>2.1.0</version>
</dependency>
<dependency>
<groupId>org.ff4j</groupId>
<artifactId>ff4j-store-redis</artifactId>
<version>2.1.0</version>
</dependency>
<!-- Caffeine本地缓存 -->
<dependency>
<groupId>com.github.ben-manes.caffeine</groupId>
<artifactId>caffeine</artifactId>
<version>3.1.8</version>
</dependency>
</dependencies>Unleash完整配置和使用
package com.laozhang.feature;
import io.getunleash.DefaultUnleash;
import io.getunleash.Unleash;
import io.getunleash.UnleashContext;
import io.getunleash.util.UnleashConfig;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
/**
* Unleash配置
*/
@Configuration
public class UnleashConfiguration {
@Value("${unleash.url:http://localhost:4242/api}")
private String unleashUrl;
@Value("${unleash.api-token:default:development.unleash-insecure-api-token}")
private String apiToken;
@Value("${spring.application.name:ai-service}")
private String appName;
@Bean
public Unleash unleash() {
UnleashConfig config = UnleashConfig.builder()
.appName(appName)
.instanceId(getInstanceId())
.unleashAPI(unleashUrl)
.apiKey(apiToken)
// 每10秒从服务端拉取最新配置(默认15秒)
.fetchTogglesInterval(10)
// 每隔60秒上报使用统计
.sendMetricsInterval(60)
// 本地备份文件(服务端不可用时使用)
.backupFile("/tmp/unleash-backup.json")
.build();
return new DefaultUnleash(config);
}
private String getInstanceId() {
// 用Pod名/容器ID作为实例ID,便于追踪
String podName = System.getenv("HOSTNAME");
return podName != null ? podName : "local-" + ProcessHandle.current().pid();
}
}package com.laozhang.feature;
import io.getunleash.Unleash;
import io.getunleash.UnleashContext;
import io.getunleash.Variant;
import org.springframework.stereotype.Service;
import java.util.Optional;
/**
* AI特性开关服务
*
* 封装Unleash,提供语义化的API,让业务代码不直接依赖Unleash SDK
*/
@Service
public class AiFeatureService {
private final Unleash unleash;
// ========== Feature Flag名称常量 ==========
// 命名规范:{module}.{feature}.{variant}
/** AI聊天功能总开关 */
public static final String AI_CHAT_ENABLED = "ai.chat.enabled";
/** 新版Prompt(gpt-4o质量提升版) */
public static final String AI_CHAT_PROMPT_V2 = "ai.chat.prompt.v2";
/** 模型升级:gpt-4o-mini → gpt-4o */
public static final String AI_MODEL_UPGRADE = "ai.model.gpt4o";
/** RAG功能开关 */
public static final String AI_RAG_ENABLED = "ai.rag.enabled";
/** 流式响应(某些客户端不支持) */
public static final String AI_STREAMING_ENABLED = "ai.streaming.enabled";
/** 语音转文字功能(新功能灰度) */
public static final String AI_VOICE_ENABLED = "ai.voice.enabled";
/** 紧急关断:一键关闭所有AI */
public static final String AI_EMERGENCY_SHUTDOWN = "ai.emergency.shutdown";
public AiFeatureService(Unleash unleash) {
this.unleash = unleash;
}
/**
* 检查AI总开关
* 优先检查紧急关断,其次检查功能开关
*/
public boolean isAiEnabled(String userId, String sessionId) {
// 紧急关断优先
if (isEmergencyShutdown()) {
return false;
}
UnleashContext context = buildContext(userId, sessionId);
return unleash.isEnabled(AI_CHAT_ENABLED, context, true); // 默认开启
}
/**
* 检查是否使用新版Prompt
*/
public boolean usePromptV2(String userId, String sessionId) {
if (!isAiEnabled(userId, sessionId)) return false;
UnleashContext context = buildContext(userId, sessionId);
return unleash.isEnabled(AI_CHAT_PROMPT_V2, context, false); // 默认关闭
}
/**
* 获取当前应该使用的模型
* 支持A/B测试变体
*/
public String getActiveModel(String userId, String sessionId) {
UnleashContext context = buildContext(userId, sessionId);
// 先检查紧急降级
if (isEmergencyShutdown()) {
return "gpt-4o-mini"; // 降级到最便宜的模型
}
// 获取变体(A/B测试)
Variant variant = unleash.getVariant(AI_MODEL_UPGRADE, context);
if (variant.isEnabled() && variant.getName() != null) {
return switch (variant.getName()) {
case "gpt-4o" -> "gpt-4o";
case "gpt-4o-mini" -> "gpt-4o-mini";
case "claude-3-5-sonnet" -> "claude-3-5-sonnet-20241022";
default -> "gpt-4o-mini";
};
}
return "gpt-4o-mini"; // 默认
}
/**
* 检查紧急关断状态
*/
public boolean isEmergencyShutdown() {
// 紧急关断不需要上下文(全局生效)
return unleash.isEnabled(AI_EMERGENCY_SHUTDOWN, false);
}
/**
* 检查特定功能对特定用户是否开放
*/
public boolean isFeatureEnabled(String featureName, String userId,
String sessionId) {
if (isEmergencyShutdown()) return false;
UnleashContext context = buildContext(userId, sessionId);
return unleash.isEnabled(featureName, context, false);
}
/**
* 构建Unleash上下文
* 这个上下文决定了哪些发布策略生效
*/
private UnleashContext buildContext(String userId, String sessionId) {
return UnleashContext.builder()
.userId(userId) // 用于按用户ID的发布策略
.sessionId(sessionId) // 用于一致性哈希灰度
.remoteAddress(getCurrentIp()) // 用于IP限制策略
.addProperty("userRole", getUserRole(userId)) // 自定义属性
.addProperty("userRegion", getUserRegion(userId))
.addProperty("userTier", getUserTier(userId)) // VIP/普通用户
.build();
}
private String getCurrentIp() {
// 实际从RequestContextHolder获取
return "127.0.0.1";
}
private String getUserRole(String userId) {
// 实际从用户服务查询
return "user";
}
private String getUserRegion(String userId) {
// 实际从用户资料查询
return "mainland_china";
}
private String getUserTier(String userId) {
// 实际从会员服务查询
return "standard";
}
}FF4J:嵌入式Feature Flag的完整实现
如果不想依赖外部服务,FF4J提供了嵌入式方案,数据存储在Redis中。
package com.laozhang.feature.ff4j;
import org.ff4j.FF4j;
import org.ff4j.redis.RedisContainerClientLettuceImpl;
import org.ff4j.store.redis.FeatureStoreRedis;
import org.ff4j.store.redis.PropertyStoreRedis;
import org.ff4j.web.FF4jWebServlet;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.boot.web.servlet.ServletRegistrationBean;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
/**
* FF4J配置
* 使用Redis作为存储后端,支持集群部署
*/
@Configuration
public class FF4jConfiguration {
@Value("${spring.data.redis.host:localhost}")
private String redisHost;
@Value("${spring.data.redis.port:6379}")
private int redisPort;
@Bean
public FF4j ff4j() {
// Redis连接
RedisContainerClientLettuceImpl redisClient =
new RedisContainerClientLettuceImpl(redisHost, redisPort);
FF4j ff4j = new FF4j();
// 使用Redis存储Feature Flag
ff4j.setFeatureStore(new FeatureStoreRedis(redisClient, "ff4j:features:"));
ff4j.setPropertiesStore(new PropertyStoreRedis(redisClient, "ff4j:properties:"));
// 启用审计日志(记录所有Feature Flag的变更)
ff4j.audit(true);
// 启用指标收集
ff4j.enableAutoCreate(true); // Feature不存在时自动创建(开发方便)
return ff4j;
}
/**
* FF4J自带的Web管理界面
* 访问 /ff4j-web-console
*/
@Bean
public ServletRegistrationBean<FF4jWebServlet> ff4jWebServlet(FF4j ff4j) {
FF4jWebServlet servlet = new FF4jWebServlet();
servlet.setFf4j(ff4j);
ServletRegistrationBean<FF4jWebServlet> reg =
new ServletRegistrationBean<>(servlet, "/ff4j-web-console/*");
reg.setName("FF4J Console");
return reg;
}
/**
* 初始化默认Feature Flag
* 生产环境可以用配置文件初始化,开发环境用代码
*/
@Bean
public FF4jFeatureInitializer ff4jInitializer(FF4j ff4j) {
return new FF4jFeatureInitializer(ff4j);
}
}package com.laozhang.feature.ff4j;
import org.ff4j.FF4j;
import org.ff4j.core.Feature;
import org.ff4j.core.FlippingStrategy;
import org.ff4j.strategy.PonderationStrategy;
import org.ff4j.strategy.el.ExpressionFlipStrategy;
import org.springframework.beans.factory.InitializingBean;
import java.util.HashMap;
import java.util.Map;
/**
* FF4J Feature初始化器
*
* 首次部署时创建所有Feature Flag的默认值
* 后续可通过管理界面修改
*/
public class FF4jFeatureInitializer implements InitializingBean {
private final FF4j ff4j;
public FF4jFeatureInitializer(FF4j ff4j) {
this.ff4j = ff4j;
}
@Override
public void afterPropertiesSet() {
initAiChatFeature();
initModelUpgradeFeature();
initRagFeature();
initEmergencyShutdown();
}
private void initAiChatFeature() {
if (!ff4j.exist("ai.chat.enabled")) {
Feature feature = new Feature("ai.chat.enabled");
feature.setDescription("AI聊天功能总开关");
feature.enable(); // 默认启用
ff4j.createFeature(feature);
}
}
private void initModelUpgradeFeature() {
if (!ff4j.exist("ai.model.upgrade")) {
Feature feature = new Feature("ai.model.upgrade");
feature.setDescription("升级到更强大的模型(gpt-4o)");
feature.disable(); // 默认禁用,需要手动开启
// 添加按百分比的发布策略:初始5%灰度
Map<String, String> params = new HashMap<>();
params.put(PonderationStrategy.PARAM_WEIGHT, "0.05"); // 5%
feature.setFlippingStrategy(new PonderationStrategy(0.05));
ff4j.createFeature(feature);
}
}
private void initRagFeature() {
if (!ff4j.exist("ai.rag.enabled")) {
Feature feature = new Feature("ai.rag.enabled");
feature.setDescription("RAG知识库增强功能");
feature.enable();
ff4j.createFeature(feature);
}
}
private void initEmergencyShutdown() {
if (!ff4j.exist("ai.emergency.shutdown")) {
Feature feature = new Feature("ai.emergency.shutdown");
feature.setDescription("紧急关断:关闭所有AI功能(危险!)");
feature.disable(); // 默认关闭,只有紧急情况才开启
ff4j.createFeature(feature);
}
}
}AI功能特性开关的设计模式
核心业务集成
package com.laozhang.ai.service;
import com.laozhang.feature.AiFeatureService;
import org.springframework.ai.chat.client.ChatClient;
import org.springframework.ai.openai.OpenAiChatOptions;
import org.springframework.stereotype.Service;
/**
* AI聊天服务:集成Feature Flag
*
* 设计原则:
* 1. Feature Flag逻辑与业务逻辑分离(在Service层处理,不在Controller)
* 2. 每个Feature有明确的降级策略
* 3. Feature Flag的状态不影响系统可用性(只影响功能)
*/
@Service
public class AiChatService {
private final ChatClient.Builder chatClientBuilder;
private final AiFeatureService featureService;
private final PromptTemplateManager promptManager;
public AiChatService(ChatClient.Builder chatClientBuilder,
AiFeatureService featureService,
PromptTemplateManager promptManager) {
this.chatClientBuilder = chatClientBuilder;
this.featureService = featureService;
this.promptManager = promptManager;
}
/**
* 主聊天方法
*
* 集成了4个Feature Flag:
* 1. 紧急关断检查
* 2. 模型选择(A/B测试)
* 3. Prompt版本选择
* 4. RAG增强开关
*/
public ChatResult chat(String userId, String sessionId, String userMessage) {
// 1. 检查总开关(包含紧急关断)
if (!featureService.isAiEnabled(userId, sessionId)) {
return ChatResult.fallback("AI服务暂时维护中,请稍后再试");
}
// 2. 选择模型(根据Feature Flag)
String model = featureService.getActiveModel(userId, sessionId);
// 3. 选择Prompt版本
String systemPrompt = featureService.usePromptV2(userId, sessionId)
? promptManager.getPromptV2()
: promptManager.getPromptV1();
// 4. 构建ChatClient(根据当前Feature配置)
ChatClient chatClient = buildChatClient(model, userId, sessionId);
// 5. 执行AI调用
try {
ChatClient.ChatClientRequestSpec spec = chatClient.prompt()
.system(systemPrompt)
.user(userMessage);
// 5a. 决定是否启用RAG
if (featureService.isFeatureEnabled("ai.rag.enabled", userId, sessionId)) {
// RAG增强:注入相关知识
spec = spec.advisors(/* QuestionAnswerAdvisor */);
}
String response = spec.call().content();
return ChatResult.success(response, model);
} catch (Exception e) {
// AI调用失败:返回降级响应
return ChatResult.fallback("抱歉,AI助手暂时无法响应,请联系人工客服。");
}
}
private ChatClient buildChatClient(String model, String userId, String sessionId) {
return chatClientBuilder
.defaultOptions(OpenAiChatOptions.builder()
.withModel(model)
.withTemperature(0.7f)
.build())
.build();
}
public record ChatResult(
boolean success,
String content,
String modelUsed,
boolean isFallback
) {
static ChatResult success(String content, String model) {
return new ChatResult(true, content, model, false);
}
static ChatResult fallback(String message) {
return new ChatResult(false, message, null, true);
}
}
}按用户分组控制:VIP用户先体验
package com.laozhang.feature.strategy;
import com.laozhang.user.UserService;
import io.getunleash.strategy.Strategy;
import org.springframework.stereotype.Component;
import java.util.*;
/**
* 自定义Unleash发布策略:基于用户等级
*
* 使用场景:新AI功能先给VIP用户,1周后再扩大到全量
*
* Unleash管理界面配置:
* Strategy: UserTierStrategy
* Parameters: allowedTiers = vip,premium
*/
@Component
public class UserTierStrategy implements Strategy {
private static final String STRATEGY_NAME = "UserTierStrategy";
private static final String PARAM_ALLOWED_TIERS = "allowedTiers";
private final UserService userService;
public UserTierStrategy(UserService userService) {
this.userService = userService;
}
@Override
public String getName() {
return STRATEGY_NAME;
}
@Override
public boolean isEnabled(Map<String, String> parameters,
io.getunleash.UnleashContext context) {
String allowedTiersParam = parameters.get(PARAM_ALLOWED_TIERS);
if (allowedTiersParam == null || allowedTiersParam.isBlank()) {
return false;
}
Set<String> allowedTiers = new HashSet<>(
Arrays.asList(allowedTiersParam.split(",")));
String userId = context.getUserId().orElse(null);
if (userId == null) return false;
// 查询用户等级
String userTier = userService.getUserTier(userId);
return allowedTiers.contains(userTier.toLowerCase());
}
}package com.laozhang.feature.strategy;
import io.getunleash.strategy.Strategy;
import org.springframework.stereotype.Component;
import java.util.Map;
/**
* 自定义策略:基于注册时间的新老用户区分
*
* 使用场景:
* - 新用户默认开启新AI体验(增加留存)
* - 老用户保持原有体验(减少迁移摩擦)
*/
@Component
public class NewUserStrategy implements Strategy {
private static final String PARAM_DAYS_THRESHOLD = "registeredWithinDays";
private final com.laozhang.user.UserService userService;
public NewUserStrategy(com.laozhang.user.UserService userService) {
this.userService = userService;
}
@Override
public String getName() {
return "NewUserStrategy";
}
@Override
public boolean isEnabled(Map<String, String> parameters,
io.getunleash.UnleashContext context) {
String daysParam = parameters.getOrDefault(PARAM_DAYS_THRESHOLD, "30");
int daysThreshold = Integer.parseInt(daysParam);
String userId = context.getUserId().orElse(null);
if (userId == null) return false;
long registrationDaysAgo = userService.getDaysSinceRegistration(userId);
return registrationDaysAgo <= daysThreshold;
}
}按地区控制:合规敏感地区关闭AI功能
package com.laozhang.feature.strategy;
import io.getunleash.strategy.Strategy;
import io.getunleash.UnleashContext;
import org.springframework.stereotype.Component;
import java.util.*;
/**
* 地区合规策略
*
* 应用场景:
* 1. GDPR合规:欧盟地区的AI功能需要额外的同意流程
* 2. 金融监管:某些AI辅助决策功能在特定省份受限
* 3. 内容合规:生成图片等功能在某些地区需要审核
*
* Unleash配置:
* Strategy: RegionComplianceStrategy
* Parameters:
* blockedRegions = eu,hk,tw
* requireConsentRegions = guangdong,beijing
*/
@Component
public class RegionComplianceStrategy implements Strategy {
private static final String PARAM_BLOCKED_REGIONS = "blockedRegions";
private static final String PARAM_REQUIRE_CONSENT = "requireConsentRegions";
@Override
public String getName() {
return "RegionComplianceStrategy";
}
@Override
public boolean isEnabled(Map<String, String> parameters,
UnleashContext context) {
String userRegion = context.getProperties()
.getOrDefault("userRegion", "unknown").toLowerCase();
// 1. 完全屏蔽的地区
String blockedRegionsParam = parameters.get(PARAM_BLOCKED_REGIONS);
if (blockedRegionsParam != null) {
Set<String> blockedRegions = new HashSet<>(
Arrays.asList(blockedRegionsParam.toLowerCase().split(",")));
if (blockedRegions.contains(userRegion)) {
return false; // 直接拒绝
}
}
// 2. 需要用户同意的地区
String requireConsentParam = parameters.get(PARAM_REQUIRE_CONSENT);
if (requireConsentParam != null) {
Set<String> consentRegions = new HashSet<>(
Arrays.asList(requireConsentParam.toLowerCase().split(",")));
if (consentRegions.contains(userRegion)) {
// 需要额外检查用户是否已同意隐私政策
String userId = context.getUserId().orElse(null);
if (userId != null) {
// 从上下文属性中获取同意状态(由应用层注入)
String consent = context.getProperties()
.getOrDefault("aiPrivacyConsent", "false");
return "true".equals(consent);
}
return false;
}
}
return true; // 其他地区默认允许
}
}按百分比控制:一致性哈希灰度
package com.laozhang.feature.strategy;
import io.getunleash.strategy.Strategy;
import io.getunleash.UnleashContext;
import org.springframework.stereotype.Component;
import java.nio.charset.StandardCharsets;
import java.security.MessageDigest;
import java.util.Map;
/**
* 一致性哈希灰度策略
*
* 核心特点:
* 同一个用户在多次请求中始终得到相同的结果(一致性)
* 避免用户体验不一致(一次看到新功能,下次又看到旧功能)
*
* 实现原理:
* hash(userId + featureName) % 100 < percentage → 开启
*
* 使用方式(Unleash管理界面配置):
* Strategy: ConsistentHashGradual
* Parameters: percentage = 5(5%灰度)
*/
@Component
public class ConsistentHashGradualStrategy implements Strategy {
private static final String PARAM_PERCENTAGE = "percentage";
@Override
public String getName() {
return "ConsistentHashGradual";
}
@Override
public boolean isEnabled(Map<String, String> parameters,
UnleashContext context) {
String percentageStr = parameters.getOrDefault(PARAM_PERCENTAGE, "0");
int percentage;
try {
percentage = Integer.parseInt(percentageStr.trim());
} catch (NumberFormatException e) {
return false;
}
if (percentage <= 0) return false;
if (percentage >= 100) return true;
// 获取用于哈希的标识符(优先使用sessionId保证会话内一致性)
String identifier = context.getSessionId()
.orElseGet(() -> context.getUserId().orElse(null));
if (identifier == null) return false;
// 计算一致性哈希
int hashValue = consistentHash(identifier);
return (hashValue % 100) < percentage;
}
/**
* 基于MurmurHash3的一致性哈希
* 比MD5快,分布更均匀
*/
private int consistentHash(String input) {
try {
MessageDigest md = MessageDigest.getInstance("MD5");
byte[] hash = md.digest(input.getBytes(StandardCharsets.UTF_8));
// 取前4字节作为int
int result = ((hash[0] & 0xFF) << 24) |
((hash[1] & 0xFF) << 16) |
((hash[2] & 0xFF) << 8) |
(hash[3] & 0xFF);
return Math.abs(result);
} catch (Exception e) {
return input.hashCode() & Integer.MAX_VALUE;
}
}
}紧急关断:一键关闭所有AI功能
package com.laozhang.feature.emergency;
import io.getunleash.Unleash;
import org.springframework.http.ResponseEntity;
import org.springframework.security.access.prepost.PreAuthorize;
import org.springframework.web.bind.annotation.*;
import java.time.Instant;
/**
* 紧急关断运维接口
*
* 功能:
* 1. 一键关闭所有AI功能(刘强事件的解决方案)
* 2. 分级关断(全部/按服务/按功能)
* 3. 定时自动恢复(防止忘记重新开启)
* 4. 关断原因记录(便于复盘)
*
* 安全:只有ROLE_OPS角色可以调用
*/
@RestController
@RequestMapping("/ops/ai/emergency")
public class EmergencyShutdownController {
private final Unleash unleash;
private final EmergencyShutdownService shutdownService;
private final AuditLogger auditLogger;
public EmergencyShutdownController(Unleash unleash,
EmergencyShutdownService shutdownService,
AuditLogger auditLogger) {
this.unleash = unleash;
this.shutdownService = shutdownService;
this.auditLogger = auditLogger;
}
/**
* 全量关断所有AI功能
*
* POST /ops/ai/emergency/shutdown
* {
* "reason": "CEO要求,舆情处理",
* "operator": "liuqiang",
* "autoRecoverAfterMinutes": 120
* }
*/
@PostMapping("/shutdown")
@PreAuthorize("hasRole('ROLE_OPS')")
public ResponseEntity<ShutdownResponse> emergencyShutdown(
@RequestBody ShutdownRequest request) {
// 1. 执行关断
shutdownService.shutdown(request.reason());
// 2. 记录审计日志
auditLogger.log(AuditEvent.builder()
.action("EMERGENCY_SHUTDOWN")
.operator(request.operator())
.reason(request.reason())
.timestamp(Instant.now())
.build());
// 3. 发送告警通知
shutdownService.notifyTeam(request.reason(), request.operator());
// 4. 设置自动恢复(如果指定了)
if (request.autoRecoverAfterMinutes() > 0) {
shutdownService.scheduleAutoRecover(request.autoRecoverAfterMinutes());
}
return ResponseEntity.ok(new ShutdownResponse(
true,
"AI功能已关闭,当前时间: " + Instant.now(),
request.autoRecoverAfterMinutes() > 0
? "将在 " + request.autoRecoverAfterMinutes() + " 分钟后自动恢复"
: "需要手动恢复"
));
}
/**
* 恢复AI功能
*/
@PostMapping("/recover")
@PreAuthorize("hasRole('ROLE_OPS')")
public ResponseEntity<ShutdownResponse> recover(
@RequestBody RecoverRequest request) {
shutdownService.recover();
auditLogger.log(AuditEvent.builder()
.action("EMERGENCY_RECOVER")
.operator(request.operator())
.reason(request.reason())
.timestamp(Instant.now())
.build());
return ResponseEntity.ok(new ShutdownResponse(
true,
"AI功能已恢复",
null
));
}
/**
* 查询当前关断状态
*/
@GetMapping("/status")
@PreAuthorize("hasAnyRole('ROLE_OPS', 'ROLE_DEV')")
public ResponseEntity<ShutdownStatus> getStatus() {
return ResponseEntity.ok(shutdownService.getCurrentStatus());
}
/**
* 细粒度关断:只关断特定功能
*
* POST /ops/ai/emergency/shutdown-feature
* {
* "featureName": "ai.voice.enabled",
* "reason": "语音功能有质量问题"
* }
*/
@PostMapping("/shutdown-feature")
@PreAuthorize("hasRole('ROLE_OPS')")
public ResponseEntity<ShutdownResponse> shutdownSpecificFeature(
@RequestBody FeatureShutdownRequest request) {
shutdownService.shutdownFeature(request.featureName(), request.reason());
return ResponseEntity.ok(new ShutdownResponse(
true,
"功能 [" + request.featureName() + "] 已关闭",
null
));
}
// 请求/响应DTO
public record ShutdownRequest(
String reason,
String operator,
int autoRecoverAfterMinutes
) {}
public record RecoverRequest(
String operator,
String reason
) {}
public record FeatureShutdownRequest(
String featureName,
String reason
) {}
public record ShutdownResponse(
boolean success,
String message,
String note
) {}
public record ShutdownStatus(
boolean isShutdown,
String shutdownReason,
Instant shutdownTime,
String shutdownBy,
Instant scheduledRecoveryTime
) {}
}package com.laozhang.feature.emergency;
import io.getunleash.Unleash;
import org.springframework.scheduling.TaskScheduler;
import org.springframework.stereotype.Service;
import java.time.Instant;
import java.util.concurrent.ScheduledFuture;
import java.util.concurrent.atomic.AtomicReference;
/**
* 紧急关断服务实现
*/
@Service
public class EmergencyShutdownService {
private final Unleash unleash;
private final TaskScheduler taskScheduler;
private final DingTalkNotifier notifier;
// 关断状态(内存中,仅用于查询;真实开关在Unleash)
private final AtomicReference<ShutdownState> currentState =
new AtomicReference<>(ShutdownState.NORMAL);
private ScheduledFuture<?> autoRecoverTask;
public EmergencyShutdownService(Unleash unleash,
TaskScheduler taskScheduler,
DingTalkNotifier notifier) {
this.unleash = unleash;
this.taskScheduler = taskScheduler;
this.notifier = notifier;
}
/**
* 执行紧急关断
*
* 关断策略:
* 1. 在Unleash中开启 ai.emergency.shutdown 标志
* 2. 同时关闭所有AI相关Feature Flag(防止Unleash不可用时失效)
*/
public void shutdown(String reason) {
// 通过Unleash API远程开关
// 注意:这需要调用Unleash的Admin API,不是客户端SDK
unleash.getFeatureToggleNames().stream()
.filter(name -> name.startsWith("ai."))
.forEach(featureName -> {
// 实际需要调用Unleash Admin REST API
// POST /api/admin/projects/:projectId/features/:featureName/environments/:environment/off
disableFeatureViaAdminApi(featureName);
});
currentState.set(new ShutdownState(
true, reason, Instant.now(), Thread.currentThread().getName()));
}
public void recover() {
// 通过Unleash API重新开启核心Feature
enableFeatureViaAdminApi("ai.chat.enabled");
enableFeatureViaAdminApi("ai.rag.enabled");
// 注意:不自动开启实验性功能,只恢复核心功能
currentState.set(ShutdownState.NORMAL);
if (autoRecoverTask != null) {
autoRecoverTask.cancel(false);
}
}
public void shutdownFeature(String featureName, String reason) {
disableFeatureViaAdminApi(featureName);
}
public void scheduleAutoRecover(int minutes) {
Instant recoverTime = Instant.now().plusSeconds(minutes * 60L);
autoRecoverTask = taskScheduler.schedule(
this::recover,
recoverTime
);
// 在30分钟前发送提醒
if (minutes > 30) {
taskScheduler.schedule(
() -> notifier.send("提醒:AI功能将在30分钟后自动恢复,请确认是否需要继续关断"),
recoverTime.minusSeconds(30 * 60)
);
}
}
public void notifyTeam(String reason, String operator) {
String message = String.format(
"【紧急通知】AI功能已紧急关断\n" +
"操作人:%s\n" +
"关断原因:%s\n" +
"时间:%s\n" +
"请及时处理并确认恢复时机",
operator, reason, Instant.now()
);
notifier.sendUrgent(message);
}
public EmergencyShutdownController.ShutdownStatus getCurrentStatus() {
ShutdownState state = currentState.get();
return new EmergencyShutdownController.ShutdownStatus(
state.isShutdown(),
state.reason(),
state.shutdownTime(),
state.shutdownBy(),
getScheduledRecoveryTime()
);
}
private void disableFeatureViaAdminApi(String featureName) {
// 实际实现:调用Unleash Admin REST API
// 这里简化为日志
org.slf4j.LoggerFactory.getLogger(getClass())
.warn("EMERGENCY: 关闭Feature Flag: {}", featureName);
}
private void enableFeatureViaAdminApi(String featureName) {
org.slf4j.LoggerFactory.getLogger(getClass())
.info("RECOVER: 恢复Feature Flag: {}", featureName);
}
private Instant getScheduledRecoveryTime() {
return autoRecoverTask != null && !autoRecoverTask.isCancelled()
? Instant.now() // 简化
: null;
}
record ShutdownState(
boolean isShutdown,
String reason,
Instant shutdownTime,
String shutdownBy
) {
static final ShutdownState NORMAL = new ShutdownState(false, null, null, null);
}
}特性开关的完整流程
特性开关的生命周期管理:避免"功能开关墓地"
Feature Flag最大的技术债务是"永不删除"。时间久了,代码里充满了没人知道是否还需要的开关。
package com.laozhang.feature.governance;
import org.springframework.scheduling.annotation.Scheduled;
import org.springframework.stereotype.Component;
import java.time.LocalDate;
import java.time.temporal.ChronoUnit;
import java.util.List;
/**
* Feature Flag生命周期管理
*
* 防止"功能开关墓地"的策略:
* 1. 每个Feature Flag必须设置过期日期
* 2. 过期前7天自动提醒
* 3. 到期后自动报告(不自动删除,防止误操作)
* 4. 超过180天未使用的开关自动提醒清理
*/
@Component
public class FeatureFlagGovernance {
private final FeatureFlagMetadataRepository metadataRepo;
private final DingTalkNotifier notifier;
public FeatureFlagGovernance(FeatureFlagMetadataRepository metadataRepo,
DingTalkNotifier notifier) {
this.metadataRepo = metadataRepo;
this.notifier = notifier;
}
/**
* 每周一早上9点检查Feature Flag健康状态
*/
@Scheduled(cron = "0 0 9 * * MON")
public void weeklyGovernanceCheck() {
List<FeatureFlagMetadata> allFlags = metadataRepo.findAll();
LocalDate today = LocalDate.now();
StringBuilder report = new StringBuilder("## Feature Flag周度治理报告\n\n");
// 1. 即将到期(7天内)
List<FeatureFlagMetadata> expiringSoon = allFlags.stream()
.filter(f -> f.expiryDate() != null)
.filter(f -> {
long daysUntilExpiry = ChronoUnit.DAYS.between(today, f.expiryDate());
return daysUntilExpiry >= 0 && daysUntilExpiry <= 7;
})
.toList();
if (!expiringSoon.isEmpty()) {
report.append("### ⚠️ 即将到期(7天内)\n");
expiringSoon.forEach(f -> report.append(String.format(
"- `%s`:%s 到期(负责人:%s)\n",
f.name(), f.expiryDate(), f.owner())));
report.append("\n");
}
// 2. 已超期
List<FeatureFlagMetadata> expired = allFlags.stream()
.filter(f -> f.expiryDate() != null)
.filter(f -> today.isAfter(f.expiryDate()))
.toList();
if (!expired.isEmpty()) {
report.append("### 🚨 已超期(需要处理)\n");
expired.forEach(f -> report.append(String.format(
"- `%s`:超期 %d 天(%s 是否仍需要?)\n",
f.name(),
ChronoUnit.DAYS.between(f.expiryDate(), today),
f.owner())));
report.append("\n");
}
// 3. 长期未使用(180天)
List<FeatureFlagMetadata> stale = allFlags.stream()
.filter(f -> f.lastUsedDate() != null)
.filter(f -> ChronoUnit.DAYS.between(f.lastUsedDate(), today) > 180)
.toList();
if (!stale.isEmpty()) {
report.append("### 💀 长期未使用(可能是功能开关墓地)\n");
stale.forEach(f -> report.append(String.format(
"- `%s`:%d 天未使用(建议:确认是否可以删除)\n",
f.name(),
ChronoUnit.DAYS.between(f.lastUsedDate(), today))));
}
// 发送报告
if (!expiringSoon.isEmpty() || !expired.isEmpty() || !stale.isEmpty()) {
notifier.send(report.toString());
}
}
/**
* 创建Feature Flag时的标准元数据
*/
public void registerFeatureFlag(FeatureFlagRegistration registration) {
// 强制要求填写:名称、负责人、用途说明、预计到期时间
if (registration.owner() == null || registration.owner().isBlank()) {
throw new IllegalArgumentException("Feature Flag必须指定负责人");
}
if (registration.expiryDate() == null) {
throw new IllegalArgumentException(
"Feature Flag必须设置过期日期(最长6个月)。" +
"如果是永久性开关,请设置为当前日期+180天,到期后评估是否延期");
}
if (ChronoUnit.DAYS.between(LocalDate.now(), registration.expiryDate()) > 180) {
throw new IllegalArgumentException(
"Feature Flag过期时间不能超过180天,请按需延期");
}
metadataRepo.save(new FeatureFlagMetadata(
registration.name(),
registration.owner(),
registration.description(),
registration.expiryDate(),
registration.linkedJiraTicket(),
LocalDate.now(),
null
));
}
public record FeatureFlagMetadata(
String name,
String owner,
String description,
LocalDate expiryDate,
String linkedJiraTicket,
LocalDate createdDate,
LocalDate lastUsedDate
) {}
public record FeatureFlagRegistration(
String name,
String owner,
String description,
LocalDate expiryDate,
String linkedJiraTicket
) {}
}Feature Flag命名规范
/**
* Feature Flag命名规范
*
* 格式:{团队}.{模块}.{功能描述}.{可选变体}
*
* 示例(AI应用):
*/
public class FeatureFlagNamingConvention {
// 功能总开关
static final String AI_CHAT = "ai.chat.enabled";
static final String AI_VOICE = "ai.voice.enabled";
// 实验性功能(有明确结束时间)
// 命名加exp前缀,提醒团队这是临时的
static final String EXP_GPT4O = "ai.exp.model-gpt4o.enabled"; // 到2025-03-01
static final String EXP_PROMPT_V2 = "ai.exp.prompt-v2.enabled"; // 到2025-02-15
// 紧急开关
static final String KILL_SWITCH_ALL = "ai.killswitch.all";
static final String KILL_SWITCH_CHAT = "ai.killswitch.chat";
// 按功能分组的开关(方便批量操作)
static final String GROUP_CUSTOMER_SERVICE = "ai.group.customer-service";
static final String GROUP_RECOMMENDATION = "ai.group.recommendation";
// 运维开关(不属于业务功能)
static final String OPS_DEBUG_LOG = "ai.ops.debug-logging";
static final String OPS_VERBOSE_ERROR = "ai.ops.verbose-errors";
}完整架构图
FAQ
Q:Unleash和FF4J怎么选?
A:团队有独立运维能力(K8s、Docker Compose)用Unleash,优点是有完整的管理界面、团队协作功能、审计日志;如果是单体应用或者不想维护额外服务,用FF4J(嵌入在Java进程里)。小团队建议先用FF4J快速落地,后期有需要再迁移到Unleash。
Q:Unleash服务挂了,Feature Flag怎么办?
A:Unleash Java SDK默认有本地缓存(backupFile参数),上一次拉取的状态会缓存在本地文件中。即使Unleash服务宕机,SDK会使用缓存的状态继续工作。因此Unleash的可用性影响的是"动态修改开关"的能力,而不是"读取开关状态"的能力。
Q:Feature Flag会影响性能吗?
A:Unleash SDK在本地内存中维护Feature Flag的状态副本(每10秒同步),查询是纯内存操作,纳秒级别,对性能没有影响。唯一的网络请求是后台定时同步,不在请求链路上。
Q:怎么防止Feature Flag被滥用(到处乱加开关)?
A:三个机制。一是强制治理:注册时必须填owner和expiryDate(本文的FeatureFlagGovernance)。二是代码审查:Feature Flag的新增必须过PR审查,review时check是否有对应的删除计划。三是定期清理:每季度做一次"Feature Flag审计",评估哪些可以删除并合并到代码里。
Q:A/B测试结束后,Feature Flag怎么处理?
A:标准流程:A/B测试期间 → 分析数据 → 选定优胜方案 → 将优胜方案的代码设为默认 → 删除Feature Flag相关代码 → 在Unleash中归档该Flag。永远不要让"开启状态的Flag"留在代码里超过6个月。
总结
刘强那个"CEO叫停"的事故,花了技术团队两周时间做出了应有的工程准备。
特性开关的核心价值不只是"快速关闭功能",而是整个发布哲学的升级:
- Unleash集成 → 企业级Feature Flag平台,有管理界面,支持多种策略
- FF4J嵌入式 → 适合小团队,快速落地,无外部依赖
- 用户分组策略 → VIP先用,收集反馈,再全量推广
- 地区合规策略 → 动态适应监管要求,无需发版
- 一致性哈希灰度 → 同一用户始终看到一致的体验
- 紧急关断 → 秒级响应,而不是两小时
- 生命周期管理 → 防止功能开关墓地
一个好的AI应用,不只要写好代码,还要把发布控制权牢牢掌握在运营者手中。
