AI工程师代码规范：LLM应用工程化的10条最佳实践

老张2026/4/30大约 9 分钟

AI工程师代码规范：LLM应用工程化的10条最佳实践

适读人群：正在写AI应用代码、想提升代码质量和可维护性的Java工程师 阅读时长：约16分钟 文章价值：10条可直接落地的AI应用代码规范，避开常见的工程化坑点

那次令人头皮发麻的代码评审

我们团队接手了一个AI模块，原来的开发已经离职了。

第一次打开代码，我就沉默了：

系统提示词直接硬编码在方法里，整个类密密麻麻
chatClient.call() 裸调，没有任何重试、超时、降级
结果直接response.getContent()取字符串，没有异常处理
生产环境还在用测试阶段随手设的temperature=1.5
没有日志，出了问题根本不知道LLM说了什么

不是代码写错了，是工程化意识完全缺失。

这篇文章，把我在AI工程实践中沉淀的10条规范写出来，每条都有反例和正例对比。

规范一：Prompt必须外置，不允许硬编码

错误做法：

// 反例：prompt硬编码在代码里，改一个词要重新发布
@Service
public class BadReviewService {
    public String review(String code) {
        return chatClient.prompt()
                .user("你是代码审查专家，请审查以下Java代码，重点关注安全漏洞、性能问题和代码规范。如果代码有SQL注入风险需要特别标注。代码如下：\n" + code)
                .call()
                .content();
    }
}

正确做法：

// 正例：prompt外置到配置文件或专门的prompt管理模块
@Service
@Slf4j
public class GoodReviewService {

    private final ChatClient chatClient;
    
    // Prompt从配置文件读取，支持动态更新
    @Value("${ai.prompts.code-review}")
    private String codeReviewPromptTemplate;

    public GoodReviewService(ChatClient.Builder builder) {
        this.chatClient = builder.build();
    }

    public String review(String code) {
        String prompt = String.format(codeReviewPromptTemplate, code);
        return chatClient.prompt()
                .user(prompt)
                .call()
                .content();
    }
}

# application.yml
ai:
  prompts:
    code-review: |
      你是一个代码审查专家，请审查以下Java代码。
      重点关注：
      1. 安全漏洞（SQL注入、XSS、序列化问题）
      2. 性能问题（N+1查询、资源泄漏）
      3. 代码规范（命名、注释、结构）
      
      代码：
      %s
      
      请按优先级列出问题，每条问题说明位置和修改建议。

为什么：Prompt是业务逻辑，会频繁迭代。外置后改Prompt不需要重新构建部署。对于复杂系统，可以用专门的Prompt管理服务（数据库+版本控制）。

规范二：必须有超时控制

错误做法：

// 反例：裸调，没有超时，LLM挂了你的线程也挂着
public String badCall(String message) {
    return chatClient.prompt()
            .user(message)
            .call()
            .content();
}

正确做法：

// 正例：设置超时，超时后有降级处理
@Service
@Slf4j
public class TimeoutSafeChatService {

    private final ChatClient chatClient;
    
    @Value("${ai.timeout.seconds:30}")
    private int timeoutSeconds;

    public String safeCall(String message) {
        try {
            return Mono.fromCallable(() ->
                            chatClient.prompt()
                                    .user(message)
                                    .call()
                                    .content()
                    )
                    .timeout(Duration.ofSeconds(timeoutSeconds))
                    .onErrorResume(TimeoutException.class, e -> {
                        log.warn("AI调用超时: message={}", 
                                message.substring(0, Math.min(50, message.length())));
                        return Mono.just(fallbackResponse());
                    })
                    .block();
        } catch (Exception e) {
            log.error("AI调用失败", e);
            return fallbackResponse();
        }
    }

    private String fallbackResponse() {
        return "抱歉，AI服务暂时响应较慢，请稍后重试或联系客服。";
    }
}

规范三：结构化输出用entity()，不要自己解析JSON

错误做法：

// 反例：手动解析LLM返回的JSON，脆弱且难维护
public SentimentResult badAnalyze(String text) {
    String response = chatClient.prompt()
            .user("分析情感，返回JSON格式：{\"sentiment\":\"positive/negative/neutral\",\"score\":0.0}")
            .user(text)
            .call()
            .content();
    
    // 手动解析：LLM可能返回```json```包裹的，或者带注释的，各种格式
    String json = response.replaceAll("```json", "").replaceAll("```", "").trim();
    return objectMapper.readValue(json, SentimentResult.class); // 可能抛异常
}

正确做法：

// 正例：使用Spring AI 1.0的entity()方法，框架处理解析
@Service
public class GoodSentimentService {

    private final ChatClient chatClient;

    public SentimentResult analyze(String text) {
        // entity()方法会自动处理格式，失败会重试
        return chatClient.prompt()
                .user("请分析以下文本的情感倾向：\n" + text)
                .call()
                .entity(SentimentResult.class);  // Spring AI自动解析
    }

    // 使用record，Spring AI可以自动生成对应的输出格式指令
    record SentimentResult(
            @JsonProperty("sentiment") String sentiment,  // positive/negative/neutral
            @JsonProperty("score") double score,          // 0.0-1.0
            @JsonProperty("reason") String reason         // 判断依据
    ) {}
}

规范四：ChatClient按场景分建，不要一个全局共用

错误做法：

// 反例：所有场景共用一个ChatClient，配置混乱
@Service
public class BadAiService {
    
    @Autowired
    private ChatClient chatClient;  // 全局共用
    
    public String translate(String text) {
        // 翻译场景，应该低temperature
        return chatClient.prompt().user("翻译：" + text).call().content();
    }
    
    public String brainstorm(String topic) {
        // 头脑风暴，应该高temperature
        return chatClient.prompt().user("头脑风暴：" + topic).call().content();
    }
}

正确做法：

// 正例：不同场景独立ChatClient，参数各自最优
@Service
public class GoodAiService {

    private final ChatClient translateClient;   // 确定性任务，低temperature
    private final ChatClient brainstormClient;  // 创意任务，高temperature
    private final ChatClient analysisClient;    // 分析任务，中等temperature

    public GoodAiService(ChatClient.Builder builder) {
        this.translateClient = builder
                .defaultSystem("你是一个专业翻译，保持原文语义，输出准确自然。")
                .defaultOptions(OpenAiChatOptions.builder()
                        .withTemperature(0.1)  // 低随机性
                        .withMaxTokens(2000)
                        .build())
                .build();
        
        this.brainstormClient = builder
                .defaultSystem("你是一个创意顾问，发散思维，给出多样化的创意想法。")
                .defaultOptions(OpenAiChatOptions.builder()
                        .withTemperature(0.9)  // 高随机性
                        .withMaxTokens(1000)
                        .build())
                .build();
        
        this.analysisClient = builder
                .defaultSystem("你是一个数据分析师，逻辑严密，基于事实分析。")
                .defaultOptions(OpenAiChatOptions.builder()
                        .withTemperature(0.3)
                        .withMaxTokens(3000)
                        .build())
                .build();
    }

    public String translate(String text) {
        return translateClient.prompt().user(text).call().content();
    }

    public String brainstorm(String topic) {
        return brainstormClient.prompt().user(topic).call().content();
    }
}

规范五：所有AI调用必须有日志

错误做法：

// 反例：没有日志，出问题就是黑盒
public String badChat(String message) {
    return chatClient.prompt().user(message).call().content();
}

正确做法：

// 正例：使用SimpleLoggerAdvisor或自定义Advisor记录关键信息
@Service
@Slf4j
public class LoggedChatService {

    private final ChatClient chatClient;

    public LoggedChatService(ChatClient.Builder builder) {
        this.chatClient = builder
                .defaultAdvisors(
                        new SimpleLoggerAdvisor(),  // Spring AI内置，记录request/response
                        new TokenUsageAdvisor()     // 自定义，记录token用量
                )
                .build();
    }
}

@Component
@Slf4j
public class TokenUsageAdvisor implements CallAroundAdvisor {
    
    @Override
    public String getName() { return "TokenUsageAdvisor"; }
    
    @Override
    public int getOrder() { return Ordered.LOWEST_PRECEDENCE; }

    @Override
    public AdvisedResponse aroundCall(AdvisedRequest request, CallAroundAdvisorChain chain) {
        AdvisedResponse response = chain.nextAroundCall(request);
        
        if (response.response() != null) {
            var usage = response.response().getMetadata().getUsage();
            if (usage != null) {
                log.info("Token用量: prompt={}, completion={}, total={}",
                        usage.getPromptTokens(),
                        usage.getGenerationTokens(),
                        usage.getTotalTokens());
            }
        }
        
        return response;
    }
}

规范六：敏感信息不进Prompt，调用前脱敏

错误做法：

// 反例：把原始用户数据直接塞进prompt，可能泄露给第三方LLM
public String analyzeUser(User user) {
    return chatClient.prompt()
            .user("分析用户：姓名=" + user.getRealName() + 
                  ", 手机=" + user.getPhone() + 
                  ", 身份证=" + user.getIdCard())
            .call()
            .content();
}

正确做法：

// 正例：调用LLM前脱敏，只传必要的非敏感信息
@Service
public class PrivacySafeAnalysisService {

    public String analyzeUser(User user) {
        // 只传分析需要的非敏感字段
        UserAnonymized anonymized = UserAnonymized.builder()
                .ageGroup(getAgeGroup(user.getBirthDate()))  // 不传精确年龄
                .cityTier(getCityTier(user.getCity()))        // 不传精确城市
                .incomeRange(getIncomeRange(user.getIncome())) // 不传精确收入
                .purchaseCount(user.getPurchaseCount())        // 可以传
                .build();
        
        return chatClient.prompt()
                .user("分析用户特征：" + anonymized.toDescription())
                .call()
                .content();
    }
    
    @Builder
    record UserAnonymized(
            String ageGroup,     // "25-35岁" 而不是具体年龄
            String cityTier,     // "一线城市" 而不是具体城市
            String incomeRange,  // "中等收入" 而不是具体数值
            int purchaseCount
    ) {
        String toDescription() {
            return String.format("年龄段：%s，城市等级：%s，收入水平：%s，购买次数：%d",
                    ageGroup, cityTier, incomeRange, purchaseCount);
        }
    }
}

规范七：用@FunctionalInterface规范Function定义

错误做法：

// 反例：内联匿名类，难以测试和复用
public String badFunctionCall(String message) {
    return chatClient.prompt()
            .user(message)
            .functions(FunctionCallback.builder()
                    .function("getWeather", (Map<String, Object> req) -> {
                        String city = (String) req.get("city");
                        // 业务逻辑写在这里，无法单独测试
                        return weatherApi.query(city);
                    })
                    .build())
            .call()
            .content();
}

正确做法：

// 正例：独立的Function Bean，可以单独测试
@Component
@Description("根据城市名称查询当前天气，返回温度、天气状况和湿度")
public class GetWeatherFunction implements Function<WeatherRequest, WeatherResponse> {

    private final WeatherApiService weatherApiService;

    public GetWeatherFunction(WeatherApiService weatherApiService) {
        this.weatherApiService = weatherApiService;
    }

    @Override
    public WeatherResponse apply(WeatherRequest request) {
        log.info("查询天气: city={}", request.city());
        return weatherApiService.getWeather(request.city());
    }
}

// 注册为Spring Bean，ChatClient自动发现
@Bean
@Description("根据城市名称查询当前天气")
public Function<WeatherRequest, WeatherResponse> getWeather(WeatherApiService service) {
    return new GetWeatherFunction(service);
}

// 调用时只用名称引用
public String chat(String message) {
    return chatClient.prompt()
            .user(message)
            .functions("getWeather")  // 引用Bean名称
            .call()
            .content();
}

规范八：Advisor的顺序必须明确定义

Advisor的执行顺序影响行为，必须显式定义：

// 明确定义各Advisor的顺序，别依赖默认顺序
@Configuration
public class AdvisorOrderConfig {

    @Bean
    public ChatClient chatClient(ChatClient.Builder builder,
                                  AuditAdvisor auditAdvisor,
                                  RateLimitAdvisor rateLimitAdvisor,
                                  SafeGuardAdvisor safeGuardAdvisor,
                                  MessageChatMemoryAdvisor memoryAdvisor) {
        
        return builder
                .defaultAdvisors(
                        // 顺序：外层到内层（数字越小越外层）
                        rateLimitAdvisor,      // Order=100: 最外层，先限流
                        auditAdvisor,          // Order=200: 审计，记录所有请求
                        safeGuardAdvisor,      // Order=300: 内容安全过滤
                        memoryAdvisor,         // Order=400: 注入记忆，最接近LLM
                        new SimpleLoggerAdvisor() // 内置，有默认Order
                )
                .build();
    }
}

// 每个自定义Advisor必须明确声明Order
@Component
public class RateLimitAdvisor implements CallAroundAdvisor {
    
    @Override
    public int getOrder() {
        return 100;  // 明确写，不要写Ordered.HIGHEST_PRECEDENCE这种模糊的
    }
    
    // ...
}

规范九：测试要覆盖LLM调用的Mock

// 正例：测试中Mock掉LLM调用，测试业务逻辑，不依赖真实LLM
@ExtendWith(MockitoExtension.class)
class LoanReviewServiceTest {

    @Mock
    private ChatClient.Builder mockBuilder;
    
    @Mock
    private ChatClient mockChatClient;
    
    @Mock
    private ChatClient.ChatClientRequest mockRequest;
    
    @Mock
    private ChatClient.CallResponseSpec mockResponse;

    @InjectMocks
    private LoanReviewService loanReviewService;

    @BeforeEach
    void setUp() {
        when(mockBuilder.defaultSystem(anyString())).thenReturn(mockBuilder);
        when(mockBuilder.defaultAdvisors(any())).thenReturn(mockBuilder);
        when(mockBuilder.build()).thenReturn(mockChatClient);
    }

    @Test
    void testHighRiskLoanApplication() {
        // 模拟LLM返回高风险评估
        when(mockChatClient.prompt()).thenReturn(mockRequest);
        when(mockRequest.user(anyString())).thenReturn(mockRequest);
        when(mockRequest.call()).thenReturn(mockResponse);
        when(mockResponse.entity(LoanReviewResult.class))
                .thenReturn(new LoanReviewResult("高风险", 8.5, "拒绝"));

        // 执行测试
        LoanReviewResult result = loanReviewService.review(
                LoanApplication.builder()
                        .amount(500000)
                        .monthlyIncome(3000)
                        .debtRatio(0.85)
                        .build()
        );

        // 验证业务逻辑
        assertThat(result.getDecision()).isEqualTo("拒绝");
        verify(mockChatClient).prompt();
    }
}

规范十：生产环境必须有熔断降级

// 正例：配置熔断，LLM不可用时平滑降级
@Service
@Slf4j
public class ResilientAiService {

    private final ChatClient chatClient;
    private final CircuitBreaker circuitBreaker;
    
    // 降级响应缓存（预先准备好的通用回答）
    private static final Map<String, String> FALLBACK_RESPONSES = Map.of(
            "customer_service", "您好，当前AI服务暂时不可用，请拨打客服热线：400-xxx-xxxx",
            "code_review", "代码审查服务暂时不可用，请稍后重试",
            "default", "服务暂时不可用，请稍后重试"
    );

    public ResilientAiService(ChatClient.Builder builder,
                               CircuitBreakerRegistry circuitBreakerRegistry) {
        this.chatClient = builder.build();
        this.circuitBreaker = circuitBreakerRegistry.circuitBreaker(
                "llm-service",
                CircuitBreakerConfig.custom()
                        .failureRateThreshold(50)         // 50%失败率触发熔断
                        .waitDurationInOpenState(Duration.ofSeconds(30))
                        .slidingWindowSize(10)
                        .build()
        );
    }

    public String callWithCircuitBreaker(String message, String scene) {
        return circuitBreaker.executeSupplier(
                () -> chatClient.prompt().user(message).call().content(),
                throwable -> {
                    log.warn("AI服务熔断降级: scene={}, error={}", scene, throwable.getMessage());
                    return FALLBACK_RESPONSES.getOrDefault(scene, FALLBACK_RESPONSES.get("default"));
                }
        );
    }
}

10条规范速查表

最高优先级（必须做）：超时控制、日志记录、熔断降级
高优先级（应该做）：Prompt外置、结构化输出、敏感信息脱敏
中优先级（最好做）：按场景分建ChatClient、Function独立、Advisor顺序
基础保障：测试Mock覆盖

小结

这10条规范，每一条都是踩过坑之后总结出来的。

AI工程化的本质，和普通后端工程化没有本质区别：可测试、可监控、可降级、可维护。只是LLM这个黑盒组件引入了一些新的挑战——输出不确定、延迟高、依赖外部服务——需要针对性地加固。

从今天开始，在你的AI项目里逐条检查，优先补超时控制和熔断降级，这两条是最影响生产稳定性的。