第1844篇:文档自动生成系统——从代码注释到API文档的自动化流水线
第1844篇:文档自动生成系统——从代码注释到API文档的自动化流水线
我们团队有段时间特别难堪的一件事:前端开发每次联调接口都得来找后端问"这个字段是什么意思""这个状态码返回了代表什么"。不是接口文档没写,而是文档和代码对不上。代码改了,文档忘更新了。
这不是态度问题,是人性问题。没有人会在修改了一个紧急bug之后,还记得同步更新文档。
所以解决方案不是要求工程师"更负责任",而是让文档生成这件事自动化,从代码里自动抽取,不依赖人的主动行为。
这篇文章,我要讲一套完整的文档自动生成系统——从Java代码注释到Swagger/OpenAPI文档,再到Markdown风格的技术文档,形成一条真正可以落地的流水线。
为什么现有的方案不够好
先说现状。大部分团队用的文档方案无非两种:
Swagger(Springdoc):在代码里加大量注解,文档是有了,但代码变得很啰嗦。一个简单的Controller方法,Swagger注解比业务代码还多。
Postman/Apifox:手动维护接口集合,和代码完全脱节,更新全靠人的自觉性。
这两种方案的共同问题是:文档和代码是两套东西,靠人来保持同步。
理想状态是:代码就是文档的唯一真相来源,文档从代码自动生成,代码改了文档跟着变。
系统架构设计
整套系统的核心逻辑是:用JavaParser解析源代码,提取注释和类型信息,然后根据提取的信息生成不同格式的文档。
这样做的优点:
- 文档永远和代码保持同步(因为从代码生成)
- 注释写在代码里,不会和业务逻辑分离
- 可以在CI/CD中自动触发,不依赖人
核心实现:代码信息提取
先建立数据模型,表示我们想要从代码里提取什么:
// 表示一个API接口的所有信息
@Data
@Builder
public class ApiEndpointInfo {
private String controllerClass;
private String controllerDescription;
private String methodName;
private String httpMethod; // GET/POST/PUT/DELETE
private String path; // 接口路径
private String summary; // 接口摘要
private String description; // 详细描述
private List<ParamInfo> pathParams;
private List<ParamInfo> queryParams;
private List<ParamInfo> headerParams;
private RequestBodyInfo requestBody;
private ResponseInfo responseInfo;
private List<String> tags;
private boolean deprecated;
private String deprecatedReason;
private List<String> throwsExceptions;
}
@Data
@Builder
public class ParamInfo {
private String name;
private String type;
private String description;
private boolean required;
private String defaultValue;
private String example;
}
@Data
@Builder
public class ResponseInfo {
private String type;
private String description;
private List<FieldInfo> fields; // 响应体字段
private Map<Integer, String> errorCodes; // 错误码说明
}然后用JavaParser做源码解析:
@Service
public class JavaSourceAnalyzer {
private final JavaParser javaParser;
private final TypeSolver typeSolver;
public JavaSourceAnalyzer(String projectSourceRoot) {
CombinedTypeSolver combinedTypeSolver = new CombinedTypeSolver();
combinedTypeSolver.add(new ReflectionTypeSolver());
combinedTypeSolver.add(new JavaParserTypeSolver(new File(projectSourceRoot)));
this.typeSolver = combinedTypeSolver;
this.javaParser = new JavaParser(
new ParserConfiguration().setSymbolResolver(
new JavaSymbolSolver(combinedTypeSolver)
)
);
}
public List<ApiEndpointInfo> analyzeController(File sourceFile) throws FileNotFoundException {
ParseResult<CompilationUnit> result = javaParser.parse(sourceFile);
if (!result.isSuccessful()) {
throw new RuntimeException("解析失败: " + result.getProblems());
}
CompilationUnit cu = result.getResult().orElseThrow();
List<ApiEndpointInfo> endpoints = new ArrayList<>();
// 查找所有Controller类
cu.findAll(ClassOrInterfaceDeclaration.class).stream()
.filter(this::isRestController)
.forEach(controllerClass -> {
String basePath = extractBasePath(controllerClass);
String controllerDesc = extractJavadocSummary(controllerClass.getJavadoc());
// 遍历所有方法
controllerClass.getMethods().stream()
.filter(this::isMappingMethod)
.forEach(method -> {
ApiEndpointInfo info = analyzeMethod(
method, basePath, controllerClass.getNameAsString(), controllerDesc
);
endpoints.add(info);
});
});
return endpoints;
}
private boolean isRestController(ClassOrInterfaceDeclaration classDecl) {
return classDecl.getAnnotationByName("RestController").isPresent() ||
classDecl.getAnnotationByName("Controller").isPresent();
}
private String extractBasePath(ClassOrInterfaceDeclaration classDecl) {
return classDecl.getAnnotationByName("RequestMapping")
.map(ann -> extractAnnotationValue(ann, "value"))
.orElse("");
}
private ApiEndpointInfo analyzeMethod(MethodDeclaration method,
String basePath, String controllerName, String controllerDesc) {
// 提取HTTP方法和路径
HttpMethodInfo httpMethodInfo = extractHttpMethod(method);
// 提取Javadoc信息
Javadoc javadoc = method.getJavadoc().orElse(null);
String summary = extractJavadocSummary(javadoc);
String description = extractJavadocDescription(javadoc);
Map<String, String> paramDocs = extractParamDocs(javadoc);
String returnDoc = extractReturnDoc(javadoc);
// 提取方法参数信息
List<ParamInfo> pathParams = new ArrayList<>();
List<ParamInfo> queryParams = new ArrayList<>();
List<ParamInfo> headerParams = new ArrayList<>();
RequestBodyInfo requestBodyInfo = null;
for (Parameter param : method.getParameters()) {
if (param.getAnnotationByName("PathVariable").isPresent()) {
pathParams.add(buildParamInfo(param, paramDocs, true));
} else if (param.getAnnotationByName("RequestParam").isPresent()) {
queryParams.add(buildParamInfo(param, paramDocs, false));
} else if (param.getAnnotationByName("RequestHeader").isPresent()) {
headerParams.add(buildParamInfo(param, paramDocs, false));
} else if (param.getAnnotationByName("RequestBody").isPresent()) {
requestBodyInfo = buildRequestBodyInfo(param, paramDocs);
}
}
return ApiEndpointInfo.builder()
.controllerClass(controllerName)
.controllerDescription(controllerDesc)
.methodName(method.getNameAsString())
.httpMethod(httpMethodInfo.getMethod())
.path(basePath + httpMethodInfo.getPath())
.summary(summary)
.description(description)
.pathParams(pathParams)
.queryParams(queryParams)
.headerParams(headerParams)
.requestBody(requestBodyInfo)
.responseInfo(buildResponseInfo(method, returnDoc))
.deprecated(method.getAnnotationByName("Deprecated").isPresent())
.build();
}
private String extractJavadocSummary(Optional<Javadoc> javadocOpt) {
return javadocOpt
.map(doc -> doc.getDescription().toText().split("\\n")[0].trim())
.orElse("");
}
private Map<String, String> extractParamDocs(Javadoc javadoc) {
if (javadoc == null) return Collections.emptyMap();
Map<String, String> paramDocs = new HashMap<>();
javadoc.getBlockTags().stream()
.filter(tag -> tag.getType() == JavadocBlockTag.Type.PARAM)
.forEach(tag -> {
String name = tag.getName().orElse("");
String desc = tag.getContent().toText();
paramDocs.put(name, desc);
});
return paramDocs;
}
}用AI增强注释质量
纯粹的解析只能拿到已有的信息。但很多代码注释写得很简陋甚至没有注释,这时候就是AI发挥作用的时候了。
@Service
public class AiDocumentEnhancer {
private final AnthropicClient anthropicClient;
public ApiEndpointInfo enhanceWithAi(ApiEndpointInfo info, String sourceCode) {
// 如果已经有足够好的文档,不需要AI增强
if (isDocumentationSufficient(info)) {
return info;
}
String prompt = buildEnhancementPrompt(info, sourceCode);
String enhancement = anthropicClient.complete(prompt);
return applyEnhancement(info, enhancement);
}
private boolean isDocumentationSufficient(ApiEndpointInfo info) {
// 有摘要且参数都有描述,认为文档足够
if (info.getSummary() == null || info.getSummary().isEmpty()) return false;
boolean allParamsDocumented = info.getQueryParams().stream()
.allMatch(p -> p.getDescription() != null && !p.getDescription().isEmpty());
return allParamsDocumented;
}
private String buildEnhancementPrompt(ApiEndpointInfo info, String sourceCode) {
return String.format("""
以下是一个Java接口方法的信息,请帮我补全缺失的文档描述。
接口路径: %s %s
方法名称: %s
已有描述: %s
方法源代码:
```java
%s
```
请生成以下JSON格式的文档补全(只生成缺失的部分):
{
"summary": "一句话描述接口功能",
"description": "详细描述(可选)",
"paramDescriptions": {
"参数名": "参数说明"
},
"responseDescription": "返回值说明",
"possibleErrors": [
{"code": 错误码, "description": "错误场景说明"}
]
}
注意:
1. 描述要简洁专业,面向API使用者
2. 参数说明要包含取值范围或示例值
3. 错误码要根据业务逻辑推断
4. 只输出JSON,不要其他内容
""",
info.getHttpMethod(), info.getPath(),
info.getMethodName(),
info.getSummary() != null ? info.getSummary() : "无",
sourceCode
);
}
}OpenAPI文档生成
把解析出来的信息转成标准的OpenAPI 3.0格式:
@Service
public class OpenApiGenerator {
public String generateOpenApiJson(List<ApiEndpointInfo> endpoints, ProjectInfo projectInfo) {
Map<String, Object> openApi = new LinkedHashMap<>();
// OpenAPI基本信息
openApi.put("openapi", "3.0.3");
openApi.put("info", buildInfo(projectInfo));
openApi.put("servers", buildServers(projectInfo));
openApi.put("paths", buildPaths(endpoints));
openApi.put("components", buildComponents(endpoints));
return new ObjectMapper()
.writerWithDefaultPrettyPrinter()
.writeValueAsString(openApi);
}
private Map<String, Object> buildPaths(List<ApiEndpointInfo> endpoints) {
Map<String, Object> paths = new LinkedHashMap<>();
// 按路径分组
Map<String, List<ApiEndpointInfo>> groupedByPath = endpoints.stream()
.collect(Collectors.groupingBy(ApiEndpointInfo::getPath));
groupedByPath.forEach((path, pathEndpoints) -> {
Map<String, Object> pathItem = new LinkedHashMap<>();
pathEndpoints.forEach(endpoint -> {
String method = endpoint.getHttpMethod().toLowerCase();
pathItem.put(method, buildOperation(endpoint));
});
paths.put(path, pathItem);
});
return paths;
}
private Map<String, Object> buildOperation(ApiEndpointInfo endpoint) {
Map<String, Object> operation = new LinkedHashMap<>();
operation.put("summary", endpoint.getSummary());
if (endpoint.getDescription() != null && !endpoint.getDescription().isEmpty()) {
operation.put("description", endpoint.getDescription());
}
operation.put("tags", List.of(endpoint.getControllerClass()));
operation.put("operationId", endpoint.getMethodName());
if (endpoint.isDeprecated()) {
operation.put("deprecated", true);
}
// 参数列表
List<Map<String, Object>> parameters = new ArrayList<>();
endpoint.getPathParams().forEach(param ->
parameters.add(buildParameter(param, "path")));
endpoint.getQueryParams().forEach(param ->
parameters.add(buildParameter(param, "query")));
endpoint.getHeaderParams().forEach(param ->
parameters.add(buildParameter(param, "header")));
if (!parameters.isEmpty()) {
operation.put("parameters", parameters);
}
// 请求体
if (endpoint.getRequestBody() != null) {
operation.put("requestBody", buildRequestBody(endpoint.getRequestBody()));
}
// 响应
operation.put("responses", buildResponses(endpoint.getResponseInfo()));
return operation;
}
private Map<String, Object> buildParameter(ParamInfo param, String in) {
Map<String, Object> parameter = new LinkedHashMap<>();
parameter.put("name", param.getName());
parameter.put("in", in);
parameter.put("required", param.isRequired());
if (param.getDescription() != null) {
parameter.put("description", param.getDescription());
}
Map<String, Object> schema = new LinkedHashMap<>();
schema.put("type", mapJavaTypeToOpenApi(param.getType()));
if (param.getExample() != null) {
schema.put("example", param.getExample());
}
parameter.put("schema", schema);
return parameter;
}
private String mapJavaTypeToOpenApi(String javaType) {
Map<String, String> typeMapping = Map.of(
"String", "string",
"Integer", "integer",
"int", "integer",
"Long", "integer",
"long", "integer",
"Boolean", "boolean",
"boolean", "boolean",
"Double", "number",
"double", "number",
"BigDecimal", "number"
);
return typeMapping.getOrDefault(javaType, "object");
}
}Markdown文档生成
OpenAPI适合机器读,但有时候你需要人可以直接阅读的文档:
@Service
public class MarkdownDocGenerator {
public String generateMarkdown(List<ApiEndpointInfo> endpoints) {
StringBuilder sb = new StringBuilder();
// 按Controller分组
Map<String, List<ApiEndpointInfo>> grouped = endpoints.stream()
.collect(Collectors.groupingBy(ApiEndpointInfo::getControllerClass));
sb.append("# API文档\n\n");
sb.append("> 本文档由代码自动生成,请勿手动修改\n\n");
sb.append("## 目录\n\n");
// 生成目录
grouped.keySet().forEach(controller -> {
sb.append("- [").append(controller).append("](#")
.append(controller.toLowerCase()).append(")\n");
});
sb.append("\n---\n\n");
// 生成每个Controller的文档
grouped.forEach((controller, controllerEndpoints) -> {
sb.append("## ").append(controller).append("\n\n");
if (!controllerEndpoints.isEmpty() &&
controllerEndpoints.get(0).getControllerDescription() != null) {
sb.append(controllerEndpoints.get(0).getControllerDescription()).append("\n\n");
}
controllerEndpoints.forEach(endpoint -> {
appendEndpointDoc(sb, endpoint);
});
});
return sb.toString();
}
private void appendEndpointDoc(StringBuilder sb, ApiEndpointInfo endpoint) {
// 方法标题
String badge = endpoint.isDeprecated() ? " ~~(已废弃)~~" : "";
sb.append("### ").append(endpoint.getSummary()).append(badge).append("\n\n");
// HTTP方法和路径
sb.append("```\n");
sb.append(endpoint.getHttpMethod()).append(" ").append(endpoint.getPath()).append("\n");
sb.append("```\n\n");
if (endpoint.getDescription() != null && !endpoint.getDescription().isEmpty()) {
sb.append(endpoint.getDescription()).append("\n\n");
}
// 路径参数
if (!endpoint.getPathParams().isEmpty()) {
sb.append("**路径参数**\n\n");
sb.append("| 参数名 | 类型 | 必填 | 说明 |\n");
sb.append("|--------|------|------|------|\n");
endpoint.getPathParams().forEach(param -> {
sb.append("| ").append(param.getName())
.append(" | ").append(param.getType())
.append(" | ✅ | ").append(param.getDescription())
.append(" |\n");
});
sb.append("\n");
}
// 查询参数
if (!endpoint.getQueryParams().isEmpty()) {
sb.append("**查询参数**\n\n");
sb.append("| 参数名 | 类型 | 必填 | 默认值 | 说明 |\n");
sb.append("|--------|------|------|--------|------|\n");
endpoint.getQueryParams().forEach(param -> {
sb.append("| ").append(param.getName())
.append(" | ").append(param.getType())
.append(" | ").append(param.isRequired() ? "✅" : "❌")
.append(" | ").append(param.getDefaultValue() != null ? param.getDefaultValue() : "-")
.append(" | ").append(param.getDescription())
.append(" |\n");
});
sb.append("\n");
}
sb.append("---\n\n");
}
}CI/CD集成
把文档生成集成进构建流水线:
# .github/workflows/generate-docs.yml
name: Generate API Documentation
on:
push:
branches: [main, develop]
paths:
- 'src/main/java/**/*.java'
jobs:
generate-docs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Set up JDK 17
uses: actions/setup-java@v4
with:
java-version: '17'
distribution: 'temurin'
- name: Generate documentation
env:
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
run: |
mvn compile -q
mvn exec:java -Dexec.mainClass="com.example.DocGeneratorMain" \
-Dexec.args="--source=src/main/java --output=docs --enhance-with-ai=true"
- name: Deploy to GitHub Pages
uses: peaceiris/actions-gh-pages@v3
with:
github_token: ${{ secrets.GITHUB_TOKEN }}
publish_dir: ./docs
destination_dir: api-docs
- name: Notify Slack
if: always()
uses: 8398a7/action-slack@v3
with:
status: ${{ job.status }}
text: "API文档已更新: https://${{ github.repository_owner }}.github.io/${{ github.event.repository.name }}/api-docs"一个我差点犯的错
在搭这套系统的时候,我差点犯了一个错误:让AI直接"推断"接口的业务逻辑并写入文档。
比如,有个接口叫/orders/cancel,我让AI根据方法签名推断文档时,它生成了"用户取消订单,同时触发退款流程"这样的描述。
听起来很对,但我们的系统里,取消订单和退款是两个独立的接口,/orders/cancel只取消,不退款。
这种描述写进文档里,会让前端开发者产生错误的预期,然后花时间去查为什么退款没触发。
教训是:AI增强文档只能增强"怎么用"(参数说明、格式要求),不能推断"做什么"(业务语义)。业务语义必须来自代码本身或者人工填写。
