企业知识图谱与LLM融合:下一代智能问答的架构
2026/10/4大约 13 分钟知识图谱LLM融合GraphRAGNeo4jJava
企业知识图谱与LLM融合:下一代智能问答的架构
开篇故事:郑颖的"知识孤岛"
2025年9月,某大型制造企业的IT总监郑颖遇到了一个棘手的问题。
她的公司有18年的技术文档积累:
- 3000多份设备维修手册(PDF)
- 5万多条故障处理记录
- 设备、零件、供应商之间的复杂关系
他们上线了一个RAG问答系统,但效果让郑颖头疼:
用户问:"3号生产线的空压机最近3次故障,使用的备件有没有来自同一供应商的?"
RAG系统的回答:检索到3篇关于空压机故障的文档,拼凑了一段模糊的总结,完全无法回答"供应商关联"这个关系型问题。
问题的核心在于:这个问题需要推理,不是检索:
- 找3号线空压机最近3次故障记录
- 提取每次故障使用的备件型号
- 查询这些备件的供应商
- 判断是否有重叠
这不是RAG能解决的,这需要知识图谱。
郑颖的团队花了三个月,构建了企业设备知识图谱并与LLM融合:
- Neo4j存储设备/故障/备件/供应商的关系网络
- LLM将自然语言问题转换为Cypher查询
- 图查询结果再经LLM生成自然语言答案
上线后:
- 关系型问题的准确率从23%提升到91%
- 用户满意度从3.2分提升到4.6分(5分满分)
- 维修工程师平均排障时间从2.3小时降低到45分钟
TL;DR
- 纯向量RAG的局限:无法处理关系推理、多跳查询
- 知识图谱的优势:精确的关系查询、可解释的推理路径
- GraphRAG架构:自然语言 → Cypher → 图查询 → LLM生成答案
- Java集成:Spring AI + Neo4j Java Driver + Cypher生成
- KG+Vector混合:复杂问题两路并行,取长补短
一、知识图谱 vs 向量数据库:互补关系
1.1 核心对比
| 维度 | 向量数据库(RAG) | 知识图谱 |
|---|---|---|
| 数据类型 | 非结构化文本 | 结构化关系 |
| 查询方式 | 语义相似度 | 精确关系遍历 |
| 擅长问题 | "关于X的文档" | "X和Y有什么关系" |
| 可解释性 | 低(相似度分数) | 高(明确路径) |
| 更新成本 | 低(重新向量化) | 中(维护关系) |
| 多跳推理 | 弱 | 强 |
1.2 典型问题分类
适合向量RAG的问题:
✓ "智能合同的主要风险有哪些?"
✓ "XX设备的安装步骤是什么?"
✓ "上季度的销售分析报告说了什么?"
适合知识图谱的问题:
✓ "A设备最近3个月的故障,有多少与供应商B有关?"
✓ "和员工C同一部门且负责相同客户的同事是谁?"
✓ "在项目D上工作过的工程师,他们还参与了哪些项目?"
两者结合最佳的问题:
✓ "3号线空压机历史故障的主要原因是什么,备件供应商是否集中?"
(需要文本分析 + 关系查询)二、知识图谱建模:企业设备维修场景
2.1 图数据模型设计
节点类型(Node):
├── Equipment(设备): id, name, model, location, installDate
├── FaultRecord(故障记录): id, faultType, description, resolvedAt
├── Spare(备件): id, partNo, name, specification
├── Supplier(供应商): id, name, contactInfo, qualificationLevel
├── Engineer(工程师): id, name, specialty, certifications
└── ProductionLine(生产线): id, name, department
关系类型(Relationship):
├── BELONGS_TO: Equipment → ProductionLine(设备属于生产线)
├── HAS_FAULT: Equipment → FaultRecord(设备有故障记录)
├── USED_IN: Spare → FaultRecord(备件用于故障维修)
├── SUPPLIED_BY: Spare → Supplier(备件由供应商提供)
├── HANDLED_BY: FaultRecord → Engineer(故障由工程师处理)
└── HAS_MODEL: Equipment → Model(设备型号关联)2.2 Neo4j图建模(Java SDK)
// KnowledgeGraphBuilder.java
@Service
@Slf4j
public class KnowledgeGraphBuilder {
private final Driver neo4jDriver;
public KnowledgeGraphBuilder(@Value("${neo4j.uri}") String uri,
@Value("${neo4j.username}") String username,
@Value("${neo4j.password}") String password) {
this.neo4jDriver = GraphDatabase.driver(uri,
AuthTokens.basic(username, password));
}
// 创建索引(优化查询性能)
public void createIndexes() {
try (Session session = neo4jDriver.session()) {
session.run("CREATE INDEX equipment_id IF NOT EXISTS FOR (e:Equipment) ON (e.id)");
session.run("CREATE INDEX fault_record_id IF NOT EXISTS FOR (f:FaultRecord) ON (f.id)");
session.run("CREATE INDEX supplier_name IF NOT EXISTS FOR (s:Supplier) ON (s.name)");
session.run("CREATE FULLTEXT INDEX fault_description IF NOT EXISTS FOR (f:FaultRecord) ON EACH [f.description]");
log.info("知识图谱索引创建完成");
}
}
// 导入设备数据
@Transactional
public void importEquipment(List<EquipmentData> equipmentList) {
String cypher = """
UNWIND $equipments AS eq
MERGE (e:Equipment {id: eq.id})
SET e.name = eq.name,
e.model = eq.model,
e.location = eq.location,
e.installDate = date(eq.installDate)
WITH e, eq
MATCH (pl:ProductionLine {id: eq.productionLineId})
MERGE (e)-[:BELONGS_TO]->(pl)
""";
try (Session session = neo4jDriver.session()) {
List<Map<String, Object>> params = equipmentList.stream()
.map(this::toMap)
.toList();
session.run(cypher, Map.of("equipments", params));
log.info("导入 {} 台设备数据", equipmentList.size());
}
}
// 导入故障记录
@Transactional
public void importFaultRecord(FaultRecord fault) {
String cypher = """
MATCH (e:Equipment {id: $equipmentId})
MERGE (f:FaultRecord {id: $faultId})
SET f.faultType = $faultType,
f.description = $description,
f.occurredAt = datetime($occurredAt),
f.resolvedAt = datetime($resolvedAt)
MERGE (e)-[:HAS_FAULT]->(f)
WITH f
// 关联备件
UNWIND $spares AS spare
MATCH (s:Spare {id: spare.id})
MERGE (s)-[:USED_IN {quantity: spare.quantity}]->(f)
WITH f
// 关联工程师
MATCH (eng:Engineer {id: $engineerId})
MERGE (f)-[:HANDLED_BY]->(eng)
""";
try (Session session = neo4jDriver.session()) {
session.run(cypher, Map.of(
"equipmentId", fault.getEquipmentId(),
"faultId", fault.getId(),
"faultType", fault.getFaultType(),
"description", fault.getDescription(),
"occurredAt", fault.getOccurredAt().toString(),
"resolvedAt", fault.getResolvedAt().toString(),
"spares", fault.getUsedSpares(),
"engineerId", fault.getHandledBy()
));
}
}
}三、GraphRAG核心:自然语言到Cypher
3.1 Text2Cypher实现
// Text2CypherService.java
@Service
@Slf4j
public class Text2CypherService {
private final ChatClient chatClient;
private final String GRAPH_SCHEMA;
@PostConstruct
public void initSchema() {
GRAPH_SCHEMA = """
## 知识图谱Schema
### 节点类型
- Equipment: {id, name, model, location, installDate}
- FaultRecord: {id, faultType, description, occurredAt, resolvedAt}
- Spare: {id, partNo, name, specification, unitPrice}
- Supplier: {id, name, contactInfo, qualificationLevel}
- Engineer: {id, name, specialty}
- ProductionLine: {id, name, department}
### 关系类型
- (Equipment)-[:BELONGS_TO]->(ProductionLine)
- (Equipment)-[:HAS_FAULT]->(FaultRecord)
- (Spare)-[:USED_IN {quantity}]->(FaultRecord)
- (Spare)-[:SUPPLIED_BY]->(Supplier)
- (FaultRecord)-[:HANDLED_BY]->(Engineer)
### 时间格式
- date: YYYY-MM-DD(用于installDate)
- datetime: ISO 8601(用于occurredAt, resolvedAt)
""";
}
public String generateCypher(String naturalLanguageQuestion) {
String prompt = String.format("""
你是Neo4j Cypher查询专家。根据知识图谱Schema和用户问题,
生成准确的Cypher查询语句。
%s
## 规则
1. 只返回Cypher语句,不要解释
2. 使用参数化查询避免注入
3. 添加LIMIT防止返回过多数据(默认LIMIT 50)
4. 时间条件使用 datetime() 函数
5. 模糊匹配使用 CONTAINS 或 =~
## 用户问题
%s
## Few-shot示例
问题:3号生产线最近3个月的故障记录有哪些?
Cypher:
MATCH (pl:ProductionLine {name: "3号生产线"})<-[:BELONGS_TO]-(e:Equipment)
-[:HAS_FAULT]->(f:FaultRecord)
WHERE f.occurredAt > datetime() - duration("P3M")
RETURN e.name AS equipment, f.faultType, f.description,
f.occurredAt AS time
ORDER BY f.occurredAt DESC
LIMIT 50
问题:最近3次空压机故障使用的备件,供应商是否有重复?
Cypher:
MATCH (e:Equipment)-[:HAS_FAULT]->(f:FaultRecord)<-[:USED_IN]-(s:Spare)
-[:SUPPLIED_BY]->(sup:Supplier)
WHERE e.name CONTAINS "空压机"
WITH f, s, sup
ORDER BY f.occurredAt DESC
WITH collect(DISTINCT {fault: f.id, spare: s.name, supplier: sup.name})[..3] AS recent
UNWIND recent AS item
WITH item.supplier AS supplierName, count(*) AS occurrences
RETURN supplierName, occurrences
ORDER BY occurrences DESC
现在请生成查询:
""",
GRAPH_SCHEMA,
naturalLanguageQuestion
);
String cypher = chatClient.prompt()
.user(prompt)
.call()
.content()
.trim();
// 清理可能的markdown代码块
cypher = cypher.replaceAll("```cypher\\s*", "")
.replaceAll("```\\s*", "")
.trim();
log.debug("生成Cypher: {}", cypher);
return cypher;
}
// 验证Cypher安全性(只允许读操作)
public boolean isSafeCypher(String cypher) {
String upperCypher = cypher.toUpperCase();
// 禁止写操作
List<String> dangerousKeywords = List.of(
"CREATE", "MERGE", "DELETE", "REMOVE", "SET", "DETACH");
return dangerousKeywords.stream()
.noneMatch(upperCypher::contains);
}
}3.2 Cypher执行与结果处理
// GraphQueryExecutor.java
@Service
@Slf4j
public class GraphQueryExecutor {
private final Driver neo4jDriver;
public GraphQueryResult executeCypher(String cypher) {
if (!isSafe(cypher)) {
throw new SecurityException("非法的Cypher查询(包含写操作)");
}
try (Session session = neo4jDriver.session()) {
Result result = session.run(cypher);
List<Map<String, Object>> rows = new ArrayList<>();
List<String> columns = null;
while (result.hasNext()) {
Record record = result.next();
if (columns == null) {
columns = record.keys();
}
Map<String, Object> row = new LinkedHashMap<>();
for (String key : record.keys()) {
Value value = record.get(key);
row.put(key, convertValue(value));
}
rows.add(row);
}
return GraphQueryResult.builder()
.cypher(cypher)
.columns(columns != null ? columns : Collections.emptyList())
.rows(rows)
.rowCount(rows.size())
.build();
} catch (Exception e) {
log.error("Cypher执行失败: {}", cypher, e);
throw new GraphQueryException("查询执行失败: " + e.getMessage(), e);
}
}
// 将Neo4j值转换为Java类型
private Object convertValue(Value value) {
if (value.isNull()) return null;
return switch (value.type().name()) {
case "STRING" -> value.asString();
case "INTEGER" -> value.asLong();
case "FLOAT" -> value.asDouble();
case "BOOLEAN" -> value.asBoolean();
case "DATE" -> value.asLocalDate().toString();
case "DATE_TIME" -> value.asZonedDateTime().toString();
case "NODE" -> convertNode(value.asNode());
case "RELATIONSHIP" -> convertRelationship(value.asRelationship());
case "LIST" -> value.asList(this::convertValue);
default -> value.toString();
};
}
private Map<String, Object> convertNode(Node node) {
Map<String, Object> result = new LinkedHashMap<>();
result.put("_labels", node.labels());
node.keys().forEach(key -> result.put(key, convertValue(node.get(key))));
return result;
}
private boolean isSafe(String cypher) {
String upper = cypher.toUpperCase();
return !upper.contains("CREATE") && !upper.contains("DELETE")
&& !upper.contains("REMOVE") && !upper.contains("SET ");
}
}3.3 Graph + LLM答案生成
// GraphRagService.java
@Service
@Slf4j
public class GraphRagService {
private final Text2CypherService text2CypherService;
private final GraphQueryExecutor queryExecutor;
private final ChatClient answerGenerationClient;
public GraphRagAnswer answer(String question) {
long startTime = System.currentTimeMillis();
try {
// Step 1: 将问题转换为Cypher
String cypher = text2CypherService.generateCypher(question);
if (!text2CypherService.isSafeCypher(cypher)) {
return GraphRagAnswer.error("生成的查询包含不安全操作");
}
// Step 2: 执行Cypher查询
GraphQueryResult queryResult = queryExecutor.executeCypher(cypher);
if (queryResult.getRows().isEmpty()) {
return GraphRagAnswer.noData(question, cypher,
"知识图谱中没有找到相关数据");
}
// Step 3: 将查询结果格式化为LLM可理解的文本
String dataContext = formatQueryResult(queryResult);
// Step 4: LLM生成自然语言答案
String answer = generateAnswer(question, cypher, dataContext);
return GraphRagAnswer.builder()
.question(question)
.answer(answer)
.cypher(cypher)
.dataContext(dataContext)
.rowCount(queryResult.getRowCount())
.processingTimeMs(System.currentTimeMillis() - startTime)
.build();
} catch (GraphQueryException e) {
// Cypher执行失败,尝试修复
log.warn("Cypher执行失败,尝试自动修复: {}", e.getMessage());
return retryWithFix(question, e);
}
}
private String formatQueryResult(GraphQueryResult result) {
StringBuilder sb = new StringBuilder();
sb.append("查询结果(共").append(result.getRowCount()).append("条):\n\n");
// 表格格式
sb.append("| ");
result.getColumns().forEach(col -> sb.append(col).append(" | "));
sb.append("\n|");
result.getColumns().forEach(col -> sb.append("---|"));
sb.append("\n");
for (Map<String, Object> row : result.getRows()) {
sb.append("| ");
result.getColumns().forEach(col -> {
Object val = row.get(col);
sb.append(val != null ? val.toString() : "N/A").append(" | ");
});
sb.append("\n");
}
return sb.toString();
}
private String generateAnswer(String question, String cypher, String data) {
String prompt = String.format("""
用户问题:%s
从知识图谱查询到的数据:
%s
请基于以上数据,用清晰、简洁的语言回答用户问题。
如果数据不足以完整回答,请说明。
不要提及Cypher或技术实现细节。
""", question, data);
return answerGenerationClient.prompt()
.system("你是一个企业知识助手,帮助用户从设备维修知识图谱中获取信息。")
.user(prompt)
.call()
.content();
}
// 自动修复错误的Cypher
private GraphRagAnswer retryWithFix(String question, GraphQueryException error) {
String fixPrompt = String.format("""
以下Cypher查询执行失败,请修复:
原始问题:%s
错误信息:%s
请生成一个新的、正确的Cypher查询。
""", question, error.getMessage());
String fixedCypher = text2CypherService.generateCypher(fixPrompt);
try {
GraphQueryResult result = queryExecutor.executeCypher(fixedCypher);
String data = formatQueryResult(result);
String answer = generateAnswer(question, fixedCypher, data);
return GraphRagAnswer.builder()
.question(question)
.answer(answer)
.cypher(fixedCypher)
.wasRetried(true)
.build();
} catch (Exception e) {
return GraphRagAnswer.error("无法执行知识图谱查询,请简化问题后重试");
}
}
}四、混合架构:KG + Vector双路并行
4.1 混合问答路由
// HybridRagService.java
@Service
@Slf4j
public class HybridRagService {
private final GraphRagService graphRagService;
private final VectorRagService vectorRagService;
private final ChatClient routerClient;
public HybridAnswer answer(String question) {
// 1. 判断问题类型
QuestionType questionType = classifyQuestion(question);
return switch (questionType) {
case RELATIONSHIP_QUERY -> {
// 纯图查询
GraphRagAnswer graphAnswer = graphRagService.answer(question);
yield HybridAnswer.fromGraph(graphAnswer);
}
case DOCUMENT_RETRIEVAL -> {
// 纯向量检索
VectorRagAnswer vectorAnswer = vectorRagService.answer(question);
yield HybridAnswer.fromVector(vectorAnswer);
}
case HYBRID -> {
// 两路并行,合并结果
CompletableFuture<GraphRagAnswer> graphFuture =
CompletableFuture.supplyAsync(() -> graphRagService.answer(question));
CompletableFuture<VectorRagAnswer> vectorFuture =
CompletableFuture.supplyAsync(() -> vectorRagService.answer(question));
GraphRagAnswer graphAnswer = graphFuture.join();
VectorRagAnswer vectorAnswer = vectorFuture.join();
// 融合两路结果
String mergedAnswer = mergeAnswers(question, graphAnswer, vectorAnswer);
yield HybridAnswer.fromBoth(mergedAnswer, graphAnswer, vectorAnswer);
}
};
}
// 问题分类
private QuestionType classifyQuestion(String question) {
// 关系关键词
List<String> relationshipKeywords = List.of(
"哪些", "多少", "有没有", "关联", "相关", "来自", "属于",
"负责", "参与", "供应商", "生产线", "故障次数"
);
// 文档关键词
List<String> documentKeywords = List.of(
"怎么", "如何", "步骤", "原因", "说明", "解释", "介绍"
);
long relCount = relationshipKeywords.stream()
.filter(question::contains).count();
long docCount = documentKeywords.stream()
.filter(question::contains).count();
if (relCount > 2 && docCount <= 1) return QuestionType.RELATIONSHIP_QUERY;
if (docCount > 2 && relCount <= 1) return QuestionType.DOCUMENT_RETRIEVAL;
return QuestionType.HYBRID;
}
private String mergeAnswers(String question,
GraphRagAnswer graphAnswer,
VectorRagAnswer vectorAnswer) {
String prompt = String.format("""
用户问题:%s
来源1(知识图谱结构化数据):
%s
来源2(文档检索内容):
%s
请综合以上两个来源的信息,给出完整、准确的回答。
优先使用结构化数据(来源1)中的精确数字/关系。
用文档内容(来源2)补充背景和细节。
""",
question,
graphAnswer.getAnswer(),
vectorAnswer.getAnswer()
);
return routerClient.prompt()
.user(prompt)
.call()
.content();
}
}五、知识图谱的构建与维护
5.1 从非结构化文档自动构建图谱
// KnowledgeGraphExtractor.java
@Service
@Slf4j
public class KnowledgeGraphExtractor {
private final ChatClient extractionClient;
private final KnowledgeGraphBuilder graphBuilder;
// 从故障维修报告中自动抽取实体和关系
public ExtractionResult extractFromDocument(String documentContent) {
String extractionPrompt = """
从以下设备维修报告中,提取所有实体和关系。
报告内容:
""" + documentContent + """
提取规则:
1. 设备信息:名称、型号、编号、所属生产线
2. 故障信息:故障类型、发生时间、描述
3. 备件信息:型号、名称、数量
4. 供应商信息:名称
5. 处理工程师:姓名
返回JSON格式:
{
"equipment": {"id": "...", "name": "...", "model": "...", "productionLine": "..."},
"fault": {"type": "...", "description": "...", "occurredAt": "YYYY-MM-DDTHH:mm:ss"},
"spares": [{"partNo": "...", "name": "...", "quantity": 1, "supplier": "..."}],
"engineer": {"name": "..."}
}
""";
String jsonResponse = extractionClient.prompt()
.user(extractionPrompt)
.call()
.content();
ExtractionResult result = parseExtractionResult(jsonResponse);
if (result.isValid()) {
// 写入知识图谱
graphBuilder.importFaultRecord(result.toFaultRecord());
}
return result;
}
// 批量处理历史文档
@Async
public void batchExtractFromDocuments(List<String> documentPaths) {
int success = 0, failed = 0;
for (String path : documentPaths) {
try {
String content = readDocument(path);
ExtractionResult result = extractFromDocument(content);
if (result.isValid()) {
success++;
} else {
failed++;
log.warn("文档抽取结果不完整: {}", path);
}
} catch (Exception e) {
failed++;
log.error("文档抽取失败 [{}]: {}", path, e.getMessage());
}
}
log.info("批量抽取完成: 成功={}, 失败={}", success, failed);
}
}5.2 知识图谱质量维护
// GraphQualityChecker.java
@Service
@Scheduled(cron = "0 0 2 * * *") // 每天凌晨2点执行
public class GraphQualityChecker {
private final Driver neo4jDriver;
private final NotificationService notificationService;
public void runDailyQualityCheck() {
List<QualityIssue> issues = new ArrayList<>();
// 检查1:孤立节点(没有任何关系的设备)
issues.addAll(findOrphanEquipments());
// 检查2:数据缺失(故障记录没有关联的工程师)
issues.addAll(findFaultsWithoutEngineer());
// 检查3:时间异常(解决时间早于发生时间)
issues.addAll(findTimeAnomalies());
if (!issues.isEmpty()) {
log.warn("知识图谱质量检查发现 {} 个问题", issues.size());
notificationService.sendQualityAlert(issues);
} else {
log.info("知识图谱质量检查通过");
}
}
private List<QualityIssue> findOrphanEquipments() {
String cypher = """
MATCH (e:Equipment)
WHERE NOT (e)-[:HAS_FAULT]->()
AND NOT (e)-[:BELONGS_TO]->()
RETURN e.id, e.name
LIMIT 100
""";
// ... 执行查询并返回问题列表
return new ArrayList<>();
}
}六、Spring AI集成方案
6.1 Spring AI GraphRAG配置
// GraphRagAutoConfiguration.java
@Configuration
@EnableAspectJAutoProxy
public class GraphRagAutoConfiguration {
@Bean
public Driver neo4jDriver(
@Value("${spring.neo4j.uri}") String uri,
@Value("${spring.neo4j.authentication.username}") String username,
@Value("${spring.neo4j.authentication.password}") String password) {
return GraphDatabase.driver(uri, AuthTokens.basic(username, password),
Config.builder()
.withMaxConnectionPoolSize(50)
.withConnectionAcquisitionTimeout(30, TimeUnit.SECONDS)
.withMaxTransactionRetryTime(15, TimeUnit.SECONDS)
.build());
}
@Bean
public ChatClient answerGenerationClient(OpenAiChatModel model) {
return ChatClient.builder(model)
.defaultSystem("""
你是一个企业知识库助手,专注于从结构化数据中提取和解释信息。
回答要简洁准确,包含具体数字和关系。
""")
.build();
}
@Bean
@Primary
public QuestionAnsweringService hybridQaService(
GraphRagService graphRag,
VectorRagService vectorRag) {
return new HybridRagService(graphRag, vectorRag);
}
}七、常见问题 FAQ
Q1:Text2Cypher准确率不够高怎么办?
A:提升Text2Cypher准确率的策略:
- 增加Few-shot示例:覆盖常见的查询模式(统计/最近N条/关系查找)
- Schema注入:将完整的节点/关系定义注入提示词
- Cypher验证+修复循环:执行失败时自动将错误反馈给LLM让其修复
- 微调专用模型:收集成功的(问题→Cypher)对,微调小模型
Q2:知识图谱更新频率高时如何保持实时性?
A:建立变更数据捕获(CDC)管道:
- 监听业务数据库的binlog(MySQL CDC)
- 通过Kafka将变更事件发送到图谱更新服务
- 使用Neo4j的
MERGE操作保证幂等性 - 异步更新,不影响主流程
Q3:如何评估知识图谱的查询质量?
A:建立评测集:
- 人工编写100个典型问题+标准Cypher对
- 计算Text2Cypher的准确率
- 对比图查询结果与人工预期的一致性
- 每次Schema变更后回归测试
Q4:图谱规模很大时,Cypher查询会不会很慢?
A:性能优化要点:
- 为频繁查询的属性创建索引(CREATE INDEX)
- 复杂文本搜索使用全文索引(FULLTEXT INDEX)
- 在关系上加属性过滤(WHERE r.year = 2025)减少遍历
- 使用
EXPLAIN/PROFILE分析查询计划 - 对于超大图(>1亿节点),考虑Neo4j Enterprise的读写分离
Q5:小公司没有专业的知识工程师,如何维护图谱?
A:
- AI辅助建模:让GPT-4o根据业务描述生成初始Schema
- 自动抽取:从现有文档自动构建初版图谱
- 渐进式完善:先建核心领域图谱(20%实体关系),后续逐步扩展
- 众包校验:让领域专家(工程师)通过友好界面修正错误
八、总结
知识图谱与LLM的融合是AI问答系统的重要演进方向:
| 架构 | 适用场景 | 局限 |
|---|---|---|
| 纯RAG | 文档检索、概念解释 | 无法处理关系推理 |
| 纯KG | 精确关系查询 | 无法处理非结构化内容 |
| KG + LLM(GraphRAG) | 关系+解释 | 建图成本高 |
| KG + Vector(Hybrid) | 全场景覆盖 | 架构复杂 |
郑颖团队的实践证明:不是所有问题都适合RAG,当业务中存在复杂的实体关系时,知识图谱是RAG的最佳搭档,而不是竞争者。
