第1686篇：AI应用的OWASP Top 10——2024年大模型应用十大安全风险解析

老张2026/4/30大约 12 分钟

第1686篇：AI应用的OWASP Top 10——2024年大模型应用十大安全风险解析

OWASP（开放式网络应用程序安全项目）是安全领域最重要的非营利组织之一，他们发布的"Top 10"风险清单一直是 Web 安全的权威参考。2023年他们发布了专门针对大模型应用的 OWASP LLM Top 10，这份清单在2024年进行了更新。

今天我们就来认真解读这十个风险，不光说"是什么"，更要说"怎么防"。

一、清单总览

OWASP LLM Top 10 2025（对应2024年发布的更新版）十大风险：

排名	风险名称	中文描述
LLM01	Prompt Injection	提示词注入
LLM02	Sensitive Information Disclosure	敏感信息泄露
LLM03	Supply Chain Vulnerabilities	供应链漏洞
LLM04	Data and Model Poisoning	数据与模型投毒
LLM05	Improper Output Handling	不当的输出处理
LLM06	Excessive Agency	过度自主行为
LLM07	System Prompt Leakage	系统提示词泄露
LLM08	Vector and Embedding Weaknesses	向量与嵌入漏洞
LLM09	Misinformation	错误信息
LLM10	Unbounded Consumption	无限制资源消耗

我们逐个展开分析。

二、LLM01 — 提示词注入（Prompt Injection）

风险等级：严重

这是排名第一的风险，覆盖面最广，影响最深。本系列前几篇已经详细讲过直接注入和间接注入，这里补充一些通常被忽略的变体。

多模态注入：对于支持图片、文件输入的 AI 系统，攻击者可以把注入指令藏在图片的元数据（EXIF）里，或者在白色背景上用白色文字写指令，肉眼不可见但 OCR 会读到。

// 处理用户上传图片时的安全清洗
@Component
public class ImageInputSanitizer {
    
    public byte[] sanitize(byte[] imageBytes, String filename) {
        try {
            BufferedImage image = ImageIO.read(new ByteArrayInputStream(imageBytes));
            
            // 1. 清除所有 EXIF 元数据（可能包含注入指令）
            ByteArrayOutputStream cleanOutput = new ByteArrayOutputStream();
            
            // 重新编码图片，会清除元数据
            String format = getImageFormat(filename);
            ImageIO.write(image, format, cleanOutput);
            
            // 2. 如果要做 OCR，对识别出的文本再做注入检测
            // (OCR 调用省略)
            
            return cleanOutput.toByteArray();
            
        } catch (IOException e) {
            throw new SecurityException("图片处理失败: " + e.getMessage());
        }
    }
}

防护要点：

区分指令上下文和数据上下文，永远不要让用户数据直接充当指令
对所有外部输入（包括文件、图片、外部数据库内容）做净化
用 XML 标签、特殊分隔符等方式明确标注数据边界

三、LLM02 — 敏感信息泄露（Sensitive Information Disclosure）

风险等级：高

模型可能泄露训练数据中的敏感信息，包括个人身份信息（PII）、医疗数据、财务数据，甚至一些公司内部信息（如果你用了公司数据微调）。

这个风险有两个来源：模型记忆了训练数据里的敏感信息；或者应用直接把敏感数据放进了上下文，结果被输出出去。

// PII 检测和脱敏
@Component
public class PIIDetectorAndRedactor {
    
    private static final Map<String, Pattern> PII_PATTERNS = Map.of(
        "PHONE", Pattern.compile("(?:(?:\\+|00)86)?1[3-9]\\d{9}"),
        "ID_CARD", Pattern.compile("[1-9]\\d{5}(?:18|19|20)\\d{2}(?:0[1-9]|1[0-2])(?:0[1-9]|[1-2]\\d|3[0-1])\\d{3}[0-9Xx]"),
        "EMAIL", Pattern.compile("[a-zA-Z0-9._%+\\-]+@[a-zA-Z0-9.\\-]+\\.[a-zA-Z]{2,}"),
        "BANK_CARD", Pattern.compile("\\b(?:4[0-9]{12}(?:[0-9]{3})?|5[1-5][0-9]{14}|[0-9]{16,19})\\b"),
        "IP_ADDRESS", Pattern.compile("\\b(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\b")
    );
    
    // 在数据进入LLM上下文之前脱敏
    public RedactedContent redactForLLM(String content) {
        List<RedactionRecord> records = new ArrayList<>();
        String redacted = content;
        
        for (Map.Entry<String, Pattern> entry : PII_PATTERNS.entrySet()) {
            Matcher matcher = entry.getValue().matcher(redacted);
            StringBuffer sb = new StringBuffer();
            while (matcher.find()) {
                String original = matcher.group();
                String placeholder = "[" + entry.getKey() + "_" + records.size() + "]";
                records.add(new RedactionRecord(placeholder, original, entry.getKey()));
                matcher.appendReplacement(sb, placeholder);
            }
            matcher.appendTail(sb);
            redacted = sb.toString();
        }
        
        return new RedactedContent(redacted, records);
    }
    
    // 如果需要，在LLM输出之后还原（谨慎使用，通常不需要还原）
    public String restore(String redactedOutput, List<RedactionRecord> records) {
        String restored = redactedOutput;
        for (RedactionRecord record : records) {
            restored = restored.replace(record.getPlaceholder(), record.getOriginal());
        }
        return restored;
    }
    
    // 检测 LLM 输出中是否意外包含了 PII
    public List<PIIFinding> detectInOutput(String output) {
        List<PIIFinding> findings = new ArrayList<>();
        for (Map.Entry<String, Pattern> entry : PII_PATTERNS.entrySet()) {
            Matcher matcher = entry.getValue().matcher(output);
            while (matcher.find()) {
                findings.add(new PIIFinding(entry.getKey(), matcher.group(), matcher.start()));
            }
        }
        return findings;
    }
}

四、LLM03 — 供应链漏洞（Supply Chain Vulnerabilities）

风险等级：高

AI 应用的供应链比传统软件更复杂，漏洞面更广。

模型来源风险：从 HuggingFace、ModelScope 等平台下载的模型可能被篡改，包含后门。

依赖库风险：AI 框架（LangChain、LlamaIndex 等）的漏洞会直接影响上层应用。LangChain 历史上有过不少 RCE 漏洞。

数据集风险：使用的公开数据集可能包含投毒数据。

插件/工具风险：AI 应用里配置的外部工具，如果有安全漏洞，攻击者可以通过 AI 触发这些漏洞。

// 模型文件完整性校验
@Service
public class ModelIntegrityVerifier {
    
    // 预期的模型文件哈希（来自官方渠道，人工记录）
    private final Map<String, String> trustedModelHashes = Map.of(
        "models/llama-3.1-8b", "sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855",
        "models/embedding-large", "sha256:abc123..."
    );
    
    public VerificationResult verify(String modelPath) {
        try {
            String actualHash = computeHash(modelPath);
            String expectedHash = trustedModelHashes.get(modelPath);
            
            if (expectedHash == null) {
                return VerificationResult.unknownModel(modelPath);
            }
            
            if (!actualHash.equals(expectedHash)) {
                log.error("模型文件哈希不匹配！模型: {}, 期望: {}, 实际: {}", 
                    modelPath, expectedHash, actualHash);
                return VerificationResult.tampered(modelPath, expectedHash, actualHash);
            }
            
            return VerificationResult.verified(modelPath);
            
        } catch (IOException e) {
            return VerificationResult.error(modelPath, e.getMessage());
        }
    }
    
    private String computeHash(String filePath) throws IOException {
        try (InputStream is = new DigestInputStream(
                new FileInputStream(filePath), 
                MessageDigest.getInstance("SHA-256"))) {
            byte[] buffer = new byte[8192];
            while (is.read(buffer) != -1) {}
            return "sha256:" + Hex.encodeHexString(
                ((DigestInputStream) is).getMessageDigest().digest());
        } catch (NoSuchAlgorithmException e) {
            throw new RuntimeException(e);
        }
    }
}

防护要点：

锁定依赖版本，定期用 Snyk/Dependabot 扫描依赖漏洞
从官方渠道下载模型，校验哈希值
对 AI 框架（LangChain 等）保持警惕，定期更新，关注安全公告
工具调用做最小权限控制

五、LLM05 — 不当的输出处理（Improper Output Handling）

风险等级：高

模型的输出如果不经过处理就直接使用，会引入很多安全问题。这是传统 Web 安全问题在 AI 场景的延伸。

XSS via LLM：如果把模型输出直接渲染成 HTML，模型可能被诱导输出 <script>alert(1)</script>。

SQL 注入 via LLM：如果把模型输出用于拼接 SQL，可能导致 SQL 注入。

代码执行 via LLM：AI Agent 场景里，如果模型生成的代码被直接执行，注入的恶意代码会被运行。

// 输出后处理：根据使用场景做不同的转义
@Component
public class LLMOutputProcessor {
    
    // 用于 HTML 渲染场景
    public String processForHTML(String rawOutput) {
        // 不要直接渲染，先转义
        String escaped = HtmlUtils.htmlEscape(rawOutput);
        
        // 如果需要支持 Markdown，用白名单 HTML 库
        // 而不是允许任意 HTML
        return markdownSafeRenderer.render(escaped);
    }
    
    // 用于 SQL 场景（更推荐用参数化查询，不要让LLM输出拼SQL）
    public String processForSQL(String rawOutput) {
        // 这里只做最后一道防线的检测
        if (containsSQLKeywords(rawOutput)) {
            log.warn("LLM输出包含SQL关键词，需要审查: {}", rawOutput);
            throw new SecurityException("LLM输出包含不安全内容，拒绝执行");
        }
        return rawOutput;
    }
    
    // 用于代码执行场景（AI Agent）
    public CodeExecutionResult processForExecution(String generatedCode, ExecutionContext context) {
        // 1. 静态分析：检测危险 API 调用
        List<String> dangerousCalls = detectDangerousCalls(generatedCode);
        if (!dangerousCalls.isEmpty()) {
            return CodeExecutionResult.blocked("包含危险API调用: " + dangerousCalls);
        }
        
        // 2. 在沙箱中执行，不是直接执行
        return sandboxExecutor.execute(generatedCode, context);
    }
    
    private List<String> detectDangerousCalls(String code) {
        List<String> dangerous = new ArrayList<>();
        String[] dangerousPatterns = {
            "Runtime.exec", "ProcessBuilder", "System.exit",
            "File.delete", "Files.delete", "new File(",
            "ClassLoader", "Reflection",
            "os.system", "subprocess", "eval(", "exec(",
            "__import__", "open(", "os.path"
        };
        
        for (String pattern : dangerousPatterns) {
            if (code.contains(pattern)) {
                dangerous.add(pattern);
            }
        }
        return dangerous;
    }
}

六、LLM06 — 过度自主行为（Excessive Agency）

风险等级：严重

这个风险在 AI Agent 大行其道的今天变得越来越重要。当 AI 被赋予了太多"自主行动"的能力——比如可以发邮件、操作文件、调用 API、执行代码——一旦被注入或出现判断失误，后果可能是灾难性的。

我见过一个案例：一个 AI 助手被配置了"可以根据用户要求发送邮件"的工具，结果用户写了一个刁钻的 prompt，让 AI 把整个联系人列表都发了一封测试邮件。

防护原则：最小权限（Least Privilege）+ 人工确认（Human in the Loop）。

@Service
public class AgentActionGovernor {
    
    // 操作风险等级定义
    public enum ActionRiskLevel {
        LOW,     // 只读操作，无外部影响
        MEDIUM,  // 有副作用，但可撤销
        HIGH,    // 有显著外部影响，部分不可撤销
        CRITICAL // 不可撤销的重大操作
    }
    
    private static final Map<String, ActionRiskLevel> ACTION_RISK_MAP = Map.of(
        "search_knowledge_base", ActionRiskLevel.LOW,
        "read_file", ActionRiskLevel.LOW,
        "create_draft_email", ActionRiskLevel.MEDIUM,
        "send_email", ActionRiskLevel.HIGH,
        "delete_file", ActionRiskLevel.HIGH,
        "make_payment", ActionRiskLevel.CRITICAL,
        "modify_database", ActionRiskLevel.CRITICAL
    );
    
    public ActionApprovalResult requestApproval(
            String actionName, 
            Map<String, Object> actionParams,
            UserContext userContext) {
        
        ActionRiskLevel riskLevel = ACTION_RISK_MAP.getOrDefault(actionName, ActionRiskLevel.HIGH);
        
        switch (riskLevel) {
            case LOW:
                // 低风险操作直接执行
                return ActionApprovalResult.autoApproved();
                
            case MEDIUM:
                // 中等风险：记录日志，但不需要用户确认
                auditLog.record(userContext.getUserId(), actionName, actionParams);
                return ActionApprovalResult.autoApproved();
                
            case HIGH:
                // 高风险：需要用户显式确认
                String confirmationId = pendingActionStore.store(actionName, actionParams);
                return ActionApprovalResult.requiresConfirmation(
                    confirmationId,
                    buildConfirmationMessage(actionName, actionParams)
                );
                
            case CRITICAL:
                // 关键操作：需要用户二次认证 + 记录
                String criticalId = pendingActionStore.store(actionName, actionParams);
                return ActionApprovalResult.requiresMFA(
                    criticalId,
                    buildConfirmationMessage(actionName, actionParams)
                );
                
            default:
                return ActionApprovalResult.denied("未知风险等级");
        }
    }
    
    private String buildConfirmationMessage(String actionName, Map<String, Object> params) {
        return String.format(
            "AI 助手请求执行操作：\n操作：%s\n参数：%s\n\n请确认是否允许此操作？",
            actionName, 
            formatParams(params)
        );
    }
    
    // 执行前的最后防线：范围限制
    public boolean isWithinAllowedScope(String actionName, Map<String, Object> params, UserContext ctx) {
        // 例如：发邮件时，只允许发给白名单中的收件人
        if ("send_email".equals(actionName)) {
            String recipient = (String) params.get("to");
            if (!emailWhitelistService.isAllowed(recipient, ctx.getUserId())) {
                log.warn("用户 {} 尝试发送邮件到未授权地址: {}", ctx.getUserId(), recipient);
                return false;
            }
        }
        
        // 例如：文件操作只允许在特定目录内
        if (actionName.startsWith("file_")) {
            String path = (String) params.get("path");
            if (!isWithinAllowedDirectory(path, ctx.getUserId())) {
                log.warn("用户 {} 尝试操作未授权路径: {}", ctx.getUserId(), path);
                return false;
            }
        }
        
        return true;
    }
}

七、LLM08 — 向量与嵌入漏洞（Vector and Embedding Weaknesses）

风险等级：中

RAG 系统中的向量库是一个容易被忽视的攻击面。

向量数据库越权访问：不同用户的数据如果没有做好隔离，一个用户可能检索到另一个用户的数据。

对抗性嵌入：攻击者可以精心构造输入，使其向量表示与某个特定文档非常接近，从而影响检索结果。

嵌入模型的预测：如果攻击者知道你用的嵌入模型，他们可以提前计算好触发特定检索结果所需的输入。

// 向量库多租户隔离
@Service
public class TenantAwareVectorStore {
    
    @Autowired
    private QdrantClient qdrantClient;
    
    // 每个查询都强制附加租户过滤条件
    public List<DocumentChunk> search(
            String tenantId,
            float[] queryVector,
            int topK) {
        
        // 强制过滤：只检索当前租户的数据
        Filter tenantFilter = Filter.newBuilder()
            .addMust(Condition.newBuilder()
                .setField(FieldCondition.newBuilder()
                    .setKey("tenant_id")
                    .setMatch(Match.newBuilder()
                        .setKeyword(tenantId)
                        .build())
                    .build())
                .build())
            .build();
        
        SearchPoints searchRequest = SearchPoints.newBuilder()
            .setCollectionName("knowledge_base")
            .addAllVector(Floats.asList(queryVector))
            .setLimit(topK)
            .setFilter(tenantFilter)  // 这个过滤条件不能由用户控制
            .setWithPayload(WithPayloadSelector.newBuilder().setEnable(true).build())
            .build();
        
        try {
            List<ScoredPoint> results = qdrantClient.searchAsync(searchRequest).get();
            return results.stream()
                .map(this::toDocumentChunk)
                .collect(Collectors.toList());
        } catch (Exception e) {
            throw new VectorSearchException("向量检索失败", e);
        }
    }
    
    // 写入时同样强制打租户标签
    public void upsert(String tenantId, String docId, float[] vector, Map<String, Object> metadata) {
        // 强制在 payload 中加入 tenant_id，不信任调用方传入的值
        Map<String, Object> secureMetadata = new HashMap<>(metadata);
        secureMetadata.put("tenant_id", tenantId);  // 覆盖任何可能的恶意 tenant_id
        secureMetadata.put("doc_id", docId);
        secureMetadata.put("ingested_at", Instant.now().toString());
        
        // 数据写入逻辑...
    }
}

八、LLM09 — 错误信息（Misinformation）

风险等级：中

大模型会一本正经地"胡说八道"（幻觉问题）。在医疗、法律、金融等高风险领域，错误信息可能造成真实伤害。

// 高风险领域的输出验证
@Component
public class HighRiskOutputValidator {
    
    // 需要额外验证的高风险领域关键词
    private static final Map<String, List<String>> HIGH_RISK_DOMAIN_KEYWORDS = Map.of(
        "medical", Arrays.asList("诊断", "治疗", "药物", "剂量", "手术", "症状"),
        "legal", Arrays.asList("法律责任", "违法", "合同", "诉讼", "刑事"),
        "financial", Arrays.asList("投资", "收益", "风险", "理财", "股票", "基金")
    );
    
    public ValidationResult validate(String output, String domain) {
        List<String> keywords = HIGH_RISK_DOMAIN_KEYWORDS.getOrDefault(domain, Collections.emptyList());
        
        boolean containsHighRiskContent = keywords.stream().anyMatch(output::contains);
        
        if (containsHighRiskContent) {
            // 检查是否包含必要的免责声明
            boolean hasDisclaimer = output.contains("仅供参考") || 
                                    output.contains("建议咨询专业") ||
                                    output.contains("请遵医嘱") ||
                                    output.contains("不构成");
            
            if (!hasDisclaimer) {
                return ValidationResult.needsDisclaimer(domain);
            }
            
            // 检查是否有过于确定性的表述
            if (hasTooDefinitiveStatements(output)) {
                return ValidationResult.tooDefinitive(
                    "AI输出包含过于确定性的高风险内容，建议增加不确定性表述");
            }
        }
        
        return ValidationResult.valid();
    }
    
    private boolean hasTooDefinitiveStatements(String output) {
        String[] definitivePatterns = {
            "一定是", "肯定是", "确定是", "必然是", "绝对是",
            "you must", "you should definitely", "it is certain"
        };
        return Arrays.stream(definitivePatterns).anyMatch(output::contains);
    }
}

九、LLM10 — 无限制资源消耗（Unbounded Consumption）

风险等级：中

没有做速率限制和成本控制的 AI 应用，很容易被滥用或被 DoS 攻击打垮。

@Component
public class ResourceConsumptionGuard {
    
    @Autowired
    private RedisRateLimiter rateLimiter;
    
    public ConsumptionCheckResult check(String userId, ChatRequest request) {
        // 1. 用户级别速率限制
        RateLimitResult userRateLimit = rateLimiter.checkLimit(
            "user:" + userId, 
            10,          // 每分钟最多10次
            Duration.ofMinutes(1)
        );
        
        if (!userRateLimit.isAllowed()) {
            return ConsumptionCheckResult.rateLimited(
                "请求过于频繁，请" + userRateLimit.getRetryAfterSeconds() + "秒后重试");
        }
        
        // 2. 输入 token 限制（防止超长输入消耗大量 token）
        int estimatedInputTokens = estimateTokens(request.getMessage());
        if (estimatedInputTokens > 4096) {
            return ConsumptionCheckResult.inputTooLong(
                "输入内容过长，请精简后重试（当前约" + estimatedInputTokens + " tokens）");
        }
        
        // 3. 并发请求限制
        if (!concurrencyLimiter.tryAcquire(userId)) {
            return ConsumptionCheckResult.tooManyConcurrent("您有请求正在处理中，请稍候");
        }
        
        // 4. 上下文窗口限制（历史消息不能无限堆积）
        List<Message> history = request.getHistory();
        if (history != null && history.size() > 50) {
            return ConsumptionCheckResult.contextTooLong("对话历史过长，请开始新对话");
        }
        
        return ConsumptionCheckResult.allowed();
    }
    
    private int estimateTokens(String text) {
        // 粗略估算：中文1字约1.5 token，英文4字约1 token
        // 生产环境可以用 tiktoken 精确计算
        return (int) (text.length() * 1.5);
    }
}

十、一个综合性防御框架

把以上十个风险对应的防护措施组合起来，形成一个统一的防御框架：

很多团队看完 OWASP Top 10 觉得要做的事太多，不知道从哪里下手。我的建议是按照风险等级和你的业务场景来排优先级：

优先做：LLM01（注入防御）、LLM06（过度自主行为）、LLM10（资源消耗限制）。这三个既是高风险，也是工程上相对容易落地的。

其次做：LLM02（敏感信息保护）、LLM05（输出处理）、LLM07（提示词保护）。

持续跟进：LLM03（供应链）、LLM04（数据投毒）、LLM08（向量漏洞）、LLM09（错误信息）。这些往往需要跨团队协作，是更长期的工程建设。

安全是一个永远不会"做完"的事，但至少要确保基础防护到位，不给攻击者留明显的漏洞。