第2243篇：金融风控AI——基于LLM的信用评估和异常检测

老张2026/4/30大约 7 分钟

第2243篇：金融风控AI——基于LLM的信用评估和异常检测

适读人群：金融科技工程师、风控系统开发者、Java后端工程师 | 阅读时长：约17分钟 | 核心价值：揭示金融风控AI的核心技术挑战，从传统规则引擎到LLM增强风控的工程演进路径

我在一家消费金融公司做过一年风控系统，那段经历让我对"风险"的理解完全不一样了。

有一天，风控团队发现一批用户的信贷申请通过了初审，但有个资深风控经理觉得有问题——这批用户的基本信息、收入证明都很漂亮，但某些行为模式不对劲：申请时间集中在工作日下午2-4点，手机机型高度集中，IP地址分布异常规律。

这是典型的团伙欺诈。一个中间商批量收集了一批"优质"资料，帮人代办贷款，再抽取中介费。传统规则引擎看不出来，因为每个申请者的单独指标都符合规则。

这件事让我意识到：金融风控AI的核心价值不是让规则更多更细，而是发现规则写不出来的异常模式。

信贷风控系统架构

特征工程：风控的核心壁垒

风控最核心的竞争力不是模型算法，而是特征工程。好的特征往往来自业务洞察：

@Service
public class CreditFeatureEngineering {

    @Autowired
    private BankStatementAnalyzer bankStatementAnalyzer;
    
    @Autowired
    private SocialGraphAnalyzer socialGraphAnalyzer;
    
    @Autowired
    private DeviceFingerprintService deviceFingerprint;

    /**
     * 构建完整的申请人特征向量
     */
    public CreditFeatureVector buildFeatures(LoanApplication app) {
        CreditFeatureVector.Builder builder = CreditFeatureVector.newBuilder();
        
        // 1. 基础统计特征
        builder.addAll(buildBasicFeatures(app));
        
        // 2. 银行流水分析特征（关键！）
        if (app.hasBankStatement()) {
            builder.addAll(bankStatementAnalyzer.analyze(app.getBankStatement()));
        }
        
        // 3. 行为特征（申请行为的异常检测）
        builder.addAll(buildBehaviorFeatures(app));
        
        // 4. 社交图谱特征（关联关系风险）
        builder.addAll(socialGraphAnalyzer.analyze(app.getUserId()));
        
        // 5. 设备风险特征
        builder.addAll(deviceFingerprint.analyze(app.getDeviceInfo()));
        
        return builder.build();
    }

    /**
     * 银行流水分析：提取收入稳定性、负债情况等特征
     */
    // 具体在 BankStatementAnalyzer 中
}

@Service
public class BankStatementAnalyzer {

    public List<FeatureValue> analyze(BankStatement statement) {
        List<Transaction> txns = statement.getTransactions();
        List<FeatureValue> features = new ArrayList<>();
        
        // 月均收入
        double monthlyIncome = calculateMonthlyIncome(txns);
        features.add(new FeatureValue("monthly_income", monthlyIncome));
        
        // 收入稳定性（变异系数）
        double incomeCV = calculateIncomeCV(txns);
        features.add(new FeatureValue("income_cv", incomeCV));
        
        // 月均支出
        double monthlyExpense = calculateMonthlyExpense(txns);
        features.add(new FeatureValue("monthly_expense", monthlyExpense));
        
        // 负债偿还比率（DSR）：每月还款金额 / 月收入
        double dsr = calculateDSR(txns, monthlyIncome);
        features.add(new FeatureValue("debt_service_ratio", dsr));
        
        // 赌博/娱乐类消费占比（高风险行为指标）
        double gamblingRatio = calculateGamblingExpenseRatio(txns);
        features.add(new FeatureValue("gambling_expense_ratio", gamblingRatio));
        
        // 频繁小额提现（套现风险）
        int frequentSmallWithdrawals = countFrequentSmallWithdrawals(txns);
        features.add(new FeatureValue("frequent_small_withdrawals", 
            (double) frequentSmallWithdrawals));
        
        // 工资发放规律性（固定雇主 vs 散乱收入）
        double salaryRegularity = calculateSalaryRegularity(txns);
        features.add(new FeatureValue("salary_regularity", salaryRegularity));
        
        // 最低余额（资金紧张程度）
        double minBalance = txns.stream()
            .mapToDouble(Transaction::getBalance).min().orElse(0);
        features.add(new FeatureValue("min_balance_3m", minBalance));
        
        return features;
    }

    private double calculateDSR(List<Transaction> txns, double monthlyIncome) {
        // 识别还款类交易（关键词匹配：还款、贷款、分期、信用卡等）
        double monthlyRepayment = txns.stream()
            .filter(t -> t.getType() == TransactionType.DEBIT)
            .filter(t -> isRepaymentTransaction(t.getDescription()))
            .mapToDouble(Transaction::getAmount)
            .average().orElse(0) * 30; // 换算为月均
        
        return monthlyIncome > 0 ? monthlyRepayment / monthlyIncome : 1.0;
    }

    private boolean isRepaymentTransaction(String description) {
        String lower = description.toLowerCase();
        return lower.contains("还款") || lower.contains("贷款") || 
               lower.contains("分期") || lower.contains("信用卡") ||
               lower.contains("repayment") || lower.contains("loan");
    }
}

LLM增强：捕捉规则覆盖不到的模式

对于高金额或疑似欺诈的申请，用LLM做深度分析：

@Service
public class LLMEnhancedRiskAnalyzer {

    @Autowired
    private LLMClient llmClient;
    
    @Autowired
    private FeatureInterpreter featureInterpreter;

    /**
     * LLM深度风险分析
     * 适用于：规则引擎未拒绝但评分在临界区间的申请
     */
    public LLMRiskAnalysis analyze(LoanApplication app, 
                                    CreditFeatureVector features,
                                    double modelScore) {
        // 将特征向量转为自然语言描述
        String featureDescription = featureInterpreter.interpret(features);
        
        // 提取关键异常信号
        List<String> anomalies = detectAnomalies(app, features);
        
        String prompt = String.format("""
            你是一个资深信贷风控分析师，请分析以下贷款申请的风险情况。
            
            申请基本信息：
            - 申请金额：%s元
            - 申请期限：%d个月
            - 申请用途：%s
            
            申请人关键特征：
            %s
            
            初步模型评分：%s分（满分1000，600分以上通过）
            
            异常信号（如有）：
            %s
            
            请从以下几个维度分析风险：
            1. 还款能力分析（收入、负债、现金流）
            2. 还款意愿评估（历史行为、申请行为）
            3. 欺诈风险评估（信息一致性、行为模式）
            4. 综合建议（通过/拒绝/补充材料/下调额度）
            
            注意：给出具体的分析依据，不要空洞的结论。
            """,
            app.getRequestAmount(),
            app.getTermMonths(),
            app.getPurpose(),
            featureDescription,
            modelScore,
            anomalies.isEmpty() ? "无明显异常信号" : String.join("\n", anomalies)
        );
        
        LLMResponse response = llmClient.complete(
            "你是专业的信贷风控分析师，具有10年以上风险评估经验。",
            prompt,
            LLMConfig.builder()
                .model("deepseek-v3")
                .temperature(0.2)
                .maxTokens(800)
                .build()
        );
        
        return parseLLMRiskAnalysis(response.getContent());
    }

    private List<String> detectAnomalies(LoanApplication app, CreditFeatureVector features) {
        List<String> anomalies = new ArrayList<>();
        
        // 申请行为异常
        if (app.getApplicationDuration() < 60) {  // 填写时间小于1分钟
            anomalies.add("申请填写时间异常短（" + app.getApplicationDuration() + "秒），疑似批量提交");
        }
        
        // 收入与消费不匹配
        double incomeRatio = features.getMonthlyIncome() / 
            Math.max(1, features.getMonthlyExpense());
        if (incomeRatio < 1.1) {
            anomalies.add("收入与支出比例异常：收入仅比支出高" + 
                String.format("%.0f%%", (incomeRatio - 1) * 100));
        }
        
        // 高额赌博消费
        if (features.getGamblingExpenseRatio() > 0.1) {
            anomalies.add(String.format("近3个月赌博/娱乐消费占比%.0f%%，超过警戒线", 
                features.getGamblingExpenseRatio() * 100));
        }
        
        // 多头借贷信号
        if (features.getActiveLoanCount() > 3) {
            anomalies.add("当前活跃贷款" + features.getActiveLoanCount() + "笔，存在多头借贷风险");
        }
        
        return anomalies;
    }
}

团伙欺诈识别：图神经网络方法

单个申请的异常检测不够，还需要识别关联关系：

@Service
public class GroupFraudDetector {

    @Autowired
    private Neo4jClient neo4jClient;
    
    @Autowired
    private FraudGNNClient gnnClient;

    /**
     * 关联图分析：找出申请人与已知欺诈用户的关联
     */
    public GroupFraudResult analyzeGroupRisk(LoanApplication app) {
        String userId = app.getUserId();
        
        // 1. 构建申请人的关联网络
        // 关联维度：共同IP、共同设备、共同手机号、共同住址、共同担保人
        List<FraudConnection> connections = findConnections(app);
        
        // 2. 在Neo4j中查询是否与已知欺诈用户有路径关联
        String cypherQuery = """
            MATCH path = shortestPath(
                (u:User {userId: $userId})-[*1..3]-(f:User {isFraudster: true})
            )
            RETURN path, length(path) as distance
            LIMIT 5
            """;
        
        List<FraudPath> fraudPaths = neo4jClient.query(cypherQuery)
            .bind(userId).to("userId")
            .fetchAs(FraudPath.class)
            .all();
        
        // 3. 设备指纹碰撞：同一设备被多人用于申请
        List<String> deviceCollisions = checkDeviceCollisions(app.getDeviceFingerprint());
        
        GroupFraudResult result = new GroupFraudResult();
        result.setFraudPaths(fraudPaths);
        result.setDeviceCollisions(deviceCollisions);
        
        // 综合风险评分
        double riskScore = calculateGroupRiskScore(fraudPaths, deviceCollisions);
        result.setGroupRiskScore(riskScore);
        
        if (riskScore > 0.7) {
            result.setRiskLevel(RiskLevel.HIGH);
            result.setRecommendation("疑似团伙欺诈，建议拒绝并上报");
        } else if (riskScore > 0.4) {
            result.setRiskLevel(RiskLevel.MEDIUM);
            result.setRecommendation("存在关联风险，建议人工复核");
        }
        
        return result;
    }
    
    private List<FraudConnection> findConnections(LoanApplication app) {
        List<FraudConnection> connections = new ArrayList<>();
        
        // IP地址关联
        List<String> sameIpUsers = findUsersByIP(app.getIpAddress());
        if (sameIpUsers.size() > 5) {
            connections.add(new FraudConnection(ConnectionType.IP, 
                app.getIpAddress(), sameIpUsers));
        }
        
        // 设备指纹关联
        List<String> sameDeviceUsers = findUsersByDevice(app.getDeviceFingerprint());
        if (sameDeviceUsers.size() > 1) {
            connections.add(new FraudConnection(ConnectionType.DEVICE,
                app.getDeviceFingerprint(), sameDeviceUsers));
        }
        
        // 住址关联（同一楼栋/同一小区）
        List<String> sameAddressUsers = findUsersByAddress(app.getAddress());
        connections.addAll(buildAddressConnections(sameAddressUsers));
        
        return connections;
    }
}

可解释性：监管要求下的风控AI

金融风控有个硬约束：必须能解释为什么拒绝。监管要求对拒贷申请人提供拒绝原因，不能说"算法觉得你有风险"。

SHAP（SHapley Additive exPlanations）是目前最好的解决方案：

@Service
public class ModelExplainabilityService {

    @Autowired
    private SHAPExplainerClient shapClient;

    /**
     * 生成决策解释
     * 找出对最终得分贡献最大的特征
     */
    public DecisionExplanation explain(String applicationId, 
                                        CreditFeatureVector features,
                                        String decision) {
        // 调用SHAP值计算服务（Python端计算）
        SHAPResponse shapResponse = shapClient.computeSHAP(
            applicationId, features.toArray());
        
        // 获取贡献最大的正负特征
        List<FeatureContribution> topPositive = shapResponse.getTopPositiveFeatures(3);
        List<FeatureContribution> topNegative = shapResponse.getTopNegativeFeatures(3);
        
        // 生成人类可读的拒绝原因
        List<String> rejectReasons = new ArrayList<>();
        for (FeatureContribution neg : topNegative) {
            rejectReasons.add(translateToPlainLanguage(neg));
        }
        
        return DecisionExplanation.builder()
            .applicationId(applicationId)
            .decision(decision)
            .rejectReasons(rejectReasons)
            .topRiskFactors(topNegative)
            .topProtectiveFactors(topPositive)
            .build();
    }

    private String translateToPlainLanguage(FeatureContribution contrib) {
        return switch (contrib.getFeatureName()) {
            case "debt_service_ratio" -> 
                String.format("您的月均还款金额占收入比例达%.0f%%，超过建议上限40%%",
                    contrib.getFeatureValue() * 100);
            case "income_cv" -> 
                "您近期收入波动较大，收入稳定性不足";
            case "active_loan_count" -> 
                String.format("您当前有%d笔活跃贷款，负债集中度过高",
                    (int) contrib.getFeatureValue());
            case "gambling_expense_ratio" ->
                "您近期存在较高金额的博彩类消费记录";
            default -> 
                String.format("风险指标[%s]超出正常范围", contrib.getFeatureName());
        };
    }
}

从风控工程师的视角看这件事

做完这个系统，我的一个核心认知变化是：风控AI的价值不在于降低坏账率，而在于在同样坏账率下，提高通过率。

银行拒绝100个申请，里面可能有30个是真正有风险的，70个是被误伤的好用户。AI能做的，是把这70个好用户准确识别出来，让他们通过审核，同时不放松对那30个高风险用户的管控。

这才是金融AI真正的商业价值——不是简单地"更严"，而是"更准"。