第1781篇：GDPR合规在AI系统中的工程实现——数据主体权利与自动化响应

老张2026/4/30大约 13 分钟

第1781篇：GDPR合规在AI系统中的工程实现——数据主体权利与自动化响应

做AI系统的同学，估计都有一个共同的"踩坑记忆"：业务跑得很顺，有一天突然收到法务的邮件，说有欧盟用户提交了GDPR数据删除请求，问技术侧能不能在规定时限内响应。

然后就开始翻代码、翻数据库、翻向量库……发现根本说不清楚用户数据都存在哪儿。

这不是极端情况，这是我见过的大多数AI团队在合规问题上的真实处境。大家把精力放在模型效果、推理性能、prompt优化上，合规是"以后再说"的事——直到以后真的来了。

今天这篇，我们认真聊聊GDPR对AI系统的工程约束，以及如何用Java把数据主体权利的响应机制做成可运行的代码。

一、GDPR对AI系统的核心约束，比你想象的要细

GDPR（通用数据保护条例）于2018年正式生效，面向所有处理欧盟居民数据的组织。罚款上限是全球年营收的4%，或2000万欧元取较高值。

很多人以为GDPR只是"要签隐私协议"，实际上对AI系统有几个具体的技术要求：

1. 数据主体的六大权利

访问权（Right to Access）：用户有权获取企业持有的关于他/她的所有个人数据
更正权（Right to Rectification）：有权要求纠正不准确的数据
删除权（Right to Erasure，"被遗忘权"）：有权要求删除其个人数据
限制处理权（Right to Restriction）：有权要求暂停数据处理
数据可携带权（Right to Portability）：有权以结构化格式接收自己的数据
反对权（Right to Object）：有权反对基于合法利益的处理

2. 对自动化决策的特殊约束

第22条明确规定：数据主体有权不受纯粹基于自动化处理（包括画像）的决策约束，特别是当该决策对其产生重大法律效果时。

这直接命中了AI推荐、信用评分、简历筛选等场景。

3. 响应时限

企业必须在收到请求后的一个日历月内响应。可延长至三个月，但必须通知请求者。

二、AI系统的数据分布——你的数据比你知道的多

先做一件事：把AI系统里可能存储个人数据的地方列出来。

这个清单往往比团队最初想的长得多：

1. 关系型数据库（用户表、会话表、反馈表）
2. 向量数据库（用户历史交互的embedding）
3. 模型微调数据集（可能含有用户生成内容）
4. RAG知识库（如果其中包含用户上传文档）
5. 日志系统（请求日志、推理日志）
6. 缓存层（Redis中的对话上下文）
7. 消息队列（处理中的请求消息）
8. 对象存储（用户上传的文件）
9. 监控系统（含有用户行为的指标数据）
10. 第三方调用日志（调用外部API时传递的数据）

每一层都需要有相应的"删除能力"或"导出能力"。这就是为什么被遗忘权在工程上很难实现——它要求的是全局清除，而不是删一张表。

三、设计数据主体权利响应系统

3.1 整体架构

3.2 核心数据模型

// 数据主体权利请求实体
@Entity
@Table(name = "dsr_requests")
public class DataSubjectRequest {
    
    @Id
    @GeneratedValue(strategy = GenerationType.UUID)
    private String requestId;
    
    @Column(nullable = false)
    private String userId;
    
    @Enumerated(EnumType.STRING)
    @Column(nullable = false)
    private RequestType requestType;
    
    @Enumerated(EnumType.STRING)
    private RequestStatus status;
    
    @Column(name = "submitted_at", nullable = false)
    private LocalDateTime submittedAt;
    
    @Column(name = "deadline_at", nullable = false)
    private LocalDateTime deadlineAt;  // 法定响应截止时间
    
    @Column(name = "completed_at")
    private LocalDateTime completedAt;
    
    @Column(name = "identity_verified")
    private boolean identityVerified;
    
    @Column(name = "verification_method")
    private String verificationMethod;
    
    // 请求详情（JSON格式）
    @Column(columnDefinition = "TEXT")
    private String requestDetails;
    
    // 处理结果摘要
    @Column(columnDefinition = "TEXT")
    private String processingNotes;
    
    public enum RequestType {
        ACCESS,           // 访问权
        ERASURE,          // 删除权
        PORTABILITY,      // 可携带权
        RECTIFICATION,    // 更正权
        RESTRICTION,      // 限制处理权
        OBJECTION         // 反对权
    }
    
    public enum RequestStatus {
        RECEIVED,         // 已接收
        IDENTITY_CHECK,   // 身份核验中
        IN_PROGRESS,      // 处理中
        COMPLETED,        // 已完成
        REJECTED,         // 已拒绝（含原因）
        EXTENDED          // 已延期（最多延长至3个月）
    }
}

3.3 身份验证服务

这是整个流程里最容易被忽视的环节。GDPR要求企业核验请求者身份，防止他人代为获取或删除数据。

@Service
@Slf4j
public class IdentityVerificationService {
    
    @Autowired
    private UserRepository userRepository;
    
    @Autowired
    private EmailService emailService;
    
    @Autowired
    private StringRedisTemplate redisTemplate;
    
    // 发送验证邮件
    public String initiateEmailVerification(String userId, String requestId) {
        User user = userRepository.findById(userId)
            .orElseThrow(() -> new UserNotFoundException(userId));
        
        String verificationToken = generateSecureToken();
        String cacheKey = "dsr:verify:" + requestId;
        
        // 存储token，30分钟有效
        redisTemplate.opsForValue().set(
            cacheKey,
            verificationToken,
            Duration.ofMinutes(30)
        );
        
        // 发送验证邮件
        emailService.sendVerificationEmail(
            user.getEmail(),
            requestId,
            verificationToken
        );
        
        log.info("DSR身份验证邮件已发送 requestId={} userId={}", requestId, userId);
        return verificationToken;
    }
    
    // 验证token
    public boolean verifyToken(String requestId, String token) {
        String cacheKey = "dsr:verify:" + requestId;
        String storedToken = redisTemplate.opsForValue().get(cacheKey);
        
        if (storedToken == null) {
            log.warn("DSR验证token已过期或不存在 requestId={}", requestId);
            return false;
        }
        
        boolean valid = MessageDigest.isEqual(
            storedToken.getBytes(StandardCharsets.UTF_8),
            token.getBytes(StandardCharsets.UTF_8)
        );
        
        if (valid) {
            // 验证成功后删除token，防止重用
            redisTemplate.delete(cacheKey);
            log.info("DSR身份验证成功 requestId={}", requestId);
        }
        
        return valid;
    }
    
    private String generateSecureToken() {
        byte[] bytes = new byte[32];
        new SecureRandom().nextBytes(bytes);
        return Base64.getUrlEncoder().withoutPadding().encodeToString(bytes);
    }
}

四、被遗忘权的工程实现——最难的那一块

被遗忘权（Right to Erasure）是GDPR里工程难度最高的权利。难点在于：你需要彻底删除数据，同时还要留下删除操作本身的证据（这两者在表面上是矛盾的）。

4.1 数据发现服务

@Service
@Slf4j
public class DataDiscoveryService {
    
    @Autowired
    private List<DataSourceAdapter> dataSourceAdapters;
    
    /**
     * 发现指定用户在所有数据源中的数据
     */
    public DataInventory discoverUserData(String userId) {
        DataInventory inventory = new DataInventory(userId);
        
        for (DataSourceAdapter adapter : dataSourceAdapters) {
            try {
                DataSourceInventory sourceInventory = adapter.scanForUser(userId);
                inventory.addSourceInventory(sourceInventory);
                log.info("数据源扫描完成 source={} userId={} recordCount={}", 
                    adapter.getSourceName(), userId, sourceInventory.getRecordCount());
            } catch (Exception e) {
                log.error("数据源扫描失败 source={} userId={}", 
                    adapter.getSourceName(), userId, e);
                inventory.addError(adapter.getSourceName(), e.getMessage());
            }
        }
        
        return inventory;
    }
}

// 关系型数据库适配器
@Component
public class RelationalDbDataSourceAdapter implements DataSourceAdapter {
    
    @Autowired
    private JdbcTemplate jdbcTemplate;
    
    // 需要扫描的表和对应的用户ID字段
    private static final Map<String, String> USER_DATA_TABLES = Map.of(
        "user_profiles", "user_id",
        "conversation_history", "user_id",
        "user_feedback", "user_id",
        "ai_interaction_logs", "user_id",
        "user_preferences", "user_id"
    );
    
    @Override
    public DataSourceInventory scanForUser(String userId) {
        DataSourceInventory inventory = new DataSourceInventory(getSourceName());
        
        for (Map.Entry<String, String> entry : USER_DATA_TABLES.entrySet()) {
            String tableName = entry.getKey();
            String userIdColumn = entry.getValue();
            
            String countSql = String.format(
                "SELECT COUNT(*) FROM %s WHERE %s = ?", tableName, userIdColumn
            );
            
            Integer count = jdbcTemplate.queryForObject(countSql, Integer.class, userId);
            if (count != null && count > 0) {
                inventory.addTable(tableName, count);
            }
        }
        
        return inventory;
    }
    
    @Override
    public void deleteUserData(String userId) {
        // 注意：要按外键依赖顺序删除
        List<String> deletionOrder = List.of(
            "ai_interaction_logs",
            "user_feedback",
            "conversation_history",
            "user_preferences",
            "user_profiles"
        );
        
        for (String tableName : deletionOrder) {
            String userIdColumn = USER_DATA_TABLES.get(tableName);
            String deleteSql = String.format(
                "DELETE FROM %s WHERE %s = ?", tableName, userIdColumn
            );
            int deleted = jdbcTemplate.update(deleteSql, userId);
            log.info("已删除关系型数据 table={} userId={} rows={}", tableName, userId, deleted);
        }
    }
    
    @Override
    public String getSourceName() {
        return "relational-db";
    }
}

4.2 向量数据库中的数据删除

这个是真正的难点。向量数据库（如Milvus、Weaviate、Pinecone）存储的是用户历史对话的embedding，删除起来比关系型数据库复杂得多。

@Component
public class VectorDbDataSourceAdapter implements DataSourceAdapter {
    
    @Autowired
    private MilvusClient milvusClient;
    
    private static final String COLLECTION_NAME = "user_conversation_vectors";
    
    @Override
    public DataSourceInventory scanForUser(String userId) {
        DataSourceInventory inventory = new DataSourceInventory(getSourceName());
        
        // 通过metadata过滤查询用户数据数量
        QueryParam queryParam = QueryParam.newBuilder()
            .withCollectionName(COLLECTION_NAME)
            .withExpr(String.format("user_id == \"%s\"", userId))
            .withOutFields(List.of("pk"))
            .build();
        
        R<QueryResults> response = milvusClient.query(queryParam);
        if (response.getStatus() == R.Status.Success.getCode()) {
            long count = response.getData().getRowCount();
            inventory.addCollection(COLLECTION_NAME, (int) count);
        }
        
        return inventory;
    }
    
    @Override
    public void deleteUserData(String userId) {
        // Milvus通过表达式删除
        DeleteParam deleteParam = DeleteParam.newBuilder()
            .withCollectionName(COLLECTION_NAME)
            .withExpr(String.format("user_id == \"%s\"", userId))
            .build();
        
        R<MutationResult> response = milvusClient.delete(deleteParam);
        
        if (response.getStatus() == R.Status.Success.getCode()) {
            long deletedCount = response.getData().getDeleteCnt();
            log.info("向量数据删除成功 userId={} deletedVectors={}", userId, deletedCount);
        } else {
            throw new DataDeletionException(
                "向量数据库删除失败: " + response.getMessage()
            );
        }
    }
    
    @Override
    public String getSourceName() {
        return "vector-db-milvus";
    }
}

4.3 删除编排服务

@Service
@Slf4j
@Transactional
public class ErasureOrchestrationService {
    
    @Autowired
    private DataDiscoveryService dataDiscoveryService;
    
    @Autowired
    private List<DataSourceAdapter> dataSourceAdapters;
    
    @Autowired
    private DsrAuditLogRepository auditLogRepository;
    
    @Autowired
    private UserRepository userRepository;
    
    /**
     * 执行被遗忘权删除
     * 
     * 注意：这里故意不在同一个事务里完成所有操作，
     * 因为不同数据源不在同一个事务边界内。
     * 使用Saga模式来保证最终一致性。
     */
    public ErasureResult executeErasure(String userId, String requestId) {
        ErasureResult result = new ErasureResult(requestId, userId);
        
        log.info("开始执行数据删除 requestId={} userId={}", requestId, userId);
        
        // 1. 先做数据发现，记录删除前的状态
        DataInventory beforeInventory = dataDiscoveryService.discoverUserData(userId);
        result.setDataInventoryBefore(beforeInventory);
        
        // 2. 逐个数据源执行删除
        for (DataSourceAdapter adapter : dataSourceAdapters) {
            try {
                adapter.deleteUserData(userId);
                result.markSourceDeleted(adapter.getSourceName());
                
                // 记录每个数据源的删除操作
                recordDeletionAudit(requestId, userId, adapter.getSourceName(), true, null);
                
            } catch (Exception e) {
                log.error("数据源删除失败 source={} userId={}", 
                    adapter.getSourceName(), userId, e);
                result.markSourceFailed(adapter.getSourceName(), e.getMessage());
                recordDeletionAudit(requestId, userId, adapter.getSourceName(), false, e.getMessage());
            }
        }
        
        // 3. 匿名化用户主记录（不是物理删除，保留账号框架以满足其他法律义务）
        anonymizeUserAccount(userId);
        
        // 4. 最终验证：再次扫描确认数据已删除
        DataInventory afterInventory = dataDiscoveryService.discoverUserData(userId);
        result.setDataInventoryAfter(afterInventory);
        
        log.info("数据删除完成 requestId={} userId={} successSources={} failedSources={}", 
            requestId, userId, result.getSuccessCount(), result.getFailureCount());
        
        return result;
    }
    
    /**
     * 用户账号匿名化
     * 某些数据出于合法目的必须保留（如财务记录），
     * 但个人标识信息可以替换为匿名标识符
     */
    private void anonymizeUserAccount(String userId) {
        userRepository.findById(userId).ifPresent(user -> {
            String anonymousId = "DELETED_" + UUID.randomUUID().toString().substring(0, 8);
            user.setEmail(anonymousId + "@deleted.invalid");
            user.setName("已删除用户");
            user.setPhone(null);
            user.setProfileData(null);
            user.setDeletedAt(LocalDateTime.now());
            user.setAnonymized(true);
            userRepository.save(user);
            log.info("用户账号已匿名化 userId={}", userId);
        });
    }
    
    private void recordDeletionAudit(String requestId, String userId, 
                                      String sourceName, boolean success, String error) {
        DsrAuditLog auditLog = new DsrAuditLog();
        auditLog.setRequestId(requestId);
        auditLog.setUserId(userId);
        auditLog.setAction("DELETE");
        auditLog.setDataSource(sourceName);
        auditLog.setSuccess(success);
        auditLog.setErrorMessage(error);
        auditLog.setTimestamp(LocalDateTime.now());
        auditLogRepository.save(auditLog);
    }
}

五、数据访问权与可携带权的实现

5.1 数据导出服务

@Service
@Slf4j
public class DataPortabilityService {
    
    @Autowired
    private List<DataExportAdapter> exportAdapters;
    
    @Autowired
    private ObjectStorageService objectStorageService;
    
    /**
     * 生成用户数据导出包
     * 格式：JSON，结构化，机器可读
     */
    public DataExportPackage generateExportPackage(String userId, String requestId) {
        DataExportPackage pkg = new DataExportPackage();
        pkg.setRequestId(requestId);
        pkg.setUserId(userId);
        pkg.setExportedAt(LocalDateTime.now());
        pkg.setFormatVersion("1.0");
        
        Map<String, Object> exportData = new LinkedHashMap<>();
        
        for (DataExportAdapter adapter : exportAdapters) {
            try {
                Object data = adapter.exportUserData(userId);
                exportData.put(adapter.getDataCategory(), data);
            } catch (Exception e) {
                log.error("数据导出失败 category={} userId={}", 
                    adapter.getDataCategory(), userId, e);
                exportData.put(adapter.getDataCategory(), 
                    Map.of("error", "数据提取失败: " + e.getMessage()));
            }
        }
        
        pkg.setData(exportData);
        
        // 序列化为JSON并上传到临时存储
        String jsonContent = serializeToJson(pkg);
        String downloadUrl = objectStorageService.uploadTempFile(
            "dsr-export-" + requestId + ".json",
            jsonContent.getBytes(StandardCharsets.UTF_8),
            Duration.ofDays(7)  // 7天后自动过期
        );
        
        pkg.setDownloadUrl(downloadUrl);
        return pkg;
    }
    
    private String serializeToJson(DataExportPackage pkg) {
        try {
            ObjectMapper mapper = new ObjectMapper();
            mapper.registerModule(new JavaTimeModule());
            mapper.enable(SerializationFeature.INDENT_OUTPUT);
            return mapper.writeValueAsString(pkg);
        } catch (JsonProcessingException e) {
            throw new RuntimeException("JSON序列化失败", e);
        }
    }
}

// 对话历史导出适配器
@Component
public class ConversationHistoryExportAdapter implements DataExportAdapter {
    
    @Autowired
    private ConversationRepository conversationRepository;
    
    @Override
    public Object exportUserData(String userId) {
        List<Conversation> conversations = conversationRepository
            .findByUserIdOrderByCreatedAtAsc(userId);
        
        return conversations.stream().map(conv -> {
            Map<String, Object> convData = new LinkedHashMap<>();
            convData.put("conversationId", conv.getId());
            convData.put("startedAt", conv.getCreatedAt());
            convData.put("messages", conv.getMessages().stream()
                .map(msg -> Map.of(
                    "role", msg.getRole(),
                    "content", msg.getContent(),
                    "timestamp", msg.getTimestamp()
                ))
                .collect(Collectors.toList())
            );
            return convData;
        }).collect(Collectors.toList());
    }
    
    @Override
    public String getDataCategory() {
        return "conversation_history";
    }
}

六、自动化决策的透明度要求

GDPR第22条对自动化决策的规定，在AI推荐、评分场景里尤为重要。

@Service
@Slf4j
public class AutomatedDecisionService {
    
    @Autowired
    private AiModelClient aiModelClient;
    
    @Autowired
    private DecisionAuditRepository decisionAuditRepository;
    
    /**
     * 执行自动化决策，并记录决策过程
     * 满足GDPR第22条的透明度要求
     */
    public AutomatedDecisionResult makeDecision(
            String userId, 
            DecisionRequest request) {
        
        // 检查用户是否已选择退出自动化决策
        if (isUserOptedOut(userId, request.getDecisionType())) {
            return AutomatedDecisionResult.optedOut(userId, request.getDecisionType());
        }
        
        // 执行AI推断
        ModelInferenceResult inference = aiModelClient.infer(request.getInputData());
        
        // 构建决策结果，包含可解释性信息
        AutomatedDecisionResult result = AutomatedDecisionResult.builder()
            .userId(userId)
            .decisionType(request.getDecisionType())
            .outcome(inference.getOutcome())
            .confidence(inference.getConfidence())
            .decisionFactors(inference.getFeatureImportance())  // 影响决策的因素
            .modelVersion(inference.getModelVersion())
            .decisionTimestamp(LocalDateTime.now())
            .isFullyAutomated(true)
            .humanReviewAvailable(true)  // 告知用户可申请人工复审
            .build();
        
        // 记录审计日志
        recordDecisionAudit(result);
        
        return result;
    }
    
    /**
     * 用户申请对自动化决策进行人工复审
     */
    public HumanReviewRequest requestHumanReview(String userId, String decisionId) {
        AutomatedDecisionAudit decision = decisionAuditRepository
            .findById(decisionId)
            .orElseThrow(() -> new DecisionNotFoundException(decisionId));
        
        // 验证是这个用户的决策
        if (!decision.getUserId().equals(userId)) {
            throw new UnauthorizedException("该决策不属于当前用户");
        }
        
        HumanReviewRequest reviewRequest = new HumanReviewRequest();
        reviewRequest.setDecisionId(decisionId);
        reviewRequest.setUserId(userId);
        reviewRequest.setRequestedAt(LocalDateTime.now());
        reviewRequest.setStatus(HumanReviewRequest.Status.PENDING);
        
        log.info("用户申请人工复审 userId={} decisionId={}", userId, decisionId);
        
        return reviewRequest;
    }
    
    private boolean isUserOptedOut(String userId, String decisionType) {
        // 检查用户偏好设置中的退出标记
        return userPreferenceRepository
            .findByUserIdAndPreferenceKey(userId, "opt_out_automated_" + decisionType)
            .map(pref -> "true".equals(pref.getValue()))
            .orElse(false);
    }
    
    private void recordDecisionAudit(AutomatedDecisionResult result) {
        AutomatedDecisionAudit audit = new AutomatedDecisionAudit();
        audit.setUserId(result.getUserId());
        audit.setDecisionType(result.getDecisionType());
        audit.setOutcome(result.getOutcome().toString());
        audit.setConfidence(result.getConfidence());
        audit.setDecisionFactors(serializeToJson(result.getDecisionFactors()));
        audit.setModelVersion(result.getModelVersion());
        audit.setDecisionTimestamp(result.getDecisionTimestamp());
        decisionAuditRepository.save(audit);
    }
}

七、请求状态机与截止时间监控

合规要求有时限，所以要有一个专门的监控机制防止超期。

@Component
@Slf4j
public class DsrDeadlineMonitor {
    
    @Autowired
    private DataSubjectRequestRepository dsrRepository;
    
    @Autowired
    private AlertService alertService;
    
    // 每小时检查一次
    @Scheduled(fixedRate = 3600000)
    public void checkApproachingDeadlines() {
        LocalDateTime now = LocalDateTime.now();
        LocalDateTime warningThreshold = now.plusDays(7);  // 提前7天预警
        
        // 查询即将超期的未完成请求
        List<DataSubjectRequest> approachingRequests = dsrRepository
            .findByStatusNotAndDeadlineAtBefore(
                DataSubjectRequest.RequestStatus.COMPLETED,
                warningThreshold
            );
        
        for (DataSubjectRequest request : approachingRequests) {
            long daysRemaining = ChronoUnit.DAYS.between(now, request.getDeadlineAt());
            
            if (daysRemaining <= 0) {
                // 已超期，高优先级告警
                alertService.sendCriticalAlert(
                    String.format("GDPR请求已超期！requestId=%s userId=%s type=%s",
                        request.getRequestId(),
                        request.getUserId(),
                        request.getRequestType())
                );
                log.error("GDPR数据主体请求已超期 requestId={} userId={}", 
                    request.getRequestId(), request.getUserId());
            } else if (daysRemaining <= 3) {
                // 3天内到期，紧急告警
                alertService.sendUrgentAlert(
                    String.format("GDPR请求即将超期（%d天）requestId=%s",
                        daysRemaining, request.getRequestId())
                );
            } else {
                // 7天内到期，普通告警
                alertService.sendWarningAlert(
                    String.format("GDPR请求%d天后到期 requestId=%s",
                        daysRemaining, request.getRequestId())
                );
            }
        }
    }
}

八、我踩过的几个坑，给你省点时间

坑1：以为删了数据库就完事了

上线前做数据清除测试，删掉了关系型库里的所有用户数据，结果Redis缓存里还有对话上下文，向量库里还有embedding，日志系统里还有完整的请求日志带着用户明文。花了两周时间逐一梳理清楚。

坑2：身份验证做得太宽松

早期版本只要填写注册邮箱就允许请求，被人测试出来可以用别人的邮箱发起删除请求。后来改成了发验证码+账号密码二次确认的方式。

坑3：删除操作本身没有幂等性

同一个请求被处理了两次，第二次在某些表上报了外键约束错误，导致整个删除流程被标记为失败。后来加了请求状态检查和删除前existence check。

坑4：日志保留策略和GDPR冲突

安全团队要求保留6个月的完整请求日志（含用户数据），GDPR说用户有权删除数据。这个矛盾靠"合法利益"（Legitimate Interest）豁免条款解决——安全日志可以保留，但需要在隐私声明里明确说明，且日志保留期限要合理。

坑5：第三方服务里的数据

系统调用了第三方分析平台、客服系统、营销工具，这些平台里也有用户数据。GDPR要求数据控制者负责协调数据处理者的删除义务。要在每个第三方合同里加数据处理协议（DPA），并在DSR响应时间内联系这些平台执行删除。

九、配套测试

@SpringBootTest
@Transactional
class ErasureOrchestrationServiceTest {
    
    @Autowired
    private ErasureOrchestrationService erasureService;
    
    @Autowired
    private DataDiscoveryService discoveryService;
    
    @Test
    @DisplayName("执行删除后用户数据应全部清除")
    void shouldEraseAllUserData() {
        // Given
        String userId = "test-user-gdpr-001";
        createTestUserData(userId);
        
        DataInventory before = discoveryService.discoverUserData(userId);
        assertThat(before.getTotalRecordCount()).isGreaterThan(0);
        
        // When
        ErasureResult result = erasureService.executeErasure(userId, "req-test-001");
        
        // Then
        assertThat(result.getFailureCount()).isEqualTo(0);
        
        DataInventory after = discoveryService.discoverUserData(userId);
        assertThat(after.getTotalRecordCount()).isEqualTo(0);
    }
    
    @Test
    @DisplayName("匿名化后原始PII不可检索")
    void shouldAnonymizeUserPII() {
        String userId = "test-user-gdpr-002";
        createTestUser(userId, "real@email.com", "真实姓名");
        
        erasureService.executeErasure(userId, "req-test-002");
        
        User user = userRepository.findById(userId).orElseThrow();
        assertThat(user.getEmail()).doesNotContain("real@email.com");
        assertThat(user.getName()).isEqualTo("已删除用户");
        assertThat(user.isAnonymized()).isTrue();
    }
}

十、小结

GDPR合规不是法务的事，是工程的事。

从数据发现、身份验证、删除编排、向量库清理，到自动化决策透明度、截止时间监控，每一个环节都需要具体的代码实现。

有几个原则值得记住：

数据最小化：AI系统只采集必要的数据，越少越容易合规
设计时考虑删除：删除能力应该是数据架构的一等公民，而不是事后打补丁
审计日志必须分离：删除操作的证据不能和被删除的数据放在一起
测试删除路径：删除逻辑要像功能代码一样测试，而且要定期演练

下一篇我们聊《生成式AI管理办法》——国内版本的合规约束，和GDPR思路不同，落地挑战也不一样。