第2238篇：质量检测AI——工业生产线视觉检测的工程实践

老张2026/4/30大约 9 分钟

第2238篇：质量检测AI——工业生产线视觉检测的工程实践

适读人群：工业AI工程师、计算机视觉开发者、制造业技术团队 | 阅读时长：约16分钟 | 核心价值：深入剖析工业视觉检测系统从相机到判决的完整工程链路，解决缺陷检测准确率和实时性的核心矛盾

有一次我在一家PCB制造厂的质检车间待了半天，看着十几个检测员坐在灯光台前，一块一块地拿着放大镜检查电路板。厂长告诉我，每块板子要检查超过两百个焊点，一个熟练检测员一天能过三四百块，但准确率随着工作时间增加会明显下降——下午四点之后的漏检率大概是早上的两倍。

更让他头疼的是招不到人。年轻人不愿意干这个，一直盯着高倍放大镜看，眼睛很快就累了，干两三年就离职。而且即使是最有经验的检测员，对微米级别的缺陷也有极限，某些细微的开路、虚焊在人眼下根本看不出来，只有在下游测试环节才暴露出来，那时候返工成本已经很高了。

这就是工业视觉检测最真实的需求背景：替代的不只是体力，更是人眼的感知极限。

工业视觉检测系统的架构考量

工业场景的视觉检测跟互联网图像识别有本质区别：

维度	工业视觉	互联网图像识别
精度要求	微米级，漏检代价极高	正确率即可
实时性	跟产线节拍对齐（100-500ms）	较宽松
数据量	缺陷样本极少	大量标注数据
部署环境	边缘设备/产线旁	云端服务器
可解释性	需要标出缺陷位置和类型	分类即可

这些差异决定了系统设计的几个关键决策：

相机接入与图像预处理

工业相机通常用GigE Vision或USB3 Vision协议，Java端通过JNA调用厂商SDK：

@Component
public class IndustrialCameraService {

    private static final Logger log = LoggerFactory.getLogger(IndustrialCameraService.class);
    
    private final CameraSDK cameraSDK;
    private volatile boolean running = false;
    private final BlockingQueue<RawFrame> frameQueue = new LinkedBlockingQueue<>(100);

    public IndustrialCameraService(CameraConfig config) {
        // 初始化相机SDK（以海康MV相机为例）
        this.cameraSDK = new HikVisionCameraSDK(config);
    }

    /**
     * 启动采集线程，触发模式：外触发（产线传感器触发）
     */
    public void startCapture() {
        cameraSDK.openCamera();
        cameraSDK.setTriggerMode(TriggerMode.EXTERNAL);
        cameraSDK.setExposureTime(5000);   // 5ms曝光
        cameraSDK.setGain(0);              // 最低增益，减少噪声
        cameraSDK.startGrabbing();
        
        running = true;
        Thread captureThread = new Thread(this::captureLoop, "camera-capture");
        captureThread.setDaemon(true);
        captureThread.start();
        
        log.info("相机采集已启动: {}", cameraSDK.getCameraInfo());
    }

    private void captureLoop() {
        while (running) {
            try {
                // 等待触发信号，超时1秒
                RawFrame frame = cameraSDK.grabFrame(1000);
                if (frame != null) {
                    if (!frameQueue.offer(frame, 100, TimeUnit.MILLISECONDS)) {
                        log.warn("帧队列已满，丢弃帧: frameId={}", frame.getFrameId());
                    }
                }
            } catch (InterruptedException e) {
                Thread.currentThread().interrupt();
                break;
            } catch (CameraException e) {
                log.error("相机采集异常，尝试重连", e);
                handleCameraError();
            }
        }
    }

    public RawFrame nextFrame(long timeoutMs) throws InterruptedException {
        return frameQueue.poll(timeoutMs, TimeUnit.MILLISECONDS);
    }
}

@Service
public class ImagePreprocessor {

    /**
     * 图像预处理流水线
     * 1. 畸变校正（相机标定参数）
     * 2. 光照均一化（去除vignetting）
     * 3. 图像锐化
     * 4. 归一化
     */
    public Mat preprocess(RawFrame frame, CameraCalibration calibration) {
        // 转换为OpenCV Mat
        Mat raw = frameToMat(frame);
        
        // 1. 畸变校正
        Mat undistorted = new Mat();
        Calib3d.undistort(raw, undistorted, 
            calibration.getCameraMatrix(), 
            calibration.getDistCoeffs());
        
        // 2. 光照均一化：CLAHE（限制对比度自适应直方图均衡化）
        Mat gray = new Mat();
        if (undistorted.channels() == 3) {
            Imgproc.cvtColor(undistorted, gray, Imgproc.COLOR_BGR2GRAY);
        } else {
            gray = undistorted;
        }
        
        CLAHE clahe = Imgproc.createCLAHE(2.0, new Size(8, 8));
        Mat equalized = new Mat();
        clahe.apply(gray, equalized);
        
        // 3. 高斯模糊降噪 + USM锐化
        Mat blurred = new Mat();
        Imgproc.GaussianBlur(equalized, blurred, new Size(0, 0), 1.5);
        Mat sharpened = new Mat();
        Core.addWeighted(equalized, 1.5, blurred, -0.5, 0, sharpened);
        
        return sharpened;
    }

    /**
     * ROI提取：基于模板匹配找到产品位置，裁剪出检测区域
     */
    public List<Mat> extractROIs(Mat image, ProductTemplate template) {
        List<Mat> rois = new ArrayList<>();
        
        // 模板匹配找到产品边界
        Mat result = new Mat();
        Imgproc.matchTemplate(image, template.getTemplate(), result, Imgproc.TM_CCOEFF_NORMED);
        
        Core.MinMaxLocResult mmr = Core.minMaxLoc(result);
        if (mmr.maxVal < 0.85) {
            log.warn("模板匹配置信度不足: {}", mmr.maxVal);
            return rois;
        }
        
        // 根据匹配位置和预定义的ROI区域，裁剪各个检测区域
        Point topLeft = mmr.maxLoc;
        for (ROIDefinition roiDef : template.getRoiDefinitions()) {
            Rect roi = new Rect(
                (int)(topLeft.x + roiDef.getOffsetX()),
                (int)(topLeft.y + roiDef.getOffsetY()),
                roiDef.getWidth(),
                roiDef.getHeight()
            );
            
            // 确保ROI在图像边界内
            roi = roi.intersect(new Rect(0, 0, image.width(), image.height()));
            if (roi.area() > 0) {
                rois.add(new Mat(image, roi));
            }
        }
        
        return rois;
    }
}

AI推理引擎：模型部署与调用

使用ONNX Runtime部署模型，支持GPU加速：

@Service
public class DefectDetectionEngine {

    private final OrtEnvironment env;
    private final OrtSession session;
    private final DefectModelConfig config;

    public DefectDetectionEngine(DefectModelConfig config) throws OrtException {
        this.config = config;
        this.env = OrtEnvironment.getEnvironment();
        
        // 配置ONNX Runtime Session，启用CUDA
        OrtSession.SessionOptions options = new OrtSession.SessionOptions();
        options.addCUDA(0);  // 使用GPU 0
        options.setOptimizationLevel(OrtSession.SessionOptions.OptLevel.ALL_OPT);
        options.setInterOpNumThreads(4);
        
        this.session = env.createSession(config.getModelPath(), options);
        log.info("ONNX模型加载成功: {}", config.getModelPath());
    }

    /**
     * 对单个ROI进行缺陷检测
     * 返回检测到的缺陷列表（位置+类型+置信度）
     */
    public List<DefectDetection> detect(Mat roiImage) throws OrtException {
        // 预处理：resize + 归一化
        Mat resized = new Mat();
        Imgproc.resize(roiImage, resized, new Size(config.getInputWidth(), config.getInputHeight()));
        
        float[] inputData = matToFloatArray(resized);
        long[] shape = {1, 3, config.getInputHeight(), config.getInputWidth()};
        
        OnnxTensor inputTensor = OnnxTensor.createTensor(env, 
            FloatBuffer.wrap(inputData), shape);
        
        // 推理
        long startTime = System.currentTimeMillis();
        OrtSession.Result output = session.run(
            Collections.singletonMap("input", inputTensor)
        );
        long inferenceMs = System.currentTimeMillis() - startTime;
        
        // 解析输出（YOLOv8格式：[batch, num_detections, 6]，6 = x,y,w,h,conf,class）
        float[][] predictions = (float[][]) output.get(0).getValue();
        
        List<DefectDetection> detections = new ArrayList<>();
        for (float[] pred : predictions) {
            float confidence = pred[4];
            if (confidence < config.getConfidenceThreshold()) continue;
            
            int classId = (int) pred[5];
            DefectType defectType = config.getDefectTypeById(classId);
            
            DefectDetection detection = DefectDetection.builder()
                .defectType(defectType)
                .confidence(confidence)
                .boundingBox(new BoundingBox(pred[0], pred[1], pred[2], pred[3]))
                .build();
            
            detections.add(detection);
        }
        
        // NMS去除重叠框
        return applyNMS(detections, config.getNmsThreshold());
    }

    /**
     * 批量推理：提升GPU利用率
     */
    public List<List<DefectDetection>> detectBatch(List<Mat> roiImages) throws OrtException {
        int batchSize = roiImages.size();
        float[] batchData = new float[batchSize * 3 * config.getInputHeight() * config.getInputWidth()];
        
        for (int i = 0; i < batchSize; i++) {
            Mat resized = new Mat();
            Imgproc.resize(roiImages.get(i), resized, 
                new Size(config.getInputWidth(), config.getInputHeight()));
            float[] imgData = matToFloatArray(resized);
            System.arraycopy(imgData, 0, batchData, 
                i * imgData.length, imgData.length);
        }
        
        long[] shape = {batchSize, 3, config.getInputHeight(), config.getInputWidth()};
        OnnxTensor inputTensor = OnnxTensor.createTensor(env, 
            FloatBuffer.wrap(batchData), shape);
        
        OrtSession.Result output = session.run(
            Collections.singletonMap("input", inputTensor)
        );
        
        return parseBatchOutput(output, batchSize);
    }

    private float[] matToFloatArray(Mat mat) {
        // BGR -> RGB，归一化到[0,1]
        Mat rgbMat = new Mat();
        Imgproc.cvtColor(mat, rgbMat, Imgproc.COLOR_BGR2RGB);
        
        float[] data = new float[3 * mat.rows() * mat.cols()];
        float[] meanValues = {0.485f, 0.456f, 0.406f};
        float[] stdValues = {0.229f, 0.224f, 0.225f};
        
        int idx = 0;
        for (int c = 0; c < 3; c++) {
            for (int h = 0; h < mat.rows(); h++) {
                for (int w = 0; w < mat.cols(); w++) {
                    double[] pixel = rgbMat.get(h, w);
                    data[idx++] = (float)((pixel[c] / 255.0 - meanValues[c]) / stdValues[c]);
                }
            }
        }
        return data;
    }
}

判决逻辑：双层过滤机制

工业检测对误判有很高的容忍度要求。我们设计了规则层+AI层的双层判决：

@Service
public class QualityJudgementService {

    @Autowired
    private DefectDetectionEngine detectionEngine;
    
    @Autowired
    private InspectionRuleRepository ruleRepo;
    
    @Autowired
    private QualityResultRepository resultRepo;

    public InspectionResult inspect(String productId, List<Mat> roiImages, 
                                     String productType) {
        // 并行推理所有ROI
        List<List<DefectDetection>> allDetections;
        try {
            allDetections = detectionEngine.detectBatch(roiImages);
        } catch (Exception e) {
            log.error("推理引擎异常，产品{}降级为人工检查", productId, e);
            return InspectionResult.manualReview(productId, "推理服务异常");
        }
        
        // 汇总所有ROI的检测结果
        List<DefectDetection> allDefects = new ArrayList<>();
        for (int i = 0; i < allDetections.size(); i++) {
            List<DefectDetection> roiDefects = allDetections.get(i);
            // 将ROI坐标转回原图坐标
            roiDefects.forEach(d -> d.setRoiIndex(i));
            allDefects.addAll(roiDefects);
        }
        
        // 获取该产品类型的判决规则
        List<InspectionRule> rules = ruleRepo.findByProductType(productType);
        
        // 执行判决
        JudgementDecision decision = applyRules(allDefects, rules);
        
        InspectionResult result = InspectionResult.builder()
            .productId(productId)
            .productType(productType)
            .decision(decision.isPass() ? "PASS" : "FAIL")
            .defects(allDefects)
            .failureReasons(decision.getFailureReasons())
            .inspectionTime(Instant.now())
            .build();
        
        resultRepo.save(result);
        return result;
    }

    private JudgementDecision applyRules(List<DefectDetection> defects, 
                                          List<InspectionRule> rules) {
        List<String> failureReasons = new ArrayList<>();
        
        for (InspectionRule rule : rules) {
            // 统计满足条件的缺陷数量
            long count = defects.stream()
                .filter(d -> d.getDefectType() == rule.getDefectType())
                .filter(d -> d.getConfidence() >= rule.getMinConfidence())
                .count();
            
            if (count > rule.getMaxAllowed()) {
                failureReasons.add(String.format(
                    "缺陷类型[%s]检出%d处，超过允许值%d处",
                    rule.getDefectType().getDisplayName(), count, rule.getMaxAllowed()
                ));
            }
            
            // 致命缺陷：任何一个都直接判废
            if (rule.isCritical() && count > 0) {
                failureReasons.add(String.format(
                    "发现致命缺陷[%s]，产品直接判废",
                    rule.getDefectType().getDisplayName()
                ));
            }
        }
        
        return new JudgementDecision(failureReasons.isEmpty(), failureReasons);
    }
}

小样本学习：解决缺陷数据稀缺

生产线刚上线时，某些罕见缺陷可能只有几十张样本。我们用Few-Shot Learning做特殊处理：

@Service
public class FewShotDefectClassifier {

    @Autowired
    private FeatureExtractor featureExtractor;  // 预训练backbone

    private final Map<DefectType, List<float[]>> prototypeVectors = new HashMap<>();

    /**
     * 注册少样本缺陷类型的原型向量
     * 通过支持集样本计算类别中心
     */
    public void registerDefectPrototype(DefectType defectType, List<Mat> supportImages) {
        List<float[]> features = supportImages.stream()
            .map(img -> featureExtractor.extract(img))
            .collect(Collectors.toList());
        
        // 计算类别中心（原型向量）
        float[] prototype = averageVectors(features);
        prototypeVectors.put(defectType, List.of(prototype));
        
        log.info("缺陷类型{}原型向量注册完成，支持集大小: {}", 
            defectType, supportImages.size());
    }

    /**
     * 基于原型网络的分类
     * 计算查询样本与各类别原型的距离
     */
    public DefectClassificationResult classify(Mat queryImage) {
        float[] queryFeature = featureExtractor.extract(queryImage);
        
        DefectType bestMatch = null;
        float minDistance = Float.MAX_VALUE;
        
        for (Map.Entry<DefectType, List<float[]>> entry : prototypeVectors.entrySet()) {
            float[] prototype = entry.getValue().get(0);
            float distance = euclideanDistance(queryFeature, prototype);
            
            if (distance < minDistance) {
                minDistance = distance;
                bestMatch = entry.getKey();
            }
        }
        
        // 距离转置信度（softmax over distances）
        float confidence = distanceToConfidence(minDistance);
        
        return new DefectClassificationResult(bestMatch, confidence, minDistance);
    }

    private float euclideanDistance(float[] a, float[] b) {
        float sum = 0;
        for (int i = 0; i < a.length; i++) {
            float diff = a[i] - b[i];
            sum += diff * diff;
        }
        return (float) Math.sqrt(sum);
    }
}

实时性优化：跑赢产线节拍

产线节拍可能是每秒2-3件产品，每件检测时间必须控制在300ms以内。几个关键优化：

1. 模型量化：FP32 -> INT8，推理速度提升3-4倍，精度损失通常在1%以内。

2. TensorRT加速：将ONNX模型转为TensorRT格式，在NVIDIA GPU上速度再提升2-3倍。

3. 异步流水线：采集、预处理、推理、后处理四个阶段并行，每个阶段处理不同帧。

@Component
public class InspectionPipeline {

    // 多级流水线，每级一个队列
    private final BlockingQueue<RawFrame> captureQueue = new LinkedBlockingQueue<>(10);
    private final BlockingQueue<ProcessedImage> preprocessQueue = new LinkedBlockingQueue<>(10);
    private final BlockingQueue<InferenceTask> inferenceQueue = new LinkedBlockingQueue<>(10);
    
    private final ExecutorService pipelineExecutor = Executors.newFixedThreadPool(4);

    @PostConstruct
    public void start() {
        // Stage 1: 相机采集（单线程，跟相机SDK同步）
        pipelineExecutor.submit(this::captureStage);
        
        // Stage 2: 图像预处理（单线程，CPU密集）
        pipelineExecutor.submit(this::preprocessStage);
        
        // Stage 3: AI推理（单线程，GPU独占）
        pipelineExecutor.submit(this::inferenceStage);
        
        // Stage 4: 后处理和上报（单线程）
        pipelineExecutor.submit(this::postprocessStage);
    }
}

工程经验总结

做了几个工厂的视觉检测项目之后，有几点认知的转变：

一开始我以为，模型精度是最重要的。 后来发现，光照稳定性才是最重要的。同一个模型，在标准光照下准确率95%，在光源老化或产品反光变化时可能掉到70%。在模型上花的时间，不如在打光方案和光源维护上花的时间。

一开始我以为，漏检是最大的风险。 后来发现，误检（把好产品判为废品）同样严重，特别是高价值产品。系统阈值调太严，误检率高，操作员开始手动放行所有告警，系统形同虚设。

一开始我以为，系统上线就大功告成。 后来发现，模型需要持续学习。产品规格会变，材料批次会变，正常品的"正常范围"在变，不更新模型，准确率会慢慢退化。