第2238篇:质量检测AI——工业生产线视觉检测的工程实践
第2238篇:质量检测AI——工业生产线视觉检测的工程实践
适读人群:工业AI工程师、计算机视觉开发者、制造业技术团队 | 阅读时长:约16分钟 | 核心价值:深入剖析工业视觉检测系统从相机到判决的完整工程链路,解决缺陷检测准确率和实时性的核心矛盾
有一次我在一家PCB制造厂的质检车间待了半天,看着十几个检测员坐在灯光台前,一块一块地拿着放大镜检查电路板。厂长告诉我,每块板子要检查超过两百个焊点,一个熟练检测员一天能过三四百块,但准确率随着工作时间增加会明显下降——下午四点之后的漏检率大概是早上的两倍。
更让他头疼的是招不到人。年轻人不愿意干这个,一直盯着高倍放大镜看,眼睛很快就累了,干两三年就离职。而且即使是最有经验的检测员,对微米级别的缺陷也有极限,某些细微的开路、虚焊在人眼下根本看不出来,只有在下游测试环节才暴露出来,那时候返工成本已经很高了。
这就是工业视觉检测最真实的需求背景:替代的不只是体力,更是人眼的感知极限。
工业视觉检测系统的架构考量
工业场景的视觉检测跟互联网图像识别有本质区别:
| 维度 | 工业视觉 | 互联网图像识别 |
|---|---|---|
| 精度要求 | 微米级,漏检代价极高 | 正确率即可 |
| 实时性 | 跟产线节拍对齐(100-500ms) | 较宽松 |
| 数据量 | 缺陷样本极少 | 大量标注数据 |
| 部署环境 | 边缘设备/产线旁 | 云端服务器 |
| 可解释性 | 需要标出缺陷位置和类型 | 分类即可 |
这些差异决定了系统设计的几个关键决策:
相机接入与图像预处理
工业相机通常用GigE Vision或USB3 Vision协议,Java端通过JNA调用厂商SDK:
@Component
public class IndustrialCameraService {
private static final Logger log = LoggerFactory.getLogger(IndustrialCameraService.class);
private final CameraSDK cameraSDK;
private volatile boolean running = false;
private final BlockingQueue<RawFrame> frameQueue = new LinkedBlockingQueue<>(100);
public IndustrialCameraService(CameraConfig config) {
// 初始化相机SDK(以海康MV相机为例)
this.cameraSDK = new HikVisionCameraSDK(config);
}
/**
* 启动采集线程,触发模式:外触发(产线传感器触发)
*/
public void startCapture() {
cameraSDK.openCamera();
cameraSDK.setTriggerMode(TriggerMode.EXTERNAL);
cameraSDK.setExposureTime(5000); // 5ms曝光
cameraSDK.setGain(0); // 最低增益,减少噪声
cameraSDK.startGrabbing();
running = true;
Thread captureThread = new Thread(this::captureLoop, "camera-capture");
captureThread.setDaemon(true);
captureThread.start();
log.info("相机采集已启动: {}", cameraSDK.getCameraInfo());
}
private void captureLoop() {
while (running) {
try {
// 等待触发信号,超时1秒
RawFrame frame = cameraSDK.grabFrame(1000);
if (frame != null) {
if (!frameQueue.offer(frame, 100, TimeUnit.MILLISECONDS)) {
log.warn("帧队列已满,丢弃帧: frameId={}", frame.getFrameId());
}
}
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
break;
} catch (CameraException e) {
log.error("相机采集异常,尝试重连", e);
handleCameraError();
}
}
}
public RawFrame nextFrame(long timeoutMs) throws InterruptedException {
return frameQueue.poll(timeoutMs, TimeUnit.MILLISECONDS);
}
}
@Service
public class ImagePreprocessor {
/**
* 图像预处理流水线
* 1. 畸变校正(相机标定参数)
* 2. 光照均一化(去除vignetting)
* 3. 图像锐化
* 4. 归一化
*/
public Mat preprocess(RawFrame frame, CameraCalibration calibration) {
// 转换为OpenCV Mat
Mat raw = frameToMat(frame);
// 1. 畸变校正
Mat undistorted = new Mat();
Calib3d.undistort(raw, undistorted,
calibration.getCameraMatrix(),
calibration.getDistCoeffs());
// 2. 光照均一化:CLAHE(限制对比度自适应直方图均衡化)
Mat gray = new Mat();
if (undistorted.channels() == 3) {
Imgproc.cvtColor(undistorted, gray, Imgproc.COLOR_BGR2GRAY);
} else {
gray = undistorted;
}
CLAHE clahe = Imgproc.createCLAHE(2.0, new Size(8, 8));
Mat equalized = new Mat();
clahe.apply(gray, equalized);
// 3. 高斯模糊降噪 + USM锐化
Mat blurred = new Mat();
Imgproc.GaussianBlur(equalized, blurred, new Size(0, 0), 1.5);
Mat sharpened = new Mat();
Core.addWeighted(equalized, 1.5, blurred, -0.5, 0, sharpened);
return sharpened;
}
/**
* ROI提取:基于模板匹配找到产品位置,裁剪出检测区域
*/
public List<Mat> extractROIs(Mat image, ProductTemplate template) {
List<Mat> rois = new ArrayList<>();
// 模板匹配找到产品边界
Mat result = new Mat();
Imgproc.matchTemplate(image, template.getTemplate(), result, Imgproc.TM_CCOEFF_NORMED);
Core.MinMaxLocResult mmr = Core.minMaxLoc(result);
if (mmr.maxVal < 0.85) {
log.warn("模板匹配置信度不足: {}", mmr.maxVal);
return rois;
}
// 根据匹配位置和预定义的ROI区域,裁剪各个检测区域
Point topLeft = mmr.maxLoc;
for (ROIDefinition roiDef : template.getRoiDefinitions()) {
Rect roi = new Rect(
(int)(topLeft.x + roiDef.getOffsetX()),
(int)(topLeft.y + roiDef.getOffsetY()),
roiDef.getWidth(),
roiDef.getHeight()
);
// 确保ROI在图像边界内
roi = roi.intersect(new Rect(0, 0, image.width(), image.height()));
if (roi.area() > 0) {
rois.add(new Mat(image, roi));
}
}
return rois;
}
}AI推理引擎:模型部署与调用
使用ONNX Runtime部署模型,支持GPU加速:
@Service
public class DefectDetectionEngine {
private final OrtEnvironment env;
private final OrtSession session;
private final DefectModelConfig config;
public DefectDetectionEngine(DefectModelConfig config) throws OrtException {
this.config = config;
this.env = OrtEnvironment.getEnvironment();
// 配置ONNX Runtime Session,启用CUDA
OrtSession.SessionOptions options = new OrtSession.SessionOptions();
options.addCUDA(0); // 使用GPU 0
options.setOptimizationLevel(OrtSession.SessionOptions.OptLevel.ALL_OPT);
options.setInterOpNumThreads(4);
this.session = env.createSession(config.getModelPath(), options);
log.info("ONNX模型加载成功: {}", config.getModelPath());
}
/**
* 对单个ROI进行缺陷检测
* 返回检测到的缺陷列表(位置+类型+置信度)
*/
public List<DefectDetection> detect(Mat roiImage) throws OrtException {
// 预处理:resize + 归一化
Mat resized = new Mat();
Imgproc.resize(roiImage, resized, new Size(config.getInputWidth(), config.getInputHeight()));
float[] inputData = matToFloatArray(resized);
long[] shape = {1, 3, config.getInputHeight(), config.getInputWidth()};
OnnxTensor inputTensor = OnnxTensor.createTensor(env,
FloatBuffer.wrap(inputData), shape);
// 推理
long startTime = System.currentTimeMillis();
OrtSession.Result output = session.run(
Collections.singletonMap("input", inputTensor)
);
long inferenceMs = System.currentTimeMillis() - startTime;
// 解析输出(YOLOv8格式:[batch, num_detections, 6],6 = x,y,w,h,conf,class)
float[][] predictions = (float[][]) output.get(0).getValue();
List<DefectDetection> detections = new ArrayList<>();
for (float[] pred : predictions) {
float confidence = pred[4];
if (confidence < config.getConfidenceThreshold()) continue;
int classId = (int) pred[5];
DefectType defectType = config.getDefectTypeById(classId);
DefectDetection detection = DefectDetection.builder()
.defectType(defectType)
.confidence(confidence)
.boundingBox(new BoundingBox(pred[0], pred[1], pred[2], pred[3]))
.build();
detections.add(detection);
}
// NMS去除重叠框
return applyNMS(detections, config.getNmsThreshold());
}
/**
* 批量推理:提升GPU利用率
*/
public List<List<DefectDetection>> detectBatch(List<Mat> roiImages) throws OrtException {
int batchSize = roiImages.size();
float[] batchData = new float[batchSize * 3 * config.getInputHeight() * config.getInputWidth()];
for (int i = 0; i < batchSize; i++) {
Mat resized = new Mat();
Imgproc.resize(roiImages.get(i), resized,
new Size(config.getInputWidth(), config.getInputHeight()));
float[] imgData = matToFloatArray(resized);
System.arraycopy(imgData, 0, batchData,
i * imgData.length, imgData.length);
}
long[] shape = {batchSize, 3, config.getInputHeight(), config.getInputWidth()};
OnnxTensor inputTensor = OnnxTensor.createTensor(env,
FloatBuffer.wrap(batchData), shape);
OrtSession.Result output = session.run(
Collections.singletonMap("input", inputTensor)
);
return parseBatchOutput(output, batchSize);
}
private float[] matToFloatArray(Mat mat) {
// BGR -> RGB,归一化到[0,1]
Mat rgbMat = new Mat();
Imgproc.cvtColor(mat, rgbMat, Imgproc.COLOR_BGR2RGB);
float[] data = new float[3 * mat.rows() * mat.cols()];
float[] meanValues = {0.485f, 0.456f, 0.406f};
float[] stdValues = {0.229f, 0.224f, 0.225f};
int idx = 0;
for (int c = 0; c < 3; c++) {
for (int h = 0; h < mat.rows(); h++) {
for (int w = 0; w < mat.cols(); w++) {
double[] pixel = rgbMat.get(h, w);
data[idx++] = (float)((pixel[c] / 255.0 - meanValues[c]) / stdValues[c]);
}
}
}
return data;
}
}判决逻辑:双层过滤机制
工业检测对误判有很高的容忍度要求。我们设计了规则层+AI层的双层判决:
@Service
public class QualityJudgementService {
@Autowired
private DefectDetectionEngine detectionEngine;
@Autowired
private InspectionRuleRepository ruleRepo;
@Autowired
private QualityResultRepository resultRepo;
public InspectionResult inspect(String productId, List<Mat> roiImages,
String productType) {
// 并行推理所有ROI
List<List<DefectDetection>> allDetections;
try {
allDetections = detectionEngine.detectBatch(roiImages);
} catch (Exception e) {
log.error("推理引擎异常,产品{}降级为人工检查", productId, e);
return InspectionResult.manualReview(productId, "推理服务异常");
}
// 汇总所有ROI的检测结果
List<DefectDetection> allDefects = new ArrayList<>();
for (int i = 0; i < allDetections.size(); i++) {
List<DefectDetection> roiDefects = allDetections.get(i);
// 将ROI坐标转回原图坐标
roiDefects.forEach(d -> d.setRoiIndex(i));
allDefects.addAll(roiDefects);
}
// 获取该产品类型的判决规则
List<InspectionRule> rules = ruleRepo.findByProductType(productType);
// 执行判决
JudgementDecision decision = applyRules(allDefects, rules);
InspectionResult result = InspectionResult.builder()
.productId(productId)
.productType(productType)
.decision(decision.isPass() ? "PASS" : "FAIL")
.defects(allDefects)
.failureReasons(decision.getFailureReasons())
.inspectionTime(Instant.now())
.build();
resultRepo.save(result);
return result;
}
private JudgementDecision applyRules(List<DefectDetection> defects,
List<InspectionRule> rules) {
List<String> failureReasons = new ArrayList<>();
for (InspectionRule rule : rules) {
// 统计满足条件的缺陷数量
long count = defects.stream()
.filter(d -> d.getDefectType() == rule.getDefectType())
.filter(d -> d.getConfidence() >= rule.getMinConfidence())
.count();
if (count > rule.getMaxAllowed()) {
failureReasons.add(String.format(
"缺陷类型[%s]检出%d处,超过允许值%d处",
rule.getDefectType().getDisplayName(), count, rule.getMaxAllowed()
));
}
// 致命缺陷:任何一个都直接判废
if (rule.isCritical() && count > 0) {
failureReasons.add(String.format(
"发现致命缺陷[%s],产品直接判废",
rule.getDefectType().getDisplayName()
));
}
}
return new JudgementDecision(failureReasons.isEmpty(), failureReasons);
}
}小样本学习:解决缺陷数据稀缺
生产线刚上线时,某些罕见缺陷可能只有几十张样本。我们用Few-Shot Learning做特殊处理:
@Service
public class FewShotDefectClassifier {
@Autowired
private FeatureExtractor featureExtractor; // 预训练backbone
private final Map<DefectType, List<float[]>> prototypeVectors = new HashMap<>();
/**
* 注册少样本缺陷类型的原型向量
* 通过支持集样本计算类别中心
*/
public void registerDefectPrototype(DefectType defectType, List<Mat> supportImages) {
List<float[]> features = supportImages.stream()
.map(img -> featureExtractor.extract(img))
.collect(Collectors.toList());
// 计算类别中心(原型向量)
float[] prototype = averageVectors(features);
prototypeVectors.put(defectType, List.of(prototype));
log.info("缺陷类型{}原型向量注册完成,支持集大小: {}",
defectType, supportImages.size());
}
/**
* 基于原型网络的分类
* 计算查询样本与各类别原型的距离
*/
public DefectClassificationResult classify(Mat queryImage) {
float[] queryFeature = featureExtractor.extract(queryImage);
DefectType bestMatch = null;
float minDistance = Float.MAX_VALUE;
for (Map.Entry<DefectType, List<float[]>> entry : prototypeVectors.entrySet()) {
float[] prototype = entry.getValue().get(0);
float distance = euclideanDistance(queryFeature, prototype);
if (distance < minDistance) {
minDistance = distance;
bestMatch = entry.getKey();
}
}
// 距离转置信度(softmax over distances)
float confidence = distanceToConfidence(minDistance);
return new DefectClassificationResult(bestMatch, confidence, minDistance);
}
private float euclideanDistance(float[] a, float[] b) {
float sum = 0;
for (int i = 0; i < a.length; i++) {
float diff = a[i] - b[i];
sum += diff * diff;
}
return (float) Math.sqrt(sum);
}
}实时性优化:跑赢产线节拍
产线节拍可能是每秒2-3件产品,每件检测时间必须控制在300ms以内。几个关键优化:
1. 模型量化:FP32 -> INT8,推理速度提升3-4倍,精度损失通常在1%以内。
2. TensorRT加速:将ONNX模型转为TensorRT格式,在NVIDIA GPU上速度再提升2-3倍。
3. 异步流水线:采集、预处理、推理、后处理四个阶段并行,每个阶段处理不同帧。
@Component
public class InspectionPipeline {
// 多级流水线,每级一个队列
private final BlockingQueue<RawFrame> captureQueue = new LinkedBlockingQueue<>(10);
private final BlockingQueue<ProcessedImage> preprocessQueue = new LinkedBlockingQueue<>(10);
private final BlockingQueue<InferenceTask> inferenceQueue = new LinkedBlockingQueue<>(10);
private final ExecutorService pipelineExecutor = Executors.newFixedThreadPool(4);
@PostConstruct
public void start() {
// Stage 1: 相机采集(单线程,跟相机SDK同步)
pipelineExecutor.submit(this::captureStage);
// Stage 2: 图像预处理(单线程,CPU密集)
pipelineExecutor.submit(this::preprocessStage);
// Stage 3: AI推理(单线程,GPU独占)
pipelineExecutor.submit(this::inferenceStage);
// Stage 4: 后处理和上报(单线程)
pipelineExecutor.submit(this::postprocessStage);
}
}工程经验总结
做了几个工厂的视觉检测项目之后,有几点认知的转变:
一开始我以为,模型精度是最重要的。 后来发现,光照稳定性才是最重要的。同一个模型,在标准光照下准确率95%,在光源老化或产品反光变化时可能掉到70%。在模型上花的时间,不如在打光方案和光源维护上花的时间。
一开始我以为,漏检是最大的风险。 后来发现,误检(把好产品判为废品)同样严重,特别是高价值产品。系统阈值调太严,误检率高,操作员开始手动放行所有告警,系统形同虚设。
一开始我以为,系统上线就大功告成。 后来发现,模型需要持续学习。产品规格会变,材料批次会变,正常品的"正常范围"在变,不更新模型,准确率会慢慢退化。
