在昉·星光 2上基于 OpenCV 模型实现人脸检测、物体识别、二维码识别、图像边缘检测

本文源于EEworld用户:lijinlei

详细介绍了昉·星光2 RISC-V单板计算机连接USB摄像头,并基于OpenCV模型实现人脸识别、物体识别、二维码识别和图像边缘检测的应用,为该设备在人工智能方面和边缘 AI 的应用与开发提供了参考。

一、准备工作

完成前面帖子描述的软件库更新流程后,终端输入以下指令验证 OpenCV 是否可以正常调用

python3
import cv2

若出现错误,如:

ImportError: libtiff.so.5: cannot open shared object file: No such file or directory

该错误是由于 OpenCV (cv2) 依赖的libtiff 动态库(libtiff.so.5)在系统中缺失或未正确安装。

二、 库函数调用

可先使用指令 sudo find / -name "libtiff*.so*" 查询是否安装 libtiff.so.5 库;

若回复有相应的库:

/usr/lib/riscv64-linux-gnu/evince/4/backends/libtiffdocument.so
/usr/lib/riscv64-linux-gnu/libtiffxx.so
/usr/lib/riscv64-linux-gnu/libtiff.so.6.0.0
/usr/lib/riscv64-linux-gnu/libtiffxx.so.6.0.0
/usr/lib/riscv64-linux-gnu/libtiffxx.so.6
/usr/lib/riscv64-linux-gnu/libtiff.so
/usr/lib/riscv64-linux-gnu/libtiff.so.6

则表明系统中已有 libtiff.so 但版本不同(如 libtiff.so.6 ),可以创建软链接:
sudo ln -s /usr/lib/riscv64-linux-gnu/libtiff.so.6 /usr/lib/riscv64-linux-gnu/libtiff.so.5

若没有相应的库,则需安装 apt install libtiff-dev 最新动态链接库。
此时再测试 import cv2 则正常运行。

三、硬件连接

连接摄像头与开发板,可使用 MSI 摄像头或 USB 摄像头


通过指令 v4l2-ctl --list-devices 查询已连接图像采集设备及其索引号

这里使用的罗技 C270 是标准的 UVC 设备,对应的索引号为 /dev/video4 .

四、人脸检测

通过 OpenCV 模型实现人脸检测、轮廓标注、置信度和识别。

代码

新建 face_detection.py 文件,并添加如下代码:

import cv2

face_cascade = cv2.CascadeClassifier("/usr/share/opencv4/haarcascades/haarcascade_frontalface_default.xml")

cap = cv2.VideoCapture(4, cv2.CAP_V4L2)

while True:
    ret, frame = cap.read()
    if not ret:
        break

    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)

    faces = face_cascade.detectMultiScale(
        gray,
        scaleFactor=1.1,
        minNeighbors=5,
        minSize=(30, 30)
    )


    for (x, y, w, h) in faces:
        cv2.rectangle(frame, (x, y), (x+w, y+h), (0, 255, 0), 2)

    cv2.imshow('Face Detection', frame)

    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

cap.release()
cv2.destroyAllWindows()

完成后保存代码,执行 python3 face_detection.py ,自动弹出摄像头实时画面显示框。
实时视频窗口会显示人脸检测结果,绿色框标识人脸位置。

效果

将摄像头对准人物照片,即可自动检测人脸并框选:


更多人物图片的测试情况:


参考: VisionFive 2 与 OpenCV .

显示置信度

要在人脸检测的同时显示检测的 准确率(置信度),需要使用 基于深度学习的检测模型(如 OpenCV 的 DNN 模块或 FaceNet),传统的 Haar 级联分类器不直接提供置信度分数。

下载 预训练模型文件

若遇到网络连接问题,可将 https://github.com 修改为 https://bgithub.xyz 实现加速下载。

代码

新建文件夹,放置预训练模型文件、 face_detection.py 程序文件,代码如下

import cv2
import numpy as np

net = cv2.dnn.readNetFromCaffe("deploy.prototxt", "res10_300x300_ssd_iter_140000.caffemodel")

cap = cv2.VideoCapture(4, cv2.CAP_V4L2)

while True:
    ret, frame = cap.read()
    if not ret:
        break

    (h, w) = frame.shape[:2]
    blob = cv2.dnn.blobFromImage(cv2.resize(frame, (300, 300)), 1.0, 
                                (300, 300), (104.0, 177.0, 123.0))
    net.setInput(blob)
    detections = net.forward()

    for i in range(detections.shape[2]):
        confidence = detections[0, 0, i, 2]

        if confidence > 0.5:
            box = detections[0, 0, i, 3:7] * np.array([w, h, w, h])
            (startX, startY, endX, endY) = box.astype("int")

            cv2.rectangle(frame, (startX, startY), (endX, endY), (0, 255, 0), 2)
            text = f"Face: {confidence * 100:.2f}%"
            cv2.putText(frame, text, (startX, startY - 10),
                        cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2)

    cv2.imshow("Face Detection with Confidence", frame)
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

cap.release()
cv2.destroyAllWindows()

终端执行 python3 face_detection.py 运行代码,调用预训练模型文件。

效果

同样进行人脸照片测试:

实时视频窗口会显示人脸检测结果,绿色框标识人脸位置,同时显示置信度。

物体识别

运行 Python 语言应用:

进入 yolo-v5 第三方 demo 目录,以在昉·星光 2的 Debian 系统上运行基于 yolo-5 onnx 模型 Python 演示代码:

cd /usr/share/opencv4/yolo-v5/
python3 yolov5.py --device 4

/dev/video4 为 USB 摄像头索引号,可通过指令 ls /dev/video* 查询对应的图像采集设备索引号。

代码

打开 yolov5.py 代码

import cv2
import numpy as np
import argparse

# Constants.
INPUT_WIDTH = 640
INPUT_HEIGHT = 640
SCORE_THRESHOLD = 0.5
NMS_THRESHOLD = 0.45
CONFIDENCE_THRESHOLD = 0.45

# Text parameters.
FONT_FACE = cv2.FONT_HERSHEY_SIMPLEX
FONT_SCALE = 0.7
THICKNESS = 1

# Colors
BLACK  = (0,0,0)
BLUE   = (255,178,50)
YELLOW = (0,255,255)
RED = (0,0,255)


def draw_label(input_image, label, left, top):
    """Draw text onto image at location."""
    # Get text size.
    text_size = cv2.getTextSize(label, FONT_FACE, FONT_SCALE, THICKNESS)
    dim, baseline = text_size[0], text_size[1]
    # Use text size to create a BLACK rectangle.
    cv2.rectangle(input_image, (left, top), (left + dim[0], top + dim[1] + baseline), BLACK, cv2.FILLED);
    # Display text inside the rectangle.
    cv2.putText(input_image, label, (left, top + dim[1]), FONT_FACE, FONT_SCALE, YELLOW, THICKNESS, cv2.LINE_AA)


def pre_process(input_image, net):
    # Create a 4D blob from a frame.
    blob = cv2.dnn.blobFromImage(input_image, 1/255, (INPUT_WIDTH, INPUT_HEIGHT), [0,0,0], 1, crop=False)

    # Sets the input to the network.
    net.setInput(blob)

    # Runs the forward pass to get output of the output layers.
    output_layers = net.getUnconnectedOutLayersNames()
    outputs = net.forward(output_layers)
    # print(outputs[0].shape)

    return outputs


def post_process(input_image, outputs):
    # Lists to hold respective values while unwrapping.
    class_ids = []
    confidences = []
    boxes = []

    # Rows.
    rows = outputs[0].shape[1]

    image_height, image_width = input_image.shape[:2]

    # Resizing factor.
    x_factor = image_width / INPUT_WIDTH
    y_factor =  image_height / INPUT_HEIGHT

    # Iterate through 25200 detections.
    for r in range(rows):
        row = outputs[0][0][r]
        confidence = row[4]

        # Discard bad detections and continue.
        if confidence >= CONFIDENCE_THRESHOLD:
            classes_scores = row[5:]

            # Get the index of max class score.
            class_id = np.argmax(classes_scores)

            #  Continue if the class score is above threshold.
            if (classes_scores[class_id] > SCORE_THRESHOLD):
                confidences.append(confidence)
                class_ids.append(class_id)

                cx, cy, w, h = row[0], row[1], row[2], row[3]

                left = int((cx - w/2) * x_factor)
                top = int((cy - h/2) * y_factor)
                width = int(w * x_factor)
                height = int(h * y_factor)

                box = np.array([left, top, width, height])
                boxes.append(box)

    # Perform non maximum suppression to eliminate redundant overlapping boxes with
    # lower confidences.
    indices = cv2.dnn.NMSBoxes(boxes, confidences, CONFIDENCE_THRESHOLD, NMS_THRESHOLD)
    for i in indices:
        box = boxes<i>
        left = box[0]
        top = box[1]
        width = box[2]
        height = box[3]
        cv2.rectangle(input_image, (left, top), (left + width, top + height), BLUE, 3*THICKNESS)
        label = "{}:{:.2f}".format(classes[class_ids<i>], confidences<i>)
        draw_label(input_image, label, left, top)

    return input_image


if __name__ == '__main__':
    parser = argparse.ArgumentParser()
    parser.add_argument('--device', type=int, default=4, help='camera device number, 1 for CSI Camera with ISP, 4 for UVC in common case.')
    args = parser.parse_args()

    # open selected camera using selected API
    deviceID = args.device
    apiID = cv2.CAP_V4L2
    cap = cv2.VideoCapture()
    cap.open(deviceID, apiPreference=apiID)

    if deviceID == 1:
        cap.set(cv2.CAP_PROP_CONVERT_RGB, 0)


    # Load class names.
    classesFile = "coco.names"
    classes = None
    with open(classesFile, 'rt') as f:
        classes = f.read().rstrip('\n').split('\n')

    # Give the weight files to the model and load the network using them.
    modelWeights = "models/yolov5n.onnx"
    net = cv2.dnn.readNet(modelWeights)
    net.setPreferableBackend(cv2.dnn.DNN_BACKEND_OPENCV)
    net.setPreferableTarget(cv2.dnn.DNN_TARGET_OPENCL)
    while(cv2.waitKey(1) < 0):
        # Process image.
        ret, frame = cap.read()
        if deviceID == 1:
            frame = cv2.cvtColor(frame, cv2.COLOR_YUV2BGR_NV12)
        detections = pre_process(frame, net)
        img = post_process(frame.copy(), detections)

        # Put efficiency information. The function getPerfProfile returns the overall time for inference(t) and the timings for each of the layers(in layersTimes)
        t, _ = net.getPerfProfile()
        label = 'Inference time: %.2f ms' % (t * 1000.0 / cv2.getTickFrequency())
        cv2.putText(img, label, (20, 40), FONT_FACE, FONT_SCALE, RED, THICKNESS, cv2.LINE_AA)

        cv2.imshow('Output', img)
    cap.release()
    cv2.destroyAllWindows()

执行代码 python3 yolov5.py ,实时窗口显示摄像头探测画面。

效果

当检测到可识别的物体时,画面出现蓝色框选和识别结果,识别速度较快。


参考:基于 YOLO-V5 ONNX 模型的通用物体识别Object Detection using YOLOv5 OpenCV DNN in C++ and Python .

二维码识别

运行 Python 语言应用:

进入 Python 应用目录,以在昉·星光 2的 Debian 系统上运行二维码检测与解码 Python 演示代码:

cd /usr/share/doc/opencv-doc/examples/python/
python3 qrcode.py --device=4

代码

打开 qrcode.py 代码

#!/usr/bin/env python

'''
This program detects the QR-codes using OpenCV Library.

Usage:
   qrcode.py
'''


# Python 2/3 compatibility
from __future__ import print_function

import numpy as np
import cv2 as cv

import argparse
import sys
import webbrowser
import re
PY3 = sys.version_info[0] == 3
if PY3:
    xrange = range

class QrSample:
    def __init__(self, args):
        self.fname = ''
        self.fext = ''
        self.fsaveid = 0
        self.input = args.input
        self.detect = args.detect
        self.out = args.out
        self.multi = args.multi
        self.saveDetections = args.save_detections
        self.saveAll = args.save_all

    def getQRModeString(self):
        msg1 = "multi " if self.multi else ""
        msg2 = "detector" if self.detect else "decoder"
        msg = "QR {:s}{:s}".format(msg1, msg2)
        return msg

    def drawFPS(self, result, fps):
        message = '{:.2f} FPS({:s})'.format(fps, self.getQRModeString())
        cv.putText(result, message, (20, 20), 1,
                   cv.FONT_HERSHEY_DUPLEX, (0, 0, 255))

    def drawQRCodeContours(self, image, cnt):
        if cnt.size != 0:
            rows, cols, _ = image.shape
            show_radius = 2.813 * ((rows / cols) if rows > cols else (cols / rows))
            contour_radius = show_radius * 0.4
            cv.drawContours(image, [cnt], 0, (0, 255, 0), int(round(contour_radius)))
            tpl = cnt.reshape((-1, 2))
            for x in tuple(tpl.tolist()):
                color = (255, 0, 0)
                cv.circle(image, tuple(x), int(round(contour_radius)), color, -1)

    def drawQRCodeResults(self, result, points, decode_info, fps):
        if (not hasattr(self, 'saved_url')):
            self.saved_url = ''

        n = len(points)
        if isinstance(decode_info, str):
            decode_info = [decode_info]
        if n > 0:
            for i in range(n):
                cnt = np.array(points<i>).reshape((-1, 1, 2)).astype(np.int32)
                self.drawQRCodeContours(result, cnt)
                msg = 'QR[{:d}]@{} : '.format(i, *(cnt.reshape(1, -1).tolist()))
                print(msg, end="")
                if len(decode_info) > i:
                    if decode_info<i>:
                        print("'", decode_info<i>, "'")
                        match = re.findall('https?://(?:[-\w.]|(?:%[\da-fA-F]{2}))+', decode_info<i>)
                        if match and decode_info<i> != self.saved_url:
                            print("Open url:", match)
                            webbrowser.open_new(decode_info<i>)
                            self.saved_url = decode_info<i>
                    else:
                        print("Can't decode QR code")
                else:
                    print("Decode information is not available (disabled)")
        else:
            print("QRCode not detected!")
        self.drawFPS(result, fps)

    def runQR(self, qrCode, inputimg):
        if not self.multi:
            if not self.detect:
                decode_info, points, _ = qrCode.detectAndDecode(inputimg)
                dec_info = decode_info
            else:
                _, points = qrCode.detect(inputimg)
                dec_info = []
        else:
            if not self.detect:
                _, decode_info, points, _ = qrCode.detectAndDecodeMulti(
                    inputimg)
                dec_info = decode_info
            else:
                _, points = qrCode.detectMulti(inputimg)
                dec_info = []
        if points is None:
            points = []
        return points, dec_info

    def DetectQRFrmImage(self, inputfile):
        inputimg = cv.imread(inputfile, cv.IMREAD_COLOR)
        if inputimg is None:
            print('ERROR: Can not read image: {}'.format(inputfile))
            return
        print('Run {:s} on image [{:d}x{:d}]'.format(
            self.getQRModeString(), inputimg.shape[1], inputimg.shape[0]))
        qrCode = cv.QRCodeDetector()
        count = 10
        timer = cv.TickMeter()
        for _ in range(count):
            timer.start()
            points, decode_info = self.runQR(qrCode, inputimg)
            timer.stop()
        fps = count / timer.getTimeSec()
        print('FPS: {}'.format(fps))
        result = inputimg
        self.drawQRCodeResults(result, points, decode_info, fps)
        cv.imshow("QR", result)
        cv.waitKey(1)
        if self.out != '':
            outfile = self.fname + self.fext
            print("Saving Result: {}".format(outfile))
            cv.imwrite(outfile, result)

        print("Press any key to exit ...")
        cv.waitKey(0)
        print("Exit")

    def processQRCodeDetection(self, qrcode, frame):
        if len(frame.shape) == 2:
            result = cv.cvtColor(frame, cv.COLOR_GRAY2BGR)
        else:
            result = frame
        print('Run {:s} on video frame [{:d}x{:d}]'.format(
            self.getQRModeString(), frame.shape[1], frame.shape[0]))
        timer = cv.TickMeter()
        timer.start()
        points, decode_info = self.runQR(qrcode, frame)
        timer.stop()

        fps = 1 / timer.getTimeSec()
        self.drawQRCodeResults(result, points, decode_info, fps)
        return fps, result, points

    def DetectQRFrmCamera(self, device):
        cap = cv.VideoCapture(device, cv.CAP_V4L2)
        # FIXME: Auto convert property failed in CSI camera(NV12 format).
        if device == 1:
            cap.set(cv.CAP_PROP_CONVERT_RGB, 0)
        # Set UVC in 640*480 format as default
        cap.set(cv.CAP_PROP_FRAME_WIDTH, 640)
        cap.set(cv.CAP_PROP_FRAME_HEIGHT, 480)
        if not cap.isOpened():
            print("Cannot open the camera")
            return
        print("Press 'm' to switch between detectAndDecode and detectAndDecodeMulti")
        print("Press 'd' to switch between decoder and detector")
        print("Press ' ' (space) to save result into images")
        print("Press 'ESC' to exit")

        qrcode = cv.QRCodeDetector()
        while True:
            if device == 1:
                ret, frame_rgb24 = cap.read()
                frame = cv.cvtColor(frame_rgb24, cv.COLOR_YUV2BGR_NV12)
            else:
                ret, frame = cap.read()
            if not ret:
                print("End of video stream")
                break
            forcesave = self.saveAll
            result = frame
            try:
                fps, result, corners = self.processQRCodeDetection(qrcode, frame)
                print('FPS: {:.2f}'.format(fps))
                forcesave |= self.saveDetections and (len(corners) != 0)
            except cv.error as e:
                print("Error exception: ", e)
                forcesave = True
            cv.imshow("QR code", result)
            code = cv.waitKey(1)
            if code < 0 and (not forcesave):
                continue
            if code == ord(' ') or forcesave:
                fsuffix = '-{:05d}'.format(self.fsaveid)
                self.fsaveid += 1
                fname_in = self.fname + fsuffix + "_input.png"
                print("Saving QR code detection result: '{}' ...".format(fname_in))
                cv.imwrite(fname_in, frame)
                print("Saved")
            if code == ord('m'):
                self.multi = not self.multi
                msg = 'Switching QR code mode ==> {:s}'.format(
                    "detectAndDecodeMulti" if self.multi else "detectAndDecode")
                print(msg)
            if code == ord('d'):
                self.detect = not self.detect
                msg = 'Switching QR code mode ==> {:s}'.format(
                    "detect" if self.detect else "decode")
                print(msg)
            if code == 27:
                print("'ESC' is pressed. Exiting...")
                break
        print("Exit.")


def main():
    parser = argparse.ArgumentParser(
        description='This program detects the QR-codes input images using OpenCV Library.')
    parser.add_argument(
        '-i',
        '--input',
        help="input image path (for example, 'opencv_extra/testdata/cv/qrcode/multiple/*_qrcodes.png)",
        default="",
        metavar="")
    parser.add_argument(
        '-D',
        '--device',
        help='camera device number, 1 for CSI Camera with ISP, 4 for UVC in common case.',
        type=int,
        default=4)
    parser.add_argument(
        '-d',
        '--detect',
        help="detect QR code only (skip decoding) (default: False)",
        action='store_true')
    parser.add_argument(
        '-m',
        '--multi',
        help="enable multiple qr-codes detection",
        action='store_true')
    parser.add_argument(
        '-o',
        '--out',
        help="path to result file (default: qr_code.png)",
        default="qr_code.png",
        metavar="")
    parser.add_argument(
        '--save_detections',
        help="save all QR detections (video mode only)",
        action='store_true')
    parser.add_argument(
        '--save_all',
        help="save all processed frames (video mode only)",
        action='store_true')
    args = parser.parse_args()
    qrinst = QrSample(args)
    if args.out != '':
        index = args.out.rfind('.')
        if index != -1:
            qrinst.fname = args.out[:index]
            qrinst.fext = args.out[index:]
        else:
            qrinst.fname = args.out
            qrinst.fext = ".png"
    if args.input != '':
        qrinst.DetectQRFrmImage(args.input)
    else:
        qrinst.DetectQRFrmCamera(args.device)


if __name__ == '__main__':
    print(__doc__)
    main()
    cv.destroyAllWindows()

执行代码 python3 qrcode.py ,实时窗口显示摄像头探测画面。

效果

当检测到二维码时,终端输出对应的扫描值,识别速度较快。


详见:二维码检测与解码 .

图像边缘检测

运行 Python 语言应用:

进入 Python 应用目录,以在昉·星光 2的 Debian 系统上运行图像边缘检测 Python 演示代码:

cd /usr/share/doc/opencv-doc/examples/python/
python3 edge.py 4

该代码可实现图像边缘绘制。

代码

打开 edge.py 文件,代码如下:

#!/usr/bin/env python


This sample demonstrates Canny edge detection.

Usage:
  edge.py [<video source>]

  Trackbars control edge thresholds.



# Python 2/3 compatibility
from __future__ import print_function

import cv2 as cv
import numpy as np

# relative module
import video

# built-in module
import sys


def main():
    try:
        fn = sys.argv[1]
        cam = int(fn)
    except:
        fn = 0

    def nothing(*arg):
        pass

    cv.namedWindow('edge', 0)
    cv.resizeWindow('edge', 640, 480)
    cv.createTrackbar('thrs1', 'edge', 2000, 5000, nothing)
    cv.createTrackbar('thrs2', 'edge', 4000, 5000, nothing)

    cap = cv.VideoCapture(cam, cv.CAP_V4L2)
    if cam == 1:
        cap.set(cv.CAP_PROP_CONVERT_RGB, 0)

    while True:
        if cam == 1:
            ret, img_origin = cap.read()
            img = cv.cvtColor(img_origin, cv.COLOR_YUV2BGR_NV12)
        else:
            ret, img = cap.read()
        gray = cv.cvtColor(img, cv.COLOR_BGR2GRAY)
        thrs1 = cv.getTrackbarPos('thrs1', 'edge')
        thrs2 = cv.getTrackbarPos('thrs2', 'edge')
        edge = cv.Canny(gray, thrs1, thrs2, apertureSize=5)
        vis = img.copy()
        vis = np.uint8(vis/2.)
        vis[edge != 0] = (0, 255, 0)
        cv.imshow('edge', vis)
        ch = cv.waitKey(5)
        if ch == 27:
            break

    print('Done')


if __name__ == '__main__':
    print(__doc__)
    main()
    cv.destroyAllWindows()

终端执行 python3 edge.py 运行代码,自动弹出窗口显示实时画面。

效果

执行代码,窗口显示摄像头探测画面的实时边缘检测结果,识别速度较快,几乎没有延时。


参考:检测图像边缘 .

1 Like