OpenCV Face detection vs YOLO Face detection

This is 2018, and face detection has become extremely easy thanks to an explosion in computer vision capabilities.  OpenCV has undoubtedly been the favourite open source library for many students and researchers.  OpenCV face detection module is based on Haar Cascades – which is quite good at detecting faces.  Let’s see how we can use it.

Installing OpenCV

If you haven’t installed it yet, then the easiest way is to run this command in a terminal.

pip install opencv-python

If pip is not available in your system, then consider installing python from these Windows and Linux websites.  Follow this for troubleshooting.

Performing Face detection in OpenCV using Haar Cascades

Create a new file face_det.py and paste the following code

# OpenCV program to detect face in real time
# import libraries of python OpenCV 
# where its functionality resides
import cv2 
import argparse 
import os
 
 # parse arguments
parser = argparse.ArgumentParser(description='OpenCV Face Detection')
parser.add_argument('--src', action='store', default=0, nargs='?', help='Set video source; default is usb webcam')
parser.add_argument('--w', action='store', default=320, nargs='?', help='Set video width')
parser.add_argument('--h', action='store', default=240, nargs='?', help='Set video height')
args = parser.parse_args()

# load the required trained XML classifiers
# https://github.com/Itseez/opencv/blob/master/
# data/haarcascades/haarcascade_frontalface_default.xml
# Trained XML classifiers describes some features of some
# object we want to detect a cascade function is trained
# from a lot of positive(faces) and negative(non-faces)
# images.
face_cascade = cv2.CascadeClassifier(cv2.data.haarcascades + 'haarcascade_frontalface_default.xml')

# capture frames from a camera
cap = cv2.VideoCapture(args.src)

while 1:            
    # reads frames from a camera
    ret, img = cap.read() 

    # convert to gray scale of each frames
    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
 
    # Detects faces of different sizes in the input image
    faces = face_cascade.detectMultiScale(gray, 1.3, 5)
    for (x,y,w,h) in faces:
        # To draw a circle in a face 
        cv2.circle(img, (x+w/2,y+h/2),(max(w,h)/2),(255,255,0),2)
        roi_gray = gray[y:y+h, x:x+w]
        roi_color = img[y:y+h, x:x+w]
 
 
    # Display an image in a window
    cv2.imshow('opencv face detection', img)

    # Wait for Esc key to stop    
    if cv2.waitKey(10) & 0xFF == 27:
        break
 
# Close the window
cap.release()
 
# De-allocate any associated memory usage
cv2.destroyAllWindows()

Run the program as python face_det.py and it will start detecting faces from the webcam.  Use the --src arg parameter to specify a custom video.  Use --help to print usage details.

How Haar Cascade face detection works

The Haar Classifier is a machine learning based approach, an algorithm created by Paul Viola and Michael Jones; which (as mentioned before) are trained from many many positive images (with faces) and negatives images (without faces).

It starts by extracting Haar features from each image as shown by the windows below:

face detection haar features

Each window is placed on the picture to calculate a single feature. This feature is a single value obtained by subtracting the sum of pixels under the white part of the window from the sum of the pixels under the black part of the window.  All possible sizes of each window are placed on all possible locations of each image to calculate a huge set of features.

face-detection-haar2

Different stages in visualization. Source: docs.opencv.org

For example, in above image, we are extracting two features. The first one focuses on the property that the region of the eyes is often darker than the area of the nose and cheeks. The second feature relies on the property that the eyes are darker than the bridge of the nose.

But among all these features calculated, most of them are irrelevant. For example, when used on the cheek, the windows become irrelevant because none of these areas are darker or lighter than other regions on the cheeks, all sectors here are the same.

So we promptly discard irrelevant features and keep only those relevant with a fancy technique called Adaboost. AdaBoost is a training process which selects only those features known to improve the classification (face/non-face) accuracy of our classifier.

In the end, the algorithm considers the fact that generally: most of the region in an image is a non-face region. Considering this, it’s a better idea to have a simple method to check if a window is a non-face region, and if it’s not, discard it right away and don’t process it again. So we can focus mostly on the area where a face is.  This saves a lot of time and makes it usable in realtime.

However, this is not a perfect technique and is known to miss many faces.  With the advent of deep learning, Haar Cascades is now outdated and has been relegated to archives!

YOLO Face Detector

You only look once (YOLO) is a state-of-the-art, real-time object detection system. It is based on Deep Learning.  Its authors describe how it works:

Prior detection systems repurpose classifiers or localizers to perform detection. They apply the model to an image at multiple locations and scales. High scoring regions of the image are considered detections.

We use a totally different approach. We apply a single neural network to the full image. This network divides the image into regions and predicts bounding boxes and probabilities for each region. These bounding boxes are weighted by the predicted probabilities.

Our model has several advantages over classifier-based systems. It looks at the whole image at test time so its predictions are informed by global context in the image. It also makes predictions with a single network evaluation unlike systems like R-CNN which require thousands for a single image. This makes it extremely fast, more than 1000x faster than R-CNN and 100x faster than Fast R-CNN. See our paper for more details on the full system.

Deepsight Face SDK

Deepsight Face SDK comes with a bundled YOLO face detector that can be used for detecting faces.  Visit deepsight sdk home page to learn more. Free version of the SDK is available here – deepsight face free sdk download

Download and install it in order to follow the tutorial.

Install requests for making API calls to Deepsight.

pip install requests

Create a new file yolo_face_det.py and paste the following code

# OpenCV program to detect face in real time
# import libraries of python OpenCV 
# where its functionality resides
import cv2 
import requests
import argparse 
import os
 
 # parse arguments
parser = argparse.ArgumentParser(description='YOLO Face Detection')
parser.add_argument('--src', action='store', default=0, nargs='?', help='Set video source; default is usb webcam')
parser.add_argument('--w', action='store', default=320, nargs='?', help='Set video width')
parser.add_argument('--h', action='store', default=240, nargs='?', help='Set video height')
args = parser.parse_args()

# face detection endpoint (deepsight sdk runs as http service on port 5000)
face_api = "http://127.0.0.1:5000/inferImage?detector=yolo"

# capture frames from a camera
cap = cv2.VideoCapture(args.src)

# loop runs if capturing has been initialized.
while 1:    
        
    # reads frames from a camera
    ret, img = cap.read() 
    img = cv2.resize(img, (int(args.w),int(args.h)))    
    r, imgbuf = cv2.imencode(".bmp", img)    
    image = {'pic':bytearray(imgbuf)}
     
    r = requests.post(face_api, files=image)
    result = r.json()   
     
    if len(result) > 1:
        faces = result[:-1]
        for face in faces:
            rect = [face[i] for i in ['faceRectangle']][0]
            x,y,w,h, confidence = [rect[i] for i in ['left', 'top', 'width', 'height', 'confidence']]
            # discard if confidence is too low
            if confidence < 0.6:
                continue
                
            cv2.rectangle(img, (x,y), (x+w,y+h), (255,0,255),4,8) 

    
    cv2.imshow('YOLO Face detection',img)
 
    # Wait for Esc key to stop
    if cv2.waitKey(1) & 0xff == 27:
        break
 
# Close the window
cap.release()
 
# De-allocate any associated memory usage
cv2.destroyAllWindows()

Run the program as python yolo_face_det.py and it will start detecting faces from the webcam.  Use the --src arg parameter to specify a custom video.  Use --help to print usage details.

You would immediately notice how YOLO is much more robust.  Vary the confidence parameter to tune the sensitivity.  The video below compares OpenCV and YOLO detectors.

Circles = OpenCV Haar Cascade Face Detector
Rectangle = Deepsight YOLO Face Detector

Deepsight also has mmod hog and yolohd apart from yolo detector.  Vary the detector=yolo line in the above source code to try them out.

Related Posts

Leave a comment