Azmath Moosa

Performing Realtime Face Detection & Analysis on Surveillance Videos

February 13, 2018


With face recognition technology becoming increasingly accessible, thanks to great strides made recently in machine learning and AI, the time is ripe to start developing solutions around this to quickly capitalise on this emerging market.  Face analytics has become inevitable in many applications - from Social media to Security systems.  Many estimates indicate a steady annual growth of 13% for net market worth of face recognition technologies which will reach at least 7.7 Billion USD by 2022  [ref].

If you have entered this market and are planning to develop a custom solution for your client, you might find reading our article comparing various existing face recognition SDKs and APIs beneficial.  These can be integrated into your application and do the heavy lifting.

This article demonstrates using Deepsight SDK for the purpose of real time face detection and demographic analysis on surveillance videos.  The video below demonstrates our objective.  Deepsight installation is easy and only takes a couple of minutes.

How to

  1. First download and install Deepsight SDK by following the getting started guide.
  2. You will need OpenCV for the tutorial.  Follow the guide here for Ubuntu or Windows to install it.
  3. Copy and paste the following python code in a new file

    import cv2
    import requests
    import numpy as np
    import json
    import argparse
    import signal
    import logging
    import datetime, time
    import os

    face_api = "http://localhost:5000/inferImage?returnFaceAttributes=true"
    compare_api = "http://localhost:5000/compareFaces"

    # parse arguments
    parser = argparse.ArgumentParser(description='Realtime Face Analytics')
    parser.add_argument('--src', action='store', default=0, nargs='?', help='Set video source; default is usb webcam')
    parser.add_argument('--w', action='store', default=320, nargs='?', help='Set video width')
    parser.add_argument('--h', action='store', default=240, nargs='?', help='Set video height')
    args = parser.parse_args()

    inp_w = int(args.w)
    inp_h = int(args.h)

    # start the camera
    cap = cv2.VideoCapture(args.src)
    cap.set(cv2.CAP_PROP_FRAME_WIDTH, float(args.w))
    cap.set(cv2.CAP_PROP_FRAME_HEIGHT, float(args.h))
    ret, frame =

    # To record video uncomment these lines
    #fourcc = cv2.VideoWriter_fourcc(*'XVID') # Define the codec and create VideoWriter object
    #out = cv2.VideoWriter(os.path.basename(args.src)[:-4]+"_out.avi",fourcc, 25.0, (inp_w,inp_h))

    # catch exit signal
    def signal_handler(signal, frame):
    global cap
    signal.signal(signal.SIGINT, signal_handler)

    male_icon = cv2.imread("m.png", -1)
    female_icon = cv2.imread("f.png", -1)

    # start processing
    count = 0
    while True:
    _, framex =
    key = cv2.waitKey(1) & 0xFF

    count +=1
    # if count%3!=0:
    # continue

    frame = cv2.resize(framex, (int(args.w),int(args.h)))

    r, imgbuf = cv2.imencode(".bmp", frame)
    image = {'pic':bytearray(imgbuf)}

    start_time = time.time()
    r =, files=image, timeout=10)
    result = r.json()
    t = time.time() - start_time
    print("---%0.3fs %0.3fFPS ---" %(t, 1/t))

    if len(result) > 1:
    faces = result[:-1]
    diag = result[-1]['diagnostics']

    for face in faces:
    #rect = [face[i] for i in ['faceRectangle']][0]
    rect, gender, age = [face[i] for i in ['faceRectangle', 'gender', 'age']]
    x,y,w,h, confidence = [rect[i] for i in ['left', 'top', 'width', 'height', 'confidence']]

    if confidence < 0.4:

    cv2.rectangle(frame, (x,y-10), (x+w,y), (0,255,255), -1, 8)
    cv2.rectangle(frame, (x,y+h-10), (x+w,y+h), (0,255,255), -1, 8)

    cv2.putText(frame,"%s"%(age), (x,y-2), cv2.FONT_HERSHEY_COMPLEX_SMALL, 0.5, (0,0,255),1)
    cv2.putText(frame, "%0.2f" % (confidence), (x, y+h-2), cv2.FONT_HERSHEY_COMPLEX_SMALL, 0.5, (0, 0, 255),1)
    cv2.putText(frame, "%s"%(gender[0]), (x+w-10,y+h-2), cv2.FONT_HERSHEY_COMPLEX_SMALL, 0.5, (0,0,255),1)

    cv2.putText(frame, "fps:%0.2f"%(1000/float(diag['elapsedTime'].split(" ")[0])), (0,20), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0,0,255))

    # uncomment below to record to disk
    cv2.imshow("frame", frame)
    if key == ord('q'):


  4. Install the dependencies pip3 install requests
  5. Run the program python3 --src /path/to/vid
  6. You should be able to see the preview.

You can explore other arg params using the --help switch.


This tutorial demonstrates how Deepsight SDK makes developing face analytics applications extremely easy.  It is possible to modify the tutorial code to improve upon the features.

Feel free to comment below for any queries.

Leave a Reply

Your email address will not be published. Required fields are marked *

© Copyright 2021 - BaseApp - All Rights Reserved


linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram