Face Detection using Web cam

If you haven’t been living in a cave until now, you might have noticed the sudden explosion of Face detection technology everywhere around you.  Our phones, offices,
shopping malls, public surveillance networks, and literally any other device that has a camera, all have been installed with this snooping tech.

But if you are of those who like to pursue the how-do-they-do-that questions, then this post is for you!  Read on to learn to build a simple Face Detector using your webcam.  Our face detector will also be able to determine gender, age and facial landmarks.


How Does it Work?

Although Face detection technology has existed for quite a few decades, they were largely ignored due to their horrible accuracy.  They were unfit for many practical applications until Deep Learning took off in 2012.  Since then, their accuracy has surpassed human levels and have thus become inevitable in many applications.  Deep Learning refers to the process of training an artificial neural network.  Wait… what?!  An artificial neural network is basically a set of interlinked nodes that try to emulate biological nervous systems.  That’s right, after failing to program computers to understand vision, we decided to take cues from Nature and immediately found that a layered set of neurons can be used to understand images!

neural network

We have observed that a biological neuron, at the fundamental level does only one thing: It amplifies whatever signal it receives from the input by a ‘weight‘ and forwards(activates) it to the next neuron depending on a threshold.  A network of these neurons is, remarkably, able to modulate input signals from sensory muscles such that the last layer of neurons activate meaningfully to a given stimuli.  The exact weights are naturally learned by the network over time.

When we emulate neural networks on a computer, we first initialise the weights randomly and then adjust them through the process of Training.  We take an image, give it as input to the input layer and compare the network’s output to expected output.  We then calculate the error and try to adjust the weights of the network such that, on the next pass, the error is reduced.  This process requires solving some convoluted math equations and requires powerful computing resources.  Once trained, the network performs remarkably well.  There is quite a lot of detail that we just skipped in order keep this article sane, but if you are interested you can learn more here.

Face Detection using Web Cam

You don’t have to be a math wizard to use this technology anymore.  There are plenty of Free and Paid services out there that provide pre-trained neural networks for the task of face detection and analysis.  You only need basic programming skills to use their API.  In this tutorial we will be using Deepsight Face – an image recognition SDK that provides pre-trained neural networks accessible through a RESTful API.

Deepsight Face


If you haven’t already, then the first step would be to install Deepsight.  The documentation covers this topic quite well, so, it is highly recommended that you follow that.  After you have launched Deepsight and it starts running you can follow what comes next.

Writing Python Code

  1. Install Python and dependencies.  Follow this in the documentation.
  2. Install OpenCV.  Follow the guides for Windows and Linux.
  3. Create a file face.py with the following code
import cv2
import requests

cap = cv2.VideoCapture(0)
cap.set(cv2.CAP_PROP_FRAME_WIDTH, 320)
cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 240)
ret, frame = cap.read()

face_api = "http://localhost:5000/inferImage?returnFaceAttributes=true&returnFaceLandmarks=true"

while True:
 _, frame = cap.read()

r, buf = cv2.imencode(".jpg", frame) 
 image = {'pic':bytearray(buf)}
 r = requests.post(face_api, files=image)
 result = r.json()


if len(result) > 1:
 faces = result[:-1]
 diag = result[-1]['diagnostics']

for face in faces:
 rect, gender, age, lmk = [face[i] for i in ['faceRectangle', 'gender', 'age', 'faceLandmarks']]
 x,y,w,h = [rect[i] for i in ['left', 'top', 'width', 'height']]

 cv2.rectangle(frame, (x,y), (x+w,y+h), (0,255,0),1,8)
 cv2.putText(frame, gender, (x,y), cv2.FONT_HERSHEY_SIMPLEX, 0.5, 255)
 cv2.putText(frame, age, (x, y+10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, 255)
 for pt in lmk.values():
 cv2.circle(frame, (pt['x'],pt['y']),2,(0,255,255),-1,8)

cv2.putText(frame, diag['elapsedTime'], (0,20), cv2.FONT_HERSHEY_SIMPLEX, 0.5, 255)

cv2.imshow("frame", frame)

Run the application using python face.py.

Congratulations on writing your first face detection app!

Related Posts

Leave a comment

%d bloggers like this: