Real-Time Head Pose Estimation in Python

栏目: IT技术 · 发布时间: 5年前

内容简介:For this project, we need OpenCV and Tensorflow so let’s install them.Our first step is to find the faces in the images on which we can find facial landmarks. For this task, we will be using a Caffe model of OpenCV’s DNN module. If you are wondering how it

Requirements

For this project, we need OpenCV and Tensorflow so let’s install them.

#Using pip
pip install opencv-python
pip install tensorflow#Using conda
conda install -c conda-forge opencv
conda install -c conda-forge tensorflow

Face Detection

Our first step is to find the faces in the images on which we can find facial landmarks. For this task, we will be using a Caffe model of OpenCV’s DNN module. If you are wondering how it fares against other models like Haar Cascades or Dlib’s frontal face detector or you want to know more about it in-depth then you can refer to this article:

You can download the required models from my GitHub repository .

import cv2
import numpy as npmodelFile = "models/res10_300x300_ssd_iter_140000.caffemodel"
configFile = "models/deploy.prototxt.txt"
net = cv2.dnn.readNetFromCaffe(configFile, modelFile)img = cv2.imread('test.jpg')
h, w = img.shape[:2]
blob = cv2.dnn.blobFromImage(cv2.resize(img, (300, 300)), 1.0,
(300, 300), (104.0, 117.0, 123.0))
net.setInput(blob)
faces = net.forward()#to draw faces on image
for i in range(faces.shape[2]):
        confidence = faces[0, 0, i, 2]
        if confidence > 0.5:
            box = faces[0, 0, i, 3:7] * np.array([w, h, w, h])
            (x, y, x1, y1) = box.astype("int")
            cv2.rectangle(img, (x, y), (x1, y1), (0, 0, 255), 2)

Load the network using cv2.dnn.readNetFromCaffe and pass the model’s layers and weights as its arguments. It performs best on images resized to 300×300.

Facial Landmark Detection

The most commonly used one is Dlib’s facial landmark detection which gives us 68 landmarks, however, it does not give good accuracy. Instead, we will be using a facial landmark detector provided by Yin Guobing in this Github repo . It also gives 68 landmarks and it is a Tensorflow CNN trained on 5 datasets! The pre-trained model can be found here . The author has only written a series of posts explaining the includes background, dataset, preprocessing, model architecture, training, and deployment that can be found here . I have provided a very brief summary here, but I would strongly encourage you to read them.

In the first of those series, he describes the problem of stability of facial landmarks in videos followed by labeling out the existing solutions like OpenFace and Dlib’s facial landmark detection along with the datasets available. The third article is all about data preprocessing and making it ready to use. In the next two articles, the work is to extract the faces and apply facial landmarks on it to make it ready to train a CNN and store them as TFRecord files. In the sixth article, a model is trained using Tensorflow. In this article, we can see how important loss functions are in training as first he used tf.losses.mean_pairwise_squared_error which uses the relationships between points as the basis for optimization when minimizing loss and could not generalize well. In contrast, when tf.losses.mean_squared_error was used it worked well. In the final article, the model is exported as an API and shown how to use it in Python.

The model takes square boxes of size 128×128 which contain faces and return 68 facial landmarks. The code provided below is taken from here and it can also be used to draw 3D annotation boxes on it. The code is modified to draw facial landmarks on all the faces, unlike the original code which would draw on only one.

This code will draw facial landmarks on the faces.

Drawing facial landmarks

Using the draw_annotation_box() we can also draw the annotation box as shown below.

With annotation box

Pose Estimation

This is a great article on Learn OpenCV which explains head pose detection on images with a lot of Maths about converting the points to 3D space and using cv2.solvePnP to find rotational and translational vectors. A quick read-through of that article will be great to understand the intrinsic working and hence I will write about it only in brief here.

We need six points of the face i.e. is nose tip, chin, extreme left and right points of lips, and the left corner of the left eye and right corner of the right eye. We take standard 3D coordinates of these facial landmarks and try to estimate the rational and translational vectors at the nose tip. Now, for an accurate estimate, we need to intrinsic parameters of the camera like focal length, optical center, and radial distortion parameters. We can estimate the former two and assume the last one is not present to make our work easier. After obtaining the required vectors we can project those 3D points on a 2D surface that is our image.

If we only use the code available and find the angle with the x-axis we can obtain the result shown below.

Result

It works great for recording the head moving up and down but not moving left or right. So how to do that? Well, above we had seen an annotation box on the face. If we could utilize it somehow to measure the left and right movements.

With annotation box

We can find the line in the middle of the two dark blue lines to act as our pointer and find the angle with the y-axis to find the angle of movement.

Result

Combining both of them we can get the result in which direction we want. The complete code can also be found here at my GitHub repository along with various other sub-models for an online proctoring solution.


以上就是本文的全部内容,希望本文的内容对大家的学习或者工作能带来一定的帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

网页艺术设计

网页艺术设计

彭钢 / 高等教育出版社 / 2006-9 / 39.00元

《网页艺术设计》将软件技术与艺术理论进行整合,注重知识性与研究性、实践性与理论性、系统性与逻辑性,全面介绍网页艺术设计的基础知识与基本实践技能,既培养学习者的网页技术应用能力,又培养学习者的艺术审美、艺术创新和研究性学习能力,使学习者在有效的课时内学习和掌握网页艺术设计的理论与实践。 《网页艺术设计》的特点是具有完整的知识结构、合理的教学设计以及立体化的教学资源。教材共分为8章,包括网页艺术......一起来看看 《网页艺术设计》 这本书的介绍吧!

RGB转16进制工具
RGB转16进制工具

RGB HEX 互转工具

图片转BASE64编码
图片转BASE64编码

在线图片转Base64编码工具

SHA 加密
SHA 加密

SHA 加密工具