Virtual Background For Video Conferencing In Python and OpenCV - A Silly Approach

栏目: IT技术 · 发布时间: 5年前

内容简介:Virtual Backgroundsare one of the hot topics among employees that work remotely at the moment. With some of us being isolated at the moment because of the Covid-19 pandemic, a lot of people have to take video calls in order to carry on their work. Some sof

Virtual Backgroundsare one of the hot topics among employees that work remotely at the moment. With some of us being isolated at the moment because of the Covid-19 pandemic, a lot of people have to take video calls in order to carry on their work. Some software tools for video conferincing allow setting a virtual background so that users can build a more friendly atmosphere for taking these calls.

Virtual Background For Video Conferencing In Python and OpenCV - A Silly Approach
Photo by timJ / Unsplash

Interested in more stories like this? Follow me on Twitter at @b_dmarius and I'll post there every new article.

As a programmer, I was naturally intrigued first time I used such a virtual background. How does it work, I wondered. Can I build such a virtual background? And if yes, how can I do it? Spoiler: it did not go well! Still, I think it was a good educational exercise and I didn't find too much information on this topic while researching this. Therefore, as I do with everyting I learn, I decided to document it here, maybe someone else will benefit from this.

So in this tutorial we are going to try a basic approach for building a a virtual background with Computer Vision techniques, using Python and OpenCV .

Introduction

The goal of this project is to take a video, try to figure out what's the background and what's the foreground of the video, remove the background part and replace it with a picture - the virtual background. Because in this project we are going to use trivial methods, we will need the assumption that the foreground will, in general, have colors different from the background. But first, let's see what are our tools.

Computer Vision

Computer Visionis an interdisciplinary field that deals with how computers can process and(maybe) understand images and videos. We say it is an interdisciplinary field because it borrows a lot of concepts from different disciplines(computer science, algebra, geometry and so on) and combines them to solve a lot of different and complex tasks, like object tracking , object detection, object recognition , object segmentation in images and videos.

OpenCV

OpenCV is a library built for solving computer vision tasks. It is open-source and it is available for several programming languages, including Python and C++. It has a tremendous amount of features for computer vision, with some of them being based on on maths and statistical approaches, and others being based on Machine Learning.

Python

If you've made it this far in this article, you probably know what Python is :grinning:

Building a virtual background

The approach I tried for this was the following. I'll show code snippets for every step and at the end of the article you'll have the full code.

  1. Import dependencies
import numpy as np
import cv2

2. Load the video from the local environment and initialize data

ap = cv2.VideoCapture('video6.mp4')
ret = True
frameCounter = 0
previousFrame = None
nextFrame = None
iterations = 0

3. Load the substitute background image from the local environment

backgroundImage = cv2.imread("image1.jpg")

4. Split the video frame by frame

while (ret):
	ret, frame = cap.read()

5. Take every pair of two frames

if frameCounter % 2 == 1:
            nextFrame = frame

        if frameCounter % 2 == 0:
            frameCounter = 0
            previousFrame = frame

        frameCounter = frameCounter + 1
        iterations = iterations + 1

6. Find the absolute difference between the two frames and convert it to grayscale -> obtaining a mask.

if iterations > 2:
            diff = cv2.absdiff(previousFrame, nextFrame)
            mask = cv2.cvtColor(diff, cv2.COLOR_BGR2GRAY)

Every image consists of pixels - you can imagine this as a 2D matrix with lines and columns and every cell in the matrix is a pixel in an image(of course, for color images we have more dimensions than just 2, but for simplicity, we can ignore this).

We obtain the difference by going pixel by pixel in the first image(so cell by cell in the first matrix) and substituting the corresponding pixel from the other image(so the corresponding cell from the other matrix).

Now here's the trick: if between the 2 frames, a pixel has not been modified, then of course the result will be 0 . How can a pixel be different between 2 frames? If the video is completely static(nothing moves in the image), then the difference will be 0 between each and every frame for all the pixels, because nothing is changed. But if something moves in the image, then we can identify where in the image something has moved by detecting the pixel differences. And we can assume that, in a video conference, the things that move are in the foreground – that's you – and the static part is the background.

And what's so important about this 0 ? The image will show a black color for every pixel that is 0, and we are going to use that in our advantage.

7. Find the cells in the mask that are over a threshold value - I've chosen 3 as a threshold, but you can play with different values. A larger value will remove more from the background, but may also remove more from the foreground.

th = 3
            isMask = mask > th
            nonMask = mask <= th

8. Create an empty image(0 for every cell) with the size of any of the two frames.

result = np.zeros_like(nextFrame, np.uint8)

9. Resize the background image so that it has the same size as the frames.

resized = cv2.resize(backgroundImage, (result.shape[1], result.shape[0]), interpolation = cv2.INTER_AREA)

10. For every cell from the mask that is bigger than the threshold, copy from the original frame.

result[isMask] = nextFrame[isMask]

11. For every cell from the mask that is lower than the threshold, copy from the substitute background image.

result[nonMask] = resized[nonMask]

12. Save the result frame to the local environment.

cv2.imwrite("output" + str(iterations) + ".jpg", result)

Results and conclusion

So what are the results? Honestly, I've been a bit dissapointed by the result. Then I did more research and the reason became more obvious. You need a more advanced approach for this and it's no surprise that big companies invest lots of resources on this type of problem.

Here's a screenshot of the video I tried. It's basically a video of my hand moving in front of a wall.

Virtual Background For Video Conferencing In Python and OpenCV - A Silly Approach
Virtual background Python and OpenCV tutorial - input

And here's a screenshot of the output image. For the background I used a photo of me in Rasnov, Romania.

Virtual Background For Video Conferencing In Python and OpenCV - A Silly Approach
Virtual background Python and OpenCV tutorial - output

As I said, I am not very satisfied with the result. But I am satisfied with what I learned from this project. It was a fun learning experience and a nice way to spend my time working with concepts I am not comfortable to work with.

Other approaches to creating a virtual background

If you think a problem is very complicated and requires levels of intelligence unusual for what you've seen in a computer software - then the answer might be Machine Learning. :grinning:

There are already Deep Learning models out there that can perform this sort of tasks. But such a model requires large datasets to train on and lots of processing power, out of which I had none at the moment of writing this article. The task to be solved by such a deep learning model is called image segmentation.

Another approach would be a c0mputer vision method for finding the distance between the camera and the objects in the image. Then you would establish a threshold for separating the foreground from the background. After that, you can use the same mask I used to remove the background and introduce a new one.

Thank you so much for reading this. Interested in more stories like this? Follow me on Twitter at @b_dmarius and I'll post there every new article.


以上所述就是小编给大家介绍的《Virtual Background For Video Conferencing In Python and OpenCV - A Silly Approach》,希望对大家有所帮助,如果大家有任何疑问请给我留言,小编会及时回复大家的。在此也非常感谢大家对 码农网 的支持!

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

算法设计与分析

算法设计与分析

郑宗汉/郑晓明编 / 清华大学出版社 / 2005-6 / 32.00元

《算法设计与分析》系统地介绍算法设计与分析的概念和方法,共四部分内容,第一部分包括前两章,介绍算法设计与分析的基本概念及必要的数学工具,对算法的时间复杂性的概念及算法的分析方法作了较为详细的叙述。第二部分包括第3~~9章,以算法设计技术为纲,从排序问题和离散集合的操作开始,进而介绍递归技术、分治法、贪婪法、动态规划、回溯法、分支与限界法以及随机算法等算法设计技术及其复杂性。第三部分包括第10章和第......一起来看看 《算法设计与分析》 这本书的介绍吧!

HTML 压缩/解压工具
HTML 压缩/解压工具

在线压缩/解压 HTML 代码

RGB转16进制工具
RGB转16进制工具

RGB HEX 互转工具

图片转BASE64编码
图片转BASE64编码

在线图片转Base64编码工具