内容简介:Virtual Backgroundsare one of the hot topics among employees that work remotely at the moment. With some of us being isolated at the moment because of the Covid-19 pandemic, a lot of people have to take video calls in order to carry on their work. Some sof
Virtual Backgroundsare one of the hot topics among employees that work remotely at the moment. With some of us being isolated at the moment because of the Covid-19 pandemic, a lot of people have to take video calls in order to carry on their work. Some software tools for video conferincing allow setting a virtual background so that users can build a more friendly atmosphere for taking these calls.
Interested in more stories like this? Follow me on Twitter at @b_dmarius and I'll post there every new article.
As a programmer, I was naturally intrigued first time I used such a virtual background. How does it work, I wondered. Can I build such a virtual background? And if yes, how can I do it? Spoiler: it did not go well! Still, I think it was a good educational exercise and I didn't find too much information on this topic while researching this. Therefore, as I do with everyting I learn, I decided to document it here, maybe someone else will benefit from this.
So in this tutorial we are going to try a basic approach for building a a virtual background with Computer Vision techniques, using Python and OpenCV .
Introduction
The goal of this project is to take a video, try to figure out what's the background and what's the foreground of the video, remove the background part and replace it with a picture - the virtual background. Because in this project we are going to use trivial methods, we will need the assumption that the foreground will, in general, have colors different from the background. But first, let's see what are our tools.
Computer Vision
Computer Visionis an interdisciplinary field that deals with how computers can process and(maybe) understand images and videos. We say it is an interdisciplinary field because it borrows a lot of concepts from different disciplines(computer science, algebra, geometry and so on) and combines them to solve a lot of different and complex tasks, like object tracking , object detection, object recognition , object segmentation in images and videos.
OpenCV
OpenCV is a library built for solving computer vision tasks. It is open-source and it is available for several programming languages, including Python and C++. It has a tremendous amount of features for computer vision, with some of them being based on on maths and statistical approaches, and others being based on Machine Learning.
Python
If you've made it this far in this article, you probably know what Python is :grinning:
Building a virtual background
The approach I tried for this was the following. I'll show code snippets for every step and at the end of the article you'll have the full code.
- Import dependencies
import numpy as np import cv2
2. Load the video from the local environment and initialize data
ap = cv2.VideoCapture('video6.mp4') ret = True frameCounter = 0 previousFrame = None nextFrame = None iterations = 0
3. Load the substitute background image from the local environment
backgroundImage = cv2.imread("image1.jpg")
4. Split the video frame by frame
while (ret): ret, frame = cap.read()
5. Take every pair of two frames
if frameCounter % 2 == 1: nextFrame = frame if frameCounter % 2 == 0: frameCounter = 0 previousFrame = frame frameCounter = frameCounter + 1 iterations = iterations + 1
6. Find the absolute difference between the two frames and convert it to grayscale -> obtaining a mask.
if iterations > 2: diff = cv2.absdiff(previousFrame, nextFrame) mask = cv2.cvtColor(diff, cv2.COLOR_BGR2GRAY)
Every image consists of pixels - you can imagine this as a 2D matrix with lines and columns and every cell in the matrix is a pixel in an image(of course, for color images we have more dimensions than just 2, but for simplicity, we can ignore this).
We obtain the difference by going pixel by pixel in the first image(so cell by cell in the first matrix) and substituting the corresponding pixel from the other image(so the corresponding cell from the other matrix).
Now here's the trick: if between the 2 frames, a pixel has not been modified, then of course the result will be 0 . How can a pixel be different between 2 frames? If the video is completely static(nothing moves in the image), then the difference will be 0 between each and every frame for all the pixels, because nothing is changed. But if something moves in the image, then we can identify where in the image something has moved by detecting the pixel differences. And we can assume that, in a video conference, the things that move are in the foreground – that's you – and the static part is the background.
And what's so important about this 0 ? The image will show a black color for every pixel that is 0, and we are going to use that in our advantage.
7. Find the cells in the mask that are over a threshold value - I've chosen 3 as a threshold, but you can play with different values. A larger value will remove more from the background, but may also remove more from the foreground.
th = 3 isMask = mask > th nonMask = mask <= th
8. Create an empty image(0 for every cell) with the size of any of the two frames.
result = np.zeros_like(nextFrame, np.uint8)
9. Resize the background image so that it has the same size as the frames.
resized = cv2.resize(backgroundImage, (result.shape[1], result.shape[0]), interpolation = cv2.INTER_AREA)
10. For every cell from the mask that is bigger than the threshold, copy from the original frame.
result[isMask] = nextFrame[isMask]
11. For every cell from the mask that is lower than the threshold, copy from the substitute background image.
result[nonMask] = resized[nonMask]
12. Save the result frame to the local environment.
cv2.imwrite("output" + str(iterations) + ".jpg", result)
Results and conclusion
So what are the results? Honestly, I've been a bit dissapointed by the result. Then I did more research and the reason became more obvious. You need a more advanced approach for this and it's no surprise that big companies invest lots of resources on this type of problem.
Here's a screenshot of the video I tried. It's basically a video of my hand moving in front of a wall.
And here's a screenshot of the output image. For the background I used a photo of me in Rasnov, Romania.
As I said, I am not very satisfied with the result. But I am satisfied with what I learned from this project. It was a fun learning experience and a nice way to spend my time working with concepts I am not comfortable to work with.
Other approaches to creating a virtual background
If you think a problem is very complicated and requires levels of intelligence unusual for what you've seen in a computer software - then the answer might be Machine Learning. :grinning:
There are already Deep Learning models out there that can perform this sort of tasks. But such a model requires large datasets to train on and lots of processing power, out of which I had none at the moment of writing this article. The task to be solved by such a deep learning model is called image segmentation.
Another approach would be a c0mputer vision method for finding the distance between the camera and the objects in the image. Then you would establish a threshold for separating the foreground from the background. After that, you can use the same mask I used to remove the background and introduce a new one.
Thank you so much for reading this. Interested in more stories like this? Follow me on Twitter at @b_dmarius and I'll post there every new article.
以上所述就是小编给大家介绍的《Virtual Background For Video Conferencing In Python and OpenCV - A Silly Approach》,希望对大家有所帮助,如果大家有任何疑问请给我留言,小编会及时回复大家的。在此也非常感谢大家对 码农网 的支持!
猜你喜欢:本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们。
CSS权威指南(第三版)
[美] Eric A.Meyer / 侯妍、尹志忠 / 中国电力出版社 / 2007-10 / 58.00
你是否既想获得丰富复杂的网页样式,同时又想节省时间和精力?本书为你展示了如何遵循CSS最新规范(CSS2和CSS2.1)将层叠样式表的方方面面应用于实践。 通过本书提供的诸多示例,你将了解如何做到仅在一处建立样式表就能创建或修改整个网站的外观,以及如何得到HTML力不能及的更丰富的表现效果。 资深CSS专家Eric A.Meyer。利用他独有的睿智和丰富的经验对属性、标记、标记属性和实......一起来看看 《CSS权威指南(第三版)》 这本书的介绍吧!