Object Detection using YOLOv3

栏目: IT技术 · 发布时间: 4年前

Object Detection using YOLOv3

A journey into detecting objects in real-time using YOLOv3 and OpenCV

Jun 21 ·5min read

D eep learning has revolutionized the realm of computer vision. Neural Networks are widely used in almost all of the cutting-edge tech such as Tesla’s auto-pilot feature. They perform too well that there are times they lead to ethical issues and conflicts. Well, we won’t be diving into those today. Let’s focus on a sub-category of computer vision called “Detection”.

What does one mean by detecting an object? When we see an object, we can exactly point where it is and determine what it is with ease. For computers though, the task is not so simple. This has been an active area of research for years and continues to be so today. In the past decade, with the advent of (rather the resurgence) of deep learning we were able to achieve good results to an extent that it has been possible to use it real-time scenarios.

Object Detection using YOLOv3 — Image from Wikipedia

Overview

There are several Neural Network architectures for detection:-

R-CNN family of architectures
Single Shot Detectors
YOLO — You Only Look Once

We will today be seeing the implementation of YOLOv3(A variant of the original YOLO architecture) without going into much details as to how it works. The language of choice is Python(❤) because of it’s huge support from libraries such as OpenCV. Another important point worth mentioning is that the YOLO models are not as accurate as R-CNN models but they are swift and are easily suitable for real-time applications.

Implementation

Let’s begin by importing the necessary libraries. The OpenCV library is going to be our best friend for this tutorial as it has several helpful functions for manipulating images as well as useful modules such as ‘dnn’.

Since we’ll be using a pre-trained model, we’d have to download certain files. The “weights” file , the “configuration” file, and the “coco-names” file. The weights and the configuration file can be found in this link and the coco-names file can be downloaded/copied from here . There are several pre-trained models available and we would be using the “YOLOv3–416" model. The models are trained on the MS COCO dataset which has 80 classes of objects present in it.

After downloading all the files, its time to create and load our model. As you can see below, the dnn module as several function in-built to aid us in this regard. The names of the objects that our model has been trained to identify is given in the “coco.names” file, which we store in a list called classes. We also retrieve the names of the output layers with the help of the getLayerNames() and getUnconnectedOutLayers() function and store them too in output_layers list.

We now have to pass-in the image through our model. But, we cannot do so directly as our model expects our image to be of a particular shape. This is where the cv2.dnn.blobFromImage() function comes in handy. It helps us to reshape our image while also normalizing them and re-ordering the color channels in proper order.

The image is then given to the model and a forward-pass is performed. The output of which gives us a list of detections. From this list a set of bounding-box co-ordinates for each object detected is obtained as shown below. We use a confidence threshold value to filter out weak detections. The default value I’ve used for confidence threshold is ‘0.5’. All the bounding-box co-ordinates, their class-ids and their corresponding confidence values are stored in lists “boxes”, “class_ids ”and “confidences” respectively.

Now that we have obtained the locations of objects in our image, it’s time to sketch their bounding-box and tag them. The draw_boxes() function does this for us. One problem that we might encounter in our journey is that, the objects, sometimes, may be detected more than once. To avoid such a scenario, we’ll employ Non-Maximum Suppression (aka Non-Maxima Suppression). The default value I’ve used for NMS threshold is ‘0.4’. This is what is performed by the cv2.dnn.NMSBoxes() function down below. We finally display the output image using the cv2.imshow() function.

Now that we have seen all the components required, let us now glue it all together to perform object detection in an image file.

We can perform the same task in videos from files as well as from web-cams as shown below.

The image on the right shows multiple bounding-boxes around the same person. After employing NMS, we obtain the image on the left as output. The duplicate bounding-boxes have been dealt with.

The entire code for this article along with a clean interface for you to build upon could be found in my GitHub repository through the link given down below.

https://github.com/GSNCodes/YOLOv3_Object_Detection_OpenCV

Recent advances in Deep Learning has opened up a lot of avenues for research and exploration. If you are thinking to dive deep into the same, I encourage you to do so. Create and innovate new tech but do so ethically. I hope you found this article helpful and I’m glad to be a part of your journey :)

~~ G.SowmiyaNarayanan

P.S. : —

This is my first article on Medium and I am open to any sorts of criticism to improve my work so that I can better cater to the needs of explorers such as yourself in the future. Feel free to comment and let me know what you think. You can also connect with me on LinkedIn .

MindBytes:-

“If Nothing Changes, Nothing Changes.”

以上所述就是小编给大家介绍的《Object Detection using YOLOv3》，希望对大家有所帮助，如果大家有任何疑问请给我留言，小编会及时回复大家的。在此也非常感谢大家对码农网的支持！

查看所有标签

猜你喜欢:

Object Detection using YOLOv3

本站部分资源来源于网络，本站转载出于传递更多信息之目的，版权归原作者或者来源机构所有，如转载稿涉及版权问题，请联系我们。

码农书籍

Rapid Web Applications with TurboGears

Mark Ramm、Kevin Dangoor、Gigi Sayfan / Prentice Hall PTR / 2006-11-07 / USD 44.99

"Dear PHP, It's over between us. You can keep the kitchen sink, but I want my MVC. With TurboGears, I was able to shed the most heinous FileMaker Pro legacy 'solu-tion' imaginable. It has relationshi......一起来看看《Rapid Web Applications with TurboGears》这本书的介绍吧!

码农工具