深度有趣 | 11 TensorFlow物体检测

栏目: 编程工具 · 发布时间: 7年前

内容简介：TensorFlow提供了用于检测图片或视频中所包含物体的API，详情可参考以下链接物体检测和图片分类不同

TensorFlow提供了用于检测图片或视频中所包含物体的API，详情可参考以下链接

github.com/tensorflow/…

物体检测和图片分类不同

图片分类是将图片分为某一类别，即从多个可能的分类中选择一个，即使可以按照概率输出最可能的多个分类，但理论上的正确答案只有一个
物体检测是检测图片中所出现的全部物体并且用矩形（Anchor Box）进行标注，物体的类别可以包括多种，例如人、车、动物、路标等，即正确答案可以是多个

通过多个例子，了解TensorFlow物体检测API的使用方法

这里使用预训练好的 ssd_mobilenet_v1_coco 模型（Single Shot MultiBox Detector），更多可用的物体检测模型可以参考这里

github.com/tensorflow/…

举个例子

加载库

# -*- coding: utf-8 -*-

import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt
from PIL import Image

from utils import label_map_util
from utils import visualization_utils as vis_util
复制代码

定义一些常量

PATH_TO_CKPT = 'ssd_mobilenet_v1_coco_2017_11_17/frozen_inference_graph.pb'
PATH_TO_LABELS = 'ssd_mobilenet_v1_coco_2017_11_17/mscoco_label_map.pbtxt'
NUM_CLASSES = 90
复制代码

加载预训练好的模型

detection_graph = tf.Graph()
with detection_graph.as_default():
	od_graph_def = tf.GraphDef()
	with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid:
		od_graph_def.ParseFromString(fid.read())
		tf.import_graph_def(od_graph_def, name='')
复制代码

加载分类标签数据

label_map = label_map_util.load_labelmap(PATH_TO_LABELS)
categories = label_map_util.convert_label_map_to_categories(label_map, max_num_classes=NUM_CLASSES, use_display_name=True)
category_index = label_map_util.create_category_index(categories)
复制代码

一个将图片转为数组的辅助函数，以及测试图片路径

def load_image_into_numpy_array(image):
	(im_width, im_height) = image.size
	return np.array(image.getdata()).reshape((im_height, im_width, 3)).astype(np.uint8)
	
TEST_IMAGE_PATHS = ['test_images/image1.jpg', 'test_images/image2.jpg']
复制代码

使用模型进行物体检测

with detection_graph.as_default():
	with tf.Session(graph=detection_graph) as sess:
	    image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')
	    detection_boxes = detection_graph.get_tensor_by_name('detection_boxes:0')
	    detection_scores = detection_graph.get_tensor_by_name('detection_scores:0')
	    detection_classes = detection_graph.get_tensor_by_name('detection_classes:0')
	    num_detections = detection_graph.get_tensor_by_name('num_detections:0')
	    for image_path in TEST_IMAGE_PATHS:
	    	image = Image.open(image_path)
	    	image_np = load_image_into_numpy_array(image)
	    	image_np_expanded = np.expand_dims(image_np, axis=0)
	    	(boxes, scores, classes, num) = sess.run(
	    		[detection_boxes, detection_scores, detection_classes, num_detections], 
	    		feed_dict={image_tensor: image_np_expanded})
	    	
	    	vis_util.visualize_boxes_and_labels_on_image_array(image_np, np.squeeze(boxes), np.squeeze(classes).astype(np.int32), np.squeeze(scores), category_index, use_normalized_coordinates=True, line_thickness=8)
	    	plt.figure(figsize=[12, 8])
	    	plt.imshow(image_np)
	    	plt.show()
复制代码

检测结果如下，第一张图片检测出了两只狗狗

第二张图片检测出了一些人和风筝

摄像头检测

安装 OpenCV ，用于实现和计算机视觉相关的功能，版本为 3.3.0.10

pip install opencv-python opencv-contrib-python -i https://pypi.tuna.tsinghua.edu.cn/simple
复制代码

查看是否安装成功，没有报错即可

import cv2
tracker = cv2.TrackerMedianFlow_create()
复制代码

在以上代码的基础上进行修改

cv2

完整代码如下

# -*- coding: utf-8 -*-

import numpy as np
import tensorflow as tf

from utils import label_map_util
from utils import visualization_utils as vis_util

import cv2
cap = cv2.VideoCapture(0)

PATH_TO_CKPT = 'ssd_mobilenet_v1_coco_2017_11_17/frozen_inference_graph.pb'
PATH_TO_LABELS = 'ssd_mobilenet_v1_coco_2017_11_17/mscoco_label_map.pbtxt'
NUM_CLASSES = 90

detection_graph = tf.Graph()
with detection_graph.as_default():
	od_graph_def = tf.GraphDef()
	with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid:
		od_graph_def.ParseFromString(fid.read())
		tf.import_graph_def(od_graph_def, name='')

label_map = label_map_util.load_labelmap(PATH_TO_LABELS)
categories = label_map_util.convert_label_map_to_categories(label_map, max_num_classes=NUM_CLASSES, use_display_name=True)
category_index = label_map_util.create_category_index(categories)

with detection_graph.as_default():
	with tf.Session(graph=detection_graph) as sess:
	    image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')
	    detection_boxes = detection_graph.get_tensor_by_name('detection_boxes:0')
	    detection_scores = detection_graph.get_tensor_by_name('detection_scores:0')
	    detection_classes = detection_graph.get_tensor_by_name('detection_classes:0')
	    num_detections = detection_graph.get_tensor_by_name('num_detections:0')
	    while True:
	    	ret, image_np = cap.read()
	    	image_np = cv2.cvtColor(image_np, cv2.COLOR_BGR2RGB)
	    	image_np_expanded = np.expand_dims(image_np, axis=0)
	    	(boxes, scores, classes, num) = sess.run(
	    		[detection_boxes, detection_scores, detection_classes, num_detections], 
	    		feed_dict={image_tensor: image_np_expanded})
	    	
	    	vis_util.visualize_boxes_and_labels_on_image_array(image_np, np.squeeze(boxes), np.squeeze(classes).astype(np.int32), np.squeeze(scores), category_index, use_normalized_coordinates=True, line_thickness=8)
	    	
	    	cv2.imshow('object detection', cv2.cvtColor(image_np, cv2.COLOR_RGB2BGR))
	    	if cv2.waitKey(25) & 0xFF == ord('q'):
	    		cap.release()
	    		cv2.destroyAllWindows()
	    		break
复制代码

视频检测

使用 cv2 读取视频并获取每一帧图片，然后将检测后的每一帧写入新的视频文件

生成的视频文件只有图像、没有声音，关于音频的处理以及视频和音频的合成，后面再进一步探索

完整代码如下

# -*- coding: utf-8 -*-

import numpy as np
import tensorflow as tf

from utils import label_map_util
from utils import visualization_utils as vis_util

import cv2
cap = cv2.VideoCapture('绝地逃亡.mov')
ret, image_np = cap.read()
out = cv2.VideoWriter('output.mov', -1, cap.get(cv2.CAP_PROP_FPS), (image_np.shape[1], image_np.shape[0]))

PATH_TO_CKPT = 'ssd_mobilenet_v1_coco_2017_11_17/frozen_inference_graph.pb'
PATH_TO_LABELS = 'ssd_mobilenet_v1_coco_2017_11_17/mscoco_label_map.pbtxt'
NUM_CLASSES = 90

detection_graph = tf.Graph()
with detection_graph.as_default():
	od_graph_def = tf.GraphDef()
	with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid:
		od_graph_def.ParseFromString(fid.read())
		tf.import_graph_def(od_graph_def, name='')

label_map = label_map_util.load_labelmap(PATH_TO_LABELS)
categories = label_map_util.convert_label_map_to_categories(label_map, max_num_classes=NUM_CLASSES, use_display_name=True)
category_index = label_map_util.create_category_index(categories)

with detection_graph.as_default():
	with tf.Session(graph=detection_graph) as sess:
	    image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')
	    detection_boxes = detection_graph.get_tensor_by_name('detection_boxes:0')
	    detection_scores = detection_graph.get_tensor_by_name('detection_scores:0')
	    detection_classes = detection_graph.get_tensor_by_name('detection_classes:0')
	    num_detections = detection_graph.get_tensor_by_name('num_detections:0')
	    while cap.isOpened():
	    	ret, image_np = cap.read()
	    	if len((np.array(image_np)).shape) == 0:
	    		break

	    	image_np = cv2.cvtColor(image_np, cv2.COLOR_BGR2RGB)
	    	image_np_expanded = np.expand_dims(image_np, axis=0)
	    	
	    	(boxes, scores, classes, num) = sess.run(
	    		[detection_boxes, detection_scores, detection_classes, num_detections], 
	    		feed_dict={image_tensor: image_np_expanded})
	    	
	    	vis_util.visualize_boxes_and_labels_on_image_array(image_np, np.squeeze(boxes), np.squeeze(classes).astype(np.int32), np.squeeze(scores), category_index, use_normalized_coordinates=True, line_thickness=8)
	    	out.write(cv2.cvtColor(image_np, cv2.COLOR_RGB2BGR))
	    	
cap.release()
out.release()
cv2.destroyAllWindows()
复制代码

播放处理好的视频，可以看到很多地方都有相应的检测结果

以上就是本文的全部内容，希望本文的内容对大家的学习或者工作能带来一定的帮助，也希望大家多多支持码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络，本站转载出于传递更多信息之目的，版权归原作者或者来源机构所有，如转载稿涉及版权问题，请联系我们。

码农书籍

程序员健康指南

Joe Kutner / 陈少芸 / 人民邮电出版社 / 2014-9-20 / 31.60元

本书是为程序员量身制作的健康指南，针对头痛、眼部疲劳、背部疼痛和手腕疼痛等常见的问题，简要介绍了其成因、测试方法，并列出了每天的行动计划，从运动、饮食等方面给出详细指导，帮助程序员在不改变工作方式的情况下轻松拥有健康。本书适合程序员、长期伏案工作的其他人群以及所有关心健康的人士阅读。一起来看看《程序员健康指南》这本书的介绍吧!

码农工具

CSS 压缩/解压工具

在线压缩/解压 CSS 代码

RGB转16进制工具

RGB HEX 互转工具

HEX CMYK 转换工具

HEX CMYK 互转工具