内容简介:TensorFlow提供了用于检测图片或视频中所包含物体的API,详情可参考以下链接物体检测和图片分类不同
TensorFlow提供了用于检测图片或视频中所包含物体的API,详情可参考以下链接
物体检测和图片分类不同
- 图片分类是将图片分为某一类别,即从多个可能的分类中选择一个,即使可以按照概率输出最可能的多个分类,但理论上的正确答案只有一个
- 物体检测是检测图片中所出现的全部物体并且用矩形(Anchor Box)进行标注,物体的类别可以包括多种,例如人、车、动物、路标等,即正确答案可以是多个
通过多个例子,了解TensorFlow物体检测API的使用方法
这里使用预训练好的 ssd_mobilenet_v1_coco
模型(Single Shot MultiBox Detector),更多可用的物体检测模型可以参考这里
举个例子
加载库
# -*- coding: utf-8 -*- import numpy as np import tensorflow as tf import matplotlib.pyplot as plt from PIL import Image from utils import label_map_util from utils import visualization_utils as vis_util 复制代码
定义一些常量
PATH_TO_CKPT = 'ssd_mobilenet_v1_coco_2017_11_17/frozen_inference_graph.pb' PATH_TO_LABELS = 'ssd_mobilenet_v1_coco_2017_11_17/mscoco_label_map.pbtxt' NUM_CLASSES = 90 复制代码
加载预训练好的模型
detection_graph = tf.Graph() with detection_graph.as_default(): od_graph_def = tf.GraphDef() with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid: od_graph_def.ParseFromString(fid.read()) tf.import_graph_def(od_graph_def, name='') 复制代码
加载分类标签数据
label_map = label_map_util.load_labelmap(PATH_TO_LABELS) categories = label_map_util.convert_label_map_to_categories(label_map, max_num_classes=NUM_CLASSES, use_display_name=True) category_index = label_map_util.create_category_index(categories) 复制代码
一个将图片转为数组的辅助函数,以及测试图片路径
def load_image_into_numpy_array(image): (im_width, im_height) = image.size return np.array(image.getdata()).reshape((im_height, im_width, 3)).astype(np.uint8) TEST_IMAGE_PATHS = ['test_images/image1.jpg', 'test_images/image2.jpg'] 复制代码
使用模型进行物体检测
with detection_graph.as_default(): with tf.Session(graph=detection_graph) as sess: image_tensor = detection_graph.get_tensor_by_name('image_tensor:0') detection_boxes = detection_graph.get_tensor_by_name('detection_boxes:0') detection_scores = detection_graph.get_tensor_by_name('detection_scores:0') detection_classes = detection_graph.get_tensor_by_name('detection_classes:0') num_detections = detection_graph.get_tensor_by_name('num_detections:0') for image_path in TEST_IMAGE_PATHS: image = Image.open(image_path) image_np = load_image_into_numpy_array(image) image_np_expanded = np.expand_dims(image_np, axis=0) (boxes, scores, classes, num) = sess.run( [detection_boxes, detection_scores, detection_classes, num_detections], feed_dict={image_tensor: image_np_expanded}) vis_util.visualize_boxes_and_labels_on_image_array(image_np, np.squeeze(boxes), np.squeeze(classes).astype(np.int32), np.squeeze(scores), category_index, use_normalized_coordinates=True, line_thickness=8) plt.figure(figsize=[12, 8]) plt.imshow(image_np) plt.show() 复制代码
检测结果如下,第一张图片检测出了两只狗狗
第二张图片检测出了一些人和风筝
摄像头检测
安装 OpenCV
,用于实现和计算机视觉相关的功能,版本为 3.3.0.10
pip install opencv-python opencv-contrib-python -i https://pypi.tuna.tsinghua.edu.cn/simple 复制代码
查看是否安装成功,没有报错即可
import cv2 tracker = cv2.TrackerMedianFlow_create() 复制代码
在以上代码的基础上进行修改
cv2
完整代码如下
# -*- coding: utf-8 -*- import numpy as np import tensorflow as tf from utils import label_map_util from utils import visualization_utils as vis_util import cv2 cap = cv2.VideoCapture(0) PATH_TO_CKPT = 'ssd_mobilenet_v1_coco_2017_11_17/frozen_inference_graph.pb' PATH_TO_LABELS = 'ssd_mobilenet_v1_coco_2017_11_17/mscoco_label_map.pbtxt' NUM_CLASSES = 90 detection_graph = tf.Graph() with detection_graph.as_default(): od_graph_def = tf.GraphDef() with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid: od_graph_def.ParseFromString(fid.read()) tf.import_graph_def(od_graph_def, name='') label_map = label_map_util.load_labelmap(PATH_TO_LABELS) categories = label_map_util.convert_label_map_to_categories(label_map, max_num_classes=NUM_CLASSES, use_display_name=True) category_index = label_map_util.create_category_index(categories) with detection_graph.as_default(): with tf.Session(graph=detection_graph) as sess: image_tensor = detection_graph.get_tensor_by_name('image_tensor:0') detection_boxes = detection_graph.get_tensor_by_name('detection_boxes:0') detection_scores = detection_graph.get_tensor_by_name('detection_scores:0') detection_classes = detection_graph.get_tensor_by_name('detection_classes:0') num_detections = detection_graph.get_tensor_by_name('num_detections:0') while True: ret, image_np = cap.read() image_np = cv2.cvtColor(image_np, cv2.COLOR_BGR2RGB) image_np_expanded = np.expand_dims(image_np, axis=0) (boxes, scores, classes, num) = sess.run( [detection_boxes, detection_scores, detection_classes, num_detections], feed_dict={image_tensor: image_np_expanded}) vis_util.visualize_boxes_and_labels_on_image_array(image_np, np.squeeze(boxes), np.squeeze(classes).astype(np.int32), np.squeeze(scores), category_index, use_normalized_coordinates=True, line_thickness=8) cv2.imshow('object detection', cv2.cvtColor(image_np, cv2.COLOR_RGB2BGR)) if cv2.waitKey(25) & 0xFF == ord('q'): cap.release() cv2.destroyAllWindows() break 复制代码
视频检测
使用 cv2
读取视频并获取每一帧图片,然后将检测后的每一帧写入新的视频文件
生成的视频文件只有图像、没有声音,关于音频的处理以及视频和音频的合成,后面再进一步探索
完整代码如下
# -*- coding: utf-8 -*- import numpy as np import tensorflow as tf from utils import label_map_util from utils import visualization_utils as vis_util import cv2 cap = cv2.VideoCapture('绝地逃亡.mov') ret, image_np = cap.read() out = cv2.VideoWriter('output.mov', -1, cap.get(cv2.CAP_PROP_FPS), (image_np.shape[1], image_np.shape[0])) PATH_TO_CKPT = 'ssd_mobilenet_v1_coco_2017_11_17/frozen_inference_graph.pb' PATH_TO_LABELS = 'ssd_mobilenet_v1_coco_2017_11_17/mscoco_label_map.pbtxt' NUM_CLASSES = 90 detection_graph = tf.Graph() with detection_graph.as_default(): od_graph_def = tf.GraphDef() with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid: od_graph_def.ParseFromString(fid.read()) tf.import_graph_def(od_graph_def, name='') label_map = label_map_util.load_labelmap(PATH_TO_LABELS) categories = label_map_util.convert_label_map_to_categories(label_map, max_num_classes=NUM_CLASSES, use_display_name=True) category_index = label_map_util.create_category_index(categories) with detection_graph.as_default(): with tf.Session(graph=detection_graph) as sess: image_tensor = detection_graph.get_tensor_by_name('image_tensor:0') detection_boxes = detection_graph.get_tensor_by_name('detection_boxes:0') detection_scores = detection_graph.get_tensor_by_name('detection_scores:0') detection_classes = detection_graph.get_tensor_by_name('detection_classes:0') num_detections = detection_graph.get_tensor_by_name('num_detections:0') while cap.isOpened(): ret, image_np = cap.read() if len((np.array(image_np)).shape) == 0: break image_np = cv2.cvtColor(image_np, cv2.COLOR_BGR2RGB) image_np_expanded = np.expand_dims(image_np, axis=0) (boxes, scores, classes, num) = sess.run( [detection_boxes, detection_scores, detection_classes, num_detections], feed_dict={image_tensor: image_np_expanded}) vis_util.visualize_boxes_and_labels_on_image_array(image_np, np.squeeze(boxes), np.squeeze(classes).astype(np.int32), np.squeeze(scores), category_index, use_normalized_coordinates=True, line_thickness=8) out.write(cv2.cvtColor(image_np, cv2.COLOR_RGB2BGR)) cap.release() out.release() cv2.destroyAllWindows() 复制代码
播放处理好的视频,可以看到很多地方都有相应的检测结果
以上就是本文的全部内容,希望本文的内容对大家的学习或者工作能带来一定的帮助,也希望大家多多支持 码农网
猜你喜欢:- ICLR 2019 | 骑驴找马:利用深度强化学习模型定位新物体
- 深度学习在 iOS 上的实践 —— 通过 YOLO 在 iOS 上实现实时物体检测
- 旷视发布通用物体检测数据集 Objects365,开启 CVPR 物体检测挑战赛
- c++ 投掷活动物体
- WebGL之物体选择
- 简易版物体识别
本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们。
Web开发秘方
Brian P. Hogan、Chris Warren、Mike Weber、Chris Johnson、Aaron Godin / 七印部落 / 华中科技大学出版社 / 2013-7-10 / 66.00元
猜猜硅谷的前端工程师怎么折腾JS的?想知道无限下拉的列表怎么做吗?你知道DropBox可以当Web服务器用吗?你知道怎么做出跨平台的幻灯片效果吗?不借助插件,怎样在移动设备上实现动画效果?怎样快速搭建和测试HTML电子邮箱?怎样制作跨PC和移动设备显示的应用界面?怎样利用最新的JavaScript框架(Backbone和Knockout)提高应用的响应速度?怎样有效利用CoffeeScript和S......一起来看看 《Web开发秘方》 这本书的介绍吧!