内容简介:常用图像数据集原始数据(.png或.jpg格式)生成方法
在计算机视觉方面的工作,我们常常需要用到很多图像数据集.像ImageNet这样早已大名鼎鼎的数据集,我等的百十个G的硬盘容量怕是怎么也承载不下;本文中,将给出一些Hello world级的图像数据集生成方法,以及其他相关图像数据资源的整理.
本文的主要内容包括:
- MNIST, CIFAR-10, CIFAR-100等.png或.jpg格式数据集的生成方法;
- 如何编写脚本生成图像数据,并更根据标签文件自动归类;
- 如何使用Digist工具生成这些数据集;
- .h5格式数据文件格式查看方式;
- 相关数据集的下载地址.
生成图像数据
MNIST, CIFAR-10, CIFAR-100等数据在其官网都有相关的介绍,这里也给出相关的数据集的官方地址:
通过官网的介绍可以看出,官网给出的数据集大多都是二进制格式和一些python,matlab格式;有时候我们需要的是原始图像数据,这个时候我们就需要使用代码或者借助其他 工具 自己生成了.
代码生成的方式,在网上也有很多,但良莠不齐.大多需要自己根据官网给出的数据格式,自己更具格式特征生成原始数据,这里就不做具体介绍了,网上有很多.这里介绍一些比较简单快捷的方式,来帮助我们快速得到原始图像数据.
CIFAR-10 图像数据
这部分是我从 kaggle cifar-10 官网 提供的CIFAR-10数据集生成的,原始数据集(.png格式,比较符合我们的要求),但存在一个问题,所给的图片混乱的排列在train目录下,未按照原始10分类进行分类,但好在给出了trainLabels.csv类别映射文件;所以,我们需要解决的首要问题就是,根据这个映射文件自动分成10类别,并存放在10个文件目录下.
下边直接给出我的代码:
# coding: utf-8 import csv import os import shutil import sys # 获取文件名(除去后缀) defgetImageFilePre(filename): if filename.endswith(".png"): temp = filename.split(".") filePre = temp[0] return filePre # string 转 int defstr2Int(stringValue): return int(stringValue) # int 转 string defint2Str(intValue): return str(intValue) # 文件重命名 deffileRename(dirPath): # 三个参数:分别返回 # 1.父目录 # 2.所有文件夹名字(不含路径) # 3.所有文件名字 for parent, dirnames, filenames in os.walk(dirPath): for dirname in dirnames: #输出文件夹信息 count = 1 newTmpPath = os.path.join(dirPath, dirname) os.chdir(newTmpPath) fileContents = os.listdir(newTmpPath) for curFile in fileContents: if curFile.endswith(".png"): newName = dirname + "."+ int2Str(count) + ".png" count = count + 1 shutil.move(curFile, newName) print curFile + " -> " + newName + " ------> OK!" defmain(): # 读取标签文件内容 csvfile = file('trainLabels.csv', 'rb') reader = csv.reader(csvfile) reader = list(reader) # 转化为list列表 # 读取目录下文件列表 dirPath = "F:\\xxxxx\\data_origin\\train_200" os.chdir(dirPath) dirContents = os.listdir(dirPath) dirContents.sort(key=lambda x:int(x[:-4])) #按文件名排序 totalFiles = 50001 for num in range(1, totalFiles): # 0-199 labelContent = reader[num] labelID = reader[num][0] labelName = reader[num][1] imageFilename = dirContents[num-1] tmpFilePre = getImageFilePre(dirContents[num-1]) if str2Int(labelID) == str2Int(tmpFilePre): print "labelID == filePre !!!" baseDirPath = "F:\\xxxxx\\data_origin\\train_with_class" new_dir_name = labelName new_dir_path = os.path.join(baseDirPath, new_dir_name) isExists = os.path.isdir(new_dir_path) if not isExists: os.makedirs(new_dir_path) print new_dir_path + " 创建成功!" else: print new_dir_path + "目录已存在!" shutil.copy(imageFilename, new_dir_path) print ">>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>" csvfile.close() rootPath = "F:\\xxxxx\\data_origin\\train_with_class" fileRename(rootPath) if __name__ == '__main__': main()
这样,便分成了10个类别,并根据类别存放在不同的目录下,每一类别5000张图片;在我的Windows平台下耗时1.5个小时(包括文件重命名)才跑完,确实有点慢.下图为最终的结果图:
caffe图像化操作工具digits工具生成图像数据集
详细的使用方法可移步这篇博文: http://www.cnblogs.com/denny402/p/5136155.html
需要安装caffe和digits工具,使用工具可直接生成自动归类的图片数据,速度很快可以一试.
.h5文件结构查看器
在做卷积神经网络的时候,我们经常需要保存.h5数据文件,但有时候我们需要利用这些.h5文件,比如在进行transfor Learning的时候,就需要根据.h5文件的格式进行层冻结.
除了自己用代码一窥.h5文件结构外,还有什么快捷的工具吗?有的,matlab就提供了现成的调用方法.文档地址在这里: http://cn.mathworks.com/help/matlab/ref/h5disp.html
如,我们可以使用matlab命令查看vgg16模型的权重结构
>> h5disp('vgg16_weights.h5')
结果显示如下:
>> h5disp('vgg16_weights.h5') HDF5 vgg16_weights.h5 Group '/' Attributes: 'nb_layers': 37 Group '/layer_0' Attributes: 'nb_params': 0 Group '/layer_1' Attributes: 'nb_params': 2 Dataset 'param_0' Size: 3x3x3x64 MaxSize: 3x3x3x64 Datatype: H5T_IEEE_F32LE (single) ChunkSize: [] Filters: none FillValue: 0.000000 Dataset 'param_1' Size: 64 MaxSize: 64 Datatype: H5T_IEEE_F32LE (single) ChunkSize: [] Filters: none FillValue: 0.000000 Group '/layer_10' Attributes: 'nb_params': 0 Group '/layer_11' Attributes: 'nb_params': 2 Dataset 'param_0' Size: 3x3x128x256 MaxSize: 3x3x128x256 Datatype: H5T_IEEE_F32LE (single) ChunkSize: [] Filters: none FillValue: 0.000000 Dataset 'param_1' Size: 256 MaxSize: 256 Datatype: H5T_IEEE_F32LE (single) ChunkSize: [] Filters: none FillValue: 0.000000 Group '/layer_12' Attributes: 'nb_params': 0 Group '/layer_13' Attributes: 'nb_params': 2 Dataset 'param_0' Size: 3x3x256x256 MaxSize: 3x3x256x256 Datatype: H5T_IEEE_F32LE (single) ChunkSize: [] Filters: none FillValue: 0.000000 Dataset 'param_1' Size: 256 MaxSize: 256 Datatype: H5T_IEEE_F32LE (single) ChunkSize: [] Filters: none FillValue: 0.000000 Group '/layer_14' Attributes: 'nb_params': 0 Group '/layer_15' Attributes: 'nb_params': 2 Dataset 'param_0' Size: 3x3x256x256 MaxSize: 3x3x256x256 Datatype: H5T_IEEE_F32LE (single) ChunkSize: [] Filters: none FillValue: 0.000000 Dataset 'param_1' Size: 256 MaxSize: 256 Datatype: H5T_IEEE_F32LE (single) ChunkSize: [] Filters: none FillValue: 0.000000 Group '/layer_16' Attributes: 'nb_params': 0 Group '/layer_17' Attributes: 'nb_params': 0 Group '/layer_18' Attributes: 'nb_params': 2 Dataset 'param_0' Size: 3x3x256x512 MaxSize: 3x3x256x512 Datatype: H5T_IEEE_F32LE (single) ChunkSize: [] Filters: none FillValue: 0.000000 Dataset 'param_1' Size: 512 MaxSize: 512 Datatype: H5T_IEEE_F32LE (single) ChunkSize: [] Filters: none FillValue: 0.000000 Group '/layer_19' Attributes: 'nb_params': 0 Group '/layer_2' Attributes: 'nb_params': 0 Group '/layer_20' Attributes: 'nb_params': 2 Dataset 'param_0' Size: 3x3x512x512 MaxSize: 3x3x512x512 Datatype: H5T_IEEE_F32LE (single) ChunkSize: [] Filters: none FillValue: 0.000000 Dataset 'param_1' Size: 512 MaxSize: 512 Datatype: H5T_IEEE_F32LE (single) ChunkSize: [] Filters: none FillValue: 0.000000 Group '/layer_21' Attributes: 'nb_params': 0 Group '/layer_22' Attributes: 'nb_params': 2 Dataset 'param_0' Size: 3x3x512x512 MaxSize: 3x3x512x512 Datatype: H5T_IEEE_F32LE (single) ChunkSize: [] Filters: none FillValue: 0.000000 Dataset 'param_1' Size: 512 MaxSize: 512 Datatype: H5T_IEEE_F32LE (single) ChunkSize: [] Filters: none FillValue: 0.000000 Group '/layer_23' Attributes: 'nb_params': 0 Group '/layer_24' Attributes: 'nb_params': 0 Group '/layer_25' Attributes: 'nb_params': 2 Dataset 'param_0' Size: 3x3x512x512 MaxSize: 3x3x512x512 Datatype: H5T_IEEE_F32LE (single) ChunkSize: [] Filters: none FillValue: 0.000000 Dataset 'param_1' Size: 512 MaxSize: 512 Datatype: H5T_IEEE_F32LE (single) ChunkSize: [] Filters: none FillValue: 0.000000 Group '/layer_26' Attributes: 'nb_params': 0 Group '/layer_27' Attributes: 'nb_params': 2 Dataset 'param_0' Size: 3x3x512x512 MaxSize: 3x3x512x512 Datatype: H5T_IEEE_F32LE (single) ChunkSize: [] Filters: none FillValue: 0.000000 Dataset 'param_1' Size: 512 MaxSize: 512 Datatype: H5T_IEEE_F32LE (single) ChunkSize: [] Filters: none FillValue: 0.000000 Group '/layer_28' Attributes: 'nb_params': 0 Group '/layer_29' Attributes: 'nb_params': 2 Dataset 'param_0' Size: 3x3x512x512 MaxSize: 3x3x512x512 Datatype: H5T_IEEE_F32LE (single) ChunkSize: [] Filters: none FillValue: 0.000000 Dataset 'param_1' Size: 512 MaxSize: 512 Datatype: H5T_IEEE_F32LE (single) ChunkSize: [] Filters: none FillValue: 0.000000 Group '/layer_3' Attributes: 'nb_params': 2 Dataset 'param_0' Size: 3x3x64x64 MaxSize: 3x3x64x64 Datatype: H5T_IEEE_F32LE (single) ChunkSize: [] Filters: none FillValue: 0.000000 Dataset 'param_1' Size: 64 MaxSize: 64 Datatype: H5T_IEEE_F32LE (single) ChunkSize: [] Filters: none FillValue: 0.000000 Group '/layer_30' Attributes: 'nb_params': 0 Group '/layer_31' Attributes: 'nb_params': 0 Group '/layer_32' Attributes: 'nb_params': 2 Dataset 'param_0' Size: 4096x25088 MaxSize: 4096x25088 Datatype: H5T_IEEE_F32LE (single) ChunkSize: [] Filters: none FillValue: 0.000000 Dataset 'param_1' Size: 4096 MaxSize: 4096 Datatype: H5T_IEEE_F32LE (single) ChunkSize: [] Filters: none FillValue: 0.000000 Group '/layer_33' Attributes: 'nb_params': 0 Group '/layer_34' Attributes: 'nb_params': 2 Dataset 'param_0' Size: 4096x4096 MaxSize: 4096x4096 Datatype: H5T_IEEE_F32LE (single) ChunkSize: [] Filters: none FillValue: 0.000000 Dataset 'param_1' Size: 4096 MaxSize: 4096 Datatype: H5T_IEEE_F32LE (single) ChunkSize: [] Filters: none FillValue: 0.000000 Group '/layer_35' Attributes: 'nb_params': 0 Group '/layer_36' Attributes: 'nb_params': 2 Dataset 'param_0' Size: 1000x4096 MaxSize: 1000x4096 Datatype: H5T_IEEE_F32LE (single) ChunkSize: [] Filters: none FillValue: 0.000000 Dataset 'param_1' Size: 1000 MaxSize: 1000 Datatype: H5T_IEEE_F32LE (single) ChunkSize: [] Filters: none FillValue: 0.000000 Group '/layer_4' Attributes: 'nb_params': 0 Group '/layer_5' Attributes: 'nb_params': 0 Group '/layer_6' Attributes: 'nb_params': 2 Dataset 'param_0' Size: 3x3x64x128 MaxSize: 3x3x64x128 Datatype: H5T_IEEE_F32LE (single) ChunkSize: [] Filters: none FillValue: 0.000000 Dataset 'param_1' Size: 128 MaxSize: 128 Datatype: H5T_IEEE_F32LE (single) ChunkSize: [] Filters: none FillValue: 0.000000 Group '/layer_7' Attributes: 'nb_params': 0 Group '/layer_8' Attributes: 'nb_params': 2 Dataset 'param_0' Size: 3x3x128x128 MaxSize: 3x3x128x128 Datatype: H5T_IEEE_F32LE (single) ChunkSize: [] Filters: none FillValue: 0.000000 Dataset 'param_1' Size: 128 MaxSize: 128 Datatype: H5T_IEEE_F32LE (single) ChunkSize: [] Filters: none FillValue: 0.000000 Group '/layer_9' Attributes: 'nb_params': 0
以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持 码农网
猜你喜欢:- 数据生成工具 ZenData 发布 2.1 版本,新增 ProtoBuf 数据生成等特性
- faker生成器生成虚拟数据的Python模块
- 模拟测试数据的生成方法
- 测试数据生成工具 datafaker
- 数据生成工具 ZenData 1.4 发布,内置国家、日期、时间格式,支持文章生成
- 数据生成工具 ZenData 发布 2.0 版本,自动生成“互联网黑话”!
本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们。