使用TensorFlow怎么高效的读取数据 - 行业资讯 - 肥雀云

　　介绍

使用TensorFlow怎么高效的读取数据,针对这个问题,这篇文章详细介绍了相对应的分析和解答,希望可以帮助更多想解决这个问题的小伙伴找到更简单易行的方法。

<强> TFRecords

TFRecords其实是一种二进制文件,虽然它不如其他格式好理解,但是它能更好的利用内存,更方便复制和移动,并且不需要单独的标签文件(等会儿就知道为什么了)……总而言之,这样的文件格式好处多的多,所以让我们用起来吧。

TFRecords文件包含了tf.train。例子协议内存块(protocol buffer)(协议内存块包含了字段功能)。我们可以写一段代码获取你的数据,将数据填入到例子协议内存块(协议缓冲区),将协议内存块序列化为一个字符串,并且通过tf.python_io。TFRecordWriter写入到TFRecords文件。

从TFRecords文件中读取数据,可以使用tf.TFRecordReader的特遣部队。parse_single_example解析器。这个操作可以将例子协议内存块(protocol buffer)解析为张量。

接下来,让我们开始读取数据之旅吧~

<强>生成TFRecords文件

我们使用tf.train.Example来定义我们要填入的数据格式,然后使用tf.python_io。TFRecordWriter来写入。

import 操作系统　　import tensorflow as tf 　　得到PIL import 形象　　　　时间=cwd os.getcwd () 　　　　& # 39;& # 39;& # 39; 　　此处我加载的数据目录如下: 　　0,——img1.jpg 　　img2.jpg。才能　　img3.jpg。才能　　,,… 　　1,——img1.jpg 　　img2.jpg。才能　　,,… 　　2,,,… 　　,这里的0,,1,,2…就是类别,也就是下文中的类　　,类是我根据自己数据类型定义的一个列表,大家可以根据自己的数据情况灵活运用　　… 　　& # 39;& # 39;& # 39; 　　时间=writer tf.python_io.TFRecordWriter (“train.tfrecords") 　　for 指数,name 拷贝;列举(类): 　　时间=class_path 才能;cwd +, name +,“/? 　　for 才能;img_name 拷贝os.listdir (class_path): 　　,,,img_path =, class_path + img_name 　　,,,,,img =, Image.open (img_path) 　　,,,,,img =, img.resize ((224,, 224)) 　　,,,img_raw =, img.tobytes(),,,,,,, #将图片转化为原生字节　　,,,example =,=tf.train.Features tf.train.Example(特性(功能={ 　　,,,,,“label":, tf.train.Feature (int64_list=tf.train.Int64List(值=https://www.yisu.com/zixun/(指数))), 　　“img_raw”: tf.train.Feature (bytes_list=tf.train.BytesList(值=[img_raw])) 　　})) 　　writer.write (example.SerializeToString()) #序列化为字符串　　writer.close ()

关于特性的例子相关定义和详细内容,我推荐去官网查看相关API。

基本的,一个例子中包含特性,特性里包含特性(这里没s)的字典。最后,功能里包含有一个FloatList,或者ByteList,或者Int64List

就这样,我们把相关的信息都存到了一个文件中,所以前面才说不用单独的标签文件。而且读取也很方便。

接下来是一个简单的读取小例子:

for serialized_example 拷贝tf.python_io.tf_record_iterator (“train.tfrecords"): 　　时间=example 才能;tf.train.Example () 　　example.ParseFromString才能(serialized_example) 　　　　时间=image 才能;example.features.feature[& # 39;图像# 39;].bytes_list.value 　　时间=label 才能;example.features.feature[& # 39;标签# 39;].int64_list.value 　　#才能,可以做一些预处理之类的　　print 形象,才能,标签

<强>使用队列读取

一旦生成了TFRecords文件,为了高效地读取数据,特遣部队中使用队列(队列)读取数据。

def read_and_decode(文件名): 　　#才能根据文件名生成一个队列　　时间=filename_queue 才能;tf.train.string_input_producer((文件名)) 　　　　时间=reader 才能;tf.TFRecordReader () 　　_,才能,serialized_example =, reader.read (filename_queue),, #返回文件名和文件　　时间=features 才能;tf.parse_single_example (serialized_example, 　　,,,,,,,,,,,,,,,,,,,={特性　　,,,,,,,,,,,,,,,,,,,,,& # 39;标签# 39;:,tf.FixedLenFeature ([], tf.int64), 　　,,,,,,,,,,,,,,,,,,,,,& # 39;img_raw& # 39;,:, tf.FixedLenFeature ([], tf.string), 　　,,,,,,,,,,,,,,,,,,,}) 　　　　img 才能=,tf.decode_raw(特性[& # 39;img_raw& # 39;],, tf.uint8) 　　img 才能=,tf.reshape (img,, (224,, 224,, 3]) 　　img 才能=,tf.cast (img, tf.float32), *,(1只/,255),安康;0.5 　　label 才能=,tf.cast(特性[& # 39;标签# 39;],,tf.int32) 　　　　return 才能,img,标签