Statistical-Learning-Method.../transMnist/transMnist.py
2018-11-16 00:00:27 +08:00

36 lines
1.1 KiB
Python
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

#coding=utf-8
'''
Mnsit原始数据集为字符格式将数据集转换为cvs格式
后续代码都会在cvs文件的基础上进行编写这样大家看代码也能清楚很多
代码由以下网址提供,表示感谢。
https://pjreddie.com/projects/mnist-in-csv/
该py文件属于一个补充不使用也不影响后续算法的实践。
转换后的CVS文件在Mnist文件夹中
'''
def convert(imgf, labelf, outf, n):
f = open(imgf, "rb")
o = open(outf, "w")
l = open(labelf, "rb")
f.read(16)
l.read(8)
images = []
for i in range(n):
image = [ord(l.read(1))]
for j in range(28*28):
image.append(ord(f.read(1)))
images.append(image)
for image in images:
o.write(",".join(str(pix) for pix in image)+"\n")
f.close()
o.close()
l.close()
if __name__ == '__main__':
convert(".\Mnist\\t10k-images.idx3-ubyte", ".\Mnist\\t10k-labels.idx1-ubyte",
".\Mnist\\mnist_test.csv", 10000)
convert(".\Mnist\\train-images.idx3-ubyte", ".\Mnist\\train-labels.idx1-ubyte",
".\Mnist\mnist_train.csv", 60000)