There is a lot of code online about reading MNIST dataset, and I have produced my own version, which uses my Binary/Reader classes talked about in a previous post. So, you could be thinking, Binary Image file stuffs with C++, oh, no! I have seen some of the parsers easily obtainable online, written in C++, and I can imagine what poor code looks like. I found none of the examples to be particularly expressive C++, and have since generated my own code which uses C++ up to 2023 .
In this post I’m going to describe loading the MNIST binary file, parsing it and specifically highlight my custom binary file class. It should seem magical how simple working with a binary file in C++ actually is, and indeed, it is somewhat magical.
The MNIST dataset includes two files, an images file, and a labels file. Both files have a binary heading and then binary data segment.
The format of the binary heading is as follows and all fields are uint32_t:
image file Magic number = 2051;
number of images = 10000
image height = 28
image width = 28
The data segment of the image file contains grey pixels of 28×28 images, the pixel type is uint8_t:
image pixels dimension = [28*28];
The number of images is 10,000.
The labels file is a list of 10,000 labels, uint8_t, and has a heading consisting of a Magic number and the number of labels.
label file Magic number = 2049
number of labels = 10000
uint8_t label[10000];
I read these files simultaneously.
The values in headings need to have their byte order reversed, but for actual data, this is not the case.
Here is my code to read MNIST followed by more code.
void parseMNIST() {
std::string imagesFileName = "mnist//t10k-images.idx3-ubyte"
, labelsFileName = "mnist//t10k-labels.idx1-ubyte";
auto readHeading = [](auto& file, auto& h) {
file >> h; h = std::byteswap(h);
};
auto checkMagic = [&](auto& file, uint32_t magic) {
uint32_t input{ 0 }; readHeading(file, input);
if (input != magic) {
throw std::logic_error("Incorrect file magic");
}
};
BinaryReader imagesIn(imagesFileName)
, labelsIn(labelsFileName);
uint32_t numImages{ 0 }
, numLabels{ 0 }
, rows{ 0 }
, cols{ 0 };
checkMagic(imagesIn, 2051);
checkMagic(labelsIn, 2049);
readHeading(imagesIn, numImages);
readHeading(labelsIn, numLabels);
if (numImages != numLabels) {
throw std::logic_error("image num should equal label num");
}
readHeading(imagesIn, rows);
readHeading(imagesIn, cols);
std::size_t imageSize = rows * cols;
uint8_t label{ 0 };
std::vector<uint8_t> image(imageSize);
for (std::size_t i = 0; i < numImages; ++i) {
imagesIn >> image;
labelsIn >> label;
mData[label].push_back(image);
}
}
To draw an image in the console I do the following:
void drawDigit(uint8_t digit, std::size_t index=0) {
auto& image = mData[digit][index];
using namespace std;
cout << to_string(digit) << endl;
for (size_t y = 0; y < 28; ++y) {
for (size_t x = 0; x < 28; ++x) {
cout << to_string(image[y * 28 + x]) << ",";
}
cout << endl;
}
cout << endl;
}
I separated the MNIST data into separate digit files, here’s what creating and reading them ended up looking like:
void loadDigit(uint8_t digit) {
BinaryReader imagesIn("mnist_" + std::to_string(digit));
std::vector<uint8_t> image(28*28);
while (imagesIn >> image) {
mData[digit].push_back(image);
};
}
void makeDigitFiles() {
for (auto& [label, images] : mData) {
std::string fileName = "mnist_" + std::to_string(label);
BinaryWriter imagesOut(fileName);
for (auto& image : images) {
imagesOut << image;
}
}
}