1. Use Keras and TensorFlow to load and initially understand the Fashion-MNIST data set
Import and preprocess the Fashion-MNIST dataset, and set the log level of TensorFlow.
Fashion-MNIST is a dataset containing a training set of 60,000 28×28 pixel clothing category images and a test set of 10,000 images.
import tensorflow as tf from tensorflow import keras from tensorflow.keras import datasets,layers,optimizers,Sequential,metrics #Import data set management library, hierarchy, optimizer, fully connected layer container, test metric import os #Set the content printed in the output box os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2' # '2'The output column only prints error information, and other messy information is not printed. #Import data set, array type (x,y),(x_test,y_test) = datasets.fashion_mnist.load_data() # View data set information print(f'x.shape={<!-- -->x.shape}, y.shape={<!-- -->y.shape}') # Check the size of the training set xy print(f'x_test.shape={<!-- -->x_test.shape}, y_test.shape={<!-- -->y_test.shape}') #View the size of the test set print(f'y[:5]={y[:5]}') # View the first 5 data of y
First, import the required libraries, including TensorFlow, Keras, and os.
Next, we use the tf.keras.datasets.fashion_mnist.load_data()
function to load the Fashion-MNIST dataset. This function returns four NumPy arrays: (x_train, y_train)
and (x_test, y_test)
, where x_train
and x_test
code> contains the image data, and y_train
and y_test
contain the corresponding labels.
We then print the shapes of these arrays to get an idea of the size of the dataset. x_train
and x_test
are both in the shape of (60000, 28, 28), representing 60000 28×28 pixel images. y_train
and y_test
are both in the shape of (60000,), representing 60000 labels.
Finally, we view the first five tags by printing y[:5]
. The label here is an integer, indicating the clothing category corresponding to the image, such as 0 indicating a T-shirt, 1 indicating pants, etc.
*View data set
# Dataset display import matplotlib.pyplot as plt import numpy as np # Name of each category class_names = ['Tshirt','Trouser','Pullover','Dress','Coat','Sandal','Shirt','Sneaker','Bag','Ankle boot'] # draw image for i in range(0,10): plt.subplot(2,5,i + 1) #The current graph is drawn at the i + 1th position in row 2 and column 5 plt.imshow(x[i]) plt.xlabel(class_names[y[i]]) #y[i] represents the label value of the category it belongs to plt.xticks([]) # Do not display x and y axis coordinates plt.yticks([])
*Image data set display
A class_names list is defined, containing the names of 10 categories, which may correspond to different categories in the image dataset.
Use a for loop to iterate over the first 10 samples in the dataset (i from 0 to 9). For each sample, the code creates a subplot (using plt.subplot) and plots the image using the imshow function.
When the image is drawn, the code adds a label to the image showing the category it belongs to (using the xlabel function). The label name is obtained through y[i]. This code y is a simple image data set display example. It first imports matplotlib.pyplot and numpy for plotting and array operations.
Next, a class_names list is defined, containing the names of 10 categories, which may correspond to different categories in the image dataset.
Then, use a for loop to iterate over the first 10 samples in the dataset (i from 0 to 9). For each sample, the code creates a subplot (using plt.subplot) and plots the image using the imshow function.
When the image is drawn, the code adds a label to the image showing the category it belongs to (using the xlabel function). The label name is obtained via y[i], which is the category label of the image.
Can more intuitively understand the content and format of the data set
2. Use TensorFlow for image data set preprocessing and loading
# Data preprocessing function, convert data type def processing(x,y): x = tf.cast(x,tf.float32)/255.0 #x data changes data type and normalizes y = tf.cast(y,tf.int32) # Change the data type of target y return(x,y) # Preprocess the training set y = tf.one_hot(y,depth=10) # One-hot encoding, converted into a vector of length 10, the value of the corresponding index becomes 1 ds_train = tf.data.Dataset.from_tensor_slices((x,y)) # Automatically convert x and y to tensor type ds_train = ds_train.map(processing).batch(128).shuffle(10000) # Set each sampling size and shuffle # Preprocess the test set ds_test = tf.data.Dataset.from_tensor_slices((x_test,y_test)) ds_test = ds_test.map(processing).batch(128) #Test all the test set samples at once, no need to interrupt # Generate an iterator to check whether the data is loaded correctly sample = next(iter(ds_train)) # Run one batch at a time, that is, 128 data print('x_batch:',sample[0].shape,'y_batch:',sample[1].shape) # Check how many are taken at one time
-
processing
function: This function is responsible for preprocessing the input data. For input x, it converts the data type totf.float32
and divides by 255.0 for normalization. For input y, it converts the data type totf.int32
. -
Data loading: The code uses
tf.data.Dataset.from_tensor_slices
to create a dataset from tensor slices. For the training set, first usetf.one_hot
to one-hot encode the target y, and then merge x and y into a single dataset. Next, use themap
function to apply theprocessing
function to each element of the data set for preprocessing, and then use thebatch
function to divide the data set into sizes of 128 batches, and use theshuffle
function to shuffle the data set. For the test set, preprocessing and batch operations are also performed, but no disruption is required. -
Check data loading: Use the
iter
function to create an iterator, and then use thenext
function to retrieve a batch of data from the iterator. Finally, use theprint
function to check the shape of the data and confirm whether the data is loaded correctly.
3. Build a simple TensorFlow fully connected neural network
# ==1== Set fully connected layer # [b,784]=>[b,256]=>[b,128]=>[b,64]=>[b,32]=>[b,10], the middle layer generally decreases from large to small dimension model = Sequential([ layers.Dense(256, activation=tf.nn.relu), #The first connection layer outputs 256 features layers.Dense(128, activation=tf.nn.relu), #Second connection layer layers.Dense(64, activation=tf.nn.relu), #The third connection layer layers.Dense(32, activation=tf.nn.relu), #The fourth connection layer layers.Dense(10), #The last layer does not require an activation function and outputs 10 categories ]) # ==2== Set input layer dimensions model.build(input_shape=[None, 28*28]) # ==3== View network structure model.summary() # ==4== Optimizer # Complete weight update w = w - lr * grad optimizer = optimizers.Adam(lr=1e-3)
- Building the network: The code uses the
Sequential
class to build a neural network composed of multiple fully connected layers. First, each fully connected layer is created using thelayers.Dense
class, which specifies the number of neurons, activation function and other parameters. Finally, the output layer (layer 5) does not use an activation function because this is a multi-classification problem and the number of outputs of the output layer is equal to the number of categories (10 categories). - Set input layer dimensions: Use the
model.build
function to set the input layer dimensions. Since the input is a 28×28 image, set the input shape to[None, 28*28]
, which represents any number of 28×28 images. - View the network structure: Use the
model.summary
function to view the network structure, including the input shape, output shape, and number of parameters of each layer. - Choosing an optimizer: An Adam optimizer was created using
optimizers.Adam
with a learning rate of 0.001. The optimizer is used to update the weights of the model for gradient descent optimization.
When training a neural network, this optimizer will update the model weight (w) according to the set learning rate (lr) based on the gradient (grad) calculated by the loss function, so that the model gradually converges to the optimal solution.
*Output results
Sequential model architecture and parameter information. This is a simple feedforward neural network containing 5 fully connected layers (Dense layers). Each layer takes the output of the previous layer as input and performs some transformation on the input. The fully connected layer here means that each hidden node is connected to all nodes in the previous layer.
4. Training and optimization
for epoch in range(20): # Run 20 times # Run each batch for step,(x,y) in enumerate(ds_train): # The shape of x in ds_train is [b, 28, 28]. Since the input layer is [b, 28*28], type conversion is required. x = tf.reshape(x, [-1, 28*28]) #-1 will automatically settle the 0th dimension # Gradient calculation with tf.GradientTape() as tape: # Automatic network operation: [b,784]=>[b,10] logits = model(x) #Get the output of the last layer # Calculate the mean square error, between the real value y (after onehot encoding) and the output result loss1 = tf.reduce_mean(tf.losses.MSE(y, logits)) # Calculate the cross entropy loss, the real value y (after onehot encoding) and the output probability (logits will automatically perform softmax and become a probability value) loss2 = tf.reduce_mean(tf.losses.categorical_crossentropy(y, logits, from_logits=True)) # Gradient calculation, the first dependent variable, the second independent variable, model.trainable_variables obtains all weights and bias parameters grads = tape.gradient(loss2, model.trainable_variables) #Update weights, zip combines elements of grads with elements in model.trainable_variables optimizer.apply_gradients(zip(grads, model.trainable_variables)) # Complete the task: w1.assign_sub(lr * grads[0]) #Print the results after running a batch each time if step % 100 == 0: print(f'epochs:{<!-- -->epoch}, step:{<!-- -->step}, loss_MSE:{<!-- -->loss1}, loss_CE:{<!-- -->loss2}')
Forward propagation, computational loss, and backpropagation in TensorFlow.
- Looping through epochs and batches: The code uses a for loop to iterate over the data set. For each epoch, the entire data set (batch) is traversed. This helps the model gradually learn from large amounts of data and optimize its parameters.
- Reshaping the input: At each iteration, the input data
x
is reshaped from [b, 28, 28] to [b, 28*28] to match the input layer of the model . - Calculating losses: The code calculates mean square error loss (MSE) and cross-entropy loss (CE) for loss optimization during training.
- Gradient calculation and update: Use
tf.GradientTape
to record the calculation process in order to calculate the gradient. An optimizer (Adam) is then used to update the model’s weights based on the gradients. - Printing results: After each batch (step) is completed, the code will print out the current epoch, step number and loss value for visualization and debugging.
5. Conduct testing and evaluation
total_correct = 0 #The total number of correct predictions total_sum = 0 #The number of total statistics for (x,y) in ds_test: #Return x and y of the test set # Change the shape of x from [b,28,28]=>[b,28*28] x = tf.reshape(x, [-1,28*28]) # Calculate the output layer [b,10] logits = model(x) # Calculate the index of the value with the highest probability #Convert logits to probability prob = tf.nn.softmax(logits, axis=1) # Convert the probability in the last dimension, and the probability sum is 1 predict = tf.argmax(prob, axis=1) # Find the location of the maximum value and get a scalar # y is int32 type, shape is [128] # predict is int64 type, shape is [128] predict = tf.cast(predict, dtype=tf.int32) # y is a vector, each element represents which category it belongs to; predict is also a vector, and the subscript value indicates which category it belongs to. # Just check whether the values of the two variables are the same correct = tf.equal(y, predict) #Return True and False # True and False become 1 and 0, count the number of 1s, and how many predictions are correct in total correct = tf.reduce_sum(tf.cast(correct, dtype=tf.int32)) # The number of correct predictions, correct is tensor type, variable numpy type total_correct + = int(correct) total_sum + = x.shape[0] #0th dimension, how many pictures are there in each test # Calculate the model accuracy after a large cycle acc = total_correct/total_sum print(f'acc: {acc}')
The main purpose is to test the trained model and calculate the accuracy of the model on the test set.
First, the code defines two variables: the total number of correct predictions and the total number of samples. Then, traverse the test set and perform forward propagation on each sample to obtain the prediction result. Then, the predictions are compared with the actual labels, and the number of correct predictions is accumulated. At the same time, count the total number of samples.
The for loop traverses the test data set (ds_test), for each input sample x and corresponding label y:
- Adjust the shape of x from [b,28,28] to [b,28*28].
- Predict the output through the model.
- Convert predicted logits into probability values (probability).
- Use tf.argmax to find the index of the predicted label, which is the prediction result.
- Convert prediction result from int64 to int32.
- Calculate the difference between the predicted result and the actual label to get the correct tensor.
- Convert correct from boolean to integer and sum to get total_correct.
- Accumulate x.shape[0] to get total_sum.
Finally, for each sample in the test set, adjust its shape, find the predicted label through model prediction, calculate the number of correct predictions, and accumulate the number of pictures in each test to obtain the accuracy of the model on the test set.
*The model is accurate
In each epoch (cycle), the model will traverse the entire training data set once.
“acc: 0.8878” is the accuracy of the model on the training data set in the 19th epoch.
As epoch increases, the loss function value will decrease and the accuracy will increase, which means that the model is gradually optimizing.