This topic mainly explains the use of the model Model in the Keras framework, mainly including:
1. Two ways to use the model;
2. Realization of classic model (custom model);
3. Customized training of the model;
1. Two ways to use the model
-
Keras provides two computing models through two APIs:
-
Functional model: through the Model class API;
-
Sequential model: through the Sequential class API;
-
The business background of this article is still the deep full link network;
Realize 4 -> 8 -> 4 -> 1 network.
1. Functional model
-
The programming characteristics of the functional model are:
-
The programmer builds the layer, through the callable feature of the Layer object, or uses apply and call to implement chained function calls;
-
Model only needs to pass inputs and outputs;
-
The reason why this mode is called a functional model is that Layer provides several functional calling methods through which a network model between layers is established.
-
Layer is a callable object, providing a __call__ callable operator (…);
-
apply function;
-
Schematic diagram of a functional model
-
Sample Code for Functional Models – Callable Operations Using Layer
# Author: Louis Young # Note: 4-8-4-1 fully linked neural network built using the Iris dataset import tensorflow as tf import tensorflow.keras as keras import tensorflow.keras.layers as layers import sklearn.datasets as datasets # 1. Define the Layer layer; # 1.1. Input layer: must be Tensor created by InputLayer or Input; input_layer = keras.Input(shape=(4,)) # 4 is the feature dimension of Iris (4 features: you can refer to the feature description of the iris dataset in sklearn) # 1.2. Hidden layers: 8-4 hide1_layer = layers.Dense(units=8, activation='relu') hide2_layer = layers.Dense(units=4, activation='relu') # 1.3. Output layer: 1 output_layer = layers.Dense(units=1, activation='sigmoid') # 2. Construct the function chain relationship between Layers hide1_layer_tensor = hide1_layer(input_layer) # <---------------------- use callable features hide2_layer_tensor = hide2_layer(hide1_layer_tensor) output_layer_tensor = output_layer(hide2_layer_tensor) # 3. Use inputs and outputs to build a function chain model; model = keras.Model(inputs=input_layer, outputs=output_layer_tensor) # inputs and outputs must be the tensor output by Layer call # 4. Training # 4.1. Training parameters model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['mse']) # 4.2. Training data, target = datasets. load_iris(return_X_y=True) data = data[:100, :] # Take the first 100 samples (the first and second categories) target = target[:100] model.fit(x=data, y=target, batch_size=10, epochs=1000, verbose=0) # 4.3. Prediction and Evaluation # 6. Prediction pre_result = model. predict(data) category = [0 if item<=0.5 else 1 for item in pre_result] accuracy = (target == category).mean() print(F'Classification accuracy: {accuracy *100.0:5.2f}%', )
Classification accuracy: 100.00%
-
Example code for a functional model – using Layer’s apply function
The apply function is actually an alias for the __call__ operator.
# Author: Louis Young # Note: 4-8-4-1 fully linked neural network built using the Iris dataset import tensorflow as tf import tensorflow.keras as keras import tensorflow.keras.layers as layers import sklearn.datasets as datasets # 1. Define the Layer layer; # 1.1. Input layer: must be Tensor created by InputLayer or Input; input_layer = keras.Input(shape=(4,)) # 4 is the feature dimension of Iris (4 features: you can refer to the feature description of the iris dataset in sklearn) # 1.2. Hidden layers: 8-4 hide1_layer = layers.Dense(units=8, activation='relu') hide2_layer = layers.Dense(units=4, activation='relu') # 1.3. Output layer: 1 output_layer = layers.Dense(units=1, activation='sigmoid') # 2. Construct the function chain relationship between Layers hide1_layer_tensor = hide1_layer.apply(input_layer) # <----------------------use apply hide2_layer_tensor = hide2_layer.apply(hide1_layer_tensor) output_layer_tensor = output_layer.apply(hide2_layer_tensor) # 3. Use inputs and outputs to build a function chain model; model = keras.Model(inputs=input_layer, outputs=output_layer_tensor) # inputs and outputs must be the tensor output by Layer call # 4. Training # 4.1. Training parameters model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['mse']) # 4.2. Training data, target = datasets. load_iris(return_X_y=True) data = data[:100, :] # Take the first 100 samples (the first and second categories) target = target[:100] model.fit(x=data, y=target, batch_size=10, epochs=1000, verbose=0) # 4.3. Prediction and Evaluation # 5. Prediction pre_result = model. predict(data) category = [0 if item<=0.5 else 1 for item in pre_result] accuracy = (target == category).mean() print(F'Classification accuracy: {accuracy *100.0:5.2f}%', )
Tensor("dense_29/Identity:0", shape=(None, 1), dtype=float32) Classification accuracy: 100.00%
2. Sequential model
Programming features of the sequential model:
-
Layer provides input and output attributes;
-
The Sequential class maintains the relationship between layers through the input and output properties of Layer, and builds a network model;
-
The first Layer must be a tensor constructed by InputLayer or Input function;
1. Schematic diagram of sequential model
2. Sample code of sequential model -layers parameter construction model
# Author: Louis Young # Note: 4-8-4-1 fully linked neural network built using the Iris dataset import tensorflow as tf import tensorflow.keras as keras import tensorflow.keras.layers as layers import sklearn.datasets as datasets # 1. Build the Layer layer # 1. Define the Layer layer; # 1.1. Input layer: must be Tensor created by InputLayer or Input; input_layer = keras.Input(shape=(4,)) # 4 is the feature dimension of Iris (4 features: you can refer to the feature description of the iris dataset in sklearn) # 1.2. Hidden layers: 8-4 hide1_layer = layers.Dense(units=8, activation='relu') hide2_layer = layers.Dense(units=4, activation='relu') # 1.3. Output layer: 1 output_layer = layers.Dense(units=1, activation='sigmoid') # 2. Use Sequential to build a sequential model seq_model = keras.Sequential(layers=[input_layer, hide1_layer, hide2_layer, output_layer]) # ------------------- The following part is exactly the same as the above code # 3. Training # 3.1. Training parameters seq_model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['mse']) # 3.2. Training data, target = datasets. load_iris(return_X_y=True) data = data[:100, :] # Take the first 100 samples (the first and second categories) target = target[:100] seq_model.fit(x=data, y=target, batch_size=10, epochs=1000, verbose=0) # 3.3. Prediction and Evaluation # 4. Prediction pre_result = seq_model. predict(data) category = [0 if item<=0.5 else 1 for item in pre_result] accuracy = (target == category).mean() print(F'Classification accuracy: {accuracy *100.0:5.2f}%', )
Classification accuracy: 100.00%
3. Sample code for the sequential model – use the add method to build the model
# Author: Louis Young # Note: 4-8-4-1 fully linked neural network built using the Iris dataset import tensorflow as tf import tensorflow.keras as keras import tensorflow.keras.layers as layers import sklearn.datasets as datasets # 1. Build the Layer layer # 1. Define the Layer layer; # 1.1. Input layer: must be Tensor created by InputLayer or Input; input_layer = keras.Input(shape=(4,)) # 4 is the feature dimension of Iris (4 features: you can refer to the feature description of the iris dataset in sklearn) # 1.2. Hidden layers: 8-4 hide1_layer = layers.Dense(units=8, activation='relu') hide2_layer = layers.Dense(units=4, activation='relu') # 1.3. Output layer: 1 output_layer = layers.Dense(units=1, activation='sigmoid') # 2. Use Sequential to build a sequential model seq_model = keras. Sequential() seq_model.add(input_layer) seq_model.add(hide1_layer) seq_model.add(hide2_layer) seq_model.add(output_layer) # ------------------- The following part is exactly the same as the above code # 3. Training # 3.1. Training parameters seq_model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['mse']) # 3.2. Training data, target = datasets. load_iris(return_X_y=True) data = data[:100, :] # Take the first 100 samples (the first and second categories) target = target[:100] seq_model.fit(x=data, y=target, batch_size=10, epochs=1000, verbose=0) # 3.3. Prediction and Evaluation # 4. Prediction pre_result = seq_model. predict(data) category = [0 if item<=0.5 else 1 for item in pre_result] accuracy = (target == category).mean() print(F'Classification accuracy: {accuracy *100.0:5.2f}%', )
Classification accuracy: 100.00%
4. The meaning of the input and output attributes of the Layer class
# Author: Louis Young # Note: 4-8-4-1 fully linked neural network built using the Iris dataset import tensorflow as tf import tensorflow.keras as keras import tensorflow.keras.layers as layers import sklearn.datasets as datasets # 1. Build the Layer layer # 1. Define the Layer layer; # 1.1. Input layer: must be Tensor created by InputLayer or Input; input_layer = keras.Input(shape=(4,)) # 4 is the feature dimension of Iris (4 features: you can refer to the feature description of the iris dataset in sklearn) # 1.2. Hidden layers: 8-4 hide1_layer = layers.Dense(units=8, activation='relu') hide2_layer = layers.Dense(units=4, activation='relu') # 1.3. Output layer: 1 output_layer = layers.Dense(units=1, activation='sigmoid') # 2. Use Sequential to build a sequential model seq_model = keras. Sequential() seq_model.add(input_layer) seq_model.add(hide1_layer) # seq_model = keras.Sequential([input_layer, hide1_layer]) has the same effect as the above three statements. # The essence of the add method is also a callable object that automatically calls Layer. Only after calling, Layer has input and output properties print(hide1_layer.input) # 22 lines can be canceled. There will be errors, and you can understand the role of the add function.
Tensor("input_17:0", shape=(None, 4), dtype=float32)
2. Classic model implementation – custom model Model
Professional model application: It is recommended to subclass Model to implement your own model.
1. Subclass Model description
-
The subclassing of the model is to overload the call function.
-
The call function comes from the parent class of the model: tensorflow.python.keras.engine.base_layer.Layer
-
Model should be a special Layer.
-
Mode inheritance structure
tf.keras.models.Model |- tensorflow.python.keras.engine.network.Network |- tensorflow.python.keras.engine.base_layer.Layer |- tensorflow.python.module.module.Module |- tensorflow.python.training.tracking.tracking.AutoTrackable
Create your own fully customized models by extending the Model class and implementing your own forward pass in the call method. Note: The Model class inheritance API was introduced in Keras 2.2.0.
2. Use a custom Model to build a model
-
customization process
-
Inherit the Model class;
-
Custom constructor to realize custom attribute transfer;
-
Customize the call function to realize the output of the model;
-
Custom Model sample code
# Author: Louis Young # Note: 4-8-4-1 fully linked neural network built using the Iris dataset import tensorflow as tf import tensorflow.keras as keras import tensorflow.keras.layers as layers import sklearn.datasets as datasets class ModelException(Exception): def __init__(self, msg='model construction exception'): self. message = msg # 1. Build the model class IrisModel(keras.Model): # constructor def __init__(self, networks): super(IrisModel, self).__init__(name='iris_model') # Determine whether the parameter type is list if not isinstance(networks, list): raise ModelException('networks specifies the network structure, the type is a list') # Generate networks attribute self.networks = networks # build layer self._layers = [] for _net in networks[:-1]: # does not consider the input layer layer = layers.Dense(units=_net, activation='relu') self._layers.append(layer) # The last layer uses the sigmoid function layer = layers.Dense(units=networks[-1], activation='sigmoid') self._layers.append(layer) # forward method: build network model def call(self, inputs, **kwargs): # Build the model output according to the layer # The input of the first layer comes from the parameter inputs x = self._layers[0](inputs) for _layer in self._layers[1:]: # The output of the previous layer is used as the input call parameter of the next layer x = _layer(x) # return the last layer as output return x # 2. Create a model instance model = IrisModel([8, 4, 1]) # layer 4 without input # ----------------------------------------<the above is a typical model construction method. # 3. Define training parameters model.compile( optimizer='adam', # specify the optimizer loss='binary_crossentropy', # specify the loss function metrics=['accuracy'] ) # 4. Data loading data, target = datasets. load_iris(return_X_y=True) data = data[:100, :] # Take the first 100 samples (the first and second categories) target = target[:100] # 5. Training model.fit(x=data, y=target, batch_size=10, epochs=100, verbose=0) # 6. Prediction # pre_result = model. predict(data) pre_result = model(data) # Same as the previous statement, the predict function is equivalent to the call of the object category = [0 if item <= 0.5 else 1 for item in pre_result] accuracy = (target == category).mean() print(F'Classification accuracy: {accuracy *100.0:5.2f}%', )
Classification accuracy: 100.00%
3. Customized training of the model
-
It is really convenient and easy to read and understand by using the compile and fit provided by Model to implement training. However, if you want to provide more powerful training control, you need to understand the details of the Model’s compile and fit functions.
-
This kind of detail is actually the encapsulation of tensorflow. Let’s understand the training of the model from a lower level.
-
We provide an example to explain the training details of the model by way of example.
1. Define training parameters
1.1. Optimizer object
-
Adam constructor description
__init__( learning_rate=0.001, # learning rate beta_1=0.9, beta_2=0.999, epsilon=1e-07, amsgrad=False, name='Adam', **kwargs )
-
Optimizing calls to the optimizer
apply_gradients( # gradient update grads_and_vars, name=None )
-
Optimizer creates code
import tensorflow.keras.optimizers as optimizers optimizer = optimizers. Adam(learning_rate=0.0001)
1.2. Loss object
- This object is a callable object used to implement the loss calculation.
-
Constructor description of the CategoricalCrossentropy loss class:
__init__( from_logits=False, label_smoothing=0, reduction=losses_utils.ReductionV2.AUTO, name='categorical_crossentropy' )
-
Callable Object Operators Explained Python
__call__( y_true, # label value y_pred, # predicted value sample_weight=None )
After calling, you can use the result function to get the result.
-
loss object construction
import tensorflow.keras.losses as losses loss =losses.CategoricalCrossentropy()
1.3. Gradient calculation
- Use `tf.GradientTape` to calculate the gradient
-
context
__enter__() __exit__( type, value, traceback )
-
Calculate the gradient
Returns a gradient structure with the same structure as sources.
gradient( target, # error term sources, output_gradients=None, unconnected_gradients=tf.UnconnectedGradients.NONE )
-
Gradient object and gradient calculation code
with tf. GradientTape() as tape: gradients = tape. gradient( loss, # loss tensor (not object) model.trainable_variables) # trainable weight tensor that needs to be updated
1.4. Evaluation metrics
- The evaluation method is used to calculate the accuracy rate, precision rate, recall rate, etc. - The following is the description of the Accuracy class
-
Accuracy constructor
__init__( name='accuracy', dtype=None )
callable operator
The inheritance structure of the Accuracy class is:
Accuracy |- MeanMetricWrapper |- Mean |- Reduce |- Metric |- tensorflow.python.keras.engine.base_layer.Layer
where the callable operator definition:
def __call__(self, *args, **kwargs):
After calling, you can use the result function to get the result.
import tensorflow.keras.metrics as metrics metric = metrics.Accuracy()
2. Implementation of training process
Implementation steps:
-
Computational model output
-
Calculate loss value (loss tensor, not loss object)
-
According to the loss value, calculate the error term
-
update the gradient using the optimizer
2.1. Training preparation-creating model
Taking the code repeated above:
# Author: Louis Young # Note: 4-8-4-1 fully linked neural network built using the Iris dataset import tensorflow as tf import tensorflow.keras as keras import tensorflow.keras.layers as layers import tensorflow.keras.optimizers as optimizers import tensorflow.keras.losses as losses import tensorflow.keras.metrics as metrics import sklearn.datasets as datasets class ModelException(Exception): def __init__(self, msg='model construction exception'): self. message = msg # 1. Build the model class IrisModel(keras.Model): # constructor def __init__(self, networks): super(IrisModel, self).__init__(name='iris_model') # Determine whether the parameter type is list if not isinstance(networks, list): raise ModelException('networks specifies the network structure, the type is a list') # Generate networks attribute self.networks = networks # build layer self._layers = [] for _net in networks[:-1]: # does not consider the input layer layer = layers.Dense(units=_net, activation='relu') self._layers.append(layer) # The last layer uses the sigmoid function layer = layers.Dense(units=networks[-1], activation='sigmoid') self._layers.append(layer) # forward method: build network model def call(self, inputs, **kwargs): # Build the model output according to the layer # The input of the first layer comes from the parameter inputs x = self._layers[0](inputs) for _layer in self._layers[1:]: # The output of the previous layer is used as the input call parameter of the next layer x = _layer(x) # return the last layer as output return x # 2. Create a model instance model = IrisModel([8, 4, 1]) # layer 4 without input # ----------------------------------------<the above is a typical model construction method.
2.2. Define the objects needed for training
Objects required for general training
-
Optimizer: optimizers package
-
Loss: losses package
-
Gradient calculation: tf.GradientTape class
Optional object: If you need to get some evaluation data during training, you can use it.
-
Evaluation metrics: the metrics package
# 1. Optimizer object optimizer = optimizers. Adam() # 2. Loss calculation object losser = losses. BinaryCrossentropy() # 3. Accuracy calculation object accuracier = metrics. Accuracy() # 4. Gradient calculation object # GradientTape can only be called once after construction, so use it in with # tape = tf. GradientTape()
2.3. Training Implementation
-
prepare data
# prepare data data, target = datasets. load_iris(return_X_y=True) data = data[:100, :] # Take the first 100 samples (the first and second categories) target = target[:100]
-
train
epochs = 100 batch_size = 10 batch_num = int(math. ceil(len(data) / batch_size))
for epoch in range(epochs): for i in range(len(data)): with tf. GradientTape() as tape: start_ = i * batch_size end_ = (i + 1) * batch_size # calculate the output predictions = model(data[start_: end_]) # calculate loss loss_value = losser(target[start_: end_], predictions) # Calculate the gradient, cannot be called in with gradients = tape.gradient(loss_value, model.trainable_variables) # update weights optimizer.apply_gradients(zip(gradients, model.trainable_variables)) pre_result = model(data) category = [0 if item <= 0.5 else 1 for item in pre_result] accuracy = (target == category).mean() print(F'Classification accuracy: {accuracy *100.0:5.2f}%') print(pre_result)
Classification accuracy: 100.00% tf.Tensor( [[0.05429551] [0.06767229] [0.06427257] [0.06895139] ...omitted [0.99639302] [0.95713528] [0.99546698]], shape=(100, 1), dtype=float64)
100 rounds of training, the batch size of each round is 10, and the number of batches is 10. After the training is completed (the total number of training times is 100*10=1000 times), observe the classification prediction output value, and the training effect is not bad (because the iris flower is simple dataset :D)
The knowledge points of the article match the official knowledge files, and you can further learn relevant knowledgePython entry skill treeArtificial intelligenceDeep learning 258551 people are studying systematically