Keras’s Model model uses

This topic mainly explains the use of the model Model in the Keras framework, mainly including:

1. Two ways to use the model;

2. Realization of classic model (custom model);

3. Customized training of the model;

1. Two ways to use the model

  • Keras provides two computing models through two APIs:

  • Functional model: through the Model class API;

  • Sequential model: through the Sequential class API;

  • The business background of this article is still the deep full link network;
    Realize 4 -> 8 -> 4 -> 1 network.

1. Functional model

  • The programming characteristics of the functional model are:

  1. The programmer builds the layer, through the callable feature of the Layer object, or uses apply and call to implement chained function calls;

  1. Model only needs to pass inputs and outputs;

  • The reason why this mode is called a functional model is that Layer provides several functional calling methods through which a network model between layers is established.

  1. Layer is a callable object, providing a __call__ callable operator (…);

  1. apply function;

  1. Schematic diagram of a functional model

  1. Sample Code for Functional Models – Callable Operations Using Layer

# Author: Louis Young
# Note: 4-8-4-1 fully linked neural network built using the Iris dataset
import tensorflow as tf
import tensorflow.keras as keras
import tensorflow.keras.layers as layers
import sklearn.datasets as datasets
 
# 1. Define the Layer layer;
# 1.1. Input layer: must be Tensor created by InputLayer or Input;
input_layer = keras.Input(shape=(4,)) # 4 is the feature dimension of Iris (4 features: you can refer to the feature description of the iris dataset in sklearn)
# 1.2. Hidden layers: 8-4
hide1_layer = layers.Dense(units=8, activation='relu')
hide2_layer = layers.Dense(units=4, activation='relu')
# 1.3. Output layer: 1
output_layer = layers.Dense(units=1, activation='sigmoid')
 
# 2. Construct the function chain relationship between Layers
hide1_layer_tensor = hide1_layer(input_layer) # <---------------------- use callable features
hide2_layer_tensor = hide2_layer(hide1_layer_tensor)
output_layer_tensor = output_layer(hide2_layer_tensor)
 
# 3. Use inputs and outputs to build a function chain model;
model = keras.Model(inputs=input_layer, outputs=output_layer_tensor) # inputs and outputs must be the tensor output by Layer call
 
# 4. Training
# 4.1. Training parameters
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['mse'])
# 4.2. Training
data, target = datasets. load_iris(return_X_y=True)
data = data[:100, :] # Take the first 100 samples (the first and second categories)
target = target[:100]
model.fit(x=data, y=target, batch_size=10, epochs=1000, verbose=0)
# 4.3. Prediction and Evaluation
# 6. Prediction
pre_result = model. predict(data)
category = [0 if item<=0.5 else 1 for item in pre_result]
accuracy = (target == category).mean()
print(F'Classification accuracy: {accuracy *100.0:5.2f}%', )
Classification accuracy: 100.00%

  1. Example code for a functional model – using Layer’s apply function
    The apply function is actually an alias for the __call__ operator.

# Author: Louis Young
# Note: 4-8-4-1 fully linked neural network built using the Iris dataset
import tensorflow as tf
import tensorflow.keras as keras
import tensorflow.keras.layers as layers
import sklearn.datasets as datasets
 
# 1. Define the Layer layer;
# 1.1. Input layer: must be Tensor created by InputLayer or Input;
input_layer = keras.Input(shape=(4,)) # 4 is the feature dimension of Iris (4 features: you can refer to the feature description of the iris dataset in sklearn)
# 1.2. Hidden layers: 8-4
hide1_layer = layers.Dense(units=8, activation='relu')
hide2_layer = layers.Dense(units=4, activation='relu')
# 1.3. Output layer: 1
output_layer = layers.Dense(units=1, activation='sigmoid')
 
# 2. Construct the function chain relationship between Layers
hide1_layer_tensor = hide1_layer.apply(input_layer) # <----------------------use apply
hide2_layer_tensor = hide2_layer.apply(hide1_layer_tensor)
output_layer_tensor = output_layer.apply(hide2_layer_tensor)
# 3. Use inputs and outputs to build a function chain model;
model = keras.Model(inputs=input_layer, outputs=output_layer_tensor) # inputs and outputs must be the tensor output by Layer call
 
# 4. Training
# 4.1. Training parameters
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['mse'])
# 4.2. Training
data, target = datasets. load_iris(return_X_y=True)
data = data[:100, :] # Take the first 100 samples (the first and second categories)
target = target[:100]
model.fit(x=data, y=target, batch_size=10, epochs=1000, verbose=0)
# 4.3. Prediction and Evaluation
# 5. Prediction
pre_result = model. predict(data)
category = [0 if item<=0.5 else 1 for item in pre_result]
accuracy = (target == category).mean()
print(F'Classification accuracy: {accuracy *100.0:5.2f}%', )
Tensor("dense_29/Identity:0", shape=(None, 1), dtype=float32)
Classification accuracy: 100.00%

2. Sequential model

Programming features of the sequential model:

  1. Layer provides input and output attributes;

  1. The Sequential class maintains the relationship between layers through the input and output properties of Layer, and builds a network model;

  • The first Layer must be a tensor constructed by InputLayer or Input function;

1. Schematic diagram of sequential model

2. Sample code of sequential model -layers parameter construction model

# Author: Louis Young
# Note: 4-8-4-1 fully linked neural network built using the Iris dataset
import tensorflow as tf
import tensorflow.keras as keras
import tensorflow.keras.layers as layers
import sklearn.datasets as datasets
 
 
# 1. Build the Layer layer
# 1. Define the Layer layer;
# 1.1. Input layer: must be Tensor created by InputLayer or Input;
input_layer = keras.Input(shape=(4,)) # 4 is the feature dimension of Iris (4 features: you can refer to the feature description of the iris dataset in sklearn)
# 1.2. Hidden layers: 8-4
hide1_layer = layers.Dense(units=8, activation='relu')
hide2_layer = layers.Dense(units=4, activation='relu')
# 1.3. Output layer: 1
output_layer = layers.Dense(units=1, activation='sigmoid')
 
# 2. Use Sequential to build a sequential model
seq_model = keras.Sequential(layers=[input_layer, hide1_layer, hide2_layer, output_layer])
 
# ------------------- The following part is exactly the same as the above code
# 3. Training
# 3.1. Training parameters
seq_model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['mse'])
# 3.2. Training
data, target = datasets. load_iris(return_X_y=True)
data = data[:100, :] # Take the first 100 samples (the first and second categories)
target = target[:100]
seq_model.fit(x=data, y=target, batch_size=10, epochs=1000, verbose=0)
# 3.3. Prediction and Evaluation
# 4. Prediction
pre_result = seq_model. predict(data)
category = [0 if item<=0.5 else 1 for item in pre_result]
accuracy = (target == category).mean()
print(F'Classification accuracy: {accuracy *100.0:5.2f}%', )
Classification accuracy: 100.00%

3. Sample code for the sequential model – use the add method to build the model

# Author: Louis Young
# Note: 4-8-4-1 fully linked neural network built using the Iris dataset
import tensorflow as tf
import tensorflow.keras as keras
import tensorflow.keras.layers as layers
import sklearn.datasets as datasets
 
 
# 1. Build the Layer layer
# 1. Define the Layer layer;
# 1.1. Input layer: must be Tensor created by InputLayer or Input;
input_layer = keras.Input(shape=(4,)) # 4 is the feature dimension of Iris (4 features: you can refer to the feature description of the iris dataset in sklearn)
# 1.2. Hidden layers: 8-4
hide1_layer = layers.Dense(units=8, activation='relu')
hide2_layer = layers.Dense(units=4, activation='relu')
# 1.3. Output layer: 1
output_layer = layers.Dense(units=1, activation='sigmoid')
 
# 2. Use Sequential to build a sequential model
seq_model = keras. Sequential()
seq_model.add(input_layer)
seq_model.add(hide1_layer)
seq_model.add(hide2_layer)
seq_model.add(output_layer)
 
# ------------------- The following part is exactly the same as the above code
# 3. Training
# 3.1. Training parameters
seq_model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['mse'])
# 3.2. Training
data, target = datasets. load_iris(return_X_y=True)
data = data[:100, :] # Take the first 100 samples (the first and second categories)
target = target[:100]
seq_model.fit(x=data, y=target, batch_size=10, epochs=1000, verbose=0)
# 3.3. Prediction and Evaluation
# 4. Prediction
pre_result = seq_model. predict(data)
category = [0 if item<=0.5 else 1 for item in pre_result]
accuracy = (target == category).mean()
print(F'Classification accuracy: {accuracy *100.0:5.2f}%', )
Classification accuracy: 100.00%

4. The meaning of the input and output attributes of the Layer class

# Author: Louis Young
# Note: 4-8-4-1 fully linked neural network built using the Iris dataset
import tensorflow as tf
import tensorflow.keras as keras
import tensorflow.keras.layers as layers
import sklearn.datasets as datasets
 
 
# 1. Build the Layer layer
# 1. Define the Layer layer;
# 1.1. Input layer: must be Tensor created by InputLayer or Input;
input_layer = keras.Input(shape=(4,)) # 4 is the feature dimension of Iris (4 features: you can refer to the feature description of the iris dataset in sklearn)
# 1.2. Hidden layers: 8-4
hide1_layer = layers.Dense(units=8, activation='relu')
hide2_layer = layers.Dense(units=4, activation='relu')
# 1.3. Output layer: 1
output_layer = layers.Dense(units=1, activation='sigmoid')
 
# 2. Use Sequential to build a sequential model
seq_model = keras. Sequential()
seq_model.add(input_layer)
seq_model.add(hide1_layer)
 
# seq_model = keras.Sequential([input_layer, hide1_layer]) has the same effect as the above three statements.
 
# The essence of the add method is also a callable object that automatically calls Layer. Only after calling, Layer has input and output properties
print(hide1_layer.input) # 22 lines can be canceled. There will be errors, and you can understand the role of the add function. 
Tensor("input_17:0", shape=(None, 4), dtype=float32)

2. Classic model implementation – custom model Model

Professional model application: It is recommended to subclass Model to implement your own model.

1. Subclass Model description

  • The subclassing of the model is to overload the call function.

  • The call function comes from the parent class of the model: tensorflow.python.keras.engine.base_layer.Layer

  • Model should be a special Layer.

  • Mode inheritance structure

tf.keras.models.Model
        |- tensorflow.python.keras.engine.network.Network
            |- tensorflow.python.keras.engine.base_layer.Layer
                |- tensorflow.python.module.module.Module
                   |- tensorflow.python.training.tracking.tracking.AutoTrackable

Create your own fully customized models by extending the Model class and implementing your own forward pass in the call method. Note: The Model class inheritance API was introduced in Keras 2.2.0.

2. Use a custom Model to build a model

  1. customization process

  • Inherit the Model class;

  • Custom constructor to realize custom attribute transfer;

  • Customize the call function to realize the output of the model;

  1. Custom Model sample code

# Author: Louis Young
# Note: 4-8-4-1 fully linked neural network built using the Iris dataset
import tensorflow as tf
import tensorflow.keras as keras
import tensorflow.keras.layers as layers
import sklearn.datasets as datasets
 
 
class ModelException(Exception):
    def __init__(self, msg='model construction exception'):
        self. message = msg
 
# 1. Build the model
class IrisModel(keras.Model):
    # constructor
    def __init__(self, networks):
        super(IrisModel, self).__init__(name='iris_model')
        # Determine whether the parameter type is list
        if not isinstance(networks, list):
            raise ModelException('networks specifies the network structure, the type is a list')
        # Generate networks attribute
        self.networks = networks
        # build layer
        self._layers = []
        for _net in networks[:-1]: # does not consider the input layer
            layer = layers.Dense(units=_net, activation='relu')
            self._layers.append(layer)
 
        # The last layer uses the sigmoid function
        layer = layers.Dense(units=networks[-1], activation='sigmoid')
        self._layers.append(layer)
 
    # forward method: build network model
    def call(self, inputs, **kwargs):
        # Build the model output according to the layer
        # The input of the first layer comes from the parameter inputs
        x = self._layers[0](inputs)
        for _layer in self._layers[1:]:
            # The output of the previous layer is used as the input call parameter of the next layer
            x = _layer(x)
 
        # return the last layer as output
        return x
 
 
# 2. Create a model instance
model = IrisModel([8, 4, 1]) # layer 4 without input
# ----------------------------------------<the above is a typical model construction method.
 
# 3. Define training parameters
model.compile(
    optimizer='adam', # specify the optimizer
    loss='binary_crossentropy', # specify the loss function
    metrics=['accuracy']
)
 
# 4. Data loading
data, target = datasets. load_iris(return_X_y=True)
data = data[:100, :] # Take the first 100 samples (the first and second categories)
target = target[:100]
# 5. Training
model.fit(x=data, y=target, batch_size=10, epochs=100, verbose=0)
 
# 6. Prediction
# pre_result = model. predict(data)
pre_result = model(data) # Same as the previous statement, the predict function is equivalent to the call of the object
category = [0 if item <= 0.5 else 1 for item in pre_result]
accuracy = (target == category).mean()
print(F'Classification accuracy: {accuracy *100.0:5.2f}%', )
Classification accuracy: 100.00%

3. Customized training of the model

  • It is really convenient and easy to read and understand by using the compile and fit provided by Model to implement training. However, if you want to provide more powerful training control, you need to understand the details of the Model’s compile and fit functions.

  • This kind of detail is actually the encapsulation of tensorflow. Let’s understand the training of the model from a lower level.

  • We provide an example to explain the training details of the model by way of example.

1. Define training parameters

1.1. Optimizer object

  1. Adam constructor description

    __init__(
        learning_rate=0.001, # learning rate
        beta_1=0.9,
        beta_2=0.999,
        epsilon=1e-07,
        amsgrad=False,
        name='Adam',
        **kwargs
    )

  1. Optimizing calls to the optimizer

 apply_gradients( # gradient update
        grads_and_vars,
        name=None
    )

  1. Optimizer creates code

import tensorflow.keras.optimizers as optimizers
optimizer = optimizers. Adam(learning_rate=0.0001)

1.2. Loss object

- This object is a callable object used to implement the loss calculation. 
  1. Constructor description of the CategoricalCrossentropy loss class:

 __init__(
        from_logits=False,
        label_smoothing=0,
        reduction=losses_utils.ReductionV2.AUTO,
        name='categorical_crossentropy'
    )

  1. Callable Object Operators Explained Python

 __call__(
        y_true, # label value
        y_pred, # predicted value
        sample_weight=None
    )

After calling, you can use the result function to get the result.

  1. loss object construction

import tensorflow.keras.losses as losses
loss =losses.CategoricalCrossentropy()

1.3. Gradient calculation

- Use `tf.GradientTape` to calculate the gradient
  1. context

__enter__()
    
    __exit__(
        type,
        value,
        traceback
)

  1. Calculate the gradient
    Returns a gradient structure with the same structure as sources.

 gradient(
        target, # error term
        sources,
        output_gradients=None,
        unconnected_gradients=tf.UnconnectedGradients.NONE
    )

  1. Gradient object and gradient calculation code

with tf. GradientTape() as tape:
    gradients = tape. gradient(
        loss, # loss tensor (not object)
        model.trainable_variables) # trainable weight tensor that needs to be updated

1.4. Evaluation metrics

- The evaluation method is used to calculate the accuracy rate, precision rate, recall rate, etc.
- The following is the description of the Accuracy class
  1. Accuracy constructor

 __init__(
        name='accuracy',
        dtype=None
    )

callable operator

The inheritance structure of the Accuracy class is:

 Accuracy
        |- MeanMetricWrapper
            |- Mean
                |- Reduce
                    |- Metric
                        |- tensorflow.python.keras.engine.base_layer.Layer

where the callable operator definition:

 def __call__(self, *args, **kwargs):

After calling, you can use the result function to get the result.

import tensorflow.keras.metrics as metrics
metric = metrics.Accuracy()

2. Implementation of training process

Implementation steps:

  1. Computational model output

  1. Calculate loss value (loss tensor, not loss object)

  1. According to the loss value, calculate the error term

  1. update the gradient using the optimizer

2.1. Training preparation-creating model

Taking the code repeated above:

# Author: Louis Young
# Note: 4-8-4-1 fully linked neural network built using the Iris dataset
import tensorflow as tf
import tensorflow.keras as keras
import tensorflow.keras.layers as layers
import tensorflow.keras.optimizers as optimizers
import tensorflow.keras.losses as losses
import tensorflow.keras.metrics as metrics
import sklearn.datasets as datasets
 
 
class ModelException(Exception):
    def __init__(self, msg='model construction exception'):
        self. message = msg
 
# 1. Build the model
class IrisModel(keras.Model):
    # constructor
    def __init__(self, networks):
        super(IrisModel, self).__init__(name='iris_model')
        # Determine whether the parameter type is list
        if not isinstance(networks, list):
            raise ModelException('networks specifies the network structure, the type is a list')
        # Generate networks attribute
        self.networks = networks
        # build layer
        self._layers = []
        for _net in networks[:-1]: # does not consider the input layer
            layer = layers.Dense(units=_net, activation='relu')
            self._layers.append(layer)
 
        # The last layer uses the sigmoid function
        layer = layers.Dense(units=networks[-1], activation='sigmoid')
        self._layers.append(layer)
 
    # forward method: build network model
    def call(self, inputs, **kwargs):
        # Build the model output according to the layer
        # The input of the first layer comes from the parameter inputs
        x = self._layers[0](inputs)
        for _layer in self._layers[1:]:
            # The output of the previous layer is used as the input call parameter of the next layer
            x = _layer(x)
 
        # return the last layer as output
        return x
 
 
# 2. Create a model instance
model = IrisModel([8, 4, 1]) # layer 4 without input
# ----------------------------------------<the above is a typical model construction method. 

2.2. Define the objects needed for training

Objects required for general training

  • Optimizer: optimizers package

  • Loss: losses package

  • Gradient calculation: tf.GradientTape class

Optional object: If you need to get some evaluation data during training, you can use it.

  • Evaluation metrics: the metrics package

# 1. Optimizer object
optimizer = optimizers. Adam()
# 2. Loss calculation object
losser = losses. BinaryCrossentropy()
# 3. Accuracy calculation object
accuracier = metrics. Accuracy()
# 4. Gradient calculation object # GradientTape can only be called once after construction, so use it in with
# tape = tf. GradientTape()
 

2.3. Training Implementation

  1. prepare data

# prepare data
data, target = datasets. load_iris(return_X_y=True)
data = data[:100, :] # Take the first 100 samples (the first and second categories)
target = target[:100]

  1. train

epochs = 100
batch_size = 10
batch_num = int(math. ceil(len(data) / batch_size))
for epoch in range(epochs):
    for i in range(len(data)):
 
        with tf. GradientTape() as tape:
            start_ = i * batch_size
            end_ = (i + 1) * batch_size
            # calculate the output
            predictions = model(data[start_: end_])
            # calculate loss
            loss_value = losser(target[start_: end_], predictions)
        # Calculate the gradient, cannot be called in with
        gradients = tape.gradient(loss_value, model.trainable_variables)
        # update weights
        optimizer.apply_gradients(zip(gradients, model.trainable_variables))
 
pre_result = model(data)
category = [0 if item <= 0.5 else 1 for item in pre_result]
accuracy = (target == category).mean()
print(F'Classification accuracy: {accuracy *100.0:5.2f}%')
print(pre_result)
Classification accuracy: 100.00%
tf.Tensor(
[[0.05429551]
 [0.06767229]
 [0.06427257]
 [0.06895139]
 
...omitted
 
 [0.99639302]
 [0.95713528]
 [0.99546698]], shape=(100, 1), dtype=float64)

100 rounds of training, the batch size of each round is 10, and the number of batches is 10. After the training is completed (the total number of training times is 100*10=1000 times), observe the classification prediction output value, and the training effect is not bad (because the iris flower is simple dataset :D)

The knowledge points of the article match the official knowledge files, and you can further learn relevant knowledgePython entry skill treeArtificial intelligenceDeep learning 258551 people are studying systematically