内容简介:In this article, you will learn how to checkpoint a deep learning model built using Keras and then reinstate the model architecture and trained weights to a new model or resume the training from you left offIt acts like an autosave for your model in case t
Checkpointing Deep Learning Models in Keras
Learn how to save deep learning models using checkpoints and how to reload them
Different methods to save and load the deep learning model are using
In this article, you will learn how to checkpoint a deep learning model built using Keras and then reinstate the model architecture and trained weights to a new model or resume the training from you left off
Usage of Checkpoints
- Allow us to use a pre-trained model for inference without having to retrain the model
- Resume the training process from where we left off in case it was interrupted or for fine-tuning the model
It acts like an autosave for your model in case training is interrupted for any reason.
Steps for saving and loading model and weights using checkpoint
- Create the model
- Specify the path where we want to save the checkpoint files
- Create the callback function to save the model
- Apply the callback function during the training
- Evaluate the model on test data
- Load the pre-trained weights on a new model using l oad_weights() or restoring the weights from the latest checkpoint
Create the base model architecture with the loss function, metrics, and optimizer
We have created the multi-class classification model for Fashion MNIST dataset
# Define the model architecture
def create_model():
model = tf.keras.Sequential()
# Must define the input shape in the first layer of the neural network
model.add(tf.keras.layers.Conv2D(filters=64, kernel_size=2, padding='same', activation='relu', input_shape=(28,28,1)))
model.add(tf.keras.layers.MaxPooling2D(pool_size=2))
model.add(tf.keras.layers.Dropout(0.3))
model.add(tf.keras.layers.Conv2D(filters=32, kernel_size=2, padding='same', activation='relu'))
model.add(tf.keras.layers.MaxPooling2D(pool_size=2))
model.add(tf.keras.layers.Dropout(0.3))
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(256, activation='relu'))
model.add(tf.keras.layers.Dropout(0.5))
model.add(tf.keras.layers.Dense(10, activation='softmax'))
#Compiling the model
model.compile(loss='sparse_categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
return model#create the model
model_ckpt= create_model()
Specify the path where the checkpoint files will be stored
checkpoint_path = "train_ckpt/cp.ckpt"
Create the callback function to save the model.
Callback functions are applied at different stages of training to give a view on the internal training states.
We create a callback function to save the model weights using ModelCheckpoint .
If we set save_weight_only to True, then only the weights will be saved. Model architecture, loss, and the optimizer will not be saved.
We can also specify if we want to save the model at every epoch or every n number of epochs.
# Create a callback that saves the model's weights
cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_path,save_best_only=True, save_weights_only=True, verbose=1)
ModelCheckpoint callback classhas the following arguments:
- filepath : specify the path or filename where we want to save the model
- monitor : the metrics that we want to monitor such as loss or accuracy
- verbosity : 0 for debug mode and 1 for info
- save_weights_only : If set to True, then only model weights will be saved else the full model is saved, including the model architecture, weights, loss function, and optimizer.
- save_best_only : If set to True, then only the best model will be saved based on the quantity we are monitoring. If we are monitoring accuracy and save_best_only is set to True, then the model will be saved every time we get higher accuracy than the previous accuracy.
- mode : It has three options- auto, min, or max . If we are monitoring accuracy, then set it to the max, and if we are monitoring loss, then set it to min . If we set the mode to auto, then the direction is inferred automatically based on the quantity being monitored
- save_freq or period : set it to ‘epoch’ or a number . When it set it to epoch, then the model is saved after each epoch. When we specify a number say 5, then the model is saved after every five epochs as shown in the code below
# Create a callback that saves the model's weights every 5 epochs
cp_callback = tf.keras.callbacks.ModelCheckpoint(
filepath=checkpoint_path,
verbose=1,
save_weights_only=True,
save_freq=5)
Apply the callback during the training process
# Train the model with the new callback
# Pass callback to training
model_ckpt.fit(train_images,
train_labels,
batch_size=64,
epochs=10,
validation_data=(test_images,test_labels),
callbacks=[cp_callback])
We can see that if the val_loss does not improve, then the weights are not saved. Whenever the loss is reduced then those weights are saved to the checkpoint file
Evaluating the model on test images
loss,acc = model_ckpt.evaluate(test_images, test_labels, verbose=2)
Checkpoint files
Checkpoint file stores the trained weights to a collection of checkpoint formatted files in a binary format
The TensorFlow save() saves three kinds of files: checkpoint file, index file, and data file. It stores the graph structure separately from the variable values .
checkpoint file: contains prefixes for both an index file as well as for one or more data files
Index files: indicates which weights are stored in which shard. As I trained the model on one machine, we see cp.ckpt.data-00000-of-00002 and cp.ckpt.data-00001-of-00002
data file: saves values for all the variables, without the structure. There can be one or more data files
Loading the pre-trained weights
Reasons for loading the pre-trained weights
- Continue from where we left off or
- Resume after an interruption or
- Load the pre-trained weight for inference
We create a new model to load the pre-trained weights.
When loading a new model with the pre-trained weights, the new model should have the same architecture as the original model.
# Create a basic model instance
model_ckpt2 = create_model()
We load the pre-trained weights into our new model using load_weights() .
model_ckpt2.load_weights(checkpoint_path)
We can make inferences using the new model on the test images
loss,acc = model_ckpt2.evaluate(test_images, test_labels, verbose=2)
print("Restored model, accuracy: {:5.2f}%".format(100*acc))
An untrained model will perform at chance levels (~10% accuracy)
To resume the training where we left off
model_ckpt2.fit(train_images,
train_labels,
batch_size=64,
epochs=10,
validation_data=(test_images,test_labels),
callbacks=[cp_callback])
we see that the accuracy has changed now
loss,acc = model_ckpt2.evaluate(test_images, test_labels, verbose=2) print("Restored model, accuracy: {:5.2f}%".format(100*acc))
Loading weights from the latest checkpoints
latest_checkoint() find the filename of the latest saved checkpoint file
#get the latest checkpoint file
checkpoint_dir = os.path.dirname(checkpoint_path)
latest = tf.train.latest_checkpoint(checkpoint_dir)
We create a new model, load the weights from the latest checkpoint and make inferences
Create a new model instance
model_latest_checkpoint = create_model()# Load the previously saved weights
model_latest_checkpoint.load_weights(latest)# Re-evaluate the model
loss, acc = model_latest_checkpoint.evaluate(test_images, test_labels, verbose=2)
print("Restored model, accuracy: {:5.2f}%".format(100*acc))
Including epoch number in the filename
# Include the epoch in the file name (uses `str.format`)
checkpoint_path = "training2/cp-{epoch:04d}.ckpt"
code for saving the model and reloading model using Fashion MNIST
Conclusion:
We now understand how to create a callback function using ModelCheckpoint class, the different checkpoint files that get created and then how we can restore the pre-trained weights
References:
以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持 码农网
猜你喜欢:本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们。
深度探索C++对象模型
[美] Stanley B. Lippman / 侯捷 / 华中科技大学出版社 / 2001-5 / 54.00元
这本书探索“对象导向程序所支持的C++对象模型”下的程序行为。对于“对象导向性质之基础实现技术”以及“各种性质背后的隐含利益交换”提供一个清楚的认识。检验由程序变形所带来的效率冲击。提供丰富的程序范例、图片,以及对象导向观念和底层对象模型之间的效率测量。一起来看看 《深度探索C++对象模型》 这本书的介绍吧!