Checkpointing Deep Learning Models in Keras

栏目: IT技术 · 发布时间: 4年前

内容简介:In this article, you will learn how to checkpoint a deep learning model built using Keras and then reinstate the model architecture and trained weights to a new model or resume the training from you left offIt acts like an autosave for your model in case t

Checkpointing Deep Learning Models in Keras

Learn how to save deep learning models using checkpoints and how to reload them

Different methods to save and load the deep learning model are using

In this article, you will learn how to checkpoint a deep learning model built using Keras and then reinstate the model architecture and trained weights to a new model or resume the training from you left off

Usage of Checkpoints

  • Allow us to use a pre-trained model for inference without having to retrain the model
  • Resume the training process from where we left off in case it was interrupted or for fine-tuning the model

It acts like an autosave for your model in case training is interrupted for any reason.

Steps for saving and loading model and weights using checkpoint

  • Create the model
  • Specify the path where we want to save the checkpoint files
  • Create the callback function to save the model
  • Apply the callback function during the training
  • Evaluate the model on test data
  • Load the pre-trained weights on a new model using l oad_weights() or restoring the weights from the latest checkpoint

Create the base model architecture with the loss function, metrics, and optimizer

We have created the multi-class classification model for Fashion MNIST dataset

# Define the model architecture 
def create_model():
model = tf.keras.Sequential()
# Must define the input shape in the first layer of the neural network
model.add(tf.keras.layers.Conv2D(filters=64, kernel_size=2, padding='same', activation='relu', input_shape=(28,28,1)))
model.add(tf.keras.layers.MaxPooling2D(pool_size=2))
model.add(tf.keras.layers.Dropout(0.3))
model.add(tf.keras.layers.Conv2D(filters=32, kernel_size=2, padding='same', activation='relu'))
model.add(tf.keras.layers.MaxPooling2D(pool_size=2))
model.add(tf.keras.layers.Dropout(0.3))
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(256, activation='relu'))
model.add(tf.keras.layers.Dropout(0.5))
model.add(tf.keras.layers.Dense(10, activation='softmax'))

#Compiling the model
model.compile(loss='sparse_categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])

return model
#create the model
model_ckpt= create_model()

Specify the path where the checkpoint files will be stored

checkpoint_path = "train_ckpt/cp.ckpt"

Create the callback function to save the model.

Callback functions are applied at different stages of training to give a view on the internal training states.

We create a callback function to save the model weights using ModelCheckpoint .

If we set save_weight_only to True, then only the weights will be saved. Model architecture, loss, and the optimizer will not be saved.

We can also specify if we want to save the model at every epoch or every n number of epochs.

# Create a callback that saves the model's weights
cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_path,save_best_only=True, save_weights_only=True, verbose=1)

ModelCheckpoint callback classhas the following arguments:

  • filepath : specify the path or filename where we want to save the model
  • monitor : the metrics that we want to monitor such as loss or accuracy
  • verbosity : 0 for debug mode and 1 for info
  • save_weights_only : If set to True, then only model weights will be saved else the full model is saved, including the model architecture, weights, loss function, and optimizer.
  • save_best_only : If set to True, then only the best model will be saved based on the quantity we are monitoring. If we are monitoring accuracy and save_best_only is set to True, then the model will be saved every time we get higher accuracy than the previous accuracy.
  • mode : It has three options- auto, min, or max . If we are monitoring accuracy, then set it to the max, and if we are monitoring loss, then set it to min . If we set the mode to auto, then the direction is inferred automatically based on the quantity being monitored
  • save_freq or period : set it to ‘epoch’ or a number . When it set it to epoch, then the model is saved after each epoch. When we specify a number say 5, then the model is saved after every five epochs as shown in the code below
# Create a callback that saves the model's weights every 5 epochs
cp_callback = tf.keras.callbacks.ModelCheckpoint(
filepath=checkpoint_path,
verbose=1,
save_weights_only=True,
save_freq=5)

Apply the callback during the training process

# Train the model with the new callback
# Pass callback to training
model_ckpt.fit(train_images,
train_labels,
batch_size=64,
epochs=10,
validation_data=(test_images,test_labels),
callbacks=[cp_callback])

We can see that if the val_loss does not improve, then the weights are not saved. Whenever the loss is reduced then those weights are saved to the checkpoint file

Evaluating the model on test images

loss,acc = model_ckpt.evaluate(test_images, test_labels, verbose=2)

Checkpoint files

Checkpoint file stores the trained weights to a collection of checkpoint formatted files in a binary format

The TensorFlow save() saves three kinds of files: checkpoint file, index file, and data file. It stores the graph structure separately from the variable values .

checkpoint file: contains prefixes for both an index file as well as for one or more data files

Index files: indicates which weights are stored in which shard. As I trained the model on one machine, we see cp.ckpt.data-00000-of-00002 and cp.ckpt.data-00001-of-00002

data file: saves values for all the variables, without the structure. There can be one or more data files

Checkpoint files

Loading the pre-trained weights

Reasons for loading the pre-trained weights

  • Continue from where we left off or
  • Resume after an interruption or
  • Load the pre-trained weight for inference

We create a new model to load the pre-trained weights.

When loading a new model with the pre-trained weights, the new model should have the same architecture as the original model.

# Create a basic model instance
model_ckpt2 = create_model()

We load the pre-trained weights into our new model using load_weights() .

model_ckpt2.load_weights(checkpoint_path)

We can make inferences using the new model on the test images

loss,acc = model_ckpt2.evaluate(test_images, test_labels, verbose=2)
print("Restored model, accuracy: {:5.2f}%".format(100*acc))

An untrained model will perform at chance levels (~10% accuracy)

To resume the training where we left off

model_ckpt2.fit(train_images, 
train_labels,
batch_size=64,
epochs=10,
validation_data=(test_images,test_labels),
callbacks=[cp_callback])

we see that the accuracy has changed now

loss,acc = model_ckpt2.evaluate(test_images, test_labels, verbose=2)
print("Restored model, accuracy: {:5.2f}%".format(100*acc))

Loading weights from the latest checkpoints

latest_checkoint() find the filename of the latest saved checkpoint file

#get the latest checkpoint file
checkpoint_dir = os.path.dirname(checkpoint_path)
latest = tf.train.latest_checkpoint(checkpoint_dir)

We create a new model, load the weights from the latest checkpoint and make inferences

Create a new model instance
model_latest_checkpoint = create_model()
# Load the previously saved weights
model_latest_checkpoint.load_weights(latest)
# Re-evaluate the model
loss, acc = model_latest_checkpoint.evaluate(test_images, test_labels, verbose=2)
print("Restored model, accuracy: {:5.2f}%".format(100*acc))

Including epoch number in the filename

# Include the epoch in the file name (uses `str.format`)
checkpoint_path = "training2/cp-{epoch:04d}.ckpt"

code for saving the model and reloading model using Fashion MNIST

Conclusion:

We now understand how to create a callback function using ModelCheckpoint class, the different checkpoint files that get created and then how we can restore the pre-trained weights

References:

https://www.tensorflow.org/tutorials/keras/save_and_load


以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

深度探索C++对象模型

深度探索C++对象模型

[美] Stanley B. Lippman / 侯捷 / 华中科技大学出版社 / 2001-5 / 54.00元

这本书探索“对象导向程序所支持的C++对象模型”下的程序行为。对于“对象导向性质之基础实现技术”以及“各种性质背后的隐含利益交换”提供一个清楚的认识。检验由程序变形所带来的效率冲击。提供丰富的程序范例、图片,以及对象导向观念和底层对象模型之间的效率测量。一起来看看 《深度探索C++对象模型》 这本书的介绍吧!

Base64 编码/解码
Base64 编码/解码

Base64 编码/解码

Markdown 在线编辑器
Markdown 在线编辑器

Markdown 在线编辑器

RGB CMYK 转换工具
RGB CMYK 转换工具

RGB CMYK 互转工具