内容简介:No more writing training routine unless you really have to. You can define your training asThe job of aIn this screenshot, I defined the
Why Use Pytorch Lightning
Reduce Boilerplate
No more writing training routine unless you really have to. You can define your training as
from pytorch_lightning import Trainertrainer = Trainer( gpus=1, logger=[logger], max_epochs=5 ) trainer.fit(model)
The job of a Trainer
is to do your training routine.
- No more writing loop. As you can see, there is no more writing for loop that usually seen at pytorch tutorial.
-
No converting your model to gpu.
You do not have to care about forgetting to convert your model to
cuda
. -
No custom printing on your loss.
You see
logger
variable there? you can use Tensorboard to manage your logs and I recommend you use it. Dopip install tensorboard
before you use it in your local.
In this screenshot, I defined the logger
variable as
from pytorch_lightning.loggers import TensorBoardLoggerlogger = TensorBoardLogger('tb_logs', name='my_model')
Pytorch Lightning will make a log dir, named tb_logs
and yyou can refer that log directory for your Tensorboard (if you are running your Tensorboard separately from Jupyter notebook).
tensorboard --logdir tb_logs/
Organize Code
Besides constructor and forward
you will be able to define more functions
-
configure_optimizer
. Expect to return a Pytorch optimizer fromtorch.optim
package.
def configure_optimizers(self): return Adam(self.parameters(), lr=0.01)
-
train_step
. Given a batch and batch number, define how will we feed the input to the model.
def training_step(self, batch, batch_idx): x, y = batch.text[0].T, batch.label y_hat = self(x) loss = self.loss_function(y_hat, y) return dict( loss=loss, log=dict( train_loss=loss ) )
In this example, notice that I do a little transformation using transpose. It is possible to do all kind of transformations before feeding into the model, but I suggest you do the heavy transformations outside this function so that it will be clean.
I have also define the loss_function
as part of the model and “hardcoded” it using Cross Entropy. If you do not want that, you can use torch.functional as F
then call your functional loss function, such as F.log_softmax()
. Another thing you can do is to let the model constructor to accept loss function as parameter.
-
train_dataloader
. Define how you wanted to load your training data loader.
Pytorch Dataloader
is an API that helps you with batching the input. Though, to my knowledge, Pytorch Lightning will run for batch_idx, batch in enumerate(train_dataloader)
(not exactly like this, but similar). This means you are free to define anything here that is iterable.
-
test_step
. Given a batch and batch number, define how will we feed the input to the model for test. It is important to note that we do not have to feed to loss function in this step, because we are running with no gradient. -
test_dataloader
. Define how you wanted to load your test data loader -
test_epoch_end
. Given all test outputs, define some action that you wanted to do with the test outputs. If you do not want to define this, then you can, but it will show warning when you have definedtest_step
andtest_dataloader
because then you are basically do nothing to your test data.
Using Pytorch Lightning with Torchtext
Previously, I have described my exploration to use torchtext [4]. Now I wanted to improve even more of my productivity on the experiment part, which includes training, testing, validating, metric logging. All of these can be achieved by using Pytorch Lightning.
I will take the IMDB sentiment classification dataset , that has been available in the Torchtext package.
Loading Dataset
IMDB sentiment classification dataset is a text classification task, given a review text predict if it is a positive or negative review. There is an official short tutorial from torchtext [5], however, that tutorial does not cover the training part. I will use some of the tutorial codes and connect them with training using Pytorch Lightning.
This dataset contains 3 classes: unknown, positive (labeled as “pos”), negative (labeled as “neg”) . So, we know that we will need to define an output that could predict 3 classes. It is a classification task so that I will use CrossEntropy loss.
Now to load the data you can do
from torchtext.data import Field from torchtext.datasets import IMDBtext_field = Field(sequential=True, include_lengths=True, fix_length=200) label_field = Field(sequential=False)train, test = IMDB.splits(text_field, label_field)
Since the IMDB review is not in uniform length, using a fixed-length parameter will help you to pad/trim the sequence data .
You can access your sample data using train.examples[i]
to peek what is inside the train and test variable.
Building Vocabulary
Pre-trained word embedding is usually trained to different data that we used. Thus it will use different “encoding” from token to integer that we currently have. build_vocab
will re-map the current integer encoding that comes from the current dataset, in this case, the IMDB dataset, with pre-trained encoding. For example, if token 2
in our vocabulary is eat
, but eat
is token number 15
in pre-trained word embedding then it will be automatically mapped to the correct token number.
from torchtext.vocab import FastTexttext_field.build_vocab(train, vectors=FastText('simple')) label_field.build_vocab(train)
Label field in IMDB dataset will be in the form of pos
, neg
, and <unk>
, so that it will still need to build its own vocab but without word embedding.
Splitting and Making Iterator
Iterator works a bit like Dataloader, it helps with batching and iterating the data in 1 epoch. We can use BucketIterator to help us iterate with a specific number of batch and convert all of those vectors into a device, where the device can be cpu
or cuda
.
from torchtext.data import BucketIteratordevice = 'cuda' if torch.cuda.is_available() else 'cpu' batch_size = 32train_iter, test_iter = BucketIterator.splits( (train, test), batch_size=batch_size, device=device )
Now we are ready to define our model.
Model Definition
Defining the model with Pytorch Lightning is as easy as William has explained [2].
LightningModule
It is better to make sure that your model can accept passed input correctly before doing the full training, like this.
sample_batch = next(iter(train_iter)) model(sample_batch.text[0].T)
Let me explain why I did the transformations.
Each batch object, from an iterator, has text
and label
fields. The text
field is actually a tuple of the real word vector and actual length vector of a review. Real word vector will be at size fixed_length x batch_size
, while the actual length vector will be at size batch_size
. In order to feed the model with the word vector, I need to: take the first tuple and rotate it so that it will produce batch_size x fixed_length
.
We are now ready to train our model!
from pytorch_lightning import Trainer from pytorch_lightning.loggers import TensorBoardLoggermodel = MyModel(text_field.vocab.vectors) logger = TensorBoardLogger('tb_logs', name='my_model') trainer = Trainer( gpus=1, logger=logger, max_epochs=3 ) trainer.fit(model)
and it’s done! It will show the progress bar automatically so you don’t have to do tqdm anymore.
for batch_idx, batch in tqdm(enumerate(train_loader)):
After training, you can do testing by 1 line
trainer.test()
If you are thinking why this test method only returns one object? Then probably you are thinking of scikit-learn’s train and test split. In Pytorch, the “test” part is usually defined as “validation”. So you might want to define validation_step
and val_dataloader
instead of test_*
.
Conclusion
In my opinion, using Pytorch lightning and Torchtext does improve my productivity to experiment with NLP deep learning models. Some of the aspects I think make this library very compelling are backward compatibility with Pytorch, Torchtext friendly, and leverage the use of Tensorboard.
Backward Compatibility with Pytorch
If you are somehow hesitant because you think it will be an overhead to use a new library, then do not worry! You can install first, use the LightningModule
instead of nn.Module
and write the usual Pytorch code. It will still work because this library does not cause any additional headaches.
Torchtext Friendly
It was fairly easy to use Torchtext along with Pytorch Lightning. Both libraries run on Pytorch and do have high compatibility with native Pytorch. Both have additional features that do not intersect but complement each other. For example, Torchtext has easy interfaces to load Dataset like IMDB or YelpReview. Then you can use Pytorch Lightning to train whatever model you wanted to define and log to Tensorboard or MLFlow.
Leverage Tensorboard Usage
Using Tensorboard instead of manually printing your losses and other metrics helps me eliminate unnecessary errors in printing losses on the training loop. It will also eliminate the need to visualize loss vs epoch plot at the end of the training.
References
[1] Pytorch Lightning Documentation. https://pytorch-lightning.readthedocs.io/en/stable/introduction_guide.html
[2] Falcon, W. From PyTorch to PyTorch Lightning — A gentle introduction. https://towardsdatascience.com/from-pytorch-to-pytorch-lightning-a-gentle-introduction-b371b7caaf09
[3] Falcon, W. Pytorch Lightning vs PyTorch Ignite vs Fast.ai. https://towardsdatascience.com/pytorch-lightning-vs-pytorch-ignite-vs-fast-ai-61dc7480ad8a
[4] Sutiono, Arie P. Deep Learning For NLP with PyTorch and Torchtext. https://towardsdatascience.com/deep-learning-for-nlp-with-pytorch-and-torchtext-4f92d69052f
[5] Torchtext Datasets Documentation. https://pytorch.org/text/datasets.html
以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持 码农网
猜你喜欢:本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们。
.NET本质论 第1卷:公共语言运行库
博克斯 (BoxDon) / 张晓坤 / 中国电力出版社 / 2004-1 / 48.00元
本书由10章组成,探讨了CLR即公共语言运行库,涵盖了基本类型、实例、方法调用和消息、AppDomain、安全、以及CLR外部世界。一起来看看 《.NET本质论 第1卷:公共语言运行库》 这本书的介绍吧!