Learn AI Today: 01 — Getting started with Pytorch

栏目: IT技术 · 发布时间: 4年前

内容简介:This is the first story in theLinear regression is a problem that you are probably familiar with. In it’s most basic form is no more than fitting a line to a set of points.Consider the mathematical expression of a line:

Learn AI Today

Learn AI Today: 01 — Getting started with Pytorch

Defining and training a Pytorch model and visualizing the results dynamically

Jul 19 ·11min read

Learn AI Today: 01 — Getting started with Pytorch

Photo by Jukan Tateisi on Unsplash .

This is the first story in the Learn AI Today series I’m creating! These stories, or at least the first few, are based on a series of Jupyter notebooks I’ve created while studying/learning PyTorch and Deep Learning . I hope you find them as useful as I did!

What you will learn in this story:

  • How to Create a PyTorch Model
  • How to Train Your Model
  • Visualize the Training Progress Dynamically
  • How the Learning Rate Affects the Training

1. Linear Regression in PyTorch

Linear regression is a problem that you are probably familiar with. In it’s most basic form is no more than fitting a line to a set of points.

1. 1 Introducing the Concepts

Consider the mathematical expression of a line:

w and b are the two parameters or weights of this linear model. In machine learning , it is common to use w referring to weights and b referring to the bias parameter.

In machine learning when we are training a model we are basically finding the optimal parameters w and b for a given set of input/target (x,y) pairs. After the model is trained we can compute the model estimates. The expression will now look

where I change the name o y to ye (y estimate) because the solution will not be exact.

The Mean Square Error (MSE) is simply mean((ye-y)²) — the mean of the squared deviations between targets and estimates. For a regression problem, you can indeed minimize the MSE in order to find the best w and b .

The idea of linear regression can be generalized using algebra matrix notation to allow for multiple inputs and targets. If you want to learn more about the mathematical exact solution for the regression problem you can search about Normal Equation .

1.2 Defining the Model

PyTorch nn.Linear class is all that you need to define a linear model with any number of inputs and outputs. For our basic example of fitting a line to a set of points consider the following model:

Note: I’m using Module from fastai library as it makes the code cleaner. If you want to use pure PyTorch you should use nn.Module instead and you need to add super().__init__() in the __init__ method. fastai Module does that for you.

If you are familiar with Python classes , the code is self-explanatory. If not, consider doing some study before diving into PyTorch. There are many online tutorials and lessons covering the topic.

Back to the code. In the __init__ method, you define the layers of the model. In this case, it is just one linear layer. Then, the forward method is the one that is called when you call the model. Similar to __call__ method in normal Python classes.

Now you can define an instance of your LinearRegression model as model = LinearRegression(1, 1) indicating the number of inputs and outputs.

Maybe you are now asking why I don’t simply do model = nn.Linear(1, 1) and you are absolutely right. The reason I’m having all the trouble of defining LinearRegression class is just to work as a template for future improvements as you will find later.

1.3 How to Train Your Model

The training process is based on a sequence of 4 steps that repeat iteratively:

  • Forward pass: The input data is given to the model and the model outputs are obtained — outputs = model(inputs)
  • The loss function is computed: For the purpose of the linear regression problem, the loss function we are using is the mean squared error (MSE). We often refer to this function as the criterion — loss = criterion(outputs, targets)
  • Backward pass: The gradients of the loss function with respect to each learnable parameter are computed. Remember that we want to reduce the loss function to make the outputs close to the targets. The gradients tell how the loss change if you increase or decrease each parameter — loss.backwards()
  • Update parameters: Update the value of the parameters by a small amount in the direction that reduces the loss. The method to update the parameters can be as simple as subtracting the value of the gradient multiplied by a small number. This number is referred to as the learning rate and the optimizer I just described is the Stochastic Gradient Descent (SGD)optimizer.step()

I didn’t define exactly the criterion and optimizer yet but I will in a minute. This is just to give you a general overview and understanding of the steps for a training iteration or as usually called — a training epoch .

Let’s define our fit function that will do all the required steps.

Notice that there’s an extra step I didn’t mention before— optimizer.zero_grad() . This is because by default, in PyTorch, when you call loss.backwards() the optimizer adds up the values of the gradients. If you don’t set them to zero at each epoch then they will be always added up and that’s not desirable. Unless you are doing gradient accumulation — but that’s a more advanced topic. Besides that, as you can see in the code above, I’m saving the value of the loss at each epoch. We should expect it to drop steadily — meaning that the model is getting better at predicting the targets.

As I mentioned above, for linear regression the criterion usually used is the MSE . As for the optimizer, nowadays I always use Adam as my first choice. It’s fast and it should work well for most problems. I won’t go into details about how Adam works for now but the idea is always to find the best solution in the least amount of time.

Let’s now move on to creating an instance of our LinearRegression model, defining our criterion and our optimizer :

model.parameters() is the way to give the optimizer the list of trainable parameters and lr is the learning rate.

Now let’s create some data and train the model!

The data is simply a set of points following the model y = 2x + 1 + noise . To make it a little more interesting I make the noise larger for larger values of x. The unsqueeze(-1) in lines 4 and 5 is just to add an extra dimension to the tensor at the end (from [10000] to [10000,1] ). The data is the same but the tensor needs to have this shape meaning that we have 10000 samples and 1 feature per sample.

Plotting the data, the result is the image below, where you can see the true model and the input data + noise.

Learn AI Today: 01 — Getting started with Pytorch

Input data for the linear regression model. Image by the author.

And now to train the model we just run our fit function!

After training, we can plot the evolution of the loss during the 100 epochs. As you can see in the image below, initially the loss was of about 2.0 and then it drops steeply down to nearly zero. This is to be expected since when we start the model parameters are randomly initialized and as the training progress they converge to the solution.

Learn AI Today: 01 — Getting started with Pytorch

Evolution of the loss (MSE) for the 100 epochs of training. Image by the author.

Note:Try playing with the learning rate value to see how it affects the training!

To check the parameters of the trained model, you can run list(model.parameters()) after training the model. You will see that they are very close to 2.0 and 1.0 for this example since the true model is y = 2x + 1 .

You can now compute the model estimates — ye = model(x_train) . (Notice that before computing the estimates you should always run model.eval() to set the model to evaluation mode. It won’t make a difference for this simple model but later it will, when we start using Batch Normalization and Dropout.)

Plotting the prediction you can see that it matches almost perfectly the true data, despite the fact that the model could only see the noisy data.

Learn AI Today: 01 — Getting started with Pytorch

Visualizing the model estimates. Image by the author.

以上就是本文的全部内容,希望本文的内容对大家的学习或者工作能带来一定的帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

软利器

软利器

保罗·莱文森 / 何道宽 / 复旦大学出版社 / 2011-5 / 35.00元

《软利器:信息革命的自然历史与未来》内容简介:何谓“软利器”?一种轻盈、透明、无质无形、难以把握的力量,由信息和物理载体构成,这就是媒介。了解媒介的属性和演化规律的人,常占尽优势:反之则身处险境。是不是有些危言耸听? 如果你看过保罗•莱文森的这本《软利器:信息革命的自然历史与未来》,或许就会深信不疑。在书中,莱文森如同一位经验丰富的航海家,带领你穿越媒介时空——你将邂逅古埃及的法老、古希腊的......一起来看看 《软利器》 这本书的介绍吧!

图片转BASE64编码
图片转BASE64编码

在线图片转Base64编码工具

XML、JSON 在线转换
XML、JSON 在线转换

在线XML、JSON转换工具

XML 在线格式化
XML 在线格式化

在线 XML 格式化压缩工具