内容简介:We will use the lower back pain symptomsThere is a class imbalance here. While there’s a lot that can be done to combat class imbalance, it outside the scope of this blog post.PyTorch supports labels starting from 0. That is
We will use the lower back pain symptoms dataset available on Kaggle. This dataset has 13 columns where the first 12 are the features and the last column is the target column. The data set has 300 rows.
Import Libraries
import numpy as np import pandas as pd import seaborn as sns import matplotlib.pyplot as plt import torch import torch.nn as nn import torch.optim as optim from torch.utils.data import Dataset, DataLoader from sklearn.preprocessing import StandardScaler from sklearn.model_selection import train_test_split from sklearn.metrics import confusion_matrix, classification_report
Read Data
df = pd.read_csv("data/tabular/classification/spine_dataset.csv")df.head()
EDA and Preprocessing
Class Distribution
There is a class imbalance here. While there’s a lot that can be done to combat class imbalance, it outside the scope of this blog post.
sns.countplot(x = 'Class_att', data=df)
Encode Output Class
PyTorch supports labels starting from 0. That is [0, n] . We need to remap our labels to start from 0.
df['Class_att'] = df['Class_att'].astype('category')encode_map = { 'Abnormal': 1, 'Normal': 0 } df['Class_att'].replace(encode_map, inplace=True)
Create Input and Output Data
The last column is our output. The input is all the columns but the last one. Here we use .iloc
method from the Pandas library to select our input and output columns.
X = df.iloc[:, 0:-1] y = df.iloc[:, -1]
Train Test Split
We now split our data into train and test sets. We’ve selected 33% percent of out data to be in the test set.
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=69)
Standardize Input
For neural networks to train properly, we need to standardize the input values. We standardize features by removing the mean and scaling to unit variance. The standard score of a sample x
where the mean is u
and the standard deviation is s
is calculated as:
z = (x — u) / s
You can find more about standardization/normalization in neural nets here .
scaler = StandardScaler() X_train = scaler.fit_transform(X_train) X_test = scaler.fit_transform(X_test)
Model Parameters
To train our models, we need to set some hyper-parameters. Note that this is a very simple neural network, as a result, we do not tune a lot of hyper-parameters. The goal is to get to know how PyTorch works.
EPOCHS = 50 BATCH_SIZE = 64 LEARNING_RATE = 0.001
Define Custom Dataloaders
Here we define a Dataloader. If this is new to you, I suggest you read the following blog post on Dataloaders and come back.
## train dataclass trainData(Dataset): def __init__(self, X_data, y_data): self.X_data = X_data self.y_data = y_data def __getitem__(self, index): return self.X_data[index], self.y_data[index] def __len__ (self): return len(self.X_data) train_data = trainData(torch.FloatTensor(X_train), torch.FloatTensor(y_train)) ## test data class testData(Dataset): def __init__(self, X_data): self.X_data = X_data def __getitem__(self, index): return self.X_data[index] def __len__ (self): return len(self.X_data) test_data = testData(torch.FloatTensor(X_test))
Let’s initialize our dataloaders. We’ll use a batch_size = 1
for our test dataloader.
train_loader = DataLoader(dataset=train_data, batch_size=BATCH_SIZE, shuffle=True)test_loader = DataLoader(dataset=test_data, batch_size=1)
Define Neural Net Architecture
Here, we define a 2 layer Feed-Forward network with BatchNorm and Dropout.
In our __init__()
function, we define the what layers we want to use while in the forward()
function we call the defined layers.
Since the number of input features in our dataset is 12, the input to our first nn.Linear
layer would be 12. The output could be any number you want. The only thing you need to ensure is that number of output features of one layer should be equal to the input features of the next layer. Read more about nn.Linear
in the docs
.
Similarly, we define ReLU, Dropout, and BatchNorm layers.
Once we’ve defined all these layers, it’s time to use them. In the forward()
function, we take inputs
as our input. We pass this input through the different layers we initialized.
The first line of the forward()
functions takes the input, passes it through our first linear layer and then applies the ReLU activation on it. Then we apply BatchNorm on the output. Look at the following code to understand it better.
Note that we did not use Sigmoid
activation
during training. That’s because, we use the nn.BCEWithLogitsLoss()
loss function which automatically applies the the Sigmoid activation. We however, need to use Sigmoid manually during inference.
class binaryClassification(nn.Module): def __init__(self): super(binaryClassification, self).__init__() # Number of input features is 12. self.layer_1 = nn.Linear(12, 64) self.layer_2 = nn.Linear(64, 64) self.layer_out = nn.Linear(64, 1) self.relu = nn.ReLU() self.dropout = nn.Dropout(p=0.1) self.batchnorm1 = nn.BatchNorm1d(64) self.batchnorm2 = nn.BatchNorm1d(64) def forward(self, inputs): x = self.relu(self.layer_1(inputs)) x = self.batchnorm1(x) x = self.relu(self.layer_2(x)) x = self.batchnorm2(x) x = self.dropout(x) x = self.layer_out(x) return x
Once, we’ve defined our architecture, we check our GPU is active. The amazing thing about PyTorch is that it’s super easy to use the GPU.
The variable device
will either say cuda:0
if we have the GPU. If not, it’ll say cpu
. You can follow along this tutorial even if you do not have a GPU without any change in code.
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")print(device) ###################### OUTPUT ######################cuda:0
Next, we need to initialize our model. After initializing it, we move it to device
. Now, this device is a GPU if you have one or it’s CPU if you don’t. The network we’ve used is fairly small. So, it will not take a lot of time to train.
After this, we initialize our optimizer and decide on which loss function to use.
model = binaryClassification() model.to(device)print(model)criterion = nn.BCEWithLogitsLoss() optimizer = optim.Adam(model.parameters(), lr=LEARNING_RATE)###################### OUTPUT ######################binaryClassification( (layer_1): Linear(in_features=12, out_features=64, bias=True) (layer_2): Linear(in_features=64, out_features=64, bias=True) (layer_out): Linear(in_features=64, out_features=1, bias=True) (relu): ReLU() (dropout): Dropout(p=0.1, inplace=False) (batchnorm1): BatchNorm1d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (batchnorm2): BatchNorm1d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) )
Train the model
Before we start the actual training, let’s define a function to calculate accuracy during training.
In the function below, we take the predicted and actual output as the input. The predicted value(a probability) is rounded off to convert it into a either 0 or 1.
Once that is done, we simply compare the number of 1/0 we predicted to the number of 1/0 actually present and calculate the accuracy.
Note that the inputs y_pred
and y_test
are for a batch. Our batch_size
was 64. So, this accuracy is being calculated for 64 predictions.
def binary_acc(y_pred, y_test): y_pred_tag = torch.round(torch.sigmoid(y_pred)) correct_results_sum = (y_pred_tag == y_test).sum().float() acc = correct_results_sum/y_test.shape[0] acc = torch.round(acc * 100) return acc
The moment we’ve been waiting for has arrived. Let’s train our model.
You can see we’ve put a model.train()
at the before the loop. model.train()
tells PyTorch that you’re in training mode.
Well, why do we need to do that? If you’re using layers such as Dropout or BatchNorm which behave differently during training and evaluation, you need to tell PyTorch to act accordingly. While the default mode in PyTorch is the train mode, so, you don’t explicitly have to write that. But it’s good practice.
Similarly, we’ll call model.eval()
when we test our model. We’ll see that below.
Back to training; we start a for
loop. At the top of this for
loop, we initialize our loss and accuracy per epoch to 0. After every epoch, we’ll print out the loss/accuracy and reset it back to 0.
Then we have another for
loop. This for
loop is used to get our data in batches from the train loader
.
We do optimizer.zero_grad()
before we make any predictions. Since the backward()
function accumulates gradients, we need to set it to 0 manually per mini-batch.
From our defined model, we then obtain a prediction, get the loss(and accuracy) for that mini-batch, perform backpropagation using loss.backward()
and optimizer.step()
. Finally, we add all the mini-batch losses (and accuracies) to obtain the average loss (and accuracy) for the epoch.
This loss and accuracy is printed out in the outer for
loop.
model.train() for e in range(1, EPOCHS+1): epoch_loss = 0 epoch_acc = 0 for X_batch, y_batch in train_loader: X_batch, y_batch = X_batch.to(device), y_batch.to(device) optimizer.zero_grad() y_pred = model(X_batch) loss = criterion(y_pred, y_batch.unsqueeze(1)) acc = binary_acc(y_pred, y_batch.unsqueeze(1)) loss.backward() optimizer.step() epoch_loss += loss.item() epoch_acc += acc.item() print(f'Epoch {e+0:03}: | Loss: {epoch_loss/len(train_loader):.5f} | Acc: {epoch_acc/len(train_loader):.3f}')###################### OUTPUT ######################Epoch 001: | Loss: 0.04027 | Acc: 98.250 Epoch 002: | Loss: 0.12023 | Acc: 96.750 Epoch 003: | Loss: 0.02067 | Acc: 99.500 Epoch 004: | Loss: 0.07329 | Acc: 96.250 Epoch 005: | Loss: 0.04676 | Acc: 99.250 Epoch 006: | Loss: 0.03005 | Acc: 99.500 Epoch 007: | Loss: 0.05777 | Acc: 98.250 Epoch 008: | Loss: 0.03446 | Acc: 99.500 Epoch 009: | Loss: 0.03443 | Acc: 100.000 Epoch 010: | Loss: 0.03368 | Acc: 100.000 Epoch 011: | Loss: 0.02395 | Acc: 100.000 Epoch 012: | Loss: 0.05094 | Acc: 98.250 Epoch 013: | Loss: 0.03618 | Acc: 98.250 Epoch 014: | Loss: 0.02143 | Acc: 100.000 Epoch 015: | Loss: 0.02730 | Acc: 99.500 Epoch 016: | Loss: 0.02323 | Acc: 100.000 Epoch 017: | Loss: 0.03395 | Acc: 98.250 Epoch 018: | Loss: 0.08600 | Acc: 96.750 Epoch 019: | Loss: 0.02394 | Acc: 100.000 Epoch 020: | Loss: 0.02363 | Acc: 100.000 Epoch 021: | Loss: 0.01660 | Acc: 100.000 Epoch 022: | Loss: 0.05766 | Acc: 96.750 Epoch 023: | Loss: 0.02115 | Acc: 100.000 Epoch 024: | Loss: 0.01331 | Acc: 100.000 Epoch 025: | Loss: 0.01504 | Acc: 100.000 Epoch 026: | Loss: 0.01727 | Acc: 100.000 Epoch 027: | Loss: 0.02128 | Acc: 100.000 Epoch 028: | Loss: 0.01106 | Acc: 100.000 Epoch 029: | Loss: 0.05802 | Acc: 98.250 Epoch 030: | Loss: 0.01275 | Acc: 100.000 Epoch 031: | Loss: 0.01272 | Acc: 100.000 Epoch 032: | Loss: 0.01949 | Acc: 100.000 Epoch 033: | Loss: 0.02848 | Acc: 100.000 Epoch 034: | Loss: 0.01514 | Acc: 100.000 Epoch 035: | Loss: 0.02949 | Acc: 100.000 Epoch 036: | Loss: 0.00895 | Acc: 100.000 Epoch 037: | Loss: 0.01692 | Acc: 100.000 Epoch 038: | Loss: 0.01678 | Acc: 100.000 Epoch 039: | Loss: 0.02755 | Acc: 100.000 Epoch 040: | Loss: 0.02021 | Acc: 100.000 Epoch 041: | Loss: 0.07972 | Acc: 98.250 Epoch 042: | Loss: 0.01421 | Acc: 100.000 Epoch 043: | Loss: 0.01558 | Acc: 100.000 Epoch 044: | Loss: 0.01185 | Acc: 100.000 Epoch 045: | Loss: 0.01830 | Acc: 100.000 Epoch 046: | Loss: 0.01367 | Acc: 100.000 Epoch 047: | Loss: 0.00880 | Acc: 100.000 Epoch 048: | Loss: 0.01046 | Acc: 100.000 Epoch 049: | Loss: 0.00933 | Acc: 100.000 Epoch 050: | Loss: 0.11034 | Acc: 98.250
Test the model
After training is done, we need to test how our model fared. See that we’ve used model.eval()
before we run our testing code. To tell PyTorch that we do not want to perform back-propagation during inference, we use torch.no_grad()
which reduces memory usage and speeds up computation.
We start by defining a list that will hold our predictions. Then we loop through our batches using the test loader . For each batch,
- We make the predictions using our trained model.
- Round off the probabilities to 1 or 0.
- Move the batch to the CPU from the GPU.
- Convert the tensor to a numpy object and append it to our list.
-
Flatten out the list so that we can use it as an input to
confusion_matrix
andclassification_report
.
y_pred_list = []model.eval() with torch.no_grad(): for X_batch in test_loader: X_batch = X_batch.to(device) y_test_pred = model(X_batch) y_test_pred = torch.sigmoid(y_test_pred) y_pred_tag = torch.round(y_test_pred) y_pred_list.append(y_pred_tag.cpu().numpy()) y_pred_list = [a.squeeze().tolist() for a in y_pred_list]
Confusion Matrix
Once we have all our predictions, we use the confusion_matrix()
function from scikit-learn to calculate the confusion matrix.
confusion_matrix(y_test, y_pred_list) ###################### OUTPUT ######################array([[23, 8], [12, 60]])
Classification Report
To obtain the classification report which has precision, recall, and F1 score, we use the function classification_report
.
print(classification_report(y_test, y_pred_list)) ###################### OUTPUT ######################precision recall f1-score support 0 0.66 0.74 0.70 31 1 0.88 0.83 0.86 72 accuracy 0.81 103 macro avg 0.77 0.79 0.78 103 weighted avg 0.81 0.81 0.81 103
以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持 码农网
猜你喜欢:本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们。
Google御用網頁語言Node.js
郭家寶 / 佳魁資訊 / 2013-4-26 / NT 490
這是一本 Node.js 的入門教學,寫給想要學習 Node.js,但沒有任何系統的經驗的開發者。如果你聽說過 Node.js,並被它許多神奇的特性吸引,本書就是為你準備的。 透過閱讀本書,你可以對 Node.js 有全面的認識,學會如何用 Node.js 程式設計,了解事件驅動、非同步式 I/O 的程式設計模式,同時還可以了解一些使用JavaScript 進行函數式程式設計的方法。 ......一起来看看 《Google御用網頁語言Node.js》 这本书的介绍吧!