A Simple Linear Regression using PyTorch

Jeril Kuriakose
5 min readSep 15, 2022
<a href=”https://www.freepik.com/free-photo/ai-technology-brain-background-digital-transformation-concept_17122619.htm#query=artificial%20intelligence&position=3&from_view=search">Image by rawpixel.com</a> on Freepik

In this article we will buld a simple Linear Regression model using PyTorch. We will cover the following:

  • Step 1: Generate and split the data
  • Step 2: Processing generated data
  • Step 3: Build Linear Regression model
  • Step 4: Training the Linear Regression model
  • Step 5: Saving the trained model
  • Step 6: Loading the saved model
  • Step 7: Testing the trained model

Dependencies

  • PyTorch
  • Scikit-learn
  • Numpy

Step 1: Generate and split the data

Lets make or generate our regression dataset using Scikit-learn

X, y = datasets.make_regression(
n_samples=1000, n_features=10, noise=5, random_state=4)

In the dataset we have 1000 samples and 10 features.

We need to reshape the target variables y to make it work with MinMaxScaler.

y = y.reshape(-1, 1)

Transform the data by scaling each feature to range (0, 1) .

X_scaler = MinMaxScaler()
X_scaled = X_scaler.fit_transform(X)
y_scaler = MinMaxScaler()
y_scaled = y_scaler.fit_transform(y)

Next let’s split the data into training and testing. 33 % of the data is used for testing.

X_train, X_test, y_train, y_test = train_test_split(
X_scaled, y_scaled, test_size=0.33, random_state=42)

Step 2: Processing generated data

Once after getting the training and testing dataset, we process the data using PyTorch Dataset and DataLoader . Dataset stores the samples and their corresponding labels, and DataLoader wraps an iterable around the Dataset to enable easy access to the samples.

class Data(Dataset):
def __init__(self, X: np.ndarray, y: np.ndarray) -> None:
# need to convert float64 to float32 else
# will get the following error
# RuntimeError: expected scalar type Double but found Float
self.X = torch.from_numpy(X.astype(np.float32))
self.y = torch.from_numpy(y.astype(np.float32))
self.len = self.X.shape[0] def __getitem__(self, index: int) -> tuple:
return self.X[index], self.y[index]
def __len__(self) -> int:
return self.len

We created a classes inheriting the properties of torch.utils.data.Dataset . The training data is then created as the following:

traindata = Data(X_train, y_train)

Now the training data can be easily accessed using index:

traindata[34]
'''
# Output:
(tensor([0.5437, 0.4400, 0.4302, 0.6022, 0.5663, 0.4369, 0.7114, 1.0000, 0.5277, 0.6294]), tensor([0.9945]))
'''

We can also slice the training data as follows:

traindata[34:36]
'''
# Output:
(tensor([[0.5437, 0.4400, 0.4302, 0.6022, 0.5663, 0.4369, 0.7114, 1.0000, 0.5277, 0.6294], [0.5033, 0.5693, 0.4204, 0.6245, 0.3367, 0.4202, 0.6300, 0.4162, 0.2972, 0.4697]]), tensor([[0.9945], [0.4330]]))
'''

Next we load the trainingdata using the DataLoader , we set batch_size to 64, and num_workers to 2.

The num_workers tells the data loader instance how many sub-processes to use for data loading. If the num_workers is zero (default) the GPU has to weight for CPU to load data. Theoretically, greater the num_workers, more efficiently the CPU load data and less the GPU has to wait.

batch_size = 64
num_workers = 2
trainloader = DataLoader(traindata,
batch_size=batch_size,
shuffle=True,
num_workers=num_workers)

Step 3: Build Linear Regression model

Now lets build our Linear Regression model:

class LinearRegression(nn.Module):  def __init__(self, input_dim: int, 
hidden_dim: int, output_dim: int) -> None:
super(LinearRegression, self).__init__()
self.input_to_hidden = nn.Linear(input_dim, hidden_dim)
self.hidden_layer_1 = nn.Linear(hidden_dim, hidden_dim)
self.hidden_layer_2 = nn.Linear(hidden_dim, hidden_dim)
self.hidden_to_output = nn.Linear(hidden_dim, output_dim)
def forward(self, x: torch.Tensor) -> torch.Tensor:
x = self.input_to_hidden(x)
x = self.hidden_layer_1(x)
x = self.hidden_layer_2(x)
x = self.hidden_to_output(x)
return x

The Linear Regression model has 4 layers and are as follows:

  • Input Layer
  • Hidden Layer 1
  • Hidden Layer 2
  • Output Layer

Since its a Linear Regression model, we need not require activation functions after each layer. And the activation function is also not required at the last output layer.

We can initilize the model by just invoking it:

# number of features (len of X cols)
input_dim = X_train.shape[1]
# number of hidden layers
hidden_layers = 50
# output dimension is 1 because of linear regression
output_dim = 1
# initialize the model
model = LinearRegression(input_dim, hidden_layers, output_dim)
print(model)
'''
# Output:
LinearRegression(
(input_to_hidden): Linear(in_features=10, out_features=50, bias=True)
(hidden_layer_1): Linear(in_features=50, out_features=50, bias=True)
(hidden_layer_2): Linear(in_features=50, out_features=50, bias=True)
(hidden_to_output): Linear(in_features=50, out_features=1, bias=True)
)
'''

Next lets define our loss function and the optimizer

# criterion to computes the loss between input and target
criterion = nn.MSELoss()
# optimizer that will be used to update weights and biases
optimizer = torch.optim.SGD(model.parameters(), lr=0.1)

Step 4: Training the Linear Regression model

Now we are all set for our training, let code our training :

epochs = 1000
for epoch in range(epochs):
running_loss = 0.0
for i, (inputs, labels) in enumerate(trainloader):
inputs, labels = data
# forward propagation
outputs = model(inputs)
loss = criterion(outputs, labels)
# set optimizer to zero grad
# to remove previous epoch gradients
optimizer.zero_grad()
# backward propagation
loss.backward()
# optimize
optimizer.step()
running_loss += loss.item()
# display statistics
if not ((epoch + 1) % (epochs // 10)):
print(f'Epochs:{epoch + 1:5d} | ' \
f'Batches per epoch: {i + 1:3d} | ' \
f'Loss: {running_loss / (i + 1):.10f}')

We are training our Linear Regression model for 1000 epochs, and print out the loss for every 100 iterations. The following is the output:

Epochs:  100 | Batches per epoch:  11 | Loss: 0.0058035171 
Epochs: 200 | Batches per epoch: 11 | Loss: 0.0000493603
Epochs: 300 | Batches per epoch: 11 | Loss: 0.0000644553
Epochs: 400 | Batches per epoch: 11 | Loss: 0.0000536770
Epochs: 500 | Batches per epoch: 11 | Loss: 0.0000425602
Epochs: 600 | Batches per epoch: 11 | Loss: 0.0000533999
Epochs: 700 | Batches per epoch: 11 | Loss: 0.0000640242
Epochs: 800 | Batches per epoch: 11 | Loss: 0.0000378003
Epochs: 900 | Batches per epoch: 11 | Loss: 0.0000374429
Epochs: 1000 | Batches per epoch: 11 | Loss: 0.0000628052

Step 5: Saving the trained model

Now lets save our trained model:

# save the trained model
PATH = './mymodel.pth'
torch.save(model.state_dict(), PATH)

Step 6: Loading the saved model

The locally saved model can be then loaded for inference, using the following:

model = LinearRegression(input_dim, hidden_layers, output_dim)
model.load_state_dict(torch.load(PATH))
'''
# Output
<All keys matched successfully>
'''

Step 7: Testing the trained model

Once the model is loaded, we can test our trained model. Lets test for a single mini-batch.

testdata = Data(X_test, y_test)
testloader = DataLoader(testdata, batch_size=batch_size,
shuffle=True, num_workers=num_workers)

Get a single mini-batch from the DataLoader

dataiter = iter(testloader)
inputs, labels = dataiter.next()

Now lets do the inference

predictions = model(inputs)
predictions_np = predictions.cpu().detach().numpy()
# inverse transform of the predictions
predictions= y_scaler.inverse_transform(predictions_np).reshape(-1)
print(predictions)
'''
# Output:
[ -58.641613 -43.134 45.21187 -207.97401 262.29315 112.10317 129.15402 38.720352 63.152897 -129.16345 95.52067 -69.0283 ... ]
'''

Looks like our code is working as expected, lets do the inference for the entire test dataset.

with torch.no_grad():
loss = 0
for i, (inputs, labels) in enumerate(testloader):
# calculate output by running through the network
predictions = model(inputs)
labels = torch.from_numpy(
y_scaler.inverse_transform(labels))
predictions = torch.from_numpy(
y_scaler.inverse_transform(predictions))
loss += F.mse_loss(predictions, labels)
print(f'MSE Loss: {loss / (i + 1).5f}')
'''
# Output:
MSE Loss: 30.67127
'''

The model can be further changed to improve the accuracy.

Entire Code:

The following is the link to the entire code:

Happy Coding !!!

--

--