Updated April 5, 2023

Introduction to PyTorch Load Model

Python class represents the model where it is taken from the module with at least two parameters defined in the program which we call as PyTorch Model. Three functions are important while saving and loading the model in PyTorch. They are torch.save torch.load and torch. nn.Module.load_state_dict. The pickle function is used for managing the models and loading the serialization techniques in the model. We can also load the data into needed storage space using torch.load. In this topic, we are going to learn about PyTorch Load Model.

What is the PyTorch Load Model?

A model with different parameters in the same module and the same dataset where the data is from tensors or CUDA from which we can create different iterators is called PyTorch Model. This model must be saved and loaded into the module and if it involves less code, it helps to manage the model easily. Pickle module is used and it does not take a not of model class but it saves the path of the file with the class.

How to save and load models in PyTorch?

torch.save(model.state_dict(), PATH)
model = TheModelClass(*args, **kwargs)
model.load_state_dict(torch.load(PATH))
model.eval()

These codes are used to save and load the model into PyTorch.

save: we can save a serialized object into the disk. This is achieved with the help of the pickle module. Any kind of object can be saved and serialized using the Pickle module.
load: now, since the module is serialized, it is important to unserialize using the same module so that we can browse for the required modules in the system.

The model can be evaluated easily as well.

Saving multiple models

We can save different models using this code.

torch.save({
            'model01_state_dict': model01.state_dict(),
            'model02_state_dict': model02.state_dict(),
            'optimizer01_state_dict': optimizer01.state_dict(),
            'optimizer02_state_dict': optimizer02.state_dict(),
            ...
            }, PATH)

Now we can load the same models using the below code.

Model01 = TheModel01Class(*args, **kwargs)
Model02 = TheModel02Class(*args, **kwargs)
Optimizer01 = TheOptimizer01Class(*args, **kwargs)
Optimizer02 = TheOptimizer02Class(*args, **kwargs)

checkpoint = torch.load(PATH)
model01.load_state_dict(checkpoint['model01_state_dict'])
model02.load_state_dict(checkpoint['model02_state_dict'])
optimizer01.load_state_dict(checkpoint['optimizer01_state_dict'])
optimizer02.load_state_dict(checkpoint['optimizer02_state_dict'])

model01.eval()
model02.eval()
model01.train()
model02.train()

While saving multiple models, it is important to save each model’s state_dict and optimizer. Any models can be saved further by adding them to the main dictionary. We can save the models into the general checkpoint with .tar extension into the code. The models and optimizers should be initialized first followed by loading the dictionary. Model.eval() should be run in between to check normalization and dropout layers or else we will get inconsistent results. It is also important to set the model to training by running model.train() with available layers.

Saving Model Across Devices

The model should be saved first using the below code.

torch.save(model.state_dict(), PATH)

The next step is to load the model.

device_model = torch.device('cpu')
model = TheModelClass(*args, **kwargs)
model.load_state_dict(torch.load(PATH, map_location=device_model))

It is important to pass torch.device(cpu) when the model was in CPU and trained in GPU. This helps to dynamically map the CPU device using the map_location parameter. Torch.device details are collected in map_location parameter with the help of torch.load argument.

If both the devices are GPU, we can use the below code.

torch.save(model.state_dict(), PATH)
device_model = torch.device("cuda")
model = TheModelClass(*args, **kwargs)
model.load_state_dict(torch.load(PATH))
model.to(device_model)

It must be noted that we have to convert the model to Cuda so that data must be prepared for the model. Here, we should manually overwrite the tensors to store the device information.

When the situation is to save the model on CPU and load it on GPU, this code must be used.

torch.save(model.state_dict(), PATH)
device_model = torch.device("cuda")
model = TheModelClass(*args, **kwargs)
model.load_state_dict(torch.load(PATH, map_location="cuda:0"))
model.to(device_model)

This code helps to run any dataparallel models.

torch.save(model.module.state_dict(), PATH)

Loading function networks

pip install torch torchvision
import numpy
from torch.utils import data
from torchvision.datasets import MNIST

def numpy_collate(batch01):
  if isinstance(batch01[0], numpy.ndarray):
    return numpy.stack(batch01)
  elif isinstance(batch01[0], (tuple,list)):
    transposed = zip(*batch01)
    return [numpy_collate(samples) for samples in transposed]
  else:
    return numpy.array(batch01)

class NumpyLoader(data.DataLoader):
  def __init__(self, dataset, batch01_size=1,
                shuffle=False, sampler=None,
                batch01_sampler=None, number_workers=0,
                pin_memory=False, drop_last=False,
                timeout=0, worker_init_fn=None):
    super(self.__class__, self).__init__(dataset,
        batch01_size=batch01_size,
        shuffle=shuffle,
        sampler=sampler,
        batch01_sampler=batch01_sampler,
        number_workers=number_workers,
        collate_fn=numpy_collate,
        pin_memory=pin_memory,
        drop_last=drop_last,
        timeout=timeout,
        worker_init_fn=worker_init_fn)
 class FlattenAndCast(object):
  def __call__(self, pic):
    return numpy.ravel(numpy.array(pic, dtype=jnp.float32))

mnist_dataset = MNIST('/tmp/mnist/', download=True, transform=FlattenAndCast())
training_generator = NumpyLoader(mnist_dataset, batch01_size=batch01_size, number_workers=0)

import time

for epoch in range(number_epochs):
  start_time = time.time()
  for x, y in training_generator:
    y = one_hot(y, n_targets)
    parameters = update(parameters, x, y)
  epoch_time = time.time() - start_time

  train_accurac = accuracy(parameters, train_images, train_labels)
  test_accurac = accuracy(parameters, test_images, test_labels)
  print("Epoch {} in the dataset {:0.2f} sec".format(epoch, epoch_time))
  print("Training set accuracy  in the dataset {}".format(train_acc))
  print("Test set accuracy  in the dataset {}".format(test_acc))

Creating a Model

sudo pip install torch
sudo pip install torchvision
import torch
print(torch.__version__)

class CSVNewDataset(Dataset):
    def __init__(self, path):
        self.A = ...
        self.B = ...

    def __len__(self):
        return len(self.A)

    def __getitem__(self, index):
        return [self.a[index], self.b[index]]

dataset01 = CSVNewDataset(...)
training, testing = random_split(dataset01, [[...], [...]])
training_dl = DataLoader(training, batch01_size=16, shuffle=True)
testing_dl = DataLoader(testing, batch01_size=256, shuffle=False)
for k, (inputs, targets) in enumerate(training_dl):

class Machine(Moduleset):  
    def __init__(self, num_inputs):
        super(Machine, self).__init__()
        self.layer = Linear(num_inputs, 1)
        self.activation = Sigmoid()

    def forward(self, A):
        A = self.layer(A)
        A = self.activation(A)
        return A
xavier_uniform_(self.layer.weight)
criterion = MSELoss()
optimizer = SGD(model.parameters(), lr=0.02, momentum=0.8)

optimizer.zero_grad()
bhat = model(inputs)
loss = criterion(bhat, targets)
loss.backward()
optimizer.step()
for k, (inputs, targets) in enumerate(testing_dl):
    bhat = model(inputs)
row = Variable(Tensor([row]).float())
bhat = model(row)
bhat = bhat.detach().numpy()

from numpy import vstack
from pandas import read_csv
from sklearn.metrics import accuracy_score
from torch.utils.data import Dataset
from torch.utils.data import DataLoader
from torch import Tensor
from torch.nn import Linear
from torch.nn import ReLU
from torch.nn import Sigmoid
from torch.nn import Module
from torch.optim import SGD
from torch.nn import BCELoss
from torch.nn.init import xavier_uniform_

class CSVNewDataset(Dataset):
    def __init__(self, path):
        df = read_csv(path, header=None)
        self.A = df.values[:, :-2]
        self.b = df.values[:, -2]
        self.A = self.A.astype('float32')
        self.b = LabelEncoder().fit_transform(self.b)
        self.b = self.b.astype('float32')
        self.b = self.b.reshape((len(self.b), 1))

    def __len__(self):
        return len(self.A)

    def __getitem__(self, index):
        return [self.A[index], self.b[index]]

    def get_splits(self, num_test=0.37):
        testing_size = round(num_test * len(self.A))
        training_size = len(self.A) - testing_size
        return random_split(self, [training_size, testing_size])

Conclusion

DataParallel is a model wrapper in the dataset where GPU utilization can be enabled easily. We have a code to save the DataParallel easily as it is model.module.state_dict(). This helps in saving the model with any required flexibility where we can save the model at any device at any time.

Quiz Result
Total Questions	Correct Answers	Wrong Answers	Percentage

Introduction to PyTorch Load Model

What is the PyTorch Load Model?

How to save and load models in PyTorch?

Saving multiple models

Conclusion

Recommended Articles

Follow us!

APPS

Blog

Courses

Email