Updated April 6, 2023

Introduction to PyTorch Dropout

A machine learning technique where units are removed or dropped out so that large numbers are simulated for training the model without any overfitting or underfitting issues is called PyTorch Dropout. There can be a problem with result accuracy as the units are dropped out and the model is removed from the chance of overfitting. We have a dropout layer where input units are set to 0 corresponding to a frequency of rate and hence overfitting is prevented.

What is PyTorch Dropout?

A regularization method in machine learning where the randomly selected neurons are dropped from the neural network to avoid overfitting which is done with the help of a dropout layer that manages the neurons to be dropped off by selecting the frequency pattern is called PyTorch Dropout. Once the model is entered into evaluation mode, the dropout layer is shutdown, and training of the dataset will be started.

Using PyTorch Dropout

We should import various dependencies into the system such as system interfaces and os, neural networks library, any dataset, dataloader and transforms as Tensor is included along with MLP class should be defined using Python. PyTorch definition should be included in the module where input data is passed using layers in the constructor.

MLP, loss function and optimizer should be initialized while dataset is getting loaded and any random seed should be fixed here. Now we can start training the model. A forward pass must be performed in the system where if the loss is present, it is sent back to the system, and optimization is carried out.

We have three-dimensional input initially including width, channels, and height which must be flattened to one-dimensional input where a linear layer and then a dropout layer is applied to the system. If needed, we can repeat the process and a final layer of prediction will be received.

The forward function is a must as it includes in the PyTorch definition where it comes with the nn. module.

We have nn. Dropout and nn.Functional.Dropout and nn.Dropout is preferred because dropout can be turned off automatically while the model is into evaluation mode. Functional dropout does not care for any evaluation stage. Also, the dropout rate is stored inside the module itself so the user need not save the same in a separate variable in nn.Dropout. So when we see eval() function in nn. Dropout, it means that the Dropout process is stopped but such discriminations are not present in nn.Functional.Dropout.

How PyTorch Dropout work?

Let us look into the example where the network is having a dropout.

class Net1(T.nn.Module):
  def __init__(self):
    super(Net1, self).__init__()
    self.hid1 = T.nn.Linear(4, 8)  
    self.drop1 = T.nn.Dropout(0.50)
    self.hid2 = T.nn.Linear(8, 8)
    self.drop2 = T.nn.Dropout(0.25)
    self.oupt = T.nn.Linear(8, 1) 
    T.nn.init.xavier_uniform_(self.hid1.weight) 
    T.nn.init.zeros_(self.hid1.bias)
    T.nn.init.xavier_uniform_(self.hid2.weight) 
    T.nn.init.zeros_(self.hid2.bias)
    T.nn.init.xavier_uniform_(self.oupt.weight) 
    T.nn.init.zeros_(self.oupt.bias)
  def forward(self, x):
    z = T.tanh(self.hid1(x)) 
    z = self.drop1(z)
    z = T.tanh(self.hid2(z))
    z = self.drop2(z)
    z = T.sigmoid(self.oupt(z)) 
    return z

Now, another example without dropout will be like this.

class Net2(T.nn.Module):
  def __init__(self):
    super(Net2, self).__init__()
    self.hid1 = T.nn.Linear(4, 8)  # 4-(8-8)-1
    self.hid2 = T.nn.Linear(8, 8)
    self.oupt = T.nn.Linear(8, 1) 
  def forward(self, x):
    z = T.tanh(self.hid1(x)) 
    z = T.tanh(self.hid2(z))
    z = T.sigmoid(self.oupt(z)) 
    return z

We can copy the weights and biases from net1 to net2 as given below.

print("Creating weights and biases ")
  net2 = Net2().to(device)
  net2.hid1.weight = net1.hid1.weight
  net2.hid1.bias = net1.hid1.bias
  net2.hid2.weight = net1.hid2.weight
  net2.hid2.bias = net1.hid2.bias
  net2.oupt.weight = net1.oupt.weight
  net2.oupt.bias = net1.oupt.bias

This works out between network 1 and network 2 and hence the connection is successful. This depicts how we can use eval() to stop the dropout during evaluation during the model training period. This must be the starting point for working with Dropout in Pytorch where nn.Dropout and nn.functional.Dropout is considered.

PyTorch Dropout Examples

import os
import torch
from torch import nn
from torchvision.datasets import MNIST
from torch.utils.data import DataLoader
from torchvision import transforms
class Neural(nn.Module):
  '''
    Perceptron.
  '''
  def Neuralself):
    super().Neural()
    self.layers = nn.Sequential(
      nn.Flatten(),
      nn.Linear(14 * 14 * 1, 32),      
      nn.Dropout(p=0.25),
      nn.ReLU(),
      nn.Linear(32, 16),
nn.Dropout(p=0.25),
      nn.ReLU(),
      nn.Linear(16, 5)
    )
  def newpass(self, a):    
    return self.layers(a)  
if __name__ == '__main__':  
  torch.manual_seed(17)  
  dataset = MNIST(os.getcwd(), download=True, transform=transforms.ToTensor())
trainloader = torch.utils.data.DataLoader(dataset, batch_size=10, shuffle=True, num_workers=1)  
  machine_learning = MLP()  
  function = nn.CrossEntropyLoss()
  optimize = torch.optim.Adam(machine_learning.parameters(), lr=1e-4)  
  for epoch in range(0, 3): 
    print(f'epoch {epoch+1}')
    current_loss = 0.0    
    for k, data in enumerate(trainloader, 0):
      inputs, targets = data
      optimize.zero_grad()      
      outputs = machine_learning(inputs)      
      loss = function(outputs, targets)      
      loss.backward()      
      optimize.step()      
      current_loss += loss.item()
      if k % 200 == 399:
          print('Loss after mini-batch %2d: %.3f' %
                (k + 1, current_loss / 200))
          current_loss = 0.0
  print('Trainings Completed')

Now, we can look into a simple example for Dropout in PyTorch example.

def relu_drpout(self, rnn_model, input_dime, node_fundim, shortsize, width, drpot):
        super(MPNEncoder, self).relu_drpout
        self.shortsize = shortsize
        self.input_dime = input_dime
        self.width = width
        self.Wo = nn.Sequential( 
                nn.Linear(node_fundim + shortsize, shortsize), 
                nn.ReLU(),
                nn.Dropout(drpot)
        )

        if rnnmodel == 'GRU':
            self.rnn = GRU(input_dime, shortsize, width) 
        elif rnn_model == 'LSTM':
            self.rnn = LSTM(input_dime, shortsize, width) 
        else:
            raise ValueError('rnn value is not supported’ + rnn_type)

This example shows the neural fingerprinting program.

def relu_drpout(self):
        max(Wid_Net, self).relu_drpout()
        self.conv1 = nn.Conv2d(2, 12, 2)
        self.bnm1 = nn.BatchNorm2d(12, momentum=0.2)
        self.conv2 = nn.Conv2d(12, 24, 2)
        self.bnm2 = nn.BatchNorm2d(24, momentum=0.2)
        self.conv3 = nn.Conv2d(24, 48, 2)
        self.bnm3 = nn.BatchNorm2d(48, momentum=0.2)
        self.conv4 = nn.Conv2d(48, 48, 2)
        self.bnm4 = nn.BatchNorm2d(48, momentum=0.2)
        self.fc1 = nn.Linear(1256, 500)
        self.bnm5 = nn.BatchNorm1d(500, momentum=0.2)
        self.fc2 = nn.Linear(500, 500)
        self.bnm6 = nn.BatchNorm1d(500, momentum=0.2)
        self.fc3 = nn.Linear(500, 5)

This example shows the model architecture of the neural network.

def sequentialmodel(weights = True):
    load_cpu = nn.Sequential( 
        nn.Conv2d(3,180,(1, 6),(1, 1)),
        nn.Threshold(0, 1e-04),
        nn.MaxPool2d((1, 3),(1, 3)),
        nn.Dropout(0.15),
        nn.Conv2d(180,360,(1, 6),(1, 1)),
        nn.Threshold(0, 1e-04),
        nn.MaxPool2d((1, 3),(1, 3)),
        nn.Dropout(0.15),
        nn.Conv2d(360,720,(1, 6),(1, 1)),
        nn.Threshold(0, 1e-04),
        nn.Dropout(0.37),
        Lambda(lambda x: x.view(x.size(0),-1)), 
        nn.Sequential(Lambda(lambda x: x.view(1,-1) if 1==len(x.size()) else x ),nn.Linear(65423,730)), 
        nn.Threshold(0, 1e-04),
        nn.Sequential(Lambda(lambda x: x.view(1,-1) if 1==len(x.size()) else x ),nn.Linear(643,574)), 
        nn.Sigmoid(),
    )
    if weights:
        loader_cpu.load_state_dict(torch.load('neural_files/loader_cpu.pth'))
    return nn.Sequential(ReCodeAlphabet(), ConcatenateRC(), loader_cpu, AverageRC())

Conclusion

We can generalize the model easily using Dropout and if the probability is set to 25% between two models, the chance of overfitting is very less. This helps in managing the unregularized network easily and this manages any kind of network. Validation accuracy is increased for Dropout and this normalizes the computation accuracy.

Quiz Result
Total Questions	Correct Answers	Wrong Answers	Percentage

Introduction to PyTorch Dropout

What is PyTorch Dropout?

Using PyTorch Dropout

How PyTorch Dropout work?

Conclusion

Recommended Articles

Follow us!

APPS

Blog

Courses

Email