Deep Learning is part of the Machine Learning family that deals with creating the Artificial Neural Network (ANN) based models. ANNs are used for both supervised as well as unsupervised learning tasks. Deep Learning is extensively used in tasks like-object detection, language translations, speech recognition, face detection, and recognition..etc. Let’s create our First Neural Network with PyTorch-
In this article, I am going to explain how to create a simple Neural Network (deep learning model) using the PyTorch framework from scratch. If you are not familiar with PyTorch, you can read my article here that throws light on fundamentals building blocks of PyTorch.
Contents-
- PyTorch: Autograd
- PyTorch: Neural Networks
- PyTorch: Loss functions
- PyTorch: Optimizers
- PyTorch: Monitoring with TensorBoard
- Conclusion
1. PyTorch: Autograd
Computing the gradients manually is a very painful and time-consuming process. Even for a small neural network, you will need to calculate all the derivatives related to all the functions, apply chain-rule, and get the result. There are huge chances of making mistakes if you try to write all this manually.
Thankfully-PyTorch supports automatic differentiation. Automatic differentiation deals with the automatic computation of the backward-pass in a given neural network. This functionality is implemented in autograd
function of the PyTorch library.
Computational Graph
Deep Learning models in PyTorch form a computational graph such that nodes of the graph are Tensors, edges are the mathematical functions producing an output Tensor form the given input Tensor. A backward-pass through such a graph allows the easy computation of the gradients.
Suppose x
is a tensor in a computational graph such that x.requires_grad=True
then PyTorch will allocate one more tensor in order to store the gradient value related Tensor-x with respect to some scalar value and that Tensor would be accessible through x.grad
.
Custom Autograd Function
You can easily define your own custom autograd functions in PyTorch. All you need to do is-Define a subclass of torch.autograd.Funtion
class and implement forward()
and backward()
functions inside it.
Here is an example code-
import torch
class MyCustomClass(torch.autograd.Function):
@staticmethod
def forward():
#Add your code here
return 1
@staticmethod
def backward():
#Add your code here
return 1
2. PyTorch: Neural Networks
While building neural networks, we usually start defining layers in a row where the first layer is called the input layer and gets the input data directly. While the last layer returns the final result after performing the required comutations. As per the neural network concepts, there are multiple options of layers that can be chosen for a deep learning model.
Popular deep learning frameworks (Keras, Tensorflow) already keep such layers implemented inside the package. Similarly, PyTorch gives you all these pre-implemented layers ready to be imported in your python workbook. All such implementation reside under torch.nn
package.
torch.nn package gives you all the pre-implemented layers such as Linear, Convolutional, Recurrent layers along with the activation functions and regularization layers. For example-
from torch.nn import Linear
from torch.nn import Conv1d, Conv2d, Conv3d, ConvTranspose2d
from torch.nn import RNN, GRU, LSTM
from torch.nn import ReLU, ELU, Sigmoid, Softmax
from torch.nn import Dropout, BatchNorm1d, BatchNorm2d
Sequential Model
The sequential class makes it very easy to write the simple neural networks using PyTorch. All you need to do is-Place your layers sequentially inside it. Here is an example sequential model-
from torch.nn import Sequential
from torch.nn import Linear
from torch.nn import ReLU, Softmax
network = Sequential(
Linear(5, 10),
ReLU(),
Linear(10, 3),
ReLU(),
Softmax(dim = 1)
)
network
Output looks like this-
Sequential(
(0): Linear(in_features=5, out_features=10, bias=True)
(1): ReLU()
(2): Linear(in_features=10, out_features=3, bias=True)
(3): ReLU()
(4): Softmax(dim=1)
)
Let’s pass two random input tensors to our network, We should get output from the softmax layer-
inputs = torch.FloatTensor(2,5)
network(inputs)
Softmax layer gives three class probabilities for each input-
tensor([[0.3071, 0.3071, 0.3858],
[0.3071, 0.3071, 0.3858]], grad_fn=<SoftmaxBackward>)
Custom Layers
Defining custom layers is super easy with PyTorch. We just need to create a sub-class of torch.nn.module
class. In the subclass, define the custom layer inside the constructor and also define the forward pass function. Here is an example of custom layer creation with PyTorch-
class MyCustomLayer(torch.nn.Module):
def __init__(self, inputs, classes, dropout_prob=0):
super(MyCustomLayer, self).__init__()
self.layer = torch.nn.Sequential(
Linear(2,2),
ReLU(),
Linear(2, 4),
Softmax(dim=1)
)
def forward(self, x):
return self.layer(x)
if __name__=="__main__":
network = MyCustomLayer(inputs = 2, classes = 4)
t = torch.FloatTensor(2, 2)
print (network)
output = network(t)
print(output)
Here is the output-
MyCustomLayer(
(layer): Sequential(
(0): Linear(in_features=2, out_features=2, bias=True)
(1): ReLU()
(2): Linear(in_features=2, out_features=4, bias=True)
(3): Softmax(dim=1)
)
)
tensor([[0.2591, 0.2457, 0.1406, 0.3545],
[0.1726, 0.3410, 0.2001, 0.2863]], grad_fn=<SoftmaxBackward>)
We got a model now, Let’s train it.
3. PyTorch: Loss functions
Our model’s computational graph is ready, the next step would be to train the model on given training data of input-output pairs. But in order to train any ML model, we need a loss function. A function that tells you how good or bad you are doing at each step of the training process.
PyTorch has implementations of most of the common loss functions like-MSELoss, BCELoss, CrossEntropyLoss…etc. All such loss functions reside in the torch.nn
package. Once you have chosen the appropriate loss function for your problem, the next step would be to define an optimizer. Let’s learn more about optimizers-
4. PyTorch: Optimizers
Our optimizer is supposed to do the most important thing for us. Optimizers are responsible for examining the gradients of model parameters and modifying the parameters in such a way that final(overall) loss decreases. In this way, Optimizer tries to reduce overall loss by changing the network parameters at each step of the training process.
These parameters can be changed in multiple different ways at each step. Thus there are a lot of different optimizers out there, each one of them tries to reduce the loss in their own unique way. Following are a few common optimizers already implemented inside the torch.optim
package.
- SGD (Stochastic Gradient Descent)
- RMSprop
- Adagrad (Adaptive Gradient)
- Adam (A combination of RMSprop and Adagrad): Popular choice
Optimizers give us the flexibility to set a few optimization parameters and also the Learning rate.
This is how an optimizer works in the neural network training loop-
Example Training LOOP:
#forward pass on input data
output = neural_network(data_batch_x)
#compute loss value
loss_value = loss_function(output, data_batch_y)
#Calculate gradients
loss_value.backward()
#change parameters
optimizer.step()
#make gradients zero
optimizer.zero_grad()
5. PyTorch: Monitoring with TensorBoard
Now you must be ready to write your first Deep Learning Model(Neural Network) using PyTorch and also to put it for the training purpose. Any experienced Deep Learning person can tell you how uncertain your model training could be. Even after following the best practices, you may not get good results in the first run.
With continuous experiments, you will eventually arrive at the best hyper-parameters for your model. Think about it, how hard it would be for you to keep track of all the experiments you are going to do? How hard it would be for you to compare loss patterns of all the experiments?
Don’t worry there a tool called-TensorBoard, specifically designed to overcome these problems. TensorBoard gives a nice interface to visualize your model training and comparing various statistical measures.
Here is a snapshot of TensorBoard interface-
TensorBoard was originally developed by Google in order to support Tensorflow as part of the TensorFlow package only. But now it comes as a separate package. As it uses tensorflow data formats, you will need to install both tensorflow and tensorboard packages in your machine in order to visualize your PyTorch based model stats on TensorBoard. Trust me, It’s totally worth it.
6. Conclusion
I hope after reading this article everyone should be able to write their first neural network with PyTorch. Apart from creating the neural network, we got to know about TensorBoard and how it can help us with our research. I suppose, everyone agrees on how easy it is to write deep neural networks with PyTorch and also to define custom things with this very flexible PyTorch toolkit.
Thanks for reading, I hope you enjoyed the article. Kindly share your feedback through the comments below.
*To get updates regarding my brand new articles, kindly register with you email address.
References
- https://www.pytorch.org (PyTorch official website)
- https://github.com/pytorch/pytorch (PyTorch Github page)
- https://en.wikipedia.org/wiki/PyTorch (Wikipedia Link)