Generative Learning and its Differences from the Discriminative Learning

Generative Learning refers to a special class of statistical models that are capable of
generating content that is very hard to distinguish from the reality (or fake content that
looks real). The generated content could be poems, images, music, songs, videos, 3D
objects or content from some other domain we could imagine. A domain is nothing but a fancy word for a bunch of examples that follow some common pattern. Interesting part is that, sometimes, the generated content is not just realistic, its completely new as well (or unseen in the training examples). Everyone must have seen or heard about the modern technologies that can generate very realistic looking faces of the people that do not even exist in the world. Projects such as Face aging apps, Virtual try-on, Photos to paintings and a lot more advancements with similar technologies are examples of the generative models.

Now the question comes: Is every machine learning (ML) model generative in nature? Well, No. Machine Learning models can be broadly classified into the following two
categories:

Discriminative Models
Generative Models

Let’s understand these two categories in more details.

Discriminative Models

As the name also suggests, the discriminative models are used for discriminative tasks such as predicting whether there is a Dog in an image or a Cat. In ML applications, the discriminative models are quite popular and are heavily used for classification tasks, for example: Sentiment Classification, Classifying emails into spam vs not spam, Image Classification and so on. Let’s check the following paragraph to understand the learning process work of discriminative models.

The discriminative models are presented with a large amount of training pairs of type
(𝒙, 𝒚) where x represents the observation and y represents the corresponding outcome also known as label. The objective of the ML model is to learn a mapping function from x to y, so that when presented with new observations in the future, it should be able to automatically predict the most likely outcome (or label). A sufficiently deep Neural Network (NN), provided with sufficient number of labelled observations, can learn the mapping function between the observations and the labels efficiently through back propagation by utilising any stochastic gradient descent-based optimisation algorithm.

In order to learn this mapping function, the discriminative models rely upon labelled
datasets. In many real-world applications, it may be difficult to gather sufficient amount of labelled data every time. Generative models, however, do not always require labelled datasets as they have a completely different kind of objective function to optimise. Let’s get a quick understanding of the generative models next.

2. Generative Models

As discussed earlier, the generative models are special type of ML models that are
capable of generating realistic content. A ML model or any technology in particular or
even a human mind can only generate realistic content when it knows almost every
important detail about the target content, which can also be termed as the domain
understanding. To achieve this goal, A generative learning approach aims at learning the distribution of target domain (where, target domain: the domain of the content that we want to generate). Once our model knows the true distribution of data, we can keep sampling from it in order to generate infinite volume of the content following the same distribution.

If you are interested in learning more about the generative learning and Generative Adversarial Networks, Do check out my book:

https://www.amazon.com/GAN-Book-Generative-Adversarial-TensorFlow2-ebook/dp/B0CR8C725C?ref_=ast_author_dp

It may sound easier but learning a distribution is not a trivial task. We will soon talk
about the challenges of learning a data distribution but before that it’s important to
properly understand the differences between a generative and a discriminative
approach. Understanding the key differences between two aforementioned approaches is important; so that we could follow the forthcoming content which is mostly related to the generative models. Let’s look at both the approaches to get a better understanding.

Generative vs. Discriminative Learning

The Generative approach of the statistical modelling (or ML) aims at learning the joint
probability distribution 𝒑(𝒙,𝒚) over the given pairs of observations and corresponding
labels (𝒙, 𝒚) or just 𝒑(𝒙) when labels are not present (as discussed earlier, the
generative models don’t always require labelled data). Because 𝒑(𝒙) represents the
data distribution of the input samples x, sampling from 𝒑(𝒙) would generate a new
sample every time.

Apart from generating data, the generative models can also be utilized to estimate the
conditional probability 𝒑(𝒚|𝒙) using the bayes rules (with the help of learned joint
distribution 𝒑(𝒙,𝒚)) to make their predictions and to choose the most likely label y for a
given input observation x. Here is how the conditional probability 𝒑(𝒚|𝒙) can be
estimated:

A discriminative approach on the other hand, as discussed earlier, estimates the
conditional probability (or posterior) 𝒑(𝒚|𝒙) directly from the observations x and the
corresponding labels y without worrying about the underlying data distribution
(basically they learn just the mapping function from observations to labels). It makes
the task of a discriminative approach pretty straight forward as the objective is just to
learn a mapping function (also known as classifier) between x and y.

In simpler words, A generative model learns the distribution first and then decides the most likely output while a discriminative model learns the direct mappings between the inputs and the class labels (based on similarities or dissimilarities).

The discriminative approach is usually preferred when the task is to solve a classification problem, or an easy problem. A generative model, on the other hand, picks up the complex task of learning a data distribution, the harder problem. Most of the time, learning a data distribution is not important and thus having a discriminative
approach makes sense to keep things simpler. Let’s look at the following example.

Example

In case of binary classification, all we need to do is learn a decision boundary that separates two classes with minimum error. With this boundary, the model can decide whether a new data point belongs to class A or class B without worrying about the data distributions (see the following figure).

Example-2

Both approaches have their own ways of solving problems. Let’s look at one more example to understand the difference between generative and discriminative learning approaches. In this example, we will start by giving a task and then see how each of the above-mentioned approaches goes about solving it.

Task: Identify the animal in a given photograph?

Generative Approach

Study all the animals (and their characteristics) in the world and then determine which animal is present in the given picture. This approach looks at the low-level attributes such as eyes, face, legs, tail, color, height and so on, to decide the final outcome.

Check out my articles on: How does a Generative Learning Model Work?
And : Building Blocks of Deep Generative Models

Discriminative Approach

No need to learn about any of the animals, simply look at the structural (or shape) differences or similarities and decide the animal. This approach usually looks at the high-level features such as structure and shape to draw a decision boundary between different animals.

Note: Based on the above definitions, one might think that ML models are always
probabilistic in nature (as we discussed about estimating the prior and posterior
distributions, in terms of probability), but a generative or discriminative model does not
always need to output probabilities to be considered as a valid model. For example: A
decision tree-based classifier, directly gives the output class without estimating any
probability value and is still a valid discriminative approach. Because, here the
predicted labels follow the distribution of the real labels provided as training data.

Now that we have a good background about the generative approach of solving ML
problems, let’s look at some common generative approaches that have been frequently used.

Check out the following list of Generative Approaches (source: Wikipedia).

Gaussian mixture model
Hidden Markov model
Probabilistic context-free grammar
Bayesian network (e.g. Naive bayes, Autoregressive model)
Averaged one-dependence estimators
Latent Dirichlet allocation
Boltzmann machine
Flow-based generative model
Energy based model
Variational auto encoder
Generative adversarial network

Discriminative approaches, on the other hand, are very frequently used for solving real world business problems due to their simplistic nature.

Following is a list of commonly applied discriminative approaches in past few decades (source: Wikipedia).

k-nearest neighbours algorithm
Logistic regression
Support Vector Machines
Decision Trees
Random Forest
Maximum-entropy Markov models
Conditional random fields
Neural networks

Conclusion

In this article, we got an intuition about the generative learning approach of developing ML models. We discussed the generative approach side-by-side with the discriminative approach and understood their differences. The discriminative approach is widely used for solving simpler problems such as classification, sentiment analysis and so on. The generative learning approach is complex in nature and solves complex real world problems.

Thanks for reading! I hope this article was helpful and cleared some of your doubts. If you find this useful kindly share, if you find any mistakes please let me know your valuable feedback by commenting below. Until then, see you in the next article!!

Generative Learning and its Differences from the Discriminative Learning