Understand Maximum Likelihood Estimation (MLE)

Introduction

Maximum Likelihood Estimation (MLE) is a statistical method used for parameter estimation to find the distribution that best describes observed data. For example, if observed data closely follows the shape of a Gaussian bell curve, one might ask: What is the best-fitting Gaussian distribution for this data? The MLE method enables us to find the mean and variance of the Gaussian distribution that most accurately represents the data. In other words, it finds the center (mean) and width(variance) of this distribution.

In this article, we will explore this concept through the following visual steps:

An Observed Gaussian Distributed Histogram

Here's the histogram of the original data generated from a Gaussian distribution with a mean (μ) of 5 and a variance (σ^2) of 4 

Find Likehood function for the Mean of a Gaussian Distribution:

We'll plot this function for a range of μ values, keeping σ^2 fixed (using some initial guess ).

The vertical red dashed line represents the Maximum Likelihood Estimate (MLE) for the mean, which is approximately μ=4.91 

The next step, we plot the Likelihood Function for Variance

We'll plot this function for a range of σ^2 values, keeping μ fixed at its MLE value. 

The vertical red dashed line represents the Maximum Likelihood Estimate (MLE) for the variance, which is approximately σ^2=3.90

The last step we plot the gaussian disbtribution with estimated mean and variance

The formula for gaussian distribution is given below (Note x is function argument, u_hat and sigma_hat are the parameters):

Here's the plot showing the Maximum Likelihood Estimates (MLE) for the Gaussian distribution overlaying the original data. The red line represents the Gaussian distribution using the MLE for the mean ( u≈4.91​) and the variance (σ^2≈3.90). As you can see, the estimated Gaussian distribution fits quite well with the histogram of the original data. 

Other Important Distributions and Its Parameters:

This process can be used for estimating other distributions such as: