Dear Ransingh
A very good question and the link shared by VK will help you visualize how CLT works.
I want to highlight a common misconception about Central Limit Theorem. It is probably one of the most misunderstood concepts in Lean Six Sigma.
Most of the people assume that if they have a large sample size (read greater than 30), then the data set follows normal distribution. This is far from truth. Irrespective of the sample size, the sample will always follow the distribution of the original data set. So if the original data set is Not Normal, then the sample (be it size 1 or 2 or 10 or 30 or 100 or however big) will also be Not Normal.
Then where does CLT apply?
CLT applies on the distribution of the sample means or sample sums i.e. if i pick up multiple samples from the Not Normal data set, calculate either the sum or the mean of all the samples and plot them on a histogram, then it will follow a Normal distribution.
For e.g. consider a roll of a single dice. Possible values are 1,2,3,4,5,6 each having the same probability. A common misconception would be if i roll the dice multiple times (say 6000) times, I will get a normal distribution. This is not true. Roll of a dice follows a Uniform distribution and hence if you roll it 6000 times, it is likely that 1 through 6 will occur 1000 times each.
However, what happens if 2 dice are rolled and sum of each roll is noted. The possible values are 2,3,4,5,6,7,8,9,10,11 and 12. Here however the probability is not the same.
Prob. of getting 2 = 1/36 (only 1 combination will give 2)
Prob. of getting 3 = 2/36 (2 combinations will give us 3)
Prob. of getting 4 = 3/36 (3 combinations will give us 4) and so on......
7 has the maximum probability (6/36) of occurrence while 2 and 12 have the least (1/36).
Now, if I roll the 2 dice for 6000 times and plot the sums of each roll on a histogram, the plot will start resembling a normal distribution because of the variation in the probabilities of each number.
Here if you notice closely,
1. The original distribution is Not Normal
2. Taking 2 data points from the original data set will give me a sample (equivalent to rolling of 2 dice). Then for each sample, the sum is being calculated and plotted
3. CLT is being applied on the sum and not on the individual data points
The same is evident in the animation link shared by VK.
So let's be aware of the misuse of this theorem and apply it correctly.
P.S. there are multiple online sources where you can also find the mathematical proof of the the Central Limit Theorem.