Jump to content
  • 0

Go to solution Solved by mohanpb0,

Normal Distribution

 

Normal Distribution - also known as the Gaussian distribution, is a continuous probability distribution that is symmetric about the mean, showing that data near the mean are more frequent in occurrence than data far from the mean. Normal distribution is graphically represented by a bell curve. It is also sometimes called as 'Natural Distribution' as it is found to be naturally occurring in many situations. E.g. height of people in a city, scores in a test etc.

 

 

An application oriented question on the topic along with responses can be seen below. The best answer was provided by Mohan PB on 15th January 2018. 

 

 

Question

Q. 65. In which situations would you consider non-normal data as abnormal? (Abnormal here means unusual or unacceptable)

 

Note for website visitors - Two questions are asked every week on this platform. One on Tuesday and the other on Friday.

 

Link to post
Share on other sites

5 answers to this question

Recommended Posts

  • 0
  • Solution

That the “Normal” in “Normal Distribution” or “Normal Data” means “Natural” rather than the other dictionary meanings like, “Ordinary” or “Typical” or “Regular” or “Usual” or “Standard” would become clearer if the origin of the so called “Normal” distribution is traced.

 

Somewhere in the 18th century C.E. a group of mathematicians and scientists in France were trying for a long time to make sense of a peculiar data distribution they had come across. They realized that one value was occurring most often and also that the other values lesser than and higher than this most frequently occurring value occurred at a progressively lesser frequency. In other words, as the value decreased from the most frequently occurring value, the frequency decreased and as the value increased more than the most frequently occurring value, the frequency again decreased. After a few days of research and discussion, they could not come to a conclusion and decided to take a walk in fresh air to clear their heads.

 

They came to an orange orchard which was full of trees bearing ripe oranges. Unable to resist the temptation, they plucked a few oranges and began to enjoy nature’s bounty. One of the group who was still thinking of the data, started to keep a tally of the number of seeds in various oranges he ate, using a twig for a pencil and the mud as a note book. To his surprise, he found that the seeds in the oranges followed a distribution similar to what they were breaking their heads about in their lab for the last few days. Quickly, he brought this to the notice of others who soon confirmed the similarity of the data distributions. It struck them that perhaps this distribution could be something that occurred naturally.

 

They tested this theory with certain other naturally occurring parameters and concluded that their theory was indeed correct. For reasons best known to themselves, they chose to name this distribution, “Normal” meaning that such a distribution occurred naturally. Or perhaps, the original French name given was translated as “Normal”. Whatever be the reason, it is now accepted that “Normal” distribution occurs naturally in many physical, social and biological processes. Therefore, if such measurements were made in a truly random manner, the data collected is expected to be naturally normal. Many a time, an apparently non-normal distribution, when investigated, would reveal some man-made cause like blending two distinct groups into one, or a skewed, non-random sampling and so on.

 

Apart from the usually quoted examples of Normality like heights and weights of people from a randomly constituted group, even product characteristics from a machine or a cell with untampered settings and without any technological restrictions, will be expected to be normal not just in their physical characteristics like length, diameter etc. but also in their functional characteristics strength, power, torque and so on. Additionally, medical parameters like blood pressure, blood sugar etc. are also expected to be normally distributed.

 

In the above situations, any distinctly non-normal distribution would need to be treated as unusual.

Link to post
Share on other sites
  • 0
2 hours ago, Vishwadeep Khatri said:

Q. 65. In which situations would you consider non-normal data as abnormal? 

 

This question is a part of Excellence Ambassador initiative and is open for 3 days. There is a reward for best answer in first 24 hours and another for best in 3 days. All rewards are mentioned here - https://www.benchmarksixsigma.com/forum/excellence-ambassador-rewards/

 

All questions so far can be seen here - https://www.benchmarksixsigma.com/forum/lean-six-sigma-business-excellence-questions/

Hi,

non normal data is consider abnormal if practitioner reach in a point where want to use a statistical tool and which require normally distributed data.

Link to post
Share on other sites
  • 0
22 hours ago, Vishwadeep Khatri said:

Q. 65. In which situations would you consider non-normal data as abnormal? (Abnormal here means unusual or unacceptable)

 

This question is a part of Excellence Ambassador initiative and is open for 3 days. There is a reward for best answer in first 24 hours and another for best in 3 days. All rewards are mentioned here - https://www.benchmarksixsigma.com/forum/excellence-ambassador-rewards/

 

All questions so far can be seen here - https://www.benchmarksixsigma.com/forum/lean-six-sigma-business-excellence-questions/

Quick feedback - you may want to rename the "Quote" button as "Reply". So far I was replying to posts separately, discovered this by accident.

Link to post
Share on other sites
  • 0

There are multiple legitimate “non-normal” distributions like Weibull / Exponential etc. But in few instances, data may not “meet” normality reqts like (symmetric/unimodal / p-value checks etc..)  because of errors like -

1)      Distribution related errors – a. Two perfectly normal distributions may be viewed as one, leading to bimodality b. Presence of multiple extreme outliers can significantly skew the symmetricity

2)      Visualization Issues – If axes are not chosen appropriately - eg starting from zero for distributions with no low values at all (eg weights of people) – it can give incorrect visual cues.

3)      Inadequate data – If the collected data is not random/representative enough, the distribution can look completely different from a Normal One

Link to post
Share on other sites
  • 0

The most common example that come to my mind is the "Homogenization' process that is adopted as part of the exercise for setting the control limits for a control chart. Non-normality could be one reason for many points to fall outside the limits and then we try to remove those points and recalculate the limits. However, if this process ends up removing number of points beyond the threshold, we have to discard this data as unacceptable and collect fresh data.

 

As per CLT, even the means of the samples taken from a non-normal distribution should pass the normality test. If such samples are found to be failing in the normality test, the data is unreliable.

 

Depending on the nature of Certain data types, they are expected to follow certain non-normal distributions; e.g. Exponential distribution for failure data. If we do not see such distributions for data from such known events, we may need to treat them as unacceptable.

Link to post
Share on other sites
Guest
This topic is now closed to further replies.
  • Who's Online (See full list)

    There are no registered users currently online

  • Forum Statistics

    • Total Topics
      2,868
    • Total Posts
      14,517
  • Member Statistics

    • Total Members
      55,057
    • Most Online
      888

    Newest Member
    Shammi
    Joined
×
×
  • Create New...