Jump to content
  • 0

Vishwadeep Khatri
 Share

Message added by Mayank Gupta,

Mean of a data set is the sum of the observations divided by the number of observations. It is the simple arithmetic average of the numbers in the data set. It is the most commonly used measure of central tendency and is usually preferred when the data distribution is symmetric.

 

Median denotes the middle value of a given data set i.e. 50% of the values will be above the median while 50% will be below it. It is one of the measures of Central Tendency which is preferred when the data is skewed or has extreme values.

 

An application-oriented question on the topic along with responses can be seen below. The best answer was provided by Suresh Kumar Gupta and Himanshu Sharma.

 

Applause for the joint winners. 

Question

Q 528. Mean and Median are the two most commonly used Measures of Central Tendency. Which of them is used more often and why? Support your answers with examples.

 

Note for website visitors - Two questions are asked every week on this platform. One on Tuesday and the other on Friday.

Link to comment
Share on other sites

3 answers to this question

Recommended Posts

  • 0

Lean Six Sigma experts usually suggest to take a target for improvement in mean as compared to median because of the following reasons: -

·        Population Data is used for analysis

·        Approach is decision based

·        Focus is on achieving on project goals

·        The mean has a direct relationship with the total (total = mean x N), but the median does not that is why it difficult to move the needle with median

·        Outliers present in the data can be taken care of by understanding the reason and if it is result of measurement error we can exclude it from analysis so that there is no impact on Mean

 

Let’s consider the following examples involving skewed data

 

Example 1: Lotteries

Which one of the below two games would you choose to play?

Game A:

·         A 1/3 chance of winning $1

·         A 1/3 chance of winning $2

·         A 1/3 chance of winning $3

Game B:

·         A 1/3 chance of winning $1

·         A 1/3 chance of winning $1.90

·         A 1/3 chance of winning $1,000,000

It was found that most people preferred Game B even with lower median ($1.90 vs. $2.00).

After looking at the data set although the distribution is skewed we should prefer using mean over median since for Game B  $1,000,000 is an extreme value and is part of the distribution and relevant for decision-making.

 

One shouldn’t take the decision based on the 1 in 3 chance of becoming a millionaire with Game B.

 

Example 2: A Company wants to increase its Headcount resource

 

A company is planning to hire 300 new employees and to estimate the total cost of the initiative.

 

Salaries data distribution would tend to be skewed and we can expect that the mean salary of new jobs would be larger than the median.

 

Considering the data distribution of salaries to be skewed the manager of the company decides to use Median and to estimate the total cost of hiring the new employees he multiplies median with 300.

 

The actual cost turned out to be much higher than their estimate since the Manager had used Median and not mean.

 

In this example Company was more interested in the total cost of hiring the new employee and Manager should have used Mean since it has a direct relationship with the (total = mean x N), but the median does not.

 

In this case, the company was interested in a total. The mean has a direct relationship with the total (total = mean x N), but the median does not.

 

So we can conclude that mean should be preferred over median as a better metric whenever the analysis is driven by goals and business decision’s depends on a total (total revenue or total sales) since it has direct relationship with the total. Also Means are sensitive to large values and care should be taken to ensure that outliers if any are taken care of.

Link to comment
Share on other sites

  • 1

Mean & Median:  Mean Median used to measure central tendency of data.

The arithmetic mean is found by adding set of value and dividing by no of values in a data set. Median is centre value of data when data arranged in ascending and descending order.

It is true that Lean Six Sigma Expert use mean mostly as compared to median.  One of the reasons, Target is decided on metric like quality rejection % or PPM or productivity – part/man-hour or Part/shift or similar target for other business metric and they present this data on monthly basis to management and compare YTD average. So mean is easily understood by management and when project moved forward and we analyse the data on comparatively low level than we use median if required as data have outlier or non-normal.

One more aspect, if there is outlier in data and reason is known so you can take decision to remove or keep outlier and still use mean.

 

Example: We can discuss below energy consumption trend.

 

image.png.837521be5c1ed5921826eb07a21c3f8c.png

 

Above graph is showing specific energy consumption trend of one year. In above data set mean is 11.3 and median is 10.8kwh/part. As we can see energy consumption is high in particular month 15.6 due to low production. Although it is specific power consumption but still we can see high consumption per part due to low production and some amount of fix energy consumption. Means if you are having zero production then also some amount of energy plant would consume. There can be two aspects in decision making one is we consider as it is and use mean 11.3 for target setting other is we remove this particular month outlier and recalculate mean which is 10.9 and more close to median or we can use median 10.8  If we are removing this particular outlier than its mean this particular reasons will not reoccur in next year or if this is something seasonal trend and production is usefully low in particular month than you do not want to remove outlier and you may take mean as 11.3 for target setting. More interestingly in case of removing outlier we may still use mean but not median because many people does not understand difference in mean and median so to explain them easily lean six sigma expert use mean for target setting.

 

Other thought process is if mean and median is different of a data set than it means data is having some outlier or specific reason for particular performance so management will think that some of the reasons would always be there and there will be some inconsistency due to current business environment of VUCA. So instead of using median, use mean and show reason for outlier or inconsistency. So this thought process can be more useful for current business scenario.

So because of above reasons, the belief is that it is difficult to move the needle with the median and Lean Six Sigma expert use mean for target setting.

 

 

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Who's Online (See full list)

    • There are no registered users currently online
  • Forum Statistics

    • Total Topics
      3.1k
    • Total Posts
      15.9k
  • Member Statistics

    • Total Members
      54,473
    • Most Online
      990

    Newest Member
    Ebuka Sunday Sunday
    Joined
×
×
  • Create New...