Jump to content

Sampling is choosing a smaller number of items / objects / individuals from a population to make inferences about the same population. Sample is only a part of the population. It should non biased and representative of the population


Applause for all the respondents - Kishan Raval, Mukti Garg


Also review the answer provided by Mr Venugopal R, Benchmark Six Sigma's in-house expert.


Q 268. 'The results are only as good as the sample' and hence it is imperative to select a good sample. What are the key considerations while sampling in order to get a good sample?



Note for website visitors - Two questions are asked every week on this platform. One on Tuesday and the other on Friday.

Share this post

Link to post
Share on other sites

5 answers to this question

Recommended Posts

  • 1

Benchmark Six Sigma Expert View by Venugopal R

Statistical Sampling is a method that has been prevalent for long to help assess the characteristics about a population. Though the best option would be to assess the entire population, it may practically not be possible and hence the dependency on sampling to take decisions.


Sampling Risks:

While every method of sampling is associated with risk of errors, it is possible to understand and even quantify these risks and thus take an informed decision. Most of us will be aware about sampling errors, but we can have a quick recap as below:

1.       Risk of declaring a good population as bad (alpha risk)

2.       Risk of declaring a bad population as good (beta risk)

Any sampling plan is governed by its operating characteristic curve (OC curve) that depicts and quantifies these risks. The OC curves and the sampling plans based on them have been very widely used in business for the purpose of deciding the appropriate acceptance sampling. However, I am not elaborating on this topic further here since there are many other aspects of sampling to be covered.


Sampling Frame:

To obtain a representative sample from a population, it is important to define the ‘sampling frame’. The sampling frame is the set of units that exhaustively represents the universe from which we take a sample. For instance, if we need to pick a sample for assessing customer satisfaction for a certain product and we pick the sample customers based on the credit card details, the sample will not cover the set of customers who paid through other means, and it is possible that their levels of satisfaction could be markedly different.  Hence, in this case, the ‘sampling frame’ should incorporate inclusion of customers from all modes of payment.


A sampling frame should be defined in such a manner that it considers and represents all possible stratification of the population. The number of units in the population not covered by the frame is known as ‘gap’. If the units in the gap are distributed like the units in the frame, then the sample will be a good representation of the population. Samples taken without using a frame are called as ‘non-probability’ samples, where as the samples taken using frames are called as ‘probability’ samples. It is recommended to use probability sampling, whenever possible, so that valid statistical inferences could be derived.


Let us discuss various types of probability samples that could be used for different situations:


Simple Random Sample:

This is one of the most basic sampling methods. In this method there is a random chance for picking up any item from a population of N items. The lot of N items represents the frame. One may use random numbers and pick the samples


Stratified Sampling:

Here the N items in a population are divided into sub-groups or strata, based on a characteristic of relevance. A simple random sample is selected from each stratum and the combined result is obtained. For instance, if we need to pick a sample to perform a medical test from a population of the state, we can sub-classify them into districts and pick random samples from each district. Stratified sampling technique can help to reduce the overall sample size to obtain the same level of confidence on inferences. Further, it will also help to understand if any heterogeneity is present between the strata.


Systematic Sampling:

In systematic sampling, we classify all the items in the frame into groups by dividing the total number of items by the sample size. A very simple example of sequential sampling is to pick every nth item from a production line for inspection. While this sampling method gives a uniform coverage across the frame, one has to be cautious of certain disadvantages. For instance, imagine this is used for assessing the travel experience of people who got off a flight, and the method followed was to pick every 12th passenger who exits. There is a possibility that you might be picking up more passengers who were occupying a particular seat location, say, window seat, and thus likely to introduce bias in the sampling.


Cluster Sampling:

All the items in the frame are divided into clusters. Clusters are naturally occurring sub-categories of the frame. Example: Districts within a state, Colleges within a region etc. Out of n number of clusters, a few samples are selected and all the items in that cluster are studied. It may be noticed that the cluster sampling method is different from the stratified sampling method. Cluster sampling could result in increased sample size, but sometimes it may be convenient and reduce need to travel.


Keeping the objective in mind, the sampling strategy and method will have to be decided, so that the inferences based on the sample will meaningfully and reliability representative.

Share this post

Link to post
Share on other sites
  • 0

Why Use Data Sampling?
Sometimes When you trying to gather information on a complete population is just cost prohibitive. Think about CNN’s TV Channel coverage of an election cycle in the USA. It is not that possible to ask every voter how they voted & WHom they gonna vote. Even if it were, not all would answer. Instead of that they use exit polls to derive statistical conclusions about the population as a whole.

Concerns About Data Sampling
When you are taking a sample from larger population you must make sure that the samples are an appropriate size and are sampled without any bias. You should address these concerns while collecting data

For example, it is very helpful if the sample size is large enough for the data to follow normal distribution as this will really opens the door to use an array of statistical tools.


How Large Should your  Data Sample Be?

calculation for how large a sample data set should be  actually depends on:

Type of data (continuous or discrete) being measured
How much  precisely  you want your statistical inferences to be.
Estimate of the standard deviation or historical standard deviation for the entire population.
confidence level desired.

Below points are for Hypothesis test 

Sample size needed for hypothesis tests depend on:

Desired Risk (Both alpha and beta)
Minimum value to be detected in between the population means 
variation in the characteristic being measured (S or sigma) – the population variance.
Even parameter shift sensitivity
(Population size does NOT come into the determination of how big a population is.)

Share this post

Link to post
Share on other sites
  • 0

Key considerations in order to get a good sample are:


- Clarity on the end result one wants to achieve.

- Alignment of the sample selection to the organisation’s business & expectations. 

- Processes & tools needed to select the right sample.

- Follow up approach to sustain the sampling results. 

Share this post

Link to post
Share on other sites
  • 0

Sampling Strategy




Within the scope of each data collection




Samples save time and effort when data is collected


·        when it is impractical, impossible or too expensive to collect all data


·        when the data collection is a Cumbersome process


Deriving a sampling strategy, which provides the most accurate level of information about the population being measured.so objective is to meet the goals of data collection but optimize the effort and cost


The sampling strategy comprises the methodology for selecting samples as well as planning the sample size. This basic procedure can be divided into four phases :


1.The selection of samples should be entirely random


2.Choose a selection principle and a selection type


different types of selection and selection principles will be driven by cost and effort criteria


they vary depending on the question being asked


3.Determine a selection technique in case of random selection



Non-random Selection


Random Selection


Quota Procedure Guideline of quotas


e.g. accident repair


Application: If only targeted information is needed


Simple Sample

All units have the same chance of being drawn

Advantage: No knowledge about population necessary

Disadvantage: High effort


Cut-off Procedure

Only a part of the population is observed, e.g. accident damage


Application: If only one aspect is to be examined


Cluster Sample

The population is clustered in a logical way and one cluster is selected e.g. sites

Advantage: Lower costs

Disadvantage: Information can get Lost


Haphazard Selection

Example: Only the information which can be obtained easily, is collected


Application: If only a first impression is to be gained e.g. for estimation of proportion or standard deviation for more precise sample size calculation


Stratified Sample

The population is stratified according to relevant criteria, e.g. spray-painting type, machine, location etc. Then a representative sample is removed from each stratum

Advantage: Smaller sample

Disadvantage: Information on the population must be available to start with



4.Determine the sample size


The bigger the sample the greater the validity i.e. the quality of the statistical conclusion about the population


One should therefore revert to available data (e.g. from IT systems): The data is treated like sample since the process to be improved hasn't yet been stopped


When new data is collected (e.g. manual counting, surveys) an assessment of the cost of collection and desired level of confidence and precision must take place



All in all three factors play a role when the sample size is determined:


·        the required Confidence Level which indicates the likelihood that the population mean lies within the given Confidence Interval. This value is normally a given for any organization e.g. 95%


·        The granularity is an indication of how precise we want to be and is usually half the width of the Confidence Interval


·        The costs and the duration of the data measurement increase with the sample size. When the sample sizes are calculated it is important to consider whether the requested precision is worth the costs inc


Rules of thumb for sample size


·  Discrete100, at least 5 per category , Data : Ok/ Not Ok

- Continuous -  30


Share this post

Link to post
Share on other sites
  • 0

There is no best answer for this question.


Sampling is done to derive meaningful inferences about the population. Some of the key considerations for sampling are

1. Purpose of the study

2. Cost and time available for the study

3. Permissible errors in the study (alpha and beta)

4. Population Information (sampling frame) available

5. Sampling method (to ensure a non-biased and representative sample)

6. Sample size 


Also review the answer provided by Mr Venugopal R, Benchmark Six Sigma's in-house expert.

Share this post

Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Answer this question...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Sign in to follow this  

  • Who's Online (See full list)

    There are no registered users currently online

  • Forum Statistics

    • Total Topics
    • Total Posts
  • Member Statistics

    • Total Members
    • Most Online

    Newest Member
    Abhishek uj
  • Create New...