• 0

# Continuous data

Continuous Data

Continuous Data - represents measured data. Any product specification that is measured and has a unit gives us continuous data. This data can be logically broken down infinitesimally. E.g. length of a product, weight of a person, time etc.

An application oriented question on the topic along with responses can be seen below. The best answer was provided by Kavitha Sundar on 31st October 2017.

## Question

While continuous data is generally preferred over discrete data, please indicate circumstances where discrete is the preferred data type although continuous data is available for the same characteristic.

Note for website visitors - Two questions are asked every week on this platform. One on Tuesday and the other on Friday.

## Recommended Posts

• 1

Question: While continuous data is generally preferred over discrete data, please indicate circumstances where discrete is the preferred data type although continuous data is available for the same characteristic.

With given a choice on the data type, it is always useful to analyze the continuous data rather than discrete, because discrete though it has large data samples studied, the data will not be broken down into meaningful information. Continuous data can be broken down into smaller pieces and make the data informative to the decision making. With a given continuous data, we can estimate how process mean is close to or far from the target. & whether we are out of spec limits or within spec limits.

Example 1: (Continuous data over discrete data) Diameter of a pipe when it is produced is collected for analysis purpose. In this case, the diameter is measured in mm. Lets say, target is 10 mm and 1 mm over to it is ok. As stats are concerned, at a very high level picture, it is classified into <10mm, between 10mm & 11 mm and >11 mm. This will be projected in discrete data, as the categories/boundaries are defined and counted as defects. This has no meaning into decision making. But when the data is represented as continuous in I-MR chart, the no. of pieces which are out of spec limits are identified, root cause will be identified and arrested. It is not possible with discrete data. Hence Continuous data is always preferred.

Example 2: (Discrete over Continuous data)

Lets say, 20 employees working in ABC process been monitored for shift adherence. Time they login is collected against the target of shift start time & Plotted in time series chart as continuous data to find the defect %. But it will be useful in terms of RCA and not meaningful if we have to count the defect count and report out that how many were late and how many were on time.  Hence for such type of data, though the data collected from a real time scenarios and possess continuous data characteristics, it is meaningful if we present no. of late logins and shift adherence % to management as discrete data.  Here such instances like average delivery time, processing time, login time, etc falls under continuous data, for reporting purposes, it is useful to represent it as discrete data.

Examples where discrete data is preferred over continuous data:

 Examples Continuous data discrete data Shift adherence Login time is noted for all employees. For reporting purpose, continuous is converted as discrete (Late, early, on time) and presented for meaningful decisions. Minimum balance of 1000 in bank account Balance range is collected for all account holders. Classified as "Maintained / Not maintained" and reported out as discrete. Car fuel guage how many litres remained in the car fuel tank gauge indicates " Full, half, Empty" Height of the child in school records Height is noted for each and every child and compared against the growth chart wrt age of the child. How many are underweight and overweight? Been counted from the collcted data and presented at high level.

Conclusion:

Does this mean only attribute data is good enough? Of course not. Both plays a different role. For decision making, RC analysis, continuous data is more meaningful but reporting purposes at high level, discrete would be better. So answer is it depends on the underlying characteristic that we want to measure / collect and represent. If it is continuous data, then you will have the choice of reporting it out as continuous or discrete or both.

Thanks

Kavitha

##### Share on other sites
• 1

It is true we have learned that “Continuous Data” is always preferable when available than “Discrete” data.

More precise statistical analysis would be possible with continuous data. For process analysis, identifying improvement and for comparing and measuring improvements, continuous data, is more amenable.

However, it is quite surprising that there are many situations where we actually deliberately present a discrete representation of continuous data.

Let us examine a few assorted situations as below:

1. One of the most common examples that comes to our minds is the usage of “Go / No go” gauges, where a variable parameter is converted as attribute for quick decision purposes.
2. A control chart uses continuous data, but when it comes to a decision for action, it is based on a set of discrete rules like “whether the point has fallen outside limits or not”?
3. Hypothesis tests for variable data such as‘t’ tests, finally rely upon a ‘Yes / No’ decision of whether the P-value is greater than 0.05 or not.
4. There may be occasions where we prefer to pay for certain commodities say apples, oranges, on a count basis, even though it is possible to weigh them.
5. In Supermarkets all items are packed and barcoded, so the count method is used for billing than the exact weights
6. We say that “I am 30 years old” and do not say that “I am 15,768,213.34 minutes old” Although time is a continuous data, when we talk about our age, we are actually ‘counting’ the number of years, and not using a continuous scale!
7. Schools prefer to use grade system than the actual marks to denote a student’s performance.
8. When someone asks a question “How punctual is he?” we would not answer by providing a frequency distribution of his arrival times, but rather provide data on ‘how many days did he arrive on time during a month’.
9. Turn Around Time (TAT) can be measured as a continuous characteristic; however most customers who outsource data capture process, specify TAT requirements on discreet basis. For eg.  “96% of production should meet TAT of 24 hours and 99% of production should meet TAT of 48 hours”.
10. Although volume of fuel is measurable in the tank of a vehicle, a discrete method of a warning light coming up when it reaches a certain level is highly preferred.
11. A ‘dip stick’ with the high and low indications marked on it is commonly used to check the engine oil levels (and not a volume meter!)
12. Pulmonologists use an equipment to test lung capacity, where 3 colored balls re used in a blowing device, and displacement of the balls is observed upon blowing. This discrete method replaces the otherwise continuous data on the rate of air displacement
13. Interestingly, a histogram is a tool used to represent continuous data in a discrete fashion. Each class interval is counted and the represented by each vertical bar of the histogram. This method allows easy representation and interpretation.
14. Ready-made dresses are classified for sizes as S, L, XL, XXL etc. which actually represents different ranges of measurable dimensions.

The above examples illustrate the fact that:

Even though we use continuous data for various purposes, when it comes to the final decision, we have to go discrete.

For certain objective decision making in our day to day activities, discrete data would be easier to measure and interpret.

##### Share on other sites
• 1

Discrete data are very much preferred when the event that is being measured is itself discrete in effect. Discrete data could be preferred if the information required from the measurement is qualitative. There could also be a continuous measure possible and being done, but the objective of the exercise would be only whether a transaction has met the requirements or not. By how much has a transaction failed to meet a requirement or by how much it has exceeded the requirement may not be of immediate concern.

One recent example that has made it to the newspaper headlines atleast on the sports pages is the use of the Danish football physiologist Jens Bangsbo’s Yo-Yo test as a criterion to decide if a cricketer, found to be adequately skilled in different departments of the game, is also physically fit enough to be considered for national selection or not. It does not matter if a player has failed to meet the required score of (say) 19 by a very small margin, he is out of the team and out by how much is irrelevant. Similarly, if a player passes the test, he is in the team and whether a player has exceeded the requirement by a high margin or he has just scraped through does not matter as both the players are equally in the team.

One other situation could be grading of students in an educational institution, be it a school or a college or any other qualifying examination. Typically, it could be felt that there is not much difference between students who get marks that differ from one another by a very small number. In other words, the difference between two students with scores of 95 and 96 could be only considered as only due to chance causes. Therefore the examining authority could create multiple slabs of marks and map these slabs to grades. All students who get marks within a slab would be awarded the corresponding grade and the actual marks scored would not even be reported as they are not considered relevant once the slab and thereby the Grade has been decided. Ofcourse there could always be a cut off for passing the exam.

In the same educational institution, the faculty could be rated by the proportion of students who have passed the exam in the class handled by the faculty member. Here, the average marks scored by the students, although calculable is not used as a measure of performance, because helping students pass an exam is considered a very basic requirement and more critical than helping more students get higher marks.

Such situations are also common in tracking the effectiveness of controls, especially manual controls. These controls could be just process controls or security related controls. The objective is to understand the effectiveness of controls for which the pre-requisite is to measure the proportion of instances in which the control has been implemented out of the total opportunities. The control can be just an entry in a register or a screen or a phone call or mail to communicate something. As the implementation of the control itself is a discrete action, that is, either it is implemented or it is not, the measure of the control itself is preferred in discrete form. The time interval within which the control is implemented or the quality of the mail or phone conversation is irrelevant and the KPI tracked here is the proportion of transactional opportunities in which the control was applied.

##### Share on other sites
• 0

Categorical data class of discrete data will be more useful to understand overall characteristic of the target under study like eg: underweight, normal, obese. This is usually just another kind of binning. In case of performing qualitative study like customer satisfaction discrete data will be more useful to analyze and come to conclusion rather than a discrete data. In scenarios where ranking has to be done based on relative performance ordinal class of discrete data will be handy compared to continuous data. Population census data analysis and clinical trial analysis of drugs are the other areas where discrete data are handier compared to continuous data.

##### Share on other sites
• 0

While continous data is preferred over discrete but in some circumstances we prefer discrete although continous data is there for the same.  For example age is continous data but sometime we consider it as a discrete for simple calculations we consider only number factor in age.  Like 10,20..etc.  Another example in regression analysis we use continous data with discrete as weight of jujube boxes is correlated with no of jujubes in the box here weight is continous but number of boxes is discrete data.  We use chl square analysis to find out is there any statistical significant difference in amount of color in the box.

We can convert continous data into discrete but not vice versa.  For example we want to know how much water is there in our office for the day we can simply say it is 50 litres but instead we can also say 50 bottles if one bottle is of one litre.  So its easy to count number of bottles rather than to Wight of every bottle there to find out quantity of water.

##### Share on other sites
• 0

Discrete data type would be preferred over Continuous data type in processes where counting, classification, ranking, categorising may be required,instead of just using the raw process data values.

A few examples where, I believe, discrete is the preferred data type although continuous data is available are:

1.       Calculating Number of defects or DPMO. Discrete data makes sense when we are talking about the number of defects, or the number of defects per million opportunities (DPMO). We might have continuous data showing dimensions of a product, but it might be just sufficient to know if the item is defective or not.

2.       Apparel /Shoes Manufacturing. Though Heights and Foot sizes are continuous but the manufacturers create categories (discrete) for production.  – Apparels: XS, S, M, L, XL, XXL; Shoes: 3,4,5,6,7,8,9,10,11,12.

3.       Kids / Senior Citizen Discounts. Though Age is available as continuous data (in terms of years, months, days and so on) for processes involving age based discounts categories (discrete) are used: Kids – below 8 years, Senior Citizens- above 65 years.

4.       Speed Tickets Penalization. Speed is continuous data but while determining if a driver was speeding and deciding the quantum of penalty, the data is treated as discrete i.e. 5 kmph above speed limit, 10 Kmph above speed limit, 20kmph above speed limit and so on.

##### Share on other sites
• 0

Discrete Data:

Discrete data is about information being categorised to a classification and is countable and data has only finite number of values.

Eg 1: Pass/Fail, Go/No-go, accept/reject, Male/Female..
Eg 2: Rating: 1–10
Eg 3: no. of late deliveries

Let us see with an example where discrete data is preferred while continuous data can still be used for that.

1.      The correspondent of a school is interested to know how many students passed and failed in each section/class of 12th Standard. He is interested only in knowing the pass status.  There are 4 sections of 12th Standard class.  Section 1 is for Maths, Physics, Chemistry and Biology. Section 2 is for Pure Science (Physics, Chemistry and Biology), Section 3 for computer science and Section 4 for commerce.Each section is having 30 students.

 S.No Section Total Students Passed Fail 1 Section 1 30 30 0 2 Section 2 30 20 10 3 Section 3 30 25 5 4 Section 4 30 26 4

Totally – 101 out of 120 passed. Now this is a discrete data as it talks about pass/fail.

Now if the correspondent wants to know how many students passed with distinction and how many passed with first class marks and how many with second class marks and how many just crossed the pass mark (35/100), then we can show the marks in ranges. Then this data can be treated as a continuous data.

 S.No Section # Students Pass Range 35-40 40-50 50-60 60-70 70-80 80-90 90-100 1 Section 1 30 2 2 5 6 10 3 2 2 Section 2 30 4 2 1 1 6 5 1 3 Section 3 30 1 2 3 2 9 4 4 4 Section 4 30 5 2 2 6 5 3 2

Conclusion

Even though, in general we know it is better to have continuous data for having accurate results, as seen above there are cases where discrete data could be needed and useful to achieve the results, even if continuous data can also be used for that.

##### Share on other sites
• 0

When there are multiple variables with continuous data , It becomes difficult to map the data. The impact on the visualisation of such data mapping is drastic and the charts/ graphs become very cluttered and confusing. The numbers / labels in such graphs often do not make sense . There is a need to assign meaning to these numbers. Hence, it is always advisable to collect the numbers in groups or categories and representing the categories as discreet data and thus make the mapping more structured and presentable.

##### Share on other sites
• 0

Advantages of Discrete data:

1. Real data comes in discrete form always,  real data points can be calibrated from discrete data
2. discrete data can be used by a computer as it works in discrete units.
3.  so discrete data are more general as continious time is also discrete time(special case- in the limit as the length of each period approaches zero).

##### Share on other sites
• 0

This question again challenged us to think beyond what we always stress upon - collecting continuous data wherever possible.Kudos to everyone who made an attempt. Great examples provided by Venugopal, Mohan PB, Kavitha and R Rajesh.

The chosen best answer is that of Kavitha's as she provides a direct comparison of examples where we have continuous data but still choose to collect discrete data.

## Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

×   Pasted as rich text.   Paste as plain text instead

Only 75 emoji are allowed.

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×

• ### Who's Online (See full list)

There are no registered users currently online

• ### Forum Statistics

• Total Topics
2,785
• Total Posts
13,951
• ### Member Statistics

• Total Members
54,713
• Most Online
888