Question: While continuous data is generally preferred over discrete data, please indicate circumstances where discrete is the preferred data type although continuous data is available for the same characteristic.
Answer:
With given a choice on the data type, it is always useful to analyze the continuous data rather than discrete, because discrete though it has large data samples studied, the data will not be broken down into meaningful information. Continuous data can be broken down into smaller pieces and make the data informative to the decision making. With a given continuous data, we can estimate how process mean is close to or far from the target. & whether we are out of spec limits or within spec limits.
Example 1: (Continuous data over discrete data) Diameter of a pipe when it is produced is collected for analysis purpose. In this case, the diameter is measured in mm. Lets say, target is 10 mm and 1 mm over to it is ok. As stats are concerned, at a very high level picture, it is classified into <10mm, between 10mm & 11 mm and >11 mm. This will be projected in discrete data, as the categories/boundaries are defined and counted as defects. This has no meaning into decision making. But when the data is represented as continuous in I-MR chart, the no. of pieces which are out of spec limits are identified, root cause will be identified and arrested. It is not possible with discrete data. Hence Continuous data is always preferred.
Example 2: (Discrete over Continuous data)
Lets say, 20 employees working in ABC process been monitored for shift adherence. Time they login is collected against the target of shift start time & Plotted in time series chart as continuous data to find the defect %. But it will be useful in terms of RCA and not meaningful if we have to count the defect count and report out that how many were late and how many were on time. Hence for such type of data, though the data collected from a real time scenarios and possess continuous data characteristics, it is meaningful if we present no. of late logins and shift adherence % to management as discrete data. Here such instances like average delivery time, processing time, login time, etc falls under continuous data, for reporting purposes, it is useful to represent it as discrete data.
Examples where discrete data is preferred over continuous data:
Examples
Continuous data
discrete data
Shift adherence
Login time is noted for all employees.
For reporting purpose, continuous is converted as discrete (Late, early, on time) and presented for meaningful decisions.
Minimum balance of 1000 in bank account
Balance range is collected for all account holders.
Classified as "Maintained / Not maintained" and reported out as discrete.
Car fuel guage
how many litres remained in the car fuel tank
gauge indicates " Full, half, Empty"
Height of the child in school records
Height is noted for each and every child and compared against the growth chart wrt age of the child.
How many are underweight and overweight? Been counted from the collcted data and presented at high level.
Conclusion:
Does this mean only attribute data is good enough? Of course not. Both plays a different role. For decision making, RC analysis, continuous data is more meaningful but reporting purposes at high level, discrete would be better. So answer is it depends on the underlying characteristic that we want to measure / collect and represent. If it is continuous data, then you will have the choice of reporting it out as continuous or discrete or both.
Thanks
Kavitha