  2. The chosen best answer is that of Natwar Lal for listing multiple scenarios when causality exists but still the X and Y may not show strong correlation. Amlan Dutt's answer is must read to get good insight on the topic. Benchmark expert view is provided by Venugopal R.
  Q. 169 How can you check if you have taken enough samples for carrying out a Regression Analysis?
  5. Benchmark Six Sigma Expert View by Venugopal R One of the important tasks that most of us would have to encounter while working on improvement projects is to establish controls for sustaining our gains. In this context, it is not only important to identify the cause-effect relationship relevant to our problem, but also, prove and implement sustenance measures. Once a cause and effect relationship is established and we have proven the relationship between two variables, we would certainly like to express the association in a best possible manner. To examine whether an established cause-effect relationship should necessarily exhibit strong correlation, let’s look at some examples and think about this question. Correlations that remain valid within a range: Let’s take an example of a compression moulded component. It was proven that the cause for the poor hardness of the moulded component was due to low temperature setting. Once the temperature setting was increased, other parameters being maintained, the required hardness was attained. Both the dependent and independent variables are continuous in nature. In this case if a study is taken up by measuring the hardness levels against various temperature settings, we can certainly expect to see a positive correlation. However, this correlation may not continue beyond a certain range of temperature value. The correlation between the cause and the effect is valid within a certain range of the cause variable and would have an optimal value. Discrete causal variable: Let’s take an example of vehicle fuel mileage. Based on studies, it was established that the type of spark plug used was an important cause for the mileage of the vehicle. In this case we have 3 different types of spark plugs to choose from, thus making the causal variable a discrete one. In a strict sense, we may not be able to establish a co-relation between the proven cause and effect, since we do not have a sets of variable data sets to derive the correlation. However, those interested in deeper research may identify a variable factor within the spark plug that causes the difference and try to establish a correlation to the effect. Discrete variables for both cause and effect: Let us take another example where a login account is not opening and the cause is identified as usage of wrong passcode. Once the right passcode is used, the login works. The variables involved in the effect and cause are both discrete. Is there a way to establish a ‘correlation’? Continuous causal variable and discrete effect: Let us consider a case where the input (causal) variable is continuous and the output (effect) variable is discrete. Consider a drop test for a packed Hardware equipment, where the input variable is the drop height and the output variable is “whether the equipment is damaged or not”. It may not be possible to derive a correlation directly. However, if we can perform multiple tests for each drop height, then the proportion of products getting damaged for different drop heights, within a certain range could show a correlation. Considering the destructive nature of such tests, it may practically be expensive. To sum up, a proven cause-effect relationship establishes an association between the two variables, dependent and independent. However, correlation could be one of the tools to depict this association, but may not be the best applicable tool in all situations. Other tools such as tests of hypothesis, ANOVA, logistic regression etc. may be more appropriate depending on the types of data.
