June 19, 20196 yr What are the key differences between Multiple Regression using historical data and Multiple Regression based on Experimental Data (DOE)? What are the advantages of one over the other, if at all?
June 19, 20196 yr Regression using Historical Data will show correlation in data - it does not necessarily control for each variable independently. Regression via DOE will show causation in data - parameters are analyzed to show their impact on the final result independently and interactively with other variables.
June 19, 20196 yr When using historical data, the inputs are not controlled but merely observed... this allows the identification of correlation and the ability to make inferences and predictions but lacks the ability to identify true cause-and-effect relationships. When using experimental data, the inputs are controlled... this allows the identification of true cause-and-effect relationships and optimization of the output(s) of interest.
June 19, 20196 yr What are the key differences between Multiple Regression using historical data and Multiple Regression based on Experimental Data (DOE)? Using historical data : you can see how the actual data fits the regression line you can see outliers and unusual observations to investigate or collect more data you can confirm if residuals are random and follow normal distribution DOE 1. models a regression line based upon experimental data - may not reflect all influences - noise, environmental, and control factors as in real data 2. models are only as good as the SME /teams that provide insight to the scope, factors, boundaries, and interactions What are the advantages of one over the other, if at all? using DOE can model and understand interactions with smaller runs/ replications ; can screen for critical X's and then evaluate interactions and provide model equation that cover more factors and levels more efficiently and effectively.
June 19, 20196 yr Differences between multiple regression using historical data vs. multiple regression using experimental data (DOE): Data capture: Analysis using historical data may be more representative of the environment (no Hawthorne effect), but such data may be captured in a less controlled and structured manner (levels of factors may not be meticulously set and recorded accurately). Modeling & analysis: No guarantee that historical data includes all the factors and combinations of their levels to allow statistically valid conclusions to be drawn. DOEs can be designed for specific purposes (e.g., screening, optimization) that allow models to be constructed and analyses to be performed in the most economical manner (i.e., in the fewest runs) and yield models
June 19, 20196 yr The key difference between DOE and historical data regression is DOE involves control of the factors that generate changes in the responses. DOE allows you to change elements to maximize the output, where as historical data provides a much less powerful solution set through showing correlations whereas DOE helps you understand causation.
June 19, 20196 yr Multiple regression using historical data is useful for: A1) Assessing correlation A2) Creating a model to predict an output (Y) from input variables (X's) DOE is useful for: B1) Optimizing designs and the specific experiments necessary and sufficient to validate designs B2) Screening/downselecting design options B3) Provides a process to design experiments including the resolution level and number of runs for a set of factors B4) Includes terms based on cross-products of the factors that can be assessed for significance for modeling
June 19, 20196 yr When using historical data for regression analysis, the variables cannot be controlled. Regression with historical data merely tells you the mathematical relationship between variables. Bias could be unknowingly introduced by using data that was simply available. For example, two variables may always be changed together and the effects may not be apparent. When using a DOE data to run a regression, the variables are controlled and the factors are kept independent of one another. Interactions can be observed and understood. Parameters can be changed and tests can be ran to verify the optimization of the model. Noise can also be accounted for, something that using historical data does not account for. Overall, DOE is the preferred method but may not always be practical to perform due to cost, time or issues with data collection. The benefits include understanding how the variables interact with each other and the sensitivity at which they interact. However, if the goal is to understand the correlation between variables (and not the interactions), historical data may be acceptable.
Create an account or sign in to comment