Skip to content
View in the app

A better way to browse. Learn more.

Benchmark Six Sigma Forum

A full-screen app on your home screen with push notifications, badges and more.

To install this app on iOS and iPadOS
  1. Tap the Share icon in Safari
  2. Scroll the menu and tap Add to Home Screen.
  3. Tap Add in the top-right corner.
To install this app on Android
  1. Tap the 3-dot menu (⋮) in the top-right corner of the browser.
  2. Tap Add to Home screen or Install app.
  3. Confirm by tapping Install.
Message added by Mayank Gupta,

Autocorrelation (or Serial Correlation) is the phenomenon where adjacent observations are correlated i.e. previous value has an impact on the next value. This leads to problems in statistical analysis like Regression, Time Series etc. as the assumption of independence of values is violated.

 

An application-oriented question on the topic along with responses can be seen below. The best answer was provided by Rahul Arora on 20th Aug 2022.

 

Applause for all the respondents - Rahul Arora, Chandra Shekhar Chauhan, Hirak Raval, Rohit Chaudhary, Soji Sam.

Featured Replies

Q 497. What is autocorrelation in regression analysis? Why is it a problem and how can a project leader deal with it? 

 

Note for website visitors - Two questions are asked every week on this platform. One on Tuesday and the other on Friday.

Solved by Rahul.Arora2

  • Solution

 

Regression analysis is one of the most common forecasting method, one of the most critical assumption while leveraging regression analysis is that the error terms are independent or random i.e they are not correlated. However in most business scenarios, these error terms tend to be correlated. This correlation of error terms of a regression forecasting model is termed as Autocorrelation or Serial Correlation.

 

305076255_ScreenShot2022-08-20at7_56_27PM.thumb.png.54af0ce56b387d84a9a5633e9246a907.png

 
From the above visual we can clearly deduce there is an underlying pattern being formed by the error terms when they are correlated thus indicating autocorrelation.
 
There are two common scenarios pertaining to autocorrelation i.e. Positive Autocorrelation & Negative Autocorrelation. 
Positive autocorrelation exists when, the positive errors are associated with the positive errors of comparable magnitude & negative errors are associated with negative errors of comparable magnitude.
Negative autocorrelation exists when, the positive errors are associated with the negative errors of comparable magnitude & negative errors are associated with positive errors of comparable magnitude.   
 
There are several possible problems that can arise due to autocorrelation:-
 
  • The estimates of the regression coefficients will become inefficient as they will no longer have the minimum variance property.
  • The variance of the error terms will be underestimated by the mean square error value.
  • The true standard deviation of the estimated regression coefficient will also be underestimated.
  • The confidence intervals & the tests using the t & F distribution will no longer be strictly applicable.
 
One of the most common way to test whether autocorrelation is present in a regression model is by leveraging the Durbin Watson Test, which is calculated basis the below equation:-
 
images.png.40015c9f0f24ffbeb80d1aacea66b7c9.png
where n is the number of observations.
 
Durbin Watson test involves finding the difference between the successive values of error i.e. (ee-1& it formulates the below hypothesis:-
 
H0 : ρ = 0 (There is no autocorrelation)
Ha : ρ != 0 ((There is autocorrelation)
 
The Durbin Watson statistic ranges from 0 to 4 & consists of two values dU & dL. If DW > dU, we fail to reject H0 hence no autocorrelation exists & if DW < dL, we reject H0 & there is autocorrelation.
 
Several approaches are leveraged in order to overcome the autocorrelation problem. Some of these are:-
 
  • By adding independent variables, as one of the most common reason autocorrelation exists in a regression forecasting model is that one or more important independent or predictor variable have not been included in the analysis. For eg : In a model which predicts the sales of new homes might contain autocorrelation & exclusion of the variable “mortgage interest rate” might be a factor contributing to autocorrelation, thus adding this variable to the model might reduce the autocorrelation significantly.
  • Transforming the variables will also help in significantly reducing the autocorrelation. One such method of transforming variables is the first differences approach, which involves subtracting each value of the independent variable X from each succeeding time period value of that same variable X. This difference thus becomes the transformed X variable, the same process is used to obtain the transformed Y. The regression analysis is then conducted on these transformed X & Y variables in order to compute a revised model free of autocorrelation.
  • Another way is to use the percentage changes from period to period & regressing these new variables.
  • Another important approach is to leverage autoregression models which leverages the relationship of values Yt to previous period values i.e. Yt-1, Yt-2 etc. Here the independent variables are time lagged versions of the dependent variable & is represented as Y-hat = b0 + b1Yt-1 + b2Yt-2 +….
 
 
 
 

Autocorrelation in regression analysis:

 

Autocorrelation mostly refers to the degree of correlation of the same variables or observation points between two successive time intervals. It measures how the lagged version of the value of a variable is related to the original version of it in given time period. 

Autocorrelation also known as serial correlation. 

The value of autocorrelation ranges from -1 to +1. A value between -1 to 0 represents negative autocorrelation and a value between 0 to 1 represents positive autocorrelation. 

Autocorrelation gives information about the trend of a et of historical data so that it can be useful in the technical analysis for the stock market. 

image.png.8a1dfa0e92b3d84189416f8f6b74fee9.png

 

image.png.a3654d4e6741e2c8e68d27777925c2b3.png

 

Causes of Autocorrelation

  • Time to adjust; this is often occurs in Macro and time series data
  • Prolonged influences; this is again a Macro, time series issues dealing with economic changes
  • Data verification and manipulation; using functions to smooth data will bring autocorrelation into the disturbance terms
  • Mis or wrong specification 

 

Why is it a problem? 

When autocorrelation is detected in the residuals from a model, it suggests that the model is misspecified or wrong, A cause is that some key variables are missing from the model. 

How to deal with autocorrelation? 

Autocorrelation functions are used for model criticism. It is used to test if there is a structure left in the residuals. An important prerequisite, here is that the data or values are correctly ordered before running the regression models. If there is structure in the residuals of a model, an AR1 model can be included to reduce the effects of this autocorrelation. 

There are mainly two methods to reduce autocorrelation, of which the first ane is most commonly used; 

  1. Improve model fit; we must try to capture structure in the data in the model. 
  2. If no more predictors can be added, include present and AR1 model. by including an AR1 model, the first model takes into account the structure in the residuals and reduces the confidence in the predictors accordingly. 

Autocorelation in regression is degree of same variables between two successive time intervals. It is also known as Serial co-relation. It helps to find repeatating periodic patterns like temp. on different days, Suction pressure at different tank levels, Process parameters at different environmental conditions etc. Generally Durbin-Watson statistics is used to test auto corelation, it ranges between -1 to 1 where 1 means strong positive co-relation, -1 means strong negative corelation and 0 means low level of auto corelation. It helps project leader to understand weather data collected is random or not . It also helps to understand the model fitness if data collection require more details and efforts.

 Autocorrelation is the degree of correlation of the same variable measured at two different time intervals. While it is useful in scenarios such as predicting weather or stock market prices, it also poses an issue with regression analyses.

 

 It is an issue in regression analysis because presence of autocorrelation in residuals means that they are not independent over time and hence we cannot rely on the standard errors and hence can’t rely on the p value.

 

 To help remove auto-correlation from regression analysis, we could use Durbin-Watson test to identify the presence of autocorrelation first and then look for any missing key variable from the analysis. Usually, it’s due to missing key variable that autocorrelation occurs.

 

 If this doesn’t fix then there are few transformation methods that could help sort this issue, such as Cochrane-Orcutt Procedure, Hildreth-Lu Procedure & First Differences Procedure.

What is Autocorrelation?
The degree of correlation of the same variables between two successive time intervals is referred to as autocorrelation. It assesses how the lagged version of a variable's value compares to the original version in a time series.

Autocorrelation analysis aids in the discovery of repeating periodic patterns that can be used as a tool for technical analysis.

 

How does it work?
In many cases, the value of a variable at one point in time is related to its value at another. Autocorrelation analysis looks for patterns or trends in time series by measuring the relationship between observations at different points in time. Temperatures on different days of the month, for example, are autocorrelated.

 

Autocorrelation, like correlation, can be positive or negative. It can range from -1 to 1 (negative autocorrelation to positive autocorrelation). Positive autocorrelation indicates that an increase in one time interval causes a proportionate increase in the lagged time interval.

 

The temperature example discussed above shows a positive autocorrelation. The temperature the following day tends to rise when it has been rising and tends to fall when it has been decreasing the previous days.

The observations with positive autocorrelation can be drawn into a smooth curve. A regression line can be used to show that a positive error is followed by another positive error, and a negative error is followed by another negative error.

 

Negative autocorrelation, on the other hand, denotes that an increase observed in one time interval causes a proportionate decrease in the lagged time interval. When the observations are plotted with a regression line, it is clear that a positive error will be followed by a negative one and vice versa.

 

Autocorrelation can be applied to varying time gaps, which is known as lag. A lag 1 autocorrelation measures the correlation between observations separated by one time interval. A lag 30 autocorrelation, for example, should be used to learn the correlation between one day's temperatures and the corresponding day the following month (assuming 30 days in that month).

 

image.thumb.png.d18c64cf49bfc4ee8a046bcf07b1e579.png

 

To test for autocorrelation, the Durbin-Watson statistic is commonly used. Statistical software can apply it to a data set. The Durbin-Watson test yields a score ranging from 0 to 4. A result close to 2 indicates a very low level of autocorrelation. A result closer to 0 indicates a stronger positive autocorrelation, while a result closer to 4 indicates a stronger negative autocorrelation.

 

When analyzing a set of historical data, it is necessary to test for autocorrelation. In the equity market, for example, stock prices on one day can be highly correlated with prices on another. However, it provides little information for statistical data analysis and does not reveal the stock's actual performance.

 

As a result, testing for autocorrelation of historical prices is required to determine whether the price change is merely a pattern or caused by other factors. In finance, one common method for removing the impact of autocorrelation is to use percentage changes in asset prices rather than historical prices themselves.

 

Although autocorrelation should be avoided in order to apply more accurate data analysis, it can still be useful in technical analysis because it searches for patterns in historical data. 

 

Through autocorrelation, a technical analyst can learn how the stock price of a given day is affected by the price of previous days. As a result, he can forecast how the price will move in the future.

 

If the price of a stock with strong positive autocorrelation has been rising for several days, the analyst can reasonably predict that the price will rise further in the coming days. To profit from the upward price movement, the analyst may buy and hold the stock for a short period of time.

 

The autocorrelation analysis only provides information about short-term trends and says little about a company's fundamentals. As a result, it can only be used to support trades with short holding periods.

 

The problem of autocorrelation in time series regression analysis is overcome by the addition of independent variables and data transformation.


Addition of Independent Variables: Autocorrelation is frequently caused by the exclusion of one or more significant predictor variables when performing a regression analysis. By including this variable in the regression model, the autocorrelation can be greatly reduced.

 

Data transformation: When adding extra variables is ineffective at reducing autocorrelation, data transformation may be used to address the issue. 

 

Autocorrelation violates the assumption of independence of residuals in regression and it adversely affects the regression analysis. It can be checked graphically by looking at the 'Residuals vs Data Order' graph or by using the Durbin-Watson statistic.

From all the published answers, there are 2 that stand out - Rahul Arora and Soji Sam. 

Rahul's answer has been selected as the best answer because he addressed how Autocorrelation affects Regression. Soji's answer just misses out as it does not answer how autocorrelation affects regression. Otherwise, it is a must read answer.

Create an account or sign in to comment

Account

Navigation

Search

Search

Configure browser push notifications

Chrome (Android)
  1. Tap the lock icon next to the address bar.
  2. Tap Permissions → Notifications.
  3. Adjust your preference.
Chrome (Desktop)
  1. Click the padlock icon in the address bar.
  2. Select Site settings.
  3. Find Notifications and adjust your preference.