A covariate is a confusing term in statistics. (References 1 & 2) For example, SPSS describes a continuous independent variable as a Covariate in its General Linear Models and as an Independent Variable in Regression. Minitab describes a continuous independent variable as a Covariate in its General Linear Model.
In research, one is looking for relationships between explanatory variables and the response variable.
Covariates are the variables that the researcher is not interested in but it affects the response variable.
Example 1
Which training methodology out of the three methodologies available is most effective in improving exam scores for the ICSE Exam?
The Explanatory Variable is the training methodologies and the exam score is the response variable.
Covariates, in this case, could be the current grade of the students, education level of parents, the income of parents, or studying abilities of the students within the three groups of the training methodologies. The variability on account of these covariates needs to be taken into consideration in order to see the correct relationship between the training methodology and the exam score.
For example, in this case, we could use the current grade of the students as a covariate since the current grade would most likely be highly correlated with the exam score.
Example 2.
What is the relationship between the Square Foot Area of a house and its price?
The Explanatory Variable is the Square Foot Area of the house and the price is the response variable.
Covariates, in this case, could be the age of the house, the location of the house, or even the distance from the bus stop. The variability on account of these covariates needs to be taken into consideration in order to see the correct relationship between the Square Foot Area of the house and the price.
For example, in this case, we could use the age of the house as a covariate since older houses are likely to be cheaper.
The most common settings for Covariates are ANOVA and regression.
ANOVA. Analysis of Variance is used to find if there is any statistical difference between three or more independent groups. However, in Example 1 above, we use the current grade as a covariate and would perform an ANCOVA (Analysis of Covariance). In this case, the students’ current grade, a continuous variable, is included as a covariate
Regression. In regression, we are attempting to quantify a relationship between one or more independent explanatory variables and the response variable. We run a Multiple Linear Regression by including Square Footage (Variable of Interest) and House Age (the covariate) as an explanatory variable and House Price as the response variable. The regression coefficient of the Square Footage would give the change in House Price per unit change in Square Footage after taking into account the House Age.
Thus, by adding House Age in the regression model in Example 2, the noise (variation on account of the covariate) is reduced. References
1. https://www.theanalysisfactor.com/confusing-statistical-terms-5-covariate/
2. https://www.theanalysisfactor.com/series-on-confusing-statistical-terms/
3. https://www.statology.org/covariate/