Correlation and Regression

Correlation and Regression: When dealing with two sets of variables, it may so happen that one variable may be in a way related to the other. That is to say that with the change in one variable from on value to the other, the other variable will also change in its value corresponding to the change in the first variable. Then it is assumed that there is a correlation between the two variables. This correlation may be either due to some direct relationship between the two variables or due to some inherent factor common to both the variables. The degree of this correlation is measured in terms of Correlation Coefficient.

So far we have studied some characteristics of one variable only, for example, mean of the distribution of height, standard deviation of weight, skewness of the distribution income. But, many situations arise in which we may have to study two variables simultaneously, say x and y. For example, the variables may be (i) the amount of rainfall and yield of a certain crop, (ii) the height and weight of a group of children, (iii) income and expenditure of several families, (iv) ages of husband and wife, etc. There are two main problems involved in such studies:

Firstly, the data reveal some association between x and y, and we may be interested to measure numerically the strength of this association between the variables. Such a measure will determine how well a linear or other equation explains the relationship between the variables. This is the problem of Correlation.

Secondly, there may be one variable of particular interest, and the other variable, regarded as an auxiliary variable, may be studied for its possible aid in throwing some light on the former. In such a case, one is then interested in using a mathematical equation for making estimates or predictions regarding the principal variable. This equation is known as a Regression Equation, and the problem of making predictions on the basis of the equation is called the problem of Regression.

In short, correlation is concerned with the measurement of the ‘strength of association’ between variables; while regression is concerned with the ‘prediction’ of the most likely value of one variable when the value of the other variable is known.

In simple correlation and simple regression (also called linear correlation and linear regression), we consider the simplest kind of relationship, viz. a linear relationship, as the regression equation. Simple correlation is, therefore, concerned with the strength of linear type of relationship between the variables.

Leave a Reply

Your email address will not be published. Required fields are marked *