All you need to know about Correlation Coefficient
It is measure of strength of association between two numerical variables is called correlation and is specifically describes the strength of linear association between two variables
We denote correlation as R
Some of the properties of correlation coefficient:
- The magnitude or the absolute value of the correlation coefficient measures the strength of the linear association between two variables
In this scatterplot the strength of the association is going from pretty strong to very weak. Higher the magnitude, higher is the strength of association.
2. The sign of the correlation coefficient indicates the direction of association
In this plot, if you see the R is positive in the first plot which says the variable increases with the increase in other variable. The other is negative which says increase in one variable decreases the other variable.
3. The correlation coefficient is always between -1
( perfect negative linear association) and 1
( perfect positive linear association). Also, R=0
indicated there is no linear relationship which means no changes to x
if y
changes.
4. The correlation coefficient is unitless, and is not affected by changes in the center or scale of either variable ( such as unit conversions)
In the above plot, the R value remains same even if the body weight is measured in lbs
or kg
5. The correlation of X
with Y
is the same as of Y
with X
The change in axis does not affect the correlation coefficient
6. The correlation coefficient is sensitive to outliers
As you see, even one outlier in the second graph has the affect on correlation coefficient greatly.
Hope this article simplifies correlation coefficient and gives very good understanding of what is correlation coefficient and how it is affected by the relationship in variables.