The correlation coefficient is a statistical measure that calculates the strength of the relationship between the relative movements of two variables.
The values of the correlation coefficient range from -1.0 to 1.0.
A correlation of -1.0 shows a perfect negative correlation, while a correlation of 1.0 shows a perfect positive correlation.
A correlation of 0.0 shows no linear relationship between the movement of the two variables.
A calculated number greater than 1.0 or less than -1.0 indicates an error in the correlation measurement.
Types of Correlation Coefficients
There are several types of correlation coefficients, the most common of which are:
Pearson Correlation Coefficient:
This is the most widely used measure of correlation. It assesses the linear relationship between two continuous variables, like checking if two things increase or decrease together in a consistent way. It is sensitive to outliers, which can skew the results.
Spearman’s Rank Correlation Coefficient:
This non-parametric measure assesses how well the relationship between two variables can be described using a monotonic function. In simpler terms, it looks at whether things that are ranked high in one list are also ranked high in another. It is less sensitive to outliers compared to Pearson’s correlation.
Kendall’s Tau:
This is another non-parametric measure used to assess the strength of association between two ranked variables. It’s like comparing the ranks of two friends’ favorite movies to see if they agree on which movies are better. Kendall’s Tau is less sensitive to small sample sizes and is more robust in cases of ties.
Point-Biserial Correlation Coefficient:
This is a special case of the Pearson correlation used when one variable is continuous (like height) and the other is dichotomous (binary, like yes/no). It checks if there’s a connection between the two, such as seeing if being tall is associated with liking basketball.
Mathematical Formula
For Pearson’s correlation coefficient, the formula is given by:
Where:
- r: This represents the Pearson correlation coefficient, which quantifies the strength and direction of the linear relationship between two variables.
- n: The number of data pairs or observations.
- Σxy: This is the sum of the products of paired scores from two variables (x and y). You multiply each pair of x and y values together, then sum all those products.
- Σx: This is the sum of all x-values in the dataset.
- Σy: This is the sum of all y-values in the dataset.
- Σx²:This is the sum of the squares of each x-value. To compute it, square each x-value individually and then sum all those squares.
- Σy²:This is the sum of the squares of each y-value. You square each y-value individually and then sum those squares.
Interpretation
- Strong positive correlation (0.7 ≤ r ≤ 1): As one variable increases, the other variable also increases.
- Moderate positive correlation (0.3 ≤ r < 0.7): As one variable increases, the other variable tends to increase.
- Weak positive correlation (0 ≤ r < 0.3): A slight increase in one variable may lead to a slight increase in the other variable.
- No correlation (r ≈ 0): No linear relationship between the variables.
- Weak negative correlation (-0.3 < r ≤ 0): A slight increase in one variable may lead to a slight decrease in the other variable.
- Moderate negative correlation (-0.7 < r ≤ -0.3): As one variable increases, the other variable tends to decrease.
- Strong negative correlation (-1 ≤ r ≤ -0.7): As one variable increases, the other variable decreases.
Applications of Correlation Coefficient
Correlation coefficients are widely used in various fields such as economics, finance, psychology, and the physical sciences.
In finance, for example, they are used to measure the correlation between the returns of different assets, which helps in portfolio diversification strategies.
In forex trading, it can be used to analyze the relationship between currency pairs, helping traders to understand if two currencies move together or in opposite directions.
You can use our online interactive tool that measures currency correlations over multiple time periods.
Limitations of Correlation Coefficient
- Linear Relationships: The Pearson correlation coefficient only measures linear relationships, so it might not provide meaningful information about non-linear relationships.
- Sensitivity to Outliers: Pearson’s correlation coefficient is sensitive to outliers, which can distort the results.
- Causation: Correlation does not imply causation. Even if two variables are highly correlated, it does not mean that one variable causes the other to change.
Correlation Coefficient Cheat Sheet
Here’s a cheat sheet that provides an overview of the different types of correlation coefficients:
Type of Correlation Coefficient | What It Measures | Example in Plain English | Sensitivity |
---|---|---|---|
Pearson Correlation Coefficient | The strength and direction of a straight-line (linear) relationship between two continuous variables. | Checking if two things increase or decrease together consistently. | Sensitive to outliers. |
Spearman’s Rank Correlation Coefficient | The consistency of the order (rank) of data points between two variables (non-parametric). | Seeing if things ranked high in one list are also ranked high in another. | Less sensitive to outliers than Pearson. |
Kendall’s Tau | The strength of association between two ranked variables focuses on the consistency of ranks (non-parametric). | Comparing ranks of two friends’ favorite movies to see if they agree. | Less sensitive to small samples and ties. |
Point-Biserial Correlation Coefficient | The relationship between a continuous variable and a binary (dichotomous) variable. | Checking if being tall is associated with liking basketball (yes/no). | Same sensitivity as Pearson. |
Phi Coefficient | The association between two binary variables. | Seeing if answering “yes” to liking pizza means also liking ice cream. | Less sensitive due to binary nature. |