Published on May 13, 2022 by Shaun Turney. Revised on September 6, 2022. The Pearson correlation coefficient (r) is the most common way of measuring a linear correlation. It is a number between –1 and 1 that measures the strength and
direction of the relationship between two variables. The longer the baby, the heavier their weight. The price of a car is not related to the width of its windshield wipers. The higher the elevation, the lower the air pressure. The Pearson correlation coefficient (r) is the most widely used correlation coefficient and is known by many names: The Pearson correlation coefficient is a
descriptive statistic, meaning that it summarizes the characteristics of a dataset. Specifically, it describes the strength and direction of the linear relationship between two quantitative variables. Although interpretations of the relationship strength (also known as effect
size) vary between disciplines, the table below gives general rules of thumb: The Pearson correlation coefficient is also an inferential statistic, meaning that it can be used to test statistical hypotheses. Specifically, we can test whether there is a significant relationship between two variables. Another way to think of the Pearson correlation coefficient (r) is as a measure of how close the observations are to a line of best fit. The Pearson correlation coefficient also tells you whether the slope of the line of best fit is negative or positive. When the slope is
negative, r is negative. When the slope is positive, r is positive. When r is 1 or –1, all the points fall exactly on the line of best fit: When r is greater than .5 or less than –.5, the points are close to the line of best fit: When r is between 0 and .3 or between 0 and –.3, the points are far from the line of best fit: When r is 0, a line of best fit is not helpful in describing the relationship between the variables: What can proofreading do for your paper?Scribbr editors not only correct grammar and spelling mistakes, but also strengthen your writing by making sure your paper is free of vague language, redundant words and awkward phrasing. See editing example When to use the Pearson correlation coefficientThe Pearson correlation coefficient (r) is one of several correlation coefficients that you need to choose between when you want to measure a correlation. The Pearson correlation coefficient is a good choice when all of the following are true:
Pearson vs. Spearman’s rank correlation coefficientsSpearman’s rank correlation coefficient is another widely used correlation coefficient. It’s a better choice than the Pearson correlation coefficient when one or more of the following is true:
Calculating the Pearson correlation coefficientBelow is a formula for calculating the Pearson correlation coefficient (r):
The formula is easy to use when you follow the step-by-step guide below. You can also use software such as R or Excel to calculate the Pearson correlation coefficient for you. Example: DatasetImagine that you’re studying the relationship between newborns’ weight and length. You have the weights and lengths of the 10 babies born last month at your local hospital. After you convert the imperial measurements to metric, you enter the data in a table:
Step 1: Calculate the sums of x and yStart by renaming the variables to “x” and “y.” It doesn’t matter which variable is called x and which is called y—the formula will give the same answer either way. Next, add up the values of x and y. (In the formula, this step is indicated by the Σ symbol, which means “take the sum of”.) Length = y Σx = 3.63 + 3.02 + 3.82 + 3.42 + 3.59 + 2.87 + 3.03 + 3.46 + 3.36 + 3.30 Σx = 33.5 Σy = 53.1 + 49.7 + 48.4 + 54.2 + 54.9 + 43.7 + 47.2 + 45.2 + 54.4 + 50.4 Σy = 501.2 Step 2: Calculate x2 and y2 and their sumsCreate two new columns that contain the squares of x and y. Take the sums of the new columns. Example: Calculating x2 and y2 and their sums
Σx2 = 13.18 + 9.12 + 14.59 + 11.70 + 12.89 + 8.24 + 9.18 + 11.97 + 11.29 + 10.89 Σx2 = 113.05 Σy2 = 2 819.6 + 2 470.1 + 2 342.6 + 2 937.6 + 3 014.0 + 1 909.7 + 2 227.8 + 2 043.0 + 2 959.4 + 2 540.2 Σy2 = 25 264 Step 3: Calculate the cross product and its sumIn a final column, multiply together x and y (this is called the cross product). Take the sum of the new column. Example: Calculating the cross product and its sum
Σxy = 192.8 + 150.1 + 184.9 + 185.4 + 197.1 + 125.4 + 143.0 + 156.4 + 182.8 + 166.3 Σxy = 1 684.2 Step 4: Calculate rUse the formula and the numbers you calculated in the previous steps to find r. Example: Calculating rTesting for the significance of the Pearson correlation coefficientThe Pearson correlation coefficient can also be used to test whether the relationship between two variables is significant. The Pearson correlation of the sample is r. It is an estimate of rho (ρ), the Pearson correlation of the population. Knowing r and n (the sample size), we can infer whether ρ is significantly different from 0.
To test the hypotheses, you can either use software like R or Stata or you can follow the three steps below. Step 1: Calculate the t valueCalculate the t value (a test statistic) using this formula: Example: Calculating the t valueThe weight and length of 10 newborns has a Pearson correlation coefficient of .47. Since we know that n = 10 and r = .47, we can calculate the t value:
Step 2: Find the critical value of tYou can find the critical value of t (t*) in a t table. To use the table, you need to know three things:
Step 3: Compare the t value to the critical valueDetermine if the absolute t value is greater than the critical value of t. “Absolute” means that if the t value is negative you should ignore the minus sign. Example: Comparing the t value to the critical value of t (t*)t = 1.506t* = 1.86 The t value is less than the critical value of t. Step 4: Decide whether to reject the null hypothesis
(Note that a sample size of 10 is very small. It’s possible that you would find a significant relationship if you increased the sample size.) Reporting the Pearson correlation coefficientIf you decide to include a Pearson correlation (r) in your paper or thesis, you should report it in your results section. You can follow these rules if you want to report statistics in APA Style:
When Pearson’s correlation coefficient is used as an inferential statistic (to test whether the relationship is significant), r is reported alongside its degrees of freedom and p value. The degrees of freedom are reported in parentheses beside r. Example: Reporting the Pearson correlation coefficient in APA StyleNewborns’ weight and length were moderately correlated, although the relationship was not statistically significant, r(8) = .47, p > .17.Frequently asked questions about the Pearson correlation coefficientSources in this articleWe strongly encourage students to use sources in their work. You can cite our article (APA Style) or take a deep dive into the articles below. This Scribbr article
Is this article helpful?You have already voted. Thanks :-) Your vote is saved :-) Processing your vote... What is Karl Pearson's coefficient of correlation What is its range?Karl Pearson's coefficient of correlation is defined as a linear correlation coefficient that falls in the value range of -1 to +1.
What is the value of the productThe Pearson product-moment correlation coefficient is a measure of the linear relationship between two questions/measures/variables, X and Y. The correlation value can range from +1 to -1. A positive correlation (e.g., +0.32) means there is a positive relationship between X and Y.
Is 0.4 A strong Pearson correlation?For this kind of data, we generally consider correlations above 0.4 to be relatively strong; correlations between 0.2 and 0.4 are moderate, and those below 0.2 are considered weak.
What is a good value for Pearson correlation?High degree: If the coefficient value lies between ± 0.50 and ± 1, then it is said to be a strong correlation. Moderate degree: If the value lies between ± 0.30 and ± 0.49, then it is said to be a medium correlation. Low degree: When the value lies below + . 29, then it is said to be a small correlation.
|