How to Correct for Random Measurement Error in Correlations

I have a forthcoming article in Communication Research Reports (see my research page for the full citation), which examines the relationship between receiver characteristics (including issue involvement, value-relevant involvement, and elaboration) and message features (including language intensity, author credibility, message quality, and message effectiveness) on perceptions of organizational credibility in public health advocacy messages.

This article stems from my thesis, and, at the time of writing the article, I understood the concept of measurement error, but did not know how to correct for measurement error in my analyses. This post is designed to (1) briefly explain random measurement error, (2) provide an example of how to correct for random measurement error by disattenuating my correlation matrix and confidence intervals from this article, and (3) address potential objections to these corrections for measurement error.

Note: The procedures to correct for measurement error in this post only work for random measurement error. If you have some sort of systematic error in your measures, these procedures are inappropriate.

Random Measurement Error

Suppose our goal is to estimate the true relationship between two latent variables, $X$ and $Y$, depicted below in the ovals. For example, one of the relationships I examined in my article mentioned above is the relationship between perceptions of message effectiveness and perceptions of organizational credibility. In order to examine these relationships, I used two scales, one for message effectiveness and one for organizational credibility, to measure and observe participant perceptions of these variables. These observed (i.e., manifest) variables are depicted below in the rectangles, as $\mathbf{x}$ and $\mathbf{y}$.

We are never able to perfectly measure a latent variable, because our measures will always have random measurement error. This stems from classical test theory, which states that our observed score ($\mathbf{x}$) is a function of the true score ($T$) plus random error ($\epsilon$): $\mathbf{x} = T + \epsilon$. The latent $X$ and the measured $\mathbf{x}$ therefore do not correlate perfectly ($\: \rho_{\mathbf{x}, \mathbf{x}} \neq 1$). Likewise, the latent $Y$ and measured $\mathbf{y}$ also do not perfectly correlate ($\: \rho_{\mathbf{y}, \mathbf{y}} \neq 1$). Note that $\rho$, the Greek letter rho, is used here to represent the true correlation between $X$ and $Y$.

So, to continue with the above example, my measures of perceived message effectiveness ($\mathbf{x}$) and organizational credibility ($\mathbf{y}$) do not perfectly measure, and therefore do not perfectly correlate, with the latent variables message effectiveness ($X$) and organizational credibility ($Y$).

The observed correlation between $\mathbf{x}$ and $\mathbf{y}$ ($\hat\rho_{\mathbf{x}, \mathbf{y}}$) will always be lower than the true correlation ($\, \rho_{X, Y}$) because of measurement error. Note that a "hat" is used to denote an estimated value. So, $\rho_{X, Y}$ refers to the true correlation between latent $X$ and $Y$, and $\hat\rho_{\mathbf{x}, \mathbf{y}}$ is the estimated/observed correlation given our measurements $\mathbf{x}$ and $\mathbf{y}$. Observed correlations are therefore considered “attenuated,” because a relationship is attenuated when it is artificially diminished.

How to Correct for Random Measurement Error

In my article on organizational credibility, the observed correlation between message effectiveness and organizational credibility is $\hat\rho_{\mathbf{x}, \mathbf{y}} = .63$. The 95% confidence interval for $\hat\rho_{\mathbf{x}, \mathbf{y}}$ is calculated with the following formula:

$\hat\rho_{\mathbf{x}, \mathbf{y}} \pm 1.96(\text{se})$, where $\hat\rho_{\mathbf{x}, \mathbf{y}}$ is the estimated correlation, and "se" is the standard error (i.e., the standard deviation of the sampling distribution for $\hat\rho_{\mathbf{x}, \mathbf{y}}$)

The formula for the standard error is as follows:

$$\text{se}(\hat\rho_{\mathbf{x}, \mathbf{y}}) = \frac{(1 – \hat\rho_{\mathbf{x}, \mathbf{y}}^2)}{\sqrt{N - 1}}$$

So, the standard error of $\hat\rho_{\mathbf{x}, \mathbf{y}} = .63$, $N = 218$ (the sample size from my experiment) is $\text{se}(\hat\rho_{\mathbf{x}, \mathbf{y}}) = .041$. The 95% confidence interval of $\hat\rho_{\mathbf{x}, \mathbf{y}} = .63$, $N = 218$ is calculated as follows:

$$[.63 – (1.96(.041)), .63 + (1.96(.041))] = [.55, .71]$$

Confidence intervals constructed in this way tell us that if we could repeatedly sample from the population of interest, and re-run the experiment each time, the interval would cover the true correlation, $\rho_{X, Y}$, 95% of the time. This website is a handy calculator for calculating confidence intervals.

As mentioned, this observed correlation ($\hat\rho_{\mathbf{x}, \mathbf{y}} = .63$) is attenuated because of measurement error. The following formula corrects for measurement error through disattenuation: $\hat\rho_{\mathbf{x}, \mathbf{y}}^{*} = \frac{\hat\rho_{\mathbf{x} \mathbf{y}}}{\sqrt{\hat\rho_{\mathbf{x}, \mathbf{x}} \hat\rho_{\mathbf{y}, \mathbf{y}}}}$, where $\hat\rho_{\mathbf{x}, \mathbf{y}}^{*}$ is a disattenuated estimate of the correlation of the latent variables, $\hat\rho_{\mathbf{x}, \mathbf{y}}$ is the observed correlation, $\hat\rho_{\mathbf{x},\mathbf{x}}$ is the reliability of the scale for $\mathbf{x}$, and $\hat\rho_{\mathbf{y},\mathbf{y}}$ is the reliability of the scale for variable $\mathbf{y}$.

So, for an observed correlation $\hat\rho_{\mathbf{x}, \mathbf{y}} = .63$, with scale reliabilities $\hat\rho_{\mathbf{x}, \mathbf{x}} = .89$ and $\hat\rho_{\mathbf{y}, \mathbf{y}} = .80$ (i.e., the Cronbach’s $\alpha$ for the message effectiveness scale was $.89$, and the Cronbach’s $\alpha$ for the organizational credibility scale was $.80$), the disattenuated correlation is:

$$\hat\rho_{\mathbf{x}, \mathbf{y}}^{*} = \frac{.63}{\sqrt{.89(.80)}} = .75$$

The observed correlation, scale reliabilities, and disattenuated correlation can be depicted as follows:

The confidence interval can also be corrected for measurement error by dividing the upper and lower bound with the square root of the scale reliabilities. Compare the uncorrected bounds, $[.55, .71]$, with the corrected bounds: $\left[\frac{.55}{\sqrt{.89(.80)}}, \frac{.71}{\sqrt{.89(.80)}}\right] = [.65, .84]$.

As you can see, the corrected confidence interval is wider ($.19$) than the uncorrected confidence interval ($.16$). This corrected, wider interval is a more precise estimate of the correlation, but we become less confident in that point estimate (i.e., the confidence interval becomes wider).

The following correlation matrix is the one that I published in my article on organizational credibility; however, this correlation matrix is not corrected for measurement error.

I created a new correlation matrix that displays the observed, uncorrected correlations in the lower left (under the diagonal) and the disattenuated, corrected correlations in the upper right (above the diagonal).

I also created a matrix of the 95% confidence intervals surrounding the correlations, both uncorrected and corrected. The observed, uncorrected 95% confidence intervals are in the lower left (under the diagonal), and the corrected 95% confidence intervals are in the upper right (above the diagonal).

Note: As the reliability of the measures increases, there is less difference between the attenuated and disattenuated correlation and confidence intervals.

Potential Counterarguments (and Rebuttals)

Some editors, reviewers, or colleagues may object to correcting for measurement error. Here I review several potential objections that may be raised when correcting for measurement error, followed by rebuttals that support the idea that we should correct for measurement error.

Take the following example to illustrate this point: Assume we have a true, population correlation of $\rho_{X, Y} = .30$, with many samples of $n = 20$, and perfect measures of $\rho_{\mathbf{x}, \mathbf{x}} = 1$ and $\rho_{\mathbf{y}, \mathbf{y}} = 1$. The average observed correlation will be $\hat\rho_{\mathbf{x}, \mathbf{y}} = .30$, but because of sampling error, 50% of the observed correlations will be above the true value of $.30$, and 50% of the observed correlations will be below the true value $.30$:

Now assume we have a true, population correlation of $\rho_{X, Y} = .30$, with many samples of $n = 20$, and imperfect measures of $\rho_{\mathbf{y}, \mathbf{x}} = .75$ and $\rho_{\mathbf{y}, \mathbf{y}} = .75$.

$$\hat\rho_{\mathbf{x}, \mathbf{y}} = \rho_{X, Y} \sqrt{.75^2} = (.30)(.75) = .23$$

This formula demonstrates that our true correlation, $\rho_{X, Y}$, is degraded in our observed correlation with measures that have a reliability of $.75$.

The average observed correlation with fallible measures will be $\hat\rho_{\mathbf{x}, \mathbf{y}} = .23$, but, again, because of sampling error, 50% of the observed correlations will be above .23, and 50% of the observed correlations will be below .23:

Now assume we have a true, population correlation of $\rho_{X, Y} = 1$, with many samples of $n = 20$, and imperfect measures of $\rho_{\mathbf{x}, \mathbf{x}} = .75$ and $\rho_{\mathbf{y}, \mathbf{y}} = .75$.

$$\hat\rho_{\mathbf{x}, \mathbf{y}} = \rho_{X, Y} \sqrt{.75^2} = 1(.75) = .75$$

The average observed correlation will be $\hat\rho_{\mathbf{x}, \mathbf{y}} = .75$, but, again, because of sampling error, 50% of the observed correlations will be above $.75$, and 50% of the observed correlations will be below $.75$:

Now, if you correct the observed correlations in this instance, 50% of the corrected correlations will be above $1$, and 50% of the corrected correlations will be below $1$:

So, to reiterate, an objection to correcting correlations is that corrected correlations can exceed $1$, and true correlations cannot. According to sampling theory, though, corrected correlations can and should exceed one on occasion. If this happens, it is still important to correct the observed correlation, but just set it to $1$.


Acknowledgements: The information in this post is based on material I learned in Prof. Jim Dillard’s measurement course in Spring 2015 at Penn State University.