To calculate a correlation coefficient, we need to obtain pairs of observations from each subject. Thus, each individual has two scores, one on each of the variables. Table 12.2 shows fictitious data for 10 students measured on the variables of classroom seating pattern and exam grade. Students in the first row receive a seating score of 1, those in the second row receive a 2, and so on. Once we have made our observations, we can see whether the two variables are related. Do the variables go together in a systematic fashion?
TABLE 12.2 Pairs of scores for 10 participants on seating pattern and exam scores (fictitious data)
The Pearson r provides two types of information about the relationship between the variables. The first is the strength of the relationship; the second is the direction of the relationship. As noted previously, the values of r can range from 0.00 to ±1.00. The absolute size of r is the coefficient that indicates the strength of the relationship. A value of 0.00 indicates that there is no relationship. The nearer r is to 1.00 (plus or minus), the stronger is the relationship. The plus and minus signs indicate whether there is a positive linear or negative linear relationship between the two variables. It is important to remember that it is the size of the correlation coefficient, not the sign, that indicates the strength of the relationship. Thus, a correlation coefficient of −.54 indicates a stronger relationship than does a coefficient of +.45.
Scatterplots The data in Table 12.2 can be visualized in a scatterplot in which each pair of scores is plotted as a single point in a diagram. Figure 12.7 shows two scatterplots. The values of the first variable are depicted on the x axis, and the values of the second variable are shown on the y axis. These scatterplots show a perfect positive relationship (+1.00) and a perfect negative relationship (−1.00). You can easily see why these are perfect relationships: The scores on the two variables fall on a straight line that is on the diagonal of the diagram. Each person’s score on one variable correlates precisely with his or her score on the other variable. If we know an individual’s score on one of the variables, we can predict exactly what his or her score will be on the other variable. Such “perfect” relationships are rarely observed in reality.
Scatterplots of perfect (±1.00) relationships
Page 253The scatterplots in Figure 12.8 show patterns of correlation you are more likely to encounter in exploring research findings. The first diagram shows pairs of scores with a positive correlation of +.65; the second diagram shows a negative relationship, −.77. The data points in these two scatterplots reveal a general pattern of either a positive or negative relationship, but the relationships are not perfect. You can make a general prediction in the first diagram, for instance, that the higher the score on one variable, the higher the score on the second variable. However, even if you know a person’s score on the first variable, you cannot perfectly predict what that person’s score will be on the second variable. To confirm this, take a look at value 1 on variable x (the horizontal axis) in the positive scatterplot. Looking along the vertical y axis, you will see that two individuals had a score of 1. One of these had a score of 1 on variable y, and the other had a score of 3. The data points do not fall on the perfect diagonal shown in Figure 12.7. Instead, there is a variation (scatter) from the perfect diagonal line.
Scatterplots depicting patterns of correlation
Page 254The third diagram shows a scatterplot in which there is absolutely no correlation (r = 0.00). The points fall all over the diagram in a completely random pattern. Thus, scores on variable x are not related to scores on variable y.
The fourth diagram has been left blank so that you can plot the scores from the data in Table 12.2. The x (horizontal) axis has been labeled for the seating pattern variable, and the y (vertical) axis for the exam score variable. To complete the scatterplot, you will need to plot the 10 pairs of scores. For each individual in the sample, find the score on the seating pattern variable; then go up from that point until you are level with that person’s exam score on the y axis. A point placed there will describe the score on both variables. There will be 10 points on the finished scatterplot.
The correlation coefficient calculated from these data shows a negative relationship between the variables (r = −.88). In other words, as the seating distance from the front of the class increases, the exam score decreases. Although these data are fictitious, a negative relationship has been reported in research on this topic (Benedict & Hoag, 2004; Brooks & Rebata, 1991).