Most research projects involve sampling participants from a population of interest. The population is composed of all individuals of interest to the researcher. One population of interest in a large public opinion poll, for instance, might be all eligible voters in the United States. This implies that the population of interest does not include people under the age of 18, convicted prisoners, visitors from other countries, and anyone else not eligible to vote. You might conduct a survey in which your population consists of all students at your college or university. With enough time and money, a survey researcher could conceivably contact everyone in the population. The United States attempts to do this every 10 years with an official census of the entire population. With a relatively small population, you might find it easy to study the entire population.
In most cases, however, studying the entire population would be a massive undertaking. Fortunately, it can be avoided by selecting a sample from Page 148the population of interest. With proper sampling, we can use information obtained from the participants (or “respondents”) who were sampled to estimate characteristics of the population as a whole. Statistical theory allows us to infer what the population is like, based on data obtained from a sample (the logic underlying what is called statistical significance will be addressed in Chapter 13).
When researchers make inferences about populations, they do so with a certain degree of confidence. Here is a statement that you might see when you read the results of a survey: “The results from the survey are accurate within ±3 percentage points, using a 95% level of confidence.” What does this tell you? Suppose you asked students to tell you whether they prefer to study at home or at school, and the survey results indicate that 61% prefer to study at home. Using the same degree of confidence, you would now know that the actual population value is probably between 58% and 64%. This is called a confidence interval—you can have 95% confidence that the true population value lies within this interval around the obtained sample result. Your best estimate of the population value is the sample value. However, because you have only a sample and not the entire population, your result may be in error. The confidence interval gives you information about the likely amount of the error. The formal term for this error is sampling error, although you are probably more familiar with the term margin of error. Recall the concept of measurement error discussed in Chapter 5. When you measure a single individual on a variable, the obtained score may deviate from the true score because of measurement error. Similarly, when you study one sample, the obtained result may deviate from the true population value because of sampling error.
The surveys you often read about in newspapers and the previous example deal with percentages. What about questions that ask for more quantitative information? The logic in this instance is very much the same. For example, if you also ask students to report how many hours and minutes they studied during the previous day, you might find that the average amount of time was 76 minutes. A confidence interval could then be calculated based on the size of the sample; for example, the 95% confidence interval is 76 minutes plus or minus 10 minutes. It is highly likely that the true population value lies within the interval of 66 to 86 minutes. The topic of confidence intervals, including how to calculate them, is discussed again in Chapter 13.