Population and Sampling


This entry is part 3 of 9 in the series Statistics

Population and Sample

The population includes all objects of interest whereas the sample is only a portion of the population. Parameters are associated with populations and statistics with samples. Parameters are usually denoted using Greek letters (mu, sigma) while statistics are usually denoted using Roman letters (x, s). Mu is written as μ (and in HTML as an ampersand mu semicolon) . x̄ There are several reasons why we don’t work with populations. They are usually large, and it is often impossible to get data for every object we’re studying. Sampling does not usually occur without cost, and the more items surveyed, the larger the cost.

Parameter and Statistic

A parameter is a characteristic of a population. A statistic is a characteristic of a sample. The average weight of all of the world’s hippos is a parameter and the average weight of a random sample of 50 hippos is a statistic.

Types of Sampling

There are five types of sampling: Random, Systematic, Convenience, Cluster, and Stratified. Random sampling is analogous to putting everyone’s name into a hat and drawing out several names. Each element in the population has an equal chance of occurring. While this is the preferred way of sampling, it is often difficult to do. It requires that a complete list of every element in the population be obtained. Computer generated lists are often used with random sampling.

Systematic sampling is easier to do than random sampling. In systematic sampling, the list of elements is “counted off”. That is, every kth element is taken. This is similar to lining everyone up and numbering off “1,2,3,4,5; 1,2,3,4,5; and so on”. When you are done numbering, all people numbered 5, for example, would be used to obtain data from.

Convenience sampling is very easy to do, but it’s probably the worst technique to use. It probably shouldn’t even be mentioned here. In convenience sampling, readily available data is used. That is, the first people the surveyor runs into are included in the study. Cluster sampling is accomplished by dividing the population into groups — usually geographically. These groups are called clusters or blocks. The clusters are randomly selected, and each element in the selected clusters are used. Stratified sampling also divides the population into groups called strata. However, this time it is by some characteristic, not geographically. For instance, the population might be separated into males and females. A sample is taken from each of these strata using either random, systematic, or convenience sampling.

Stratified sampling first starts with putting the population into groups (strata) based on a characteristic. Suppose we have the blue red and green strata. Now we can randomly select elements from each strata based on the size of the strata so that the numbers selected are proportional.

Series Navigation<< The Classification of DataVisualizing Data >>

Leave a Reply