Area probability sampling, also called cluster sampling, is a sampling
technique in which the population of interest is divided into groups, or
clusters, and then a random sample of clusters is drawn to represent the
population of interest. Cluster units can be geographic, temporal, or spatial
in nature(1). Once the clusters are drawn one can either
measure all of the elementary units within the sampled cluster or take
a sample of smaller clusters or elementary units from within the sampled
clusters.
Purpose
The purpose of cluster sampling is to sample economically while retaining
the characteristics of a probability sample. Since the primary sampling
unit (PSU) is a cluster of elements located in proximity to one another
as opposed to the PSU being the individual element in the population, cluster
sampling offers a time and cost efficient way to sample a population that
is spread across a large geographic area(1).
Example
Suppose you needed to obtain a sample of 2,000 American families. Suppose
that a listing of every family in the United States is available and a
sample of 2,000 families can be taken from simple random sampling. Households
would be spread throughout most counties in the United States. Cluster
sampling would allow you to draw a sample of 50 counties from across the
United States and then take a sample of 3000 households from within the
sample of counties(2). The advantage of this technique
is households would be spread only within these 50 counties.
Types of Area Probability Sampling
Clusters can be selected by a variety of sampling techniques and there
can be multiple stages of cluster sampling.
Single stage cluster sampling
Sampling is done only at the first phase and once the sample of clusters
is selected every listing unit within each of the selected clusters is
included in the sample(1). For example, the cluster sample
might be city blocks. In single stage cluster sampling, all of the apartments
in all of the houses on the city blocks would be included in the study
after the cluster sample of city blocks is selected.
Multi-stage sampling
In multi-stage sampling, cluster sampling is done within the previously
selected cluster. For example, after the city blocks are sampled one would
then do a cluster sampling of houses on the city blocks.
Primary sampling units(PSUs)
Primary sampling units refer to the sampling units, or clusters, used at
the first stage of sampling when there are two or more stages in the sample
design(1).
Enumeration units or listing units
When more than one cluster is included in the study, the sampling
units for the final stage are called enumeration units or listing units(1).
Problems
The standard errors of estimates obtained from cluster sampling
designs are high compared with those obtained from samples of the same
number of listing units chosen by other sampling designs(1).
The reason for this is because listing units within the same cluster are
homogeneous, (similar in socioeconomic status, ethnicity, and other factors)
and this results in redundancy when more than one household within the
same cluster is selected(1). Increasing the sample size
is an effectual way to remedy this higher standard error problem in comparison
to other sampling designs.
References
1. Levy P, Lemeshow S.
Sampling of Populations: Methods and Applications. New York: Wiley, 1991.
2. Hansen MH. Sample Survey
Methods and Theory. New York: Wiley, 1953.