Sample Generation

 
It is ideal to randomize the selection of samples to be measured in a study.  A random process must be used to accomplish random sample generation.
Random numbers can be generated several ways including: Regardless of which  procedure is used, one must assign numbers to the objects, samples, or areas to be sampled and the selection process is carried out according to the random numbers which were assigned to the population(1).
An example of the Lehmer Generator formula depicts the process of random number generation.
The importance of proper randomization procedures can not be emphasized enough. Random Sample Generation enables studies to be conducted without the possibility of selection bias.

Home



Example of using a table of Random Numbers
Steps:
  • Enter the table by a random process such as have two people each pick one number and find a starting place with one # as a row and the other # as a column(1)
  • Once a starting number has been selected, proceed in a systematic direction which has been decided previously (up, down, left, or right)
  • Continue with this until you have fullfill the required amount for your selection(1)
Example:
If you needed a random sample of 10 people out of a population of 50 you would begin at a random spot in the table. If we began at 20 (after deciding to move down the columns) until we had 10 numbers:
20, 43, 23, 06, 11, 40, 35, 46, 12, 39 
84 81 91 06 12 11 83 06 10 34
23 29 00 64 02 40 03 45 86 26
38 48 95 32 05 35 84 07 39 35
10 38 07 28 77 46 12 64 45 16
83 53 60 92 20 91 90 97 63 45
60 89 58 63 83 56 80 54 84 46
74 23 60 65 43 12 37 49 95 08
49 10 47 94 92 39 30 03 47 02
76 15 84 79 23 21 86 11 60 28
03 52 46 32 06 27 04 03 08 25
*please note this is NOT an entire table of random numbers


 
 
 
 
 
 
 
 
 
Random Number Generators : 
The goal of random sample generation is to be able to generate sequences of random numbers. Random Number Generators, RNGs, are deterministic algorithms that produce numbers with certain distribution properties(2). Roughly speaking, these numbers should behave similarly to independent  identically distributed random variables(2).
     A hardware (true) random number generator is a piece of electronics that plugs into a computer and produces genuine random numbers as opposed to the pseudo-random numbers that are produced by many computer programs(3).
Because  random generator programs use deterministic algorithms they are more correctly referred to as pseudo-random number generators, since the sequences of numbers they produce are purely deterministic and thus can only approximate a true random sequence(4).
Random number generators require the user to specify an initial value, or seed. Initializing the generator with the same seed will give the same sequence of random numbers. If you want a different sequence, you just initialize using a different seed(4).

Links
http://random.mat.sbg.ac.at/generators/
http://random.mat.sbg.ac.at/links/rando.html
http://webnz.com/robert/true_rng.html
http://www.npac.syr.edu/projects/random/brief.html 
 


 
 
 
 
 
 

The Lehmer Generator
There are many methods for generating sequences of random numbers. One of the most popular is called the linear congruential method, which was invented by D. H. Lehmer in 1949(3).
 
The Lehmer Generator
The basic formula is simply:
x(i+1) = (a * x(i) + c) mod m
To get the next value, take the current value, multiply it by a, add c to it, divide by m and take the remainder. Choosing good values for a, c, and m is not simple and bad values quickly degenerate into non-random sequences(3).

 
 
 
 
 
 
 
 
 
 

Importance of Proper Random Sample Generation
     In a study examining  whether the selection process for enrollment in clinical trials dealing with obstetrics and gynecology are using proper randomization techniques, investigators found that only 32% of the reports described an adequate method for generating a sequence of random samples(5). Proper randomization eliminates selection bias and  is required to generate unbiased comparison groups in controlled trials. The authors considered the following approaches to the generation of an allocation sequence as adequate: computer, random number table, shuffled cards or tossed coins, and minimization. Their estimate for adequate sequence generation may be generous becuase it includes processes that are subject to human perturbations and result in unreproducible results. A computer random number generator was the most frequently specified method (18%), followed by a random number table (11%).
     Randomized controlled trials provide the most valid basis for the comparison of interventions in health care. If improperly conducted, however, trials purporting to be "randomized" can yield biased results(5). The investigators in this study recommend tables and computers not only because of reproducibility but also because of ease and speed(5).

References

1.     Taylor JK. Statistical Techniques for Data Analysis. Michigan: Lewis Publishers, 1990.
2.     Hellekalek P. University of Salzburg's Mathematics Department. http://random.mat.sbg.ac.at/generators/ 
3.     Holtzman, J. Generating Random Numbers. Electronics Now 1998; 69:22-25.
4.     (http://www.npac.syr.edu/projects/random/brief.html.) 
5.     Schulz K. Assessing the Quality of Randomization from Reports of Controlled Trials Published in Obstetrics and      Gynecology journals. The Journal of the American Medical Association 1994;272:125-9. 

HOME