The role Galileo played in data distributions is that he was the first to suggest that "measurement errors are deserving of a systematic and scientific treatment". Yes, he did it through his observation of distances between stars and the distance of a star from the center of the earth. All observations that we see are burdened with errors and those observations are distributed symmetrically about the true value; errors are distributed symmetrically about zero.
The bell-shaped curve
Abraham de Moivre (1667-1754) proved that the central limit theorem holds for simple collections of numbers from games of chance. He is known as the father of the normal distribution and the first appearance for a bell-shaped curve appeared in his Book of Chances, although the beginnings of the curve has been attributed to Carl Friedrich Gauss (1777-1855).
A normal distribution simply means that observations of a certain variable have a continuous probability distribution characterized by a mean and a standard deviation (dispersion of data from the mean calculated as the square root of the average squares of the deviations from the mean or center). If the mean equals zero and the standard deviation equals one then we would have what is known as a standard normal distribution. The important thing to know in a normal distribution is that 68% of the area under the curve is within 1 standard deviation of the mean, 95% of the area lies within 2 standard deviations, and 99% of the area lies within 3 standard deviations. You may also hear that a normal distribution is symmetric about its mean. In a perfect world, you may be able to see plotted continuous values distributed equally with complete symmetry around the mean. However, we live in a messy world and it is rarely the case that you will ever see complete/perfect symmetry (may be only in stars).
The normal distribution is famous because of the central limit theorem, which is a fascinating phenomenon. It is always refreshing to be able to attribute the existence of one phenomena from another, and that is the relationship between central limit theorem and normal distributions. The central limit theorem simply means taking samples from an original randomly distributed variable (could be discrete), calculating the averages from each of these samples and plotting the means will give us a normal distribution.The larger the samples you take and the more the samples you take the more normally distributed your plot will be.
Example:
Imagine Gauss, Fisher, Pearson, Cox, Student (the pen name of a statistician who developed the student's t-test who was known as William Sealy Gosset) were all participating in a show called the World Idol of Statistics (disregarding time and space, here). Voters get to choose their idol by dialing in their votes and pressing 1 for Gauss, 2 for Fisher, 3 for Pearson, 4 for Cox, and 5 for Student (yes somewhat similar to the singing Idol talent shows). Plotting the results of these votes would reveal a discrete probability distribution function. Now take 50 samples each of n=10, for example and plot the frequency of means of these samples; you will start to see a normal distribution pattern.
_____________________________________________________________________________
References and Further Reading:
G. Galilei, Dialogue Concerning the Two Chief World Systems—Ptolemaic & Copernican (S. Drake translator), 2nd ed., Berkeley, Univ. California Press, 1967.
Hald, Anders (1990), "De Moivre and the Doctrine of Chances, 1718, 1738, and 1756", History of Probability and Statistics and Their Applications before 1750, Wiley Series in Probability and Statistics.
Stahl S. The evolution of the normal distribution. http://mathdl.maa.org/images/upload_library/22/Allendoerfer/stahl96.pdf. last accessed on July 5, 2013.
Salsburg D. The Lady Tasting Tea: How Statistics Revolutionized Science in the Twentieth Century. Holt Paperbacks. First published in hardcover in 2001 by WH Freeman and Company.
http://www.stat.uchicago.edu/events/normal/fatherND.html
http://www.robertnowlan.com/pdfs/de%20Moivre,%20Abraham.pdf
Great Blog! After all life is not ONLY about chances...
ReplyDeleteThank you! agreed and we should not underestimate it (chance) or randomness...
ReplyDelete