Tuesday, July 30, 2013

The lady tasting vodka: The Null Hypothesis

This past Saturday night I contemplated designing an experiment similar to the one Sir Ronald Fischer described in his 1935 book "The Design of Experiments" known as the Lady tasting tea, with the company of two good friends (Dana and Zahi). However, (and I am only blaming it on the time of day) iced chamomile tea and vodka were used. I am sharing the thought just as a fun analogy because the overpowering taste of vodka will make it challenging to differentiate between what was poured first, chamomile tea or vodka. The Lady tasting tea experiment was one of the first experiments designed with randomization. Up to my knowledge (derived from readings), it is a true story describing the null hypothesis and randomization.

Fischer only worked with null hypotheses. There is no alternative in his experiments (those were the works of Jerzy Neyman and Egon Pearson). The Null here and in every scenario is the "default position", which is the "no difference" between two methods, treatments groups, measurements etc...Fischer used his P-value as a rough guide of the strength of evidence against the null.

The Lady Tasting Tea
I first read about the lady tasting tea in David Salsburg's book which includes stories of how statistics revolutionized medicine. You can easily find the book because "The Lady Tasting Tea" is in the title (I personally received it as a gift from Dr. Wallace Chamon, an Ophthalmologist and Professor in Brazil). I found several references to the story online and even a full lecture about the topic by Debroh Nolan at UC Berkeley.

The story, as described, goes back to a summer afternoon in Cambridge, England, in the 1920s. A group of university scholars and their spouses were all gathered for an afternoon tea. A Lady known as Dr. Muriel Bristol , an algologist, was being served tea and she says: "No thank you, I prefer my tea poured with milk first." Fischer then responds by saying: "Nonsense it all tastes the same". William Roach in the background (who probably had his eye on Bristol as he later married her) yells: "let's test her". and so the preliminary preparations began..

The Experiment
The Null Hypothesis was that the Lady will have no ability to differentiate between the cups with milk being poured first or tea being poured first. The experiment considered: (i) the number of cups (more than 2 because with 2 only the Lady would have 50/50 chance of getting it right); (ii) whether they should be paired; (iii) in what order should they be presented, (iv) who prepares them, portions and right temperature etc..

Example of random ordered cups. T: Tea poured first; M: Milk poured first
The Lady was then provided with 8 cups of tea (randomly ordered): 4 prepared by first adding milk and 4 prepared by first adding tea. Any software can easily generate randomized numbers. The RAND() function in excel can assign numbers to an ordered list of cups with milk or tea being poured first. I took a snapshot of how excel can do this (table on the right), where I assigned M for milk being poured first and T for tea being poured first.

The Lady was then asked to identify the cups. Fischer was only willing to reject the hypothesis if the Lady categorized all the cups correctly, recognizing her ability at a 1.4% significance level. Here is where the 1.4% came from:


I am not sure if the results of the experiment were presented, but the conclusion in the end was that Lady Dr. Bristol was indeed able to differentiate between the cups. While my attempt to design an experiment with chamomile tea and vodka was fun. Alcohol has an overpowering taste and should be tested on its own. I should probably run this experiment comparing two types of vodka (for example British and Russian vodka) and see if anyone can really tell the difference, similar to how Debroh Nolan ran the experiment comparing Mexican and American Coca Cola. (Note: this is only a tasting experiment, the subject will not drink the 8 cups).

The main reason why I decided to write about this topic is to lay the foundation for future discussions on the null hypothesis, the alternative, and significance tests and how they are inconveniently married. When discussing this story with Dr. Sandeep Jain (the Director of the Corneal Neurobiology Laboratory at the University of Illinois at Chicago), his first reaction was that "the greatest discoveries are observational and they come about by a fair degree of luck and chance. They come about by ways you don't expect them to."  This one came about from a Lady wanting to drink her tea poured with milk first.

_____________________________________________________________________________
References 

David Salsburg. The Lady Tasting Tea: How Statistics Revolutionized Science in the Twentieth Century. Copyright by WH Freeman and Company, 2001: Publication date May 2002.

Debroh Nolan. Lecture 10: Fischer's "Lady Tasting Tea" Experiment. given at UC Berkeley. CosmoLearning. http://www.cosmolearning.com/video-lectures/lecture-10-fishers-lady-tasting-tea-experiment-10081/

Box, Joan Fischer (1978). R.A. Fisher, the Life of a Scientist. New York: Wiley. p. 134 

Charlie Gibbons. Fischer's Exact Test and its Extensions. University of California Berkeley. Fall 2012. http://cgibbons.us/courses/are210/NPTestsNotes.pdf. Last accessed July 31, 2013







Sunday, July 14, 2013

Ockham's razor, frequentist-Bayesian, and parsimonious regression models


With all else being equal, a simpler explanation is more likely to be the correct one is what we understand as Occam's razor (William Ockham). In other words, simpler models are favored until the data is able to explain, with a few assumptions, more complex ones. This philosophical notion is applied in several scientific disciplines.

Given Occam's principle, how does one go about reaching a simple parsimonious model to predict disease risk? What is the difference between adopting a probabilistic frequentist or a subjective (Bayesian) approach, in other words should there be a Bayesian viewpoint in epidemiological research data?

Frequentist and subjective probability
Frequentist methods are known as the Fisherian P-values (RA Fisher) and the confidence intervals that remain the norm in biostatistics and epidemiological data under analysis (what we see in published clinical and epidemiological studies).  They are based on notions of objectivity and likelihood functions that help with setting conclusions. Frequentist techniques are highly effective in randomized trials. However in observational studies, a frequentist model may become more questionable (potentially misleading) as we are more likely to be confronted with confounding, selection bias, and measurement error. That is when Bayesian methods may potentially be worthwhile looking into. Even though Bayesian methods have been criticized for their imprecision, their total reliance on prior parameter distribution, and that they are largely based on subjective and arbitrary elements, it has been suggested that they may be useful where prior estimates may be generated by applying the same formulas that frequentists use. An article by Sander Greenland in 2006 provides a clear explanation of this topic with clear examples.

What are Baysian methods (subjective probability)?
Subjective probability can be defined as the degree of belief that x is true. The probability in this context does not represent the external world but rather features of personal subjective interpretations. In subjective probability, we are not interested in any kind of long run frequency behaviors.

For example, what is the probability of your flight to Hawaii on January 28, 2014 will be cancelled?
In this case, you are not interested in a frequency behavior, but you are interested in predicting whether your flight will be cancelled on this specific day (one single occasion). There is a certain degree of belief in whether this event will occur. Subjective attitude toward the belief that the flight will be cancelled. You know that flights are more likely to get cancelled due to it being in the middle of "storm season", you may determine in this case that the probability is high. Another example, is what is the probability of you getting a heart attack on your 80th birthday? Betting games are widely known as developed based on subjective probability. In larger contexts and data sets, Bayesian methods are applied through subjectively determining prior distributions and applying them to current models.

The parallel between Bayesian and frequency methods is the conditional model.
conditional probability (Baye's rule): P(data/parameters)
P (A/B) = (P(A) x P(B/A))/P(B)

Where Probability that A is true given that B is true is known as the Posterior probability of A.

If you had observational data and wanted to determine the outcome Y of breast cancer given several parameters x, you would need to build a logistic model using either automated mechanical methods or confounding and interaction assumptions.   In certain cases, the epidemiological data model will be based on statistical cut off points potentially conflicting with contextual information. Most models are criticized as being biased with too many assumptions. In this case, can one consider developing a model using Bayesian priors equally arbitrarily as a frequency data model? Can one replace arbitrary variable selection by prior distributions?  The articles by Greenland suggest 'yes'. The concept of pooling studies (hypothetical prior with current study) is suggested (adding results from the hypothetical study of priors as a new strata...).

In conclusion, would the recipe of reaching a parsimonious model (making Ockham happy) using observational data include:
1) Common sense and ingenuity
2) A frequency model with few assumptions (frequentist approach)
3) A priors model (Bayesian perspective)

Should this become common practice?
_____________________________________________________________________
References and good reads:

Savage LJ. Subjective Probability and Statistical Practice. The Foundations of Statistical Inference. 1962

Greenland S. Bayesian perspectives for epidemiological research: I. Foundations and basic methods. Int J Epidemiol. 2006 Jun;35(3):765-75. 

Greenland S. Bayesian perspectives for epidemiological research. II. Regression analysis. Int J Epidemiol. 2007 Feb;36(1):195-202. 





Friday, July 5, 2013

The Normal Distribution-Bell-Shaped Curve-Central Limit Theorem


The role Galileo played in data distributions is that he was the first to suggest that "measurement errors are deserving of a systematic and scientific treatment". Yes, he did it through his observation of distances between stars and the distance of a star from the center of the earth. All observations that we see are burdened with errors and those observations are distributed symmetrically about the true value; errors are distributed symmetrically about zero.

The bell-shaped curve

Abraham de Moivre (1667-1754) proved that the central limit theorem holds for simple collections of numbers from games of chance. He is known as the father of the normal distribution and the first appearance for a bell-shaped curve appeared in his Book of Chances, although the beginnings of the curve has been attributed to Carl Friedrich Gauss (1777-1855).

A normal distribution simply means that observations of a certain variable have a continuous probability distribution characterized by a mean and a standard deviation (dispersion of data from the mean calculated as the square root of the average squares of the deviations from the mean or center). If the mean equals zero and the standard deviation equals one then we would have what is known as a standard normal distribution. The important thing to know in a normal distribution is that 68% of the area under the curve is within 1 standard deviation of the mean, 95% of the area lies within 2 standard deviations, and 99% of the area lies within 3 standard deviations. You may also hear that a normal distribution is symmetric about its mean. In a perfect world, you may be able to see plotted continuous values distributed equally with complete symmetry around the mean. However, we live in a messy world and it is rarely the case that you will ever see complete/perfect symmetry (may be only in stars).

The normal distribution is famous because of the central limit theorem, which is a fascinating phenomenon. It is always refreshing to be able to attribute the existence of one phenomena from another, and that is the relationship between central limit theorem and normal distributions. The central limit theorem simply means taking samples from an original randomly distributed variable (could be discrete), calculating the averages from each of these samples and plotting the means will give us a normal distribution.The larger the samples you take and the more the samples you take the more normally distributed your plot will be. 

Example:
Imagine Gauss, Fisher, Pearson, Cox, Student (the pen name of a statistician who developed the student's t-test who was known as William Sealy Gosset) were all participating in a show called the World Idol of Statistics (disregarding time and space, here). Voters get to choose their idol by dialing in their votes and pressing 1 for Gauss, 2 for Fisher, 3 for Pearson, 4 for Cox, and 5 for Student (yes somewhat similar to the singing Idol talent shows). Plotting the results of these votes would reveal a discrete probability distribution function. Now take 50 samples each of n=10, for example and plot the frequency of means of these samples; you will start to see a normal distribution pattern.

_____________________________________________________________________________
References and Further Reading:

G. Galilei, Dialogue Concerning the Two Chief World Systems—Ptolemaic & Copernican (S. Drake translator), 2nd ed., Berkeley, Univ. California Press, 1967.

Hald, Anders (1990), "De Moivre and the Doctrine of Chances, 1718, 1738, and 1756", History of Probability and Statistics and Their Applications before 1750, Wiley Series in Probability and Statistics.

Stahl S. The evolution of the normal distribution. http://mathdl.maa.org/images/upload_library/22/Allendoerfer/stahl96.pdf. last accessed on July 5, 2013.

Salsburg D. The Lady Tasting Tea: How Statistics Revolutionized Science in the Twentieth Century. Holt Paperbacks. First published in hardcover in 2001 by WH Freeman and Company. 

http://www.stat.uchicago.edu/events/normal/fatherND.html

http://www.robertnowlan.com/pdfs/de%20Moivre,%20Abraham.pdf