Anybody having some knowledge in statistics knows “the law of large numbers”. It states that as the number of observations increases, the sample mean will approach the population mean. It is arguably the foundation of modern statistical inference. So what the heck is the “law of small numbers”?
Let’s take an example. Considering a random sequence of coin tosses, common people may think that the sequence of “H-T-T-H-T-H” is more likely to happen than “H-H-H-H-H-H” because the former one looks more random than the later (in fact, they have the same joint probability). Therefore, even some erudite students believe that given that a series of heads have occurred, the next one is more likely to be a tail. That is, several heads in a line make the next toss almost bound to a tail. Unfortunately, this is the famous “gambler’s fallacy”.
People always think “randomness” is close to the meaning of “fairness”. In other word, chance is more like a “self-correcting” mechanism. There seems to be an “equilibrium” over time. In the long run, randomness is sort of “fairness” at the probability level. But that is an overall assessment. For a random process with no inherent dependency such as coin tossing, there are no such things as “fairness or equilibrium”. The abnormal pattern observed can’t be corrected (and often can’t be replicated) , but instead diluted over the time.
Some may resort to the Bayesian type of inference in the coin tossing case. The root of Bayesian is conditional probability. However, the probability of a tail given all previous tosses is the same as the probability of a tail without any previous tosses. The inter-independence among coin tosses renders this type of Bayesian inference of no use in this case (The correct way of applying Bayesian method is in the postscript).
Furthermore, if you think that a sequence of Bernoulli trials such as coin tosses is a binomial trial (as many people have indeed argued in this way), you unintentionally impose a condition on the coin tossing experiment– there is an end of the experiment. That is, there are only n tosses. Then the sample size problem kicks in (the more flaws in treating random process as binomial distribution are detailed in postscript).
People have a misconception that a small random sample should be representative to the total population. They should have the same characteristics. Or, in the coin tossing example, the local behavior of a random process should be pretty much similar to the overall behavior. This is the fallacy of the “law of small numbers”.
Ignoring the role of sample size yields the “gambler’s fallacy” which seems trivial. Misunderstanding the assumptions of these basic statistical concepts is serious in research. Unfortunately, even trained researchers are subject to this bias. For example, in meta-analysis reports, it is very common to see that small sample size studies have the most varied results, while studies with large sample size are more consistent (thus they are weighted more in the pooled analysis). However, they all get published in pretty good journals.
The conclusion is that we should always remain skeptical to small sample size studies. Well then, are large sample size studies always good?
=============
Postscript:
Some readers gave many insightful comments. Wasguru pointed out that the flaw in treating a random process such as coin tossing as a binomial distribution is that binomial considers only the number of heads in N trials, not the order of sequence. It lost a great deal of information.
Furthermore, Wasguru also pointed out that the whole argument is based on a basic assumption: the coin is fair (or p=0.5). “If that assumption is subject to test,
Given six heads in a row,when you view it as binomial distribution retrospectively, it is a small probability. “If you observed something with small probability, it won’t be corrected later,…you go back and modify your assumption, that’s what should happenâ€( 008). This is similar to the argument by Wasguru.
Enlighten gave another interesting observation: “Although the difference between the observed probability and the expected probability will get closer to 0 if you have more trials, the difference between the NUMBERS of heads and tails actually tends to increaseâ€. The abso
On the other hand, “a gambler will more likely to lose all his money the more he plays, even if it’s a fair coinâ€(Enlighten). Or “in gambling, the random walk has an absorbing barrier†(Wasguru). This is certainly true due to the finite nature of gambler’s resource (time and money). There is a stopping point for any game player. If the gambler lost all of his money, or stops at an un
For an ideal coin tossing game (fair and independent), any previous lost is sunk cost. If it’s gone, it is lost forever. Or maybe, as 008 asserted: “there is no “fair” gameâ€. Then one should always apply Bayesian’s rule to test the fairness of history and to predict the future.
Unfortunately, human beings always resort to intuitive thinking instead of careful reasoning, a psychological misconception. They expect the process “corrects†itself, but are unwilling to change their assumptions.