The experimental (or empirical) probability pertains to data taken from a number of trials. It is a probability calculated from experience, not from theory. If a sample of
Experimental probability contrasts theoretical probability, which is what we would expect to happen. For example, if we flip a coin
In statistical terms, the empirical probability is an estimate of a probability. In simple cases, where the result of a trial only determines whether or not the specified event has occurred, modeling using a binomial distribution might be appropriate. A binomial distribution is the discrete probability distribution of the number of successes in a sequence of
If a trial yields more information, the empirical probability can be improved on by adopting further assumptions in the form of a statistical model: if such a model is fitted, it can be used to estimate the probability of the specified event. For example, one can easily assign a probability to each possible value in many discrete cases: when throwing a die, each of the six values
Advantages
An advantage of estimating probabilities using empirical probabilities is that this procedure includes few assumptions. For example, consider estimating the probability among a population of men that satisfy two conditions:
- They are over six feet in height.
- They prefer strawberry jam to raspberry jam.
A direct estimate could be found by counting the number of men who satisfy both conditions to give the empirical probability of the combined condition.
An alternative estimate could be found by multiplying the proportion of men who are over six feet in height with the proportion of men who prefer strawberry jam to raspberry jam, but this estimate relies on the assumption that the two conditions are statistically independent.
Disadvantages
A disadvantage in using empirical probabilities is that without theory to "make sense" of them, it's easy to draw incorrect conclusions. Rolling a six-sided die one hundred times it's entirely possible that well over
This shortcoming becomes particularly problematic when estimating probabilities which are either very close to zero, or very close to one. For example, the probability of drawing a number from between
In these cases, very large sample sizes would be needed in order to estimate such probabilities to a good standard of relative accuracy. Here statistical models can help, depending on the context.
For example, consider estimating the probability that the lowest of the maximum daily temperatures at a site in February in any one year is less than zero degrees Celsius. A record of such temperatures in past years could be used to estimate this probability. A model-based alternative would be to select of family of probability distributions and fit it to the data set containing the values of years past. The fitted distribution would provide an alternative estimate of the desired probability. This alternative method can provide an estimate of the probability even if all values in the record are greater than zero.