The importance of the central limit theorem

Artículo revisado y aprobado por nuestro equipo editorial, siguiendo los criterios de redacción y edición de YuBrain.

The central limit theorem is a basic theorem in probability theory. The term “central” is equivalent to fundamental, or of central importance, and was coined by George Polyá in 1920, signifying the relevance of the theorem in probability theory. The limit theorem has several versions proposed by different mathematicians. Basically, the central limit theorem says that under certain hypotheses the distribution of the sum of a very large number of random variables approximates a normal distribution .

The Central Limit Theorem

The statement of the central limit theorem is abstract, but let’s see a way to understand it step by step. Suppose we have a simple random sample of n items from a population of interest. In this sample, the sample mean can be calculated, which represents the mean of the population of interest. A distribution of the sample mean can be generated by repeatedly selecting simple random samples from the same population that have the same size, then calculating the mean of each of these samples. Each of the simple random samples must be independent of the others.

The central limit theorem concerns the distribution of sample means, and says that this distribution approximates a normal distribution. The larger the simple random samples, the better the approximation to a normal distribution of the distribution of sample means. It should be noted that the central limit theorem establishes that under these conditions the distribution of the sample mean is normal, regardless of its initial distribution. Even if the population has a skewed distribution, a frequent situation when studying parameters such as people’s income or their weight, the distribution of the sample mean will be normal if the sample size is large enough.

And it is at this point where the importance of the central limit theorem lies, since it allows us to simplify statistical problems when working with a distribution that can be considered normal. There are many and very relevant applications in which it is essential to be able to consider that the population has a normal distribution, such as hypothesis tests or the determination of confidence intervals.

It is not difficult to find real-world data sets that show outliers, skewed distributions, or multiple peaks. But applying the central limit theorem, if an appropriate sample size is selected, problems in which the populations do not present a normal distribution can be addressed. Therefore, even if the distribution of the population to be studied is not known, the central limit theorem ensures that, if we take large enough samples, the real distribution can be approximated by a normal distribution. In specific situations, an exploratory analysis of the data can help to measure the size of the sample so that the central limit theorem is valid.

Fountain

Jimena Blaiotta, Pablo Delieutraz. Central Limit Theorem .  Faculty of Exact and Natural Sciences, University of Buenos Aires, Argentina, 2004.

Sergio Ribeiro Guevara (Ph.D.)
Sergio Ribeiro Guevara (Ph.D.)
(Doctor en Ingeniería) - COLABORADOR. Divulgador científico. Ingeniero físico nuclear.

Artículos relacionados