Difference between the standard deviations of a sample and a population

Artículo revisado y aprobado por nuestro equipo editorial, siguiendo los criterios de redacción y edición de YuBrain.

Tabla de Contenidos

In calculating the standard deviation, two situations must be considered: the standard deviation of a population or a set of values, and the standard deviation of a sample.

Let us remember, before advancing in the two definitions, that the standard deviation σ is a parameter that allows evaluating the dispersion of a set of values . If the average of a set of values ​​is calculated, the standard deviation evaluates the difference of the values ​​in the set from the average. And the average of a set of n values ​​is defined as the sum of all of them divided by the number of n values . The general formula used to calculate the standard deviation σ is shown below; consists of subtracting from each value of the set that we analyze, that we note with the subscript i, the average of all values; we square each of these differences and add them; We divide the result by the number of values ​​in the set minus 1, and calculate the square root of this value.

Standard deviation σ of a sample.
Standard deviation σ of a sample.

Although both definitions of standard deviation assess variability, there are conceptual differences between calculating on a population and on a sample. The difference has to do with the distinction between a statistical variable and a mathematical parameter. If data is collected from all members of a population or a defined data set is studied, this is the calculation of the standard deviation of a population. If you are analyzing data that represents a sample from a larger population, it is the calculation of the standard deviation of a sample. The figure below graphically illustrates the difference. The standard deviation of a population is a mathematical parameter with a definite value; The standard deviation of a sample is a statistical parameter that evaluates a set of data whose result is projected onto a larger set. This evaluation depends on the sample, it is not a definite value, as it is in the case of a population.

Population and sample.
Population and sample.

Qualitatively the difference in definition implies a slightly different calculation; In the case of the standard deviation of a sample, the difference between each value and the squared average is divided by the number of values ​​minus 1 ( n – 1), as shown in the previous formula. In the case of the standard deviation of a population it is divided by n .

Example

Let’s see an example to fix ideas. Let’s take a set of values ​​and calculate the standard deviation according to the two definitions. The group is as follows, and contains 5 values ​​( n = 5), which are as follows:

1, 2, 4, 5, 8

The average of these values ​​has the following expression

(1 + 2 + 4 + 5 + 8)/5 = 20/5 = 4

The differences of each value and the average squared are represented with the following sequence

(1 – 4) 2 = 9

(2 – 4) 2 = 4

(4 – 4) 2 = 0

(5 – 4) 2 = 1

(8 – 4) 2 = 16

The sum of the five values ​​is 30.

In the case of calculating the standard deviation of the population, this value must be divided by n , 5 in this example and the result is 6 . In the case of the standard deviation of the sample it is necessary to divide between n – 1; 4 in this case and the result is 7.5 . To complete the calculation we must obtain the square root; approximately 2.4495 if it were a population, and approximately 2.7386 if it were a sample.

Fountain

Yadolah Dodge. The Concise Encyclopaedia of Statistics . New York: Springer, 2010.

Sergio Ribeiro Guevara (Ph.D.)
Sergio Ribeiro Guevara (Ph.D.)
(Doctor en Ingeniería) - COLABORADOR. Divulgador científico. Ingeniero físico nuclear.

Artículos relacionados