Tabla de Contenidos
In calculating the standard deviation, two situations must be considered: the standard deviation of a population or a set of values, and the standard deviation of a sample.
Let us remember, before advancing in the two definitions, that the standard deviation σ is a parameter that allows evaluating the dispersion of a set of values . If the average of a set of values is calculated, the standard deviation evaluates the difference of the values in the set from the average. And the average of a set of n values is defined as the sum of all of them divided by the number of n values . The general formula used to calculate the standard deviation σ is shown below; consists of subtracting from each value of the set that we analyze, that we note with the subscript i, the average of all values; we square each of these differences and add them; We divide the result by the number of values in the set minus 1, and calculate the square root of this value.
Although both definitions of standard deviation assess variability, there are conceptual differences between calculating on a population and on a sample. The difference has to do with the distinction between a statistical variable and a mathematical parameter. If data is collected from all members of a population or a defined data set is studied, this is the calculation of the standard deviation of a population. If you are analyzing data that represents a sample from a larger population, it is the calculation of the standard deviation of a sample. The figure below graphically illustrates the difference. The standard deviation of a population is a mathematical parameter with a definite value; The standard deviation of a sample is a statistical parameter that evaluates a set of data whose result is projected onto a larger set. This evaluation depends on the sample, it is not a definite value, as it is in the case of a population.
Qualitatively the difference in definition implies a slightly different calculation; In the case of the standard deviation of a sample, the difference between each value and the squared average is divided by the number of values minus 1 ( n – 1), as shown in the previous formula. In the case of the standard deviation of a population it is divided by n .
Example
Let’s see an example to fix ideas. Let’s take a set of values and calculate the standard deviation according to the two definitions. The group is as follows, and contains 5 values ( n = 5), which are as follows:
1, 2, 4, 5, 8
The average of these values has the following expression
(1 + 2 + 4 + 5 + 8)/5 = 20/5 = 4
The differences of each value and the average squared are represented with the following sequence
(1 – 4) 2 = 9
(2 – 4) 2 = 4
(4 – 4) 2 = 0
(5 – 4) 2 = 1
(8 – 4) 2 = 16
The sum of the five values is 30.
In the case of calculating the standard deviation of the population, this value must be divided by n , 5 in this example and the result is 6 . In the case of the standard deviation of the sample it is necessary to divide between n – 1; 4 in this case and the result is 7.5 . To complete the calculation we must obtain the square root; approximately 2.4495 if it were a population, and approximately 2.7386 if it were a sample.
Fountain
Yadolah Dodge. The Concise Encyclopaedia of Statistics . New York: Springer, 2010.