Tabla de Contenidos
The standard deviation of a data set or of a sample from a certain population is a descriptive statistical parameter that measures the spread of values in that set. If the average of a set of values is calculated, the standard deviation evaluates the difference of the values in the set from the average.
The standard deviation is a non-negative real number. Since zero is a non-negative real number, it is worth asking when the standard deviation will be equal to zero and what does it mean. This happens only in a very particular case, which is when all the values in the data set are exactly the same.
standard deviation
When you have a data set, be it a sample from a certain population or a set of values produced by a certain system, two questions immediately arise: to what defined value can we associate the data set we have and what is the dispersion of the data set? data set we analyse.
In the so-called descriptive statistics there are different parameters that seek to answer these two questions. To evaluate the value to which we can associate the data set, we can calculate the average or arithmetic mean, the geometric mean, the harmonic mean, the mode, the average range or the median. In this case we will use the average or arithmetic mean: the average of a set of n values is the sum of all of them divided by the number of values n .
The spread of values in a set can be assessed by calculating the standard deviation, range, or interquartile range. The figure below shows the general formula used to calculate the standard deviation σ . Expressed in words: we subtract from each value of the set that we analyze, that we note with the subscript i , the average of all the values; we square each of these differences and add them; We divide the result by the number of values in the set minus 1 and calculate the square root of this value.
The standard deviation has two different definitions, depending on the type of data we are analyzing. This difference implies a slightly different calculation. The standard deviation can be calculated on a population or on a sample.
If data is collected from all members of a population or a set, the standard deviation of a population must be used. If you are analyzing data that represents a sample from a larger population, you must use the standard deviation of a sample. The difference in the calculation is that in the case of the standard deviation of a sample, the difference between each value and the squared average is divided by the number of values minus 1 ( n – 1) , as shown in the figure. For the standard deviation of a population, divide by n .
The standard deviation equals zero.
The standard deviation σ calculated in this way evaluates the spread of the values in the set: the larger its value, the greater the spread. Y is always a positive number, since it is the sum of squared values that, therefore, will all be positive. So intuitively, if the value of the standard deviation is zero the spread should be zero. And this occurs when all the values of the set coincide: there is no dispersion.
In turn, if all the values in the set match, the average also matches that value. According to the previous definition of average, if the n values of the set are equal, the sum of the n values translates into multiplying that value by n ; when dividing it by n to calculate the average, both values of n are eliminated and we then have that the average is equal to the unique value of the set. Developing this description in an equation, if there are n equal values, expressed as x , the average is calculated as
( x + x + x + x + x +…+ x )/ n = nx / n = x
Let’s see what happens with the calculation of the standard deviation with the formula described previously. In that formula, each value x i is equal to x , and in turn is equal to the average. Therefore, when the average is subtracted from each x i value , the result is zero. Having a sum with all its addends equal to zero, the result will also be zero. And then the final result of the standard deviation will be zero.
We already saw then that when all the values in a set are equal, the average is equal to that value and the standard deviation is zero. Consider the reverse situation: is the standard deviation zero only if all the values in the set are equal?
To check this, let’s see what happens if only one value were different. That would imply that the average is no longer equal to all the values in the set and then at least one of the addends of the standard deviation calculation would be non-zero: therefore, the standard deviation would not be zero. Since this summation is developed over values raised to the square, all the addends are positive and it is not possible for them to be compensated in a subtraction. The only way for the sum of positive numbers to be zero is for all the addends to be zero; therefore, the only way for the standard deviation to be zero is for all values in the group to be equal to the mean, and therefore equal to each other.
Both arguments constitute a necessary and sufficient condition: the standard deviation of a set of values is zero only if all the values in the set are equal.
Fountain
Yadolah Dodge. The Concise Encyclopaedia of Statistics . New York: Springer, 2010.