Tabla de Contenidos
The population standard deviation is one of the most important population parameters for measuring the variability or dispersion of data within the population. Like any parameter in statistics, it is represented by a Greek letter, in this case, the letter σ (sigma). This allows it to be easily differentiated from the standard deviation of the sample(s) which, although similar, is not the same nor is it calculated with the same formulas.
Next, we will see, by means of an example, different ways to calculate the standard deviation of a population. It should be noted that, in order to calculate the population standard deviation , it is essential to know all the population data. This rarely happens in real contexts, but it is still important to understand how it is calculated, as it helps to understand some of the mathematical characteristics of this important parameter.
Population Standard Deviation Formulas
Depending on the data available, the population standard deviation can be determined using three different formulas.
Mathematical definition of the population standard deviation
The standard deviation is defined as the square root of the variance, σ 2 . That is, if we know the variance of the population, we can calculate the standard deviation using the following equation:
This case rarely occurs, but it is good to keep in mind.
Other population standard deviation formulas
If instead of knowing the variance of a population, we know all the N data items that comprise it, then we can calculate the population standard deviation as the square root of the average of the squared deviations from the mean. That is to say:
In this equation, x i represents the value of each data item in the population, N represents the number of data items in the population (or the size of the population, which is the same) and μ is the population mean. Note that the population mean is also represented by a Greek letter because it is another population parameter and the size of the population is represented by N (capital letter) to distinguish it from n that is usually associated with the size of a sample.
The population mean, μ, is given by:
Equation 2 can be expanded, rearranged, and simplified to obtain:
In case of not having individual data of the population but data grouped in a frequency table, the previous formulas are slightly modified to give:
In the above equations, the quantity that lies within the root is nothing more than the population variance. Equation 4 has the advantage of being established exclusively in terms of population data and not of some population parameter as in the case of equations 2 and 5.
Example of calculating the population standard deviation
Suppose we want to determine the variability in the weight of a particular car model of which only 20 examples are known to exist worldwide. The data of the weights in kilograms of these 20 cars are presented in the following table:
410 | 408 | 408 | 405 | 391 | 390 | 402 | 397 | 397 | 395 |
390 | 404 | 397 | 394 | 399 | 397 | 405 | 408 | 410 | 400 |
Since we know that there are only 20 cars of this model, these represent the entire population, so we have all the data needed to determine the population standard deviation. Let’s look at three different ways to determine this standard deviation.
Method 1: Calculation based on the definition of variance
This method is based on the use of equation 2 presented above. As we can see, the equation requires the use of the population mean and another series of calculations that are detailed below:
Step 1: Determine the population mean
The population mean or μ is calculated by means of equation 3, adding all the data and dividing by the total number of data, which is, in this case, 20.
Step 2: Calculate the deviations from the mean
This step involves calculating the subtractions (x i – μ). For example:
x 1 – μ = 410 – 400.35kg = 9.65kg
x 2 – μ = 408 – 400.35kg = 7.65kg
x 3 – μ = 408 – 400.35kg = 7.65kg
…
X 20 – μ = 400kg – 400.35kg = – 0.35
The results are presented in the following table:
x i | x i – μ |
410 | 9.65 |
408 | 7.65 |
408 | 7.65 |
405 | 4.65 |
391 | -9.35 |
390 | -10.35 |
402 | 1.65 |
397 | -3.35 |
397 | -3.35 |
395 | -5.35 |
390 | -10.35 |
404 | 3.65 |
397 | -3.35 |
394 | -6.35 |
399 | -1.35 |
397 | -3.35 |
405 | 4.65 |
408 | 7.65 |
410 | 9.65 |
400 | -0.35 |
Step 3: Square all deviations from the mean
(x 1 – μ) 2 = (9.65) 2 = 93.1225 kg 2
(x 2 – μ) 2 = (7.65) 2 = 58.5225 kg 2
(x 3 – μ) 2 = (7.65) 2 = 58.5225 kg 2
…
(x 20 – μ) 2 = (– 0.35) 2 = 0.1225 kg 2
The results are presented in the following table:
x i / kg | (x i – μ)/ kg | (x i – μ ) 2 / kg 2 |
410 | 9.65 | 93.1225 |
408 | 7.65 | 58.5225 |
408 | 7.65 | 58.5225 |
405 | 4.65 | 21.6225 |
391 | -9.35 | 87.4225 |
390 | -10.35 | 107.1225 |
402 | 1.65 | 2.7225 |
397 | -3.35 | 11.2225 |
397 | -3.35 | 11.2225 |
395 | -5.35 | 28.6225 |
390 | -10.35 | 107.1225 |
404 | 3.65 | 13.3225 |
397 | -3.35 | 11.2225 |
394 | -6.35 | 40.3225 |
399 | -1.35 | 1.8225 |
397 | -3.35 | 11.2225 |
405 | 4.65 | 21.6225 |
408 | 7.65 | 58.5225 |
410 | 9.65 | 93.1225 |
400 | -0.35 | 0.1225 |
Step 4: Add up all the squared deviations
Step 5: Apply the formula of equation 2
Now that we have this sum, all that remains is to replace this value, as well as the number of data, which is 20, in equation 2:
Thus, we obtain that the standard deviation of the weight of the population of 20 cars is approx. 6.5kg.
Method 2: Using the rearranged equation
Now we will carry out the same calculation, but using equation 4, which is equivalent to the equation we just used, but is more practical, especially if you are working with a larger number of data. The main benefit is that it is not necessary to calculate an additional parameter (the population mean) to be able to calculate the deviations, but everything is calculated based on the original individual data. Also, at no time do you need to work with negative numbers, which are a major source of error among students.
Step 1: Calculate the square of each individual data
That is, the following calculations are carried out:
(x 1 ) 2 = (410) 2 = 168,100 kg 2
(x 2 ) 2 = (408) 2 = 166.464 kg 2
(x 3 ) 2 = (408) 2 = 166.464 kg 2
…
(x 20 ) 2 = (400) 2 = 160,000 kg 2
The results are presented in the following table:
x i | x i 2 |
410 | 168,100 |
408 | 166,464 |
408 | 166,464 |
405 | 164,025 |
391 | 152,881 |
390 | 152,100 |
402 | 161,604 |
397 | 157,609 |
397 | 157,609 |
395 | 156,025 |
390 | 152,100 |
404 | 163,216 |
397 | 157,609 |
394 | 155,236 |
399 | 159,201 |
397 | 157,609 |
405 | 164,025 |
408 | 166,464 |
410 | 168,100 |
400 | 160,000 |
Step 2: Add up all the individual data
Step 3: Add all the squares
Step 4: Apply the formula of equation 4
The last step is to introduce these two values and the number of data in equation 4 to obtain the population standard deviation:
Method 3: Using spreadsheets
Spreadsheets such as Microsoft Excel, Apple Numbers or Google Sheets include among their basic functions the direct calculation of the standard deviation (both sample and population). These functions take a data set as an argument and carry out all the calculations shown in the previous method to directly return the standard deviation in the cell where the formula is entered.
The procedure is the next:
Step 1: Enter the data in the spreadsheet
We can enter the data in the form of a column, row or matrix anywhere in the spreadsheet. The following screenshot shows what the data for this problem looks like in Excel 2016.
Step 2: Use the formula to calculate the standard deviation
Once the data has been added, we use the standard deviation function, placing the cells where the data is found as arguments.
To call a function in a spreadsheet, we usually start by typing the equals sign (=) followed by the name of the function we want to use. The names change slightly from one application to another and in some cases also change depending on the language in which you are working.
In the case of Excel (Spanish version), the function to calculate the population standard deviation is called STDEV.P, while in Google Sheets it is STDEVP (without the point). Then you must enter the argument(s) of the function between parentheses. In our example, we pass as an argument the range of cells in which the data is located (ranging from cell A3 to J4).
By pressing ENTER, the program runs the function and calculates the standard deviation of the population, presenting the result in the respective cell, as shown below:
As we can see, any of the three methods practiced here produces the same result. It’s just different ways of doing the same thing.
other methods
In addition to the three methods mentioned above, scientific and financial calculators also often have a function to determine the standard deviation of a data set, be it sample or population. The way in which data is entered and results obtained varies from manufacturer to manufacturer, and even from one calculator model to another, so it is impractical to show the specific steps for doing so here.
Instead, we will discuss the most important general steps without delving into them. Anyone wishing to use this function on their scientific calculator should refer to the user manual that came with the calculator or search it online to determine the specific key combination in each case.
Step 1: Clear memory
On many calculators, previously stored data is not visible. If we enter data about others that were already stored without realizing it, the calculator will give a wrong result. To ensure that this does not happen, it is advisable to clear all of the calculator’s memory (or at least the statistical analysis mode) before beginning to enter new data.
Step 2: Access statistics mode
The functions to calculate the standard deviation are part of the “Statistics,” “Statistics” or simply “S” mode on most calculators, so we must start by entering this mode of operation.
Step 3: Enter the data
This varies from one calculator to another. In some cases data can be added in table form, while in others data is entered one by one after pressing the DT (or DAT) key. It is important to check the number of data entered at the end of this step to ensure that none were missing.
Step 4: Calculate the population standard deviation
Once the data is entered, all that remains is to ask the calculator for the result we are looking for. On many calculators, both the sample and population standard deviations are represented by the symbol σ (despite this being an error in the case of the sample deviation). However, we can distinguish the sample deviation from the population deviation because the sample deviation is accompanied by n-1 (that is, it appears as σ n-1 ) while the population deviation appears as s n . This refers to the fact that in the calculation of the sample standard deviation it is divided by n-1 instead of n as in the population.
References
Devore, JL (2019). Probability and Statistics (1st ed .). Cengage Learning.
MateMobile. (2021, January 1). Variance and standard deviation for binned data | matermobile . https://matemovil.com/varianza-y-desviacion-estandar-para-datos-agrupados-por-intervalos/
Google technical support. (nd). STDEV (STDEV) – Google Docs Editors Help . Google – Google Docs Editors Help. https://support.google.com/docs/answer/3094054?hl=en-419
Superprof. (nd). Standard deviation . Mathematics Dictionary | Superprof. https://www.superprof.es/diccionario/matematicas/estadistica/desviacion-estandar.html
TOMi.digital. (nd). Standard Deviation for grouped data . https://tomi.digital/en/52202/standard-deviation-for-grouped-data?utm_source=google&utm_medium=seo