Learn to calculate the standard deviation

Artículo revisado y aprobado por nuestro equipo editorial, siguiendo los criterios de redacción y edición de YuBrain.

The standard deviation, represented either by the Greek letter σ (sigma) or by the letter S , is a measure of the variability of a data series. More precisely, it represents a measure of the average deviations of the data of a sample or a population with respect to the population mean, thus indicating how dispersed the data is around said central tendency value.

A high standard deviation indicates that, on average, the data is far from the mean in both directions (the data is very spread out), while a small standard deviation indicates the opposite.

The standard deviation is always calculated as the square root of another measure of variability, called the variance. There are several ways to calculate the variance depending on the type of data available (sample or population), which results in more than one way to calculate the standard deviation.

In both cases slightly different formulas are used, which are described in the next section. Hereinafter it is described how to calculate each of them step by step and “by hand”. It also describes how to use calculators with statistical functions and spreadsheets such as Excel or Google Sheets to calculate this important statistical variable.

There are two types of standard deviation

In statistics there are two kinds of descriptive measures of a data series, depending on whether all the data of a population or only those of a sample are available. Those measures that are used to describe the population are called population parameters and are usually represented with Greek letters. Meanwhile, the parameters that describe a sample are called statistics and are usually represented with lowercase letters.

In view of this, there are two types of standard deviation:

  • The population standard deviation , which is a population parameter represented by the Greek letter σ (lowercase sigma).
  • The sample standard deviation , which is a statistical parameter that is represented by the letter S.

Below are the formulas for calculating both types of standard deviation.

Formulas to calculate the population standard deviation σ

Formula to calculate the population standard deviation

In these equations x i represents the value of each individual data item, μ is the population mean, and n is the total number of data items in the population.

Formulas to calculate the sample standard deviation S

Formula to calculate the sample standard deviation

In these equations x i represents the value of each individual data item in the sample, ¯x is the sample mean, and n is the total number of data items in the sample.

The only real difference in the way the two standard deviations are calculated is that in one case it is divided by n, while in the other it is divided by n – 1 . The latter is to correct the difference between the sample mean and the population mean, which are usually not the same.

What formula should be used?

The only thing to consider in deciding which of the formulas to use is whether the data for which the standard deviation is to be calculated represents all the data in a population or represents only a sample. This is usually evident from the statement (in case a statistical problem is being solved) or from the way the data was obtained.

TIP: When in doubt, it is safest to assume that this is a sample, since you rarely have all the data for a population.

As for using the first (the one on the left) or the second (the one on the right) formula for σ or for S, in both cases the two equations shown give the same result. However, it is more practical to use the formula on the right, even though it may seem more complicated. The reason is very simple: fewer steps are required to calculate the standard deviation with the formulas on the right than with those on the left.

How to calculate the standard deviation “by hand”

Below we present the steps that must be carried out to calculate the standard deviation, using an example to illustrate the process.

Problem

The time that a sample of 15 cars took to fill the fuel tank at a service station was determined. The data, measured in seconds, is presented below:

71 65 48 76 80
64 42 55 80 66
53 49 70 67 42

Determine the standard deviation.

Solution: in this case, the statement specifies that the data correspond to a sample, so the equation we will use to determine the standard (sample) deviation will be:

Example of formula one to calculate the sample standard deviation

To apply this formula, we only need to calculate the sum of the data (∑X i ), the sum of the squares of the data (∑X i 2 ) and the total number of data (n). This is easily accomplished through the following steps:

Step 1: Organize the data vertically

Calculating the standard deviation is easier if you have your data arranged in a vertical list, as it makes the next steps easier. It is not strictly necessary, but it also helps to have each data item identified with a number, as it easily provides the total number of data items (n) which is necessary for the formula to use. The data does not need to be ordered in any way.

# X i x i 2
1 71  
2 65  
3 48  
4 76  
5 80  
6 64  
7 42  
8 55  
9 80  
10 66  
eleven 53  
12 49  
13 70  
14 67  
fifteen 42  

Step 2: calculate the square of each data

The next step is to square each individual data item and then write the result in a column next to it.

# X i x i 2
1 71 5041
2 65 4225
3 48 2304
4 76 5776
5 80 6400
6 64 4096
7 42 1764
8 55 3025
9 80 6400
10 66 4356
eleven 53 2809
12 49 2401
13 70 4900
14 67 4489
fifteen 42 1764

Step 3: Sum all the original data

We add all the values ​​that appear in the column that we identify as X i and write down the result at the end of that column.

Step 4: Add all the squares of the data and write the result at the bottom of the column

We add all the values ​​that appear in the column that we identify as X i 2 and write down the result at the end of said column. After performing steps 3 and 4, the table will look like this:

# X i x i 2
1 71 5041
2 65 4225
3 48 2304
4 76 5776
5 80 6400
6 64 4096
7 42 1764
8 55 3025
9 80 6400
10 66 4356
eleven 53 2809
12 49 2401
13 70 4900
14 67 4489
fifteen 42 1764
Number of data (n) Sum of data ( ∑X i ) Sum of squares ( ∑X i 2 )
fifteen 928 59750

Step 5: Apply the standard deviation formula

The last step is simply to replace the values ​​at the end of the table in the respective formula:

Substitute values ​​in the formula to calculate the standard deviation

Result of calculating the standard deviation by hand

How to Calculate Standard Deviation with Statistical Calculator

Most scientific and financial calculators have special functions to facilitate the calculation of all measures of central tendency and dispersion used in statistics. The procedure, regardless of the model of the calculator, is always the same:

Step 1 – Enter Statistics Mode

Calculators usually have a special mode for statistical functions. It is usually accessed by pressing the MODE button followed by a number that usually appears on the screen next to STAT , SD (for standard deviation ) or something similar.

Step 2 – Clean Up Memory

On older calculators it is not displayed on the screen whether or not there is already data stored in the calculator’s memory, so it is always a good idea to clear the memory before beginning. To do this, press the CLR or MCL key and then select the MODE option (this will erase only the data stored in the statistics mode). In many cases it is necessary to re-enter statistics mode after this step.

Step 3: enter all the data

All the data are entered sequentially, one by one, by pressing the DT , DATA key or the like in between.

Step 4: get the result

The last step is simply asking the calculator for the standard deviation. Where the results are located varies greatly between models and brands of calculators. In some you have to press the SHIFT key followed by the key that says S-VAR above , in others it is different. It is advisable to refer to the manual of the calculator.

Once we get the right menu, we must select which of the two standard deviations we need. If it is population data, we select the option that says σ or σ(n). If it is sample data, we select the option that says σ(n-1) or S.

How to Calculate Standard Deviation in Microsoft® Excel™

The easiest way to calculate the standard deviation is through spreadsheets like Excel or Google Sheets. These programs already have all the protocols to calculate the different statistical variables that we may need. This is done in two simple steps:

Step 1: paste or add the data

This is as simple as copying the data directly, one by one into separate cells (in the form of columns, rows or matrices, it doesn’t matter what). In the case of our example:

How to calculate the standard deviation in spreadsheets like Excel

STEP 2: Write the formula for the standard deviation we need

This depends on the spreadsheet being used and the language it is set to. In the case of Microsoft® Excel™, Spanish version, the formulas for the standard deviation are:

Sample Standard Deviation (S): =STDEV.M(data 1; data 2;…;data n)
Population Standard Deviation (σ): =STDEV.P(data 1; data 2;…;data n)

You do not need to enter the individual data, just select the cells into which the data has already been pasted. In our example, the data is in the range from cell B1 to cell F3, which is written as B2:F3.

How to calculate the standard deviation in spreadsheets like Excel - Step 2

Finally, the ENTER key is pressed and READY! The standard deviation is obtained.

References

  • Espinoza, CI, & Echecopar, AL (2020). Statistical Applications using MS Excel with Step-by-Step Examples (Spanish Edition) (1st ed .). Lima, Peru: Luis Felipe Arizmendi Echecopar and Duo Negocios SAC.

Israel Parada (Licentiate,Professor ULA)
Israel Parada (Licentiate,Professor ULA)
(Licenciado en Química) - AUTOR. Profesor universitario de Química. Divulgador científico.

Artículos relacionados