 Home Excel Access | Links | Alphabetical Index | Weighted Averages | # Using Excel for Statistics - Averages

Excel is a useful tool for statistical calculations. It is vital to realise however that the correct interpretation and understanding of the results is very important. Excel will display the results of calculations but will not explain their meaning or significance.

Statistics typically summarise a collection of data and may be able to forecast other values based on a limited sample. The simplest tools are the three types of averages.

 - A B C D E F G 2 Journeys A Journeys B Type A B 3 23 23 Mean 22.85 22.85 [E3] =AVERAGE(A3:A9) 4 24 22 Median 23 23 [E4] =MEDIAN(A3:A9) 5 27 23 Mode 19 23 [E5] =MODE(A3:A9) 6 28 23 Count 7 7 [E6] =COUNT(A3:A9) 7 19 24 8 20 23 9 19 22

The Mean is the common average - the sum of the values divided by the number (count) of values. The Median is the middle value if the numbers are sorted from lowest to highest. The Mode is the most commonly occurring value within the range.

#### Standard Deviation

Both of the sets of journeys have the same Mean and Median and yet a glance at the data immediately reveals that they are different. There is much more variation in the values of Journeys A. A measure called a Standard Deviation can describe how tightly the values are clustered around the Mean in each set of data. The higher the Standard Deviation, the greater the amount of scatter.

 - G H I 13 Type A B 14 Standard Deviation 3.71 0.69 15 [H14] =STDEV(A3:A9) 16 One Std Deviation 17 22.86 ± 3.72 (i.e. 19.14 to 26.58) 18 22.86 ± 0.69 (i.e. 22.17 to 23.55) These statistical techniques assume that the data is normally distributed - i.e. a single Mean value with most values close to either side of the Mean and the number of values tailing away the further you get away from the Mean - often described as a bell shaped distribution. In a Normal Distribution 68% of randomly selected values will be within 1 Standard Deviation from the Mean (i.e. 34% on either side of the mean). Approximately 95% of the values will be within 2 Standard Deviations.

In the example above, we would expect that 68% of journeys in group A will be within the range of 19.1 to 26.6 miles.

Note that Excel contains two functions for calculating the Standard Deviation. There is STDEV() and STDEVP(). The first one is for the more common situation in which there is a sample of data, while the second is for when you have all data for the entire population. In many circumstances the whole population is not known and a sample is taken in order to estimate the population characteristics (e.g. the likely number of journeys on any given day).

#### Confidence interval

Having calculated the mean journey length it is reasonable to assume that this value is a good indicator of the average of all related journeys - i.e. the whole population. It is very unlikely however that the result would be exactly 22.86 again. We can use a confidence interval to define the range of values within which the real mean value is likely to be found.  Get this information as a documentaccompanied by Excel worksheets Click here for details about obtaining thisfile. It has been rewritten for Excel 2010.