How to determine the correct average value of data

By MUNGAI KIHANYA

The Sunday Nation

Nairobi,

17 December 2006

 

Reacting to last week’s article, Peter Ngash says, “… the mean is misleading due to the fact that a few extremely large values can inflate [it]… The median works better … [It] is the value in the middle when the data is arranged in ascending order…”

You are entirely correct, Peter. There are several ways of finding the average of a given set of data, namely, the mean, the median and the mode. Last week I used the mean to demonstrate that the opinion of a few people can accurately represent that of the whole population. I started off by looking at the “average” height of an adult male.

In that example, I was right to use the mean (strictly speaking, arithmetic mean) as the average because the data is unlikely to be widely scattered – the height of adult males is fairly restricted within a narrow range. However, if I was looking for the height of a human being, the story would be different.

The data would vary from very many one-foot babies to very few basketball playing giants. In that case, the mean would be a misleading average and the median would be preferred.

The mode is another average that can also be used. It is the most frequently occurring value in a set of data. For the height of adult males (measured to the nearest centimetre), it is unlikely that any value would appear more than once in a group of ten men.

Nevertheless, the mode is the preferred average when dealing with discrete values. That is, numbers of objects that do not come in fractions. For the example, what is the average number of seats in a public service vehicle in Kenya?

Now, if you counted the total number of seats in an group buses, matatus and taxis and then calculated the mean, you would definitely get an answer with a decimal value, but what is the meaning of, say, 0.672 of a seat? Nonsense? May be!

It would be better to list down the numbers and then determine the most frequently occurring one. The answer would be a discrete quantity without a decimal part – as required. My guess is 14: there seems to be more 14-seaters than any other kind.

Many people don’t know that there are two types of mean – arithmetic and geometric. The most commonly used is the first kind, where you add all the data and divide by the number of values. To get the geometric mean, the numbers are multiplied and then the “n-th” root of the product is determined

For example, if there two values, then find the square root of the product; if there are three numbers, get the cube root of the product; four values, the fourth root and so on. But why would one need to do things in such a complicated way? Well, that is a story for another day.

 
     
  Back to 2006 Articles  
     
 
World of Figures Home About Figures Consultancy