How to determine the correct average value of data
By MUNGAI KIHANYA
The Sunday Nation
Nairobi,
17 December 2006
Reacting to last week’s article, Peter Ngash says, “…
the mean is misleading due to the fact that a few extremely large values
can inflate [it]… The median works better … [It] is the value in the
middle when the data is arranged in ascending order…”
You are entirely correct, Peter. There are several
ways of finding the average of a given set of data, namely, the mean,
the median and the mode. Last week I used the mean to demonstrate that
the opinion of a few people can accurately represent that of the whole
population. I started off by looking at the “average” height of an adult
male.
In that example, I was right to use the mean
(strictly speaking, arithmetic
mean) as the average because the data is unlikely to be widely
scattered – the height of adult males is fairly restricted within a
narrow range. However, if I was looking for the height of a human being,
the story would be different.
The data would vary from very many one-foot babies to
very few basketball playing giants. In that case, the mean would be a
misleading average and the median would be preferred.
The mode is another average that can also be used. It
is the most frequently occurring value in a set of data. For the height
of adult males (measured to the nearest centimetre), it is unlikely that
any value would appear more than once in a group of ten men.
Nevertheless, the mode is
the preferred average when dealing with discrete values. That is,
numbers of objects that do not come in fractions. For the example, what
is the average number of seats in a public service vehicle in
Kenya?
Now, if you counted the total number of seats in an
group buses, matatus and taxis and then calculated the mean, you would
definitely get an answer with a decimal value, but what is the meaning
of, say, 0.672 of a seat? Nonsense? May be!
It would be better to list down the numbers and then
determine the most frequently occurring one. The answer would be a
discrete quantity without a decimal part – as required. My guess is 14:
there seems to be more 14-seaters than any other kind.
Many people don’t know that there are two types of
mean – arithmetic and geometric. The most commonly used is the first
kind, where you add all the data and divide by the number of values. To
get the geometric mean, the numbers are multiplied and then the “n-th”
root of the product is determined
For example, if there two values, then find the
square root of the product; if there are three numbers, get the cube
root of the product; four values, the fourth root and so on. But why
would one need to do things in such a complicated way? Well, that is a
story for another day.
|