Rabu, 23 Januari 2013

Statistics| Analysis of Data – Confidence intervals



Most of the time the population mean differs from the mean of each individual sample taken from the population.
Consider the following example where absorbance of a solution containing a known concentration of substance A was determined by U.V./Visible spectrometer. The absorbance of the solution was measured three times during each experiment and the average value, standard deviation s and 2s was calculated (see Table 1).

Table 1: Absorbance values measured for a solution A using a U.V./Visible spectrometer. Three consecutive measurements were recorded during each experiment

Table 1: Absorbance values measured for a solution A using a U.V./Visible spectrometer. Three consecutive measurements were recorded during each experiment.

The mean values from Table 1 were plotted in the graph shown in Fig. 1 below and 2s (2 * standard deviation s) is shown for each mean. As can you see, the mean value calculated for the absorbance of A in each experiment (sample of the population) differs from the mean of the population (red line). Please also note that the interval - solid line above and below the mean in each experiment (Fig. 1) -  of  each mean (the range of values the mean can take with a certain probability) does not always contain the population mean (experiments 5, 9, 10, 13).

Fig. 1: Absorbance values obtained by measuring a solution of substance A with known concentration C using a U.V./Visible spectrometer.
Fig 1: Absorbance values obtained by measuring a solution of substance A with known concentration C using a U.V/Visible spectrometer. Each point shown on the graph is the mean of three consecutive measurements of the solution of substance A with concentration C.

From the above discussion the following question arises:

How can we assess the accuracy of the population mean? Within wich boundaries the true value of the population mean is contained? 

Such boundaries are called confidence intervals or confidence levels. Confidence intervals in a sense give us the range of values that the population mean can take with a certain degree of confidence – usually 90%, 95% or 99%.

Most of the time we look at 95% confidence intervals but all of them have similar interpretation:
they are limits constructed such that a certain percentage of the time 95% in this case the value of the population mean will fall within these limits.


How can we calculate confidence intervals? 

In order to calculate the confidence interval, we need to know the limits within which 95% of means will fall. If we will assume a normal distribution with a mean = 0 and  s = 1 we can use the z-scores with values between -1.96 and  +1.96 (remember that 95% of z-scores fall between these two values). Remember also that we can convert values to z-scores using the formula:

z = (x - x̅ )/ s                      (1)

If we know that the upper limit will be z = +1.96 then from (1) we get:

(x - x̅ )/ s  = 1.96  and  x =   x̅ + 1.96 * s  (this is the upper boundary – limit)

and
(x - x̅ )/ s  = -1.96  and  x =   x̅ - 1.96 * s  (this is the lower boundary – limit)

Therefore, the confidence interval can easily be calculated once the standard deviation s of the mean and the mean are known. The general form of the confidence interval is given below:

                                                           x =   x̅ ± zcritical * s               (2)

where x is the upper or lower value the mean of the population can take with a certain degree of confidence,  x̅ is the mean value of the population of measurements, zcritical is the z critical value from statistical tables (see Table 2) at a certain confidence level (usually 95%) and s is the standard deviation of the measurements.



Confidence Level (%)
z-critical value
99
2.58
95
1.96
90
1.645
50
0.675

Table 2: Critical values of z at different confidence levels


Tidak ada komentar:

Posting Komentar