Você está na página 1de 5

Application of Statistical Concepts in the Determination of Weight

Variation in Samples

Robles, Jeffrey
Department of Chemical Engineering, College of Engineering
University of the Philippines, Diliman, Quezon City
Date Due: July 4, 2012
Date Submitted: July 4, 2012
______________________________________________________________________________
Keywords: statistics, standard deviation, uncertainty, accuracy, precision


METHODOLOGY

Ten 1-peso coins were used as samples in the
experiment. Forceps were used in handling these
samples to prevent accumulation of moisture
which can affect the measured weight. Using an
analytical balance, their weights were taken and
were subjected to further calculations. The
weights of the first six samples belong to Data
Set 1(DS1) while all masses constitute Data Set
2(DS2).

RESULTS AND DISCUSSION

One of the objectives of this experiment is to
observe the fact that errors in experiment are
inevitable. It is therefore important to be aware
of such errors. There are basically three types of
errors: Determinate/ Systematic, random/
Indeterminate, and Gross Errors. Determinate
errors are errors which can be attributed to
known causes and are often reproducible. These
can either be methodic, operative or
instrumental. An example of this type is the
possible failure of the analytical balance used in
the experiment. Meanwhile, gross errors are
errors that occur only occasionally mostly due to
human errors. These errors are too impactful that
the process should be repeated to obtain accurate
results. Handling the samples with bare hands
rather than with forceps is a possible gross error
source in the experiment. Lastly, random errors
cannot be ascribed to any definite cause. These
most often are due to the limitation of
measurements. One possible source of random
error in the experiment is the fluctuation of
temperature in the balance room brought about
by people who come & go. Random errors are
always present in an experiment and cannot be
eliminated.

Another objective of the experiment is to apply
statistical concepts in treating random errors.
One of such is the concept of normal distribution
or normal error curve and its parametrical
properties. The Normal error curve or Gaussian
curve is a mound-shaped curve representing the
normal probability distribution of data around
the mean defined by the equation
y =
c
-(x-)
2
2o
2
c2n
(1)

where y is the relative frequency, is the
population mean, is the population standard
deviation, and and are mathematical
constants. This accounts the graph of all data
populations provided that determinate errors are
dismissed. In a normal error curve, the mean
occurs at the central point of maximum
frequency thereby resulting to a symmetrical
distribution of positive and negative deviation
around the maximum. Other properties of a
normal distribution include statistical parameters
such as sample mean, range, and standard
deviation.

The results tabulated in Tables A in the attached
result sheet reveal that there are apparent
weights that vary quite enormously from the
others (outliers). For Data Set 1, the suspected
outliers were 5.3090 and 5.4680 while those for
Data Set 2 were 5.3090 and 5.9581. These
values were then subjected to a test concerning
their rejection or retention called the Q-test. The
Q-test is a statistical test that affirms the
acceptance of results, more particularly the
outliers at a certain confidence level, say 95%.

cxp
, given by the equation

cxp
=
|X
q
-X
n
|
R
(2)

where x
q
is the suspected value, x
n
is the
nearest neighbour value, and R is the sample
range (x
hghcst
- x
Iowcst
), should be less than the
tabulated value of Q,
tub
, in order for the value
to be accepted. Otherwise the value is rejected
and therefore, should not be included in further
calculations.

The only rejected result among the ten is the last
entry in Data Set 2. Its
cxp
, 0.6859, is greater
than the
tub
at 95% confidence level, 0.466.
The Q-test guarantees that the suspect value,
upon rejection was subjected to some error. One
possible source of error can be traced not on the
experimenter but on the sample itself. All other
samples were made on 2004 or later, while the
rejected one was made on 2001. According to
the Bangko Sentral ng Pilipinas (BSP), 1-peso
coins made on the late 90s and early 2000 (6.1
g) are relatively heavier than the present coins
(5.35 g).

The Sample standard deviation s, given by the
equation
s =
_
_ (x

-x )
2 n
=1
n - 1
(S)
where x is the sample mean (
_ x
i
n
i=1
n
) and n is
the number of samples, expresses the degree
of dispersion of values from the mean. The
term n-1 refers to the number of degrees of
freedom or the number of individual results
that could be allowed to vary while x and s
are held constant. With n-1 in the equation
instead on n, s becomes unbiased in
estimating the population standard deviation
. Moreover, the standard deviation
provides a measure of the dispersion or
spread of results and thus gives a gauge of
the analysis precision. That is, the higher
the s, the wider the results are spread while
the lower the s, the more compressed are the
results around the mean. The sample
standard deviation, as a measure of
precision, is related to the sample range R.
As R gets narrower, s becomes smaller. The
R and s values of Data set 1 are 0.1590 and
0.05403 respectively while that of 2 are
0.2039 and 0.05991 respectively. With a
lower s and narrower R, DS1 is said to be
more precise than DS2 since the values in
DS1 tend to nearly equal to the sample mean
x . Another way of expressing standard
deviation is by expressing it as a ratio of s
and x , also known as the Relative Standard
Deviation (RSD) which is usually expressed in
parts per thousand (ppt).
The Confidence level (CL), given by
CI = x _
ts
n
(4)
where t (Students t) is the tabulated value for n-
1 measurements at a certain level of probability,
say 95%, is the interval within which the
population mean has the highest risk of being
found. At 95% probability, it can be inferred that
if the experiment is repeated 100 times, thus
obtaining 100 confidence intervals, 95 of which
would contain . The narrower the interval and
the higher the level of probability, the more
accurate is x in estimating . In the experiment,
the confidence intervals were 5.3319 to 5.4453
(for DS1) and 5.3588 to 5.4510 (for DS2).
Having quite narrow intervals, it can be
concluded that the results are of high accuracy.
Furthermore, the little difference between the
confidence limits or the boundaries (0.0269 and
0.0057) describes the results are precise.
From (4), the essence of a reliable s is evident.
The less reliable it is, the less precise the
confidence interval and consequently the
analysis are. Thus, it is indeed important to
obtain a precise value of s. In the experiment,
one way of producing a more reliable standard
deviation is by increasing the number of coins to
be weighed, provided that no further errors
would be committed. Another is by pooling the
standard deviations of the two data sets,
assuming that the two sets have the same
source(s) of error(s). By pooling s, a better
estimate of the true standard deviation is
obtained. In general,
s
pooIcd
=
_
_ (x
i
-x
2
)
2
+
n
1
i=1
_ (x
i
-x
2
)
2
+
n
2
i=1
n
1
+n
2
+-n
s
(S)
where n
s
is the total number of samples which
indicates the total number of degrees of freedom
lost. Using (5), the s
pooIcd
of the experiment is
0.05772. This is assumed to be the standard
deviation of the analysis.
The sample mean x is the average value of a
finite number of samples. Assuming that only
few errors were present, x is assumed to be
equal to the true weight of a 1-peso coin. The
sample means of Data Sets 1 and 2 are
5.38860.0002 and 5.40490.0003, respectively.
Using the t-test(6) and F-test(7) given below,
these means are compared to determine whether
they are significantly different or not.
t =
|x
1
-x
2
|
s
_
n
1
n
2
n
1
+n
2
(6) F =
s
1
2
s
2
2
, s
1
> s
2
(7)
The t-test and F-test are applied almost the same
as the Q-test is, only that the tabulated values are
different. If the s values in DS1 and DS2 are
equal, there is no need for the F-test. If the s
values, upon F-test are found to be significantly
different, t-test cant be applied. Since the s
values of the two sets are different, F-test is
applied.
Table 1 t-test and F-test result at 95% probability

Tabulated
value
Experime
ntal value
Remark Conclusion
F 4.77 1.23 exp<tab Not significantly
different
t 1.77 0.572 exp<tab Not significantly
different

From Table 1, it is concluded that the two means
are not significantly different. Thus, the null
hypothesis that the means are identical is correct
and the obtained results are reproducible and
precise, except of course the rejected value.
The mean in the analysis, either 5.3886 g or
5.4049 g (since they are not significantly
different), deviates by 0.72% from the true mass
of 1-peso coins which is 5.35 g. This merely
shows the accuracy of the analysis.
CONCLUSION
Statistical tools are indeed efficient tools in
analyzing and interpreting results. Parameters
such as mean, standard deviation, and
confidence level help in determining the degree
of precision, reproducibility, and accuracy of the
analysis as well as in treating random errors
errors that no matter what happen, remain in the
process. With such parameters, the results of this
experiment were analyzed and have been found
to be precise and accurate.
Generally, the precision of results of an
experiment increases as the number of samples
examined increases. However, in the experiment
the precision of Data Set 1 is higher than that of
Data Set 2 despite its fewer samples. This
perhaps was due to the significant variation in
the samples themselves. Therefore, it is
recommended to have more samples in the
analysis, provided that these are of the same
quality (dirt-free, same year of production, etc.)
before the weighing.
REFERENCES

Skoog, D. et.al.; Fundamentals of Analytical
Chemistry, Eighth Edition; Raffles
Corporate Center: Pasig City, 2010;
pp. 91-146
Day Jr., R.A.; Underwood, A.L.;
Quantitative Analysis, Sixth
Edition.; Prentice-Hall, Inc.: New
Jersey, 1991; pp. 7-25
Mendenhall,W.; Beaver, R.; Beaver, B.;
Introduction to Probability and
Statistics, Eleventh Edition;
Books/Cole:CA, 2003; pp 206-208,
363- 366, 401-407, Appendix
tables 1 & 6
http://www.bsp.gov.ph/ (accessed July 3,
2012)


APPENDIX

SAMPLE CALCULATIONS:

For Data Set 1
x =
_ x

n
=1
n

=
(S.S7u8 _u.uuu1) +(S.426S _u.uuu1) + (S.Su9u _u.uuu1) +
(S.468 _ u.uuu1) +(S.S794 _u.uuu1) +(S.S778 _u.uuu1)
6

=
S2.SS1S _ 6(u.uuu1)
2
6

= S.S886 _S.S886
_
_
6(u.uuu1)
2
S2.SS1S
_
2

= S.S886 _u.uuu2
s =
_
_ (x

-x)
2 n
=1
n - 1

s =
_
(S.S7u8 - S.S886)
2
+ (S.426S -S.S886)
2
+
(S.Su9u - S.S886)
2
+ (S.468u -S.S886)
2
+
(S.S794 - S.S886)
2
+ (S.S778 - S.S886)
2
6 - 1

s = u.uS4uS

RS =
s
x
1uuuppt
=
u.uS4uS
S.S886
1uuuppt
= 1u.uSppt

R = x
hghcst
- x
Iowcst

= (S.468u _ u.uuu1) - (S.Su9u _u.uuu1)
= u.1S9u _2(u.uuu1)
2

= u.1Su9 _u.uuu1

RR =
R
x
1uuuppt
=
u.1S9u _u.uuu1
S.S886 _u.uuu2
1uuuppt
= 29.S1 _29.S1
_
_
u.uuu1
u.1S9u
]
2
+_
u.uuu2
S.S886
]
2

= 29.S1 _u.uS

CI = x _
ts
n

= S.S886 _
(2.S7)(u.uS4uS)
6

= S.S886 _u.uS67
Q-Test
For Data Set 2
Suspcct Ioluc: S.9S81

tub
: u.466

cxp
=
|x
q
- x
n
|
R

=
|S.9S81 - S.S129|
u.2uS9

= u.68S9
Since
cxp
>
tub
, suspect value is rejected

Você também pode gostar