Statistical Treatment of Data

Statistical Treatment of Data
• Statistical treatment of data is essential in

order to make use of the data in the right
form. Raw data collection is only one
aspect of any experiment; the
organization of data is equally important
so that appropriate conclusions can be
drawn. This is what statistical treatment
of data is all about.
• There are many techniques involved in statistics
that treat data in the required manner.
Statistical treatment of data is essential in all
experiments, whether social, scientific or any
other form. Statistical treatment of data greatly
depends on the kind of experiment and the
desired result from the experiment.
• For example, in a survey regarding the
election of a Mayor, parameters like age,
gender, occupation, etc. would be
important in influencing the person's
decision to vote for a particular
candidate. Therefore the data needs to be
treated in these reference frames.
• An important aspect of statistical treatment of data
is the handling of errors. All experiments invariably
produce errors and noise.
Both systematic and random errors need to be taken
into consideration.
• Depending on the type of experiment being
performed, Type-I and Type-II errors also need to
be handled. These are the cases of false positives
and false negatives that are important to understand
and eliminate in order to make sense from the result
of the experiment.
Treatment of Data and Distribution
• Trying to classify data into commonly known patterns is
a tremendous help and is intricately related to statistical
treatment of data. This is because distributions such as
the normal probability distributionoccur very commonly
in nature that they are the underlying distributions in
most medical, social and physical experiments.
• Therefore if a given sample size is known to be normally
distributed, then the statistical treatment of data is made
easy for the researcher as he would already have a lot of
back up theory in this aspect. Care should always be
taken, however, not to assume all data to be normally
distributed, and should always be confirmed with
appropriate testing.
• Statistical treatment of data also involves describing
the data. The best way to do this is through
the measures of central
tendencies like mean, median and mode. These help
the researcher explain in short how the data are
concentrated. Range, uncertainty and standard
deviation help to understand the distribution of the
data. Therefore two distributions with the same
mean can have wildly different standard deviation,
which shows how well the data points are
concentrated around the mean.
• Statistical treatment of data is an important
aspect of all experimentation today and a
thorough understanding is necessary to conduct
the right experiments with the right inferences
from the data obtained.
• Significant Figures : number of digits
know with certainty + the first in doubt.
• Rounding off: use the same number of
significant figures.
•Addition and subtraction: 13.4+
1478.224 = 1491.624 ~ 1491.6
•Multiplication and division:
31x350.1=10,853.1~11,000
• Kind of Errors:
•Systematic: instrument or the measuring
technique.
•Random: judgement of the observer,
fluctuations in conditions (temp., voltage,
pressure, etc.)
• Absolute Error: E  xi  xtrue
xi  xtrue
• Relative Error: E  100%
xtrue
• Mean, arithmetic mean, and average
are synonyms.
N
 xi
x i 1
N
• Median: is the middle result when

replicate data are arranged in order of
size.
• Accuracy: indicates the closeness of the
measurement to its true value or
accepted value. It is expressed by the
error.
• Precision: describes the reproducibility
of measurements. That is: the closeness
of results that have been obtained in
exactly the same way.
Low accuracy, low precision High accuracy, low precision
Low accuracy, high precision High accuracy, high precision

Distribution of Experimental Data
• Precision:
•Describes the reproducibility of
measurements.
•It can be represented by the deviation
from the mean. That is:
d i  xi  x
Precision: Describes the reproducibility of

the measurements.
•It can be represented by:
•The deviation from the mean. di  xi  x
•Average Deviation. d   xi  x
N
 x  x
N
2
i
•Standard Deviation s  i 1
N 1
Standard Error of the Mean
s
sm 
N
Provides limits within which there is a
certain probability of finding the true
value.
The t-Distribution
• For small sample size (N<30) a correction

is needed.
• The Student Distribution is similar to the
normal (Gaussian) distribution but is
spread out more.
• Biasing in the data collection or systematic

error will not be detected by this type of
statistical analysis.
• This type of statistical analysis only
concerns itself with precision of values,
NOT the accuracy.
Propagation of Error
Uncertainty Analysis
p  100 KPa  1 KPa

UNCERTAINTY
When the plus or minus notation is used to
designate the uncertainty, the person making
this designation is stating the degree of
accuracy with which he or she believes the
measurement was made.
1. Suppose a set of measurements is made and

the uncertainty in each measurement
assessed with the same probability.
2. These measurements are used to calculate
some desired result of the experiments.
What is the uncertainty in the calculated result?

R  R x1 , x2 ,, xn 
wR is the uncertainty in the result.
w1 , w2 ,, wn are the uncertainties in the

independent variables.
2 12
 R   R 
2 2
 R  
wR   w1    w2      wn  
 x1   x2   xn  
Correlations and Correlation
Coefficient
Given data:  xi , yi 
Linear Regression using least squares analysis:
attempts to minimize the sum of the vertical
distances from the points to the straight line,
that is: N 2
min S    yi  mxi  b 
i 1
N  xi yi   xi  yi
in particular, m 
N  xi   xi 
2 2
Coefficient
But, we might equally well have written instead
x  my  b
N  xi yi   xi  yi
Then, m 
N  yi   yi 
2 2
Coefficient
Correlation Coefficient= r  m  m
 r 1 Perfect correlation
r 0 No correlation

Statistical Treatment of Data

Enviado por

Dados do documento

Descrição original:

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Statistical Treatment of Data

Enviado por

Direitos autorais:

Formatos disponíveis

Statistical Treatment of Data

• Statistical treatment of data is essential in

• Median: is the middle result when

Low accuracy, low precision High accuracy, low precision

Low accuracy, high precision High accuracy, high precision

Precision: Describes the reproducibility of

Standard Error of the Mean

• For small sample size (N<30) a correction

• Biasing in the data collection or systematic

p  100 KPa  1 KPa

1. Suppose a set of measurements is made and

What is the uncertainty in the calculated result?

w1 , w2 ,, wn are the uncertainties in the

Você também pode gostar