Você está na página 1de 7

Stat 305, Fall 2014

Name

Chapter 1: Introduction to Engineering Statistics


What is statistics?
Statistics is the scientific application of mathematical principles to the collection,
analysis, and presentation of data . . . at the foundation of all of statistics is data.
Engineers and scientists are constantly exposed to data that they are expected to
make sense of and statistics is a tool that, if used properly, allows us to gain knowledge
about how physical systems work.
Statistics vs. Mathematics:
In mathematics there is usually a set equation and solution for problems.
Statistics involves much more uncertainty. Although we still base our work on
equations, when dealing with small sets of data we are never 100% certain of the
answer.
Engineering statistics is the study of how best to . . .

What is Data?
Collection of facts
e.g. measurements, traits, outcomes
Start by collecting data (Ch. 2)

Basic Terminology
Observational Study:
Investigators role is
A process or phenomenon is watched and data are recorded, but there is no intervention on the part of the person conducting the study
Examples:
survey studies, economic studies, many social science studies, sports statistics
A researcher keeps track of how many cars drive on a certain stretch of road over
a one-hr period to study why so many accidents occur there

Experimental Study (a.k.a. an experiment):


Investigators role is
Process variables are manipulated by the investigator, and the study environment is
regulated
Examples:
Chemistry and physics experiments
A researcher tests the fracture strength of bricks by subjecting them to different
temperatures and measuring the fracture point
Experimental studies are more common with engineering data
Easier and safer to infer causality from an experiment
In experimental studies researcher can control other variables that may affect the
outcome, but are not of interest (often called Lurking variables)
Example: A researcher realizes that occasionally he gets bad results in an electroplating
operator. He knows that there are certain factors (temperature, voltage, current, raw material lot, etc.) that vary in the experiment that could be affecting the results and wants to
know if he can determine what combination of these factors, if any, causes the problem.
Approach 1: Wait for a bad result and then try to see what combinations of factors were
involved and look for patterns.
Approach 2: Systematically vary patterns of combinations of factors and see what happens.

Population
The entire group of objects about which one wishes to gather information in a statistical study.
e.g.
Sample
Group of objects of which one actually gathers data.
e.g.
Example: Of interest is the overall satisfaction of ISU students with the bus system. It
may be costly or impossible to survey all 33,241 students at ISU, so instead, a group of 100
students is randomly chosen to participate in the study.

Sample Size
Number of object, people, etc. in the sample.
In a perfect world we would always have access to the entire population of data, that
is almost never the case.
Census: a study using the entire population
There is always uncertainty involved in statistics. We make guesses about the entire
population based on only a sample.
The larger the sample, the better the guess (Usually due to constraints such as money,
time, etc. we cannot have as large a sample as we would like).
Enumerative Study
A study (experiment) for which there is a particular, well-defined, finite group of
objects under study.
Data are collected on some or all of these objects, and conclusions are intended to
apply only to these objects.
e.g. Gas mileage of all 2015 Ford Taurus automobiles; Strength of 200 2 x 4 boards
to be used to build a specific house.
Analytical Study
A study (experiment) in which a process or phenomenon is investigated at one point in
space and time with the hope that the data collected will be representative of system
behavior at other places and times under similar conditions.
There is rarely, if ever, a particular well-defined group of objects to which conclusions
are thought to be limited.
Most engineering studies are of this type.
e.g. Gas mileage of all Ford mid-size vehicles; Smoothness of all 2 x 4 boards cut
by the primary supplier of Lowes.
Categorical Data
Non-numerial characteristics associated with items in a sample.
Must be aggregated and counted to produce numerical values.
e.g. Eye color (blue, brown, green, etc); Engine status (working, not working &
fixable, not working & not fixable)
Cant average eye color.
Quantitative Data (numerical)
Numerical characteristics associated with items in a sample.
Typically counts of occurrences of a phenomenon of interest or measurements of some
physical property.
3

Can be further broken down into discrete (countable) and continuous (uncountable)
Discrete can be enumerated into a set {. . . ,-1, 0, 1,. . . }
Continuous must be labeled as an interval (-1, 1); [-1,1); [-1,1]
Examples:
1. # of heads in 10 flips of a coin.

2. Distance a car travels until needing service

3. Diameter of a bolt machined by employee A

4. Total number of bolts machined by employee A that did not meet tolerance specifications

Univariate Data
Arise when only a single characteristic of each sample is observed.
e.g. measure height of students in stat 305.
Multivariate Data
Arise when observations are made on more than one characteristic of each sampled
item.
e.g. measure height, weight and observe eye color of students in stat 305.
When 2 characteristics are measured we call it Bivariate Data.
e.g. measure height and weight of students in stat 305.
Paired
Bivariate data where both variables are attempting to quantify the same thing
e.g. Before and After studies: Metal specimen hardness before and after treating;
Pharmaceutical study on a new drug (pain level with/without drug)
Measurements of the same quantity made with different instruments/systems
Measure the weight of students in stat 305 using 2 different scales

Types of Data Structures


Statistical engineering studies are often conducted to compare process performance at different sets of conditions (new vs. standard). Several samples are involved to include several
combinations of conditions. For organizational purposes, standard notions of structure have
been adopted. (Complete factorial study, Fractional factorial study, Hierarchical studies)
Vocab:
Factor: process variables
Level: settings of each variable; levels of the factor
Brick Example:
Factors: Temperature, Humidity
Levels: Temperature - high, low; Humidity - high, low
Complete Factorial Study
All combinations of all levels of all factors of interest are represented in a data set.
Example: In an experiment to investigate the compressive strength properties of
cement-soil mixtures, two different aging periods were used in combination with two
different aging temps and two different soils.

If A has a levels, B has b levels, etc., then we talk about a full a x b (etc) factorial
Example: A with 2 levels, B with 3 levels; 2x3 = 6 combinations

Fractional Factorial Study


Motivation: axbxc. . . etc grows very fast and sometimes one cant afford a full factorial
data set
Example: 4 treatments each with 2 levels (16 combinations)
A
+
+
+
+

B
+
+
+
+

C
+
+
+
+

D
+
+
+
-

This is a clever half of those 16 combinations.


(You dont need to know how to create a fractional factorial, but you should be able
to identify them.)
Hierarchical Studies
Situations where samples aggregate into groups; groups into groups of groups; etc.
Example: Production line producing a machined metal part - for 2 (5 day) work
weeks, each day 2 parts produced are selected and a critical diameter measured

The big question: Where is observed variation coming from? Within a day? Between days? Between weeks?

Measurement
Validity: Faithfully representing the aspect of interest; i.e. usefully or appropriately represents the feature of an object or system.
Precision: Small variation in repeat measurements.
Accuracy (unbiasedness): Producing the true value on average

These three issues must be addressed in this order.


1. Validity must be addressed based on the set up on the study. The objects being tested
should be representative of the population that results will be applied to.
2. Calibration: an activity aimed at improving measurement accuracy. If I have a standard item and my system measures 2 units high on it, I might calibrate by subtracting
2 units from whatever the instrument reads.
3. Averages: a device used to improve precision of a measurement system; hoping to
partially cancel errors of measurement.

Você também pode gostar