Section 1

SECTION 1
PRINCIPLES, CONCEPTS & DEFINITIONS OF RELIABILITY

1.1 Introduction to Reliability and Maintainability
Reliability and Maintainability (R&M) are vital characteristics of products & manufacturing
machinery and equipment that enable U.S. manufacturers to be world class competitors.
Reliability consideration plays an increasing role in virtually all-engineering disciplines. As the
demand for the systems that perform better and cost less increase, there is great need to minimize
the probability of failures, whether the failures simply increase costs and inconvenience, or
gravely threaten the public safety.
In the broad sense, reliability is associated with dependability, with successful operation, and
with the absence of break-downs or failures.
From the product point view, customer relies on a product that performs its intended function
with no failure. From the manufacturing point of view, efficient production planning depends on
a process that yields high quality parts at a specific rate without interruption. Predictable
reliability and maintainability of the manufacturing machinery and equipment is a key ingredient
in maintaining production efficiency and the effective deployment of Just-in-Time principles. In
both cases, improved reliability and maintainability of a product/equipment lead to lower total
life cycle costs that are necessary to maintain the customer satisfaction competitive edge.
This document provides the methodology to achieve these objectives by providing R&M
techniques and guidance on where to apply them in the up-front design and development of a
new product/equipment. Furthermore, the methodology for continuous improvement of a design
and or machinery after installation is provided. Implementation of the R&M concepts described
in the notebook will help increase the product reliability, and/or equipment availability and
reduce its overall operational and maintenance costs.
1.2 Basic Definitions of Reliability & Maintainability
Reliability is the probability that a product (equipment) can perform
continuously, without failure, for a specified interval of time when operating
under stated conditions. Increased reliability implies less failure of an equipment
and consequently less downtime and loss of production.
Maintainability is a characteristic of design, installation and operation, usually
expressed as the probability that a machine can be restored to specified operable
condition (returned to a serviceable state) within a specified interval of time when
maintenance action is performed in accordance with prescribed procedures and
resources.
Azim Houshyar, January 2011
Section 1 - page 1
Benefits of R&M
Highly reliable and maintainable production machinery offers the means for producing
consistently high quality products at lower costs and at higher output levels. Successful
application of R&M techniques has a very positive effect on employee morale and pride since the
reduction in downtime also results in significant reduction in employee stress and frustration.
Table 1-1. R&M User/Supplier Benefits
User Benefits
Supplier Benefits
Higher machinery & equipment availability
Reduced warranty costs
Unscheduled downtime reduced/eliminated
Reduced build costs
Reduced maintenance costs
Reduced design costs
Stabilized work schedule
Improved customer relations
Improved J-I-T performance capability
Higher customer satisfaction
Improved profitability
Increased understanding of productions
Increased employee satisfaction
Increased sales volume
Lower overall cost of production
Increased employee satisfaction
Higher quality parts and product
Improved status in the marketplace
Less need for in-process inventory to cover downtime
A competitive edge in the marketplace
Reduced Life Cycle Cost

Life Cycle Cost (LCC) refers to the total cost of a system during its operational life. LCC
is the sum of non-recurring costs plus operation and support costs. Operation and support
costs typically consume about 50% of the total LCC.
MACHINE BUILD
35%
OPREATION
AND SUPPORT
50%
DEVELOPMENT
12%
Figure 1-1. Total Life Cycle Cost
CONCEPTION STAGE
3%
Section 1 - page 2
Emphasizing R&M practices during the conception and development stages can lower the total
LCC. By using R&M to minimize stress (electrical, mechanical, etc.), the equipment will be less
prone to failure during operation. This results in a decrease of the operation and support costs
that account for the bulk of total LCC.
A slight increase in spending to incorporate R&M practices during the conception and design
stages can dramatically lower the operation and support costs.
It is important to consider R&M at the early stage of a program. Studies have shown that as much
as 95% of LCC is determined during conceptual and development stages. Once new product
(equipment) has reached the build stage, therefore, only 5% opportunity remains to effectively
improve the reliability or maintainability of the product (equipment).
Examples of LCC Improvement: Intel Corporation is engaged in the design and manufacture of
solid-state devices. Intel has developed and is implementing a corporate strategy that addresses
the subject of reliability and maintainability in an aggressive, committed manner.
In portions of its assembly operation, Intel has improved the Mean Time Between Adjustments
(MTBA) from 5 minutes to 16 minutes. This improvement makes it possible for one operator to
run eight machines rather than four, a doubling of operator productivity. In addition, process
yields have been improved due to the elimination of scrap that resulted from the more frequent
shutdowns.
Intel's R&M program was also responsible for improving the Mean Time Between Failures
(MTBF) from 10 hours to 250 hours on its solid-state component wire bonding machines. This
improvement had the same effect as adding 30% capacity to the existing machine base. Another
benefit of this improved reliability lies in the fact that Intel was able to reassign the three line
technicians who previously served as "baby-sitters" to more productive work.
Your Example of LCC Improvement: Choose a product (equipment) that you are familiar
with. State your approach to improving the life cycle cost for the product (equipment). What
additional resources (time, money, technology, labor,...) are needed, and what are the foreseeable
benefits?
Example of Life Cycle Cost
Equipment Name:
Estimated Initial Cost:
Estimated Life:
Estimated Annual Operational Cost:
Current Status:
Recommended Modifications:
Section 1 - page 3
What do we mean when we say we have a Reliable Product?

Well, we may think of a dependable, trustworthy product, but can these descriptions be
quantified?
Can you predict the exact time when a given product will fail?
Well, even though you probably can't say the exact time of the failure of a product, you can
estimate the percentage of products that will fail by a given time.
Reliability can be stated in different forms. For instance:
1)
The reliability that a product (equipment) will be performing its intended function after
1,000 hours of use is 0.80; or
2)
The reliability at 1,000 hours is 0.80, or the reliability is 80%.
3)
Another way to look at it is that if we place 100 units of this product (equipment) in use,
80 of them will still be operating (with no failure) at 1,000 hours.
4)
The reliability at any future time (say 1,500 hours) is less.
Remember that the reliability of a product (equipment) should not be stated as simply 0.8, since
no time is specified. It is equally ambiguous for a product (equipment) to have a 1,000-hour life
without indicating a reliability for that time. Instead it should be stated that the 1,000-hour
reliability of the product (equipment) is 0.8.
100
2
1
Reliability Functions
Rel [%]
0
0
10,000
Time to failure
Question: Looking at the Figure, state your findings regarding the relationship between
reliability and time? Which of the two curves represents a more reliable system? Why?
Response:
Section 1 - page 4
In the definition of reliability, three phrases were used. Those phrases were:
1)
Perform intended functions satisfactorily;
2)
For the specified period of time; and
3)
Under specified conditions.
What do we mean by "Perform Intended Function Satisfactorily"?
To understand this phrase better, let's define Failure.
FAILURE: An event when machinery/equipment is not available to produce parts at specified
conditions when scheduled or is not capable of producing parts or perform scheduled operations
to specification. For every failure, an action is required.
Unsatisfactory performance is subject to interpretation. Therefore it must be clearly defined at the
time of the contract. There will be various levels of failure based on the customer's level of
severity for incidents on the manufacturing equipment.
What do we mean by "Specified Time Period"?
Products deteriorate with use and even with age when dormant. This is especially true for wood
products. Longer lengths of usage imply higher chance of failure and hence lower reliability.
For design purposes, target usage periods must be identified. Typically identified usage periods
are:
The warranty period;
Durability life that is a measure of useful life, defining the number of operating
hours until overhaul is expected or required.
What do we mean by "Specified Conditions"?
Products react to the environment in which they are being placed in. Different environments
promote different failure modes and different failure rates for a product. Therefore the
environmental factors which the product will encounter must be clearly defined.
Environmental factors such as: Temperature, Humidity, Vibration, Mechanical shock,
Immersion/splash, Pressure/vacuum, Contamination, Electrical noise, Electromagnetic fields,
Corrosive materials,..., must be addressed during the design stages of the equipment. These
environmental conditions must be thoroughly documented.
1.3 Association between Quality and Reliability
Lamberson lists quality characteristics as:
Psychological (taste, beauty, style, status);
Technological (hardness, vibration, noise, materials);
Time oriented (reliability and maintainability);
Contractual (warranty); and
Ethical (honesty of repairman, experience of sales force).
Section 1 - page 5
Quality is referred to as fitness for use. This comprises all phases of the life cycle of the
product including engineering, manufacturing, marketing and maintenance. This must be
addressed from the customers' standpoint. Company-wide quality control is a philosophy that
focuses on meeting customer needs and expectations throughout the life cycle of the product
while continuously improving the production process.
Quality Defects are defined as those which can be located by conventional inspection
techniques.
Reliability Defects are defined as those which require some stress applied over time to develop
into detectable defects.
Performance and Reliability: Engineering is concerned with designing and building products
for improved performance. This requires the designs to incorporate features that may tend to be
less reliable than the older systems with lower performances.
The trade-offs between performance and reliability are often subtle. Thus any product with both
improved performance and reliability is significant advance.
We usually improve performance through increased loading;
Decrease the weight of an aircraft increase in the stress level of structure
Increase in temperature to get thermodynamical efficient rapid corrosion in material
This approach to the physical limits of system increases number of failures.
Specifications for a purer material, tighter dimensional tolerance, ..., is required to reduce
uncertainty in the performance limits, and thereby permit one to operate close to these limits
without increasing the probability of exceeding them.
The performance of a system is often increased at the expense of increased complexity, this again
decreases reliability, unless compensating measures are taken.
Probably greatest improvements in performance is introduction of new materials or devices to
achieve a particular goal:
Replacement of wood by metal,
Replacement of piston with jet aircraft engine,
Replacement of vacuum tubes with solid electronics.

Notes: Even with major advances in technology, reliability may be a severe problem, particularly
during the early stages of introducing a new technological advance.
At any stage of technological development, trade-offs must be made between:
Reliability and performance,
Reliability and cost.

Ex: Race car: Performance is improving, but reliability remains below 50%. Here performance
is everything, and one must tolerate a high probability of break-down if there is to be any chance
of winning the race.
Section 1 - page 6
Ex: Military aircraft: An intermediate example in which reliability and performance are
balanced.
Ex: Commercial airliner: In this case, reliability is the overriding design consideration. Thus
degraded speed, payload, and fuel economy are accepted to maintain a very small probability of
catastrophic failure.
1.4 Definition of Reliability Measures
In this section, we will define:
Repairable and non-repairable units;
Mean Time Between Failures (MTBF);
Mean Time To Failure (MTTF);
Failure rate;
Mean Time To Repair (MTTR);
Reliability, Maintainability, and Availability.

Items/components/subsystems/systems can be classified as repairable or non-repairable.
Whenever we use MTBF, we are referring to repairable entities, whereas MTTF is used for nonrepairable entities.
What are some indicators used to Quantify Product Reliability?
Mean Time Between Failures (MTBF): The average time between failure occurrences. The
sum of the operating time of a machine divided by the total number of failures.
Mean Cycle Between Failure (MCBF): The average cycles between failure occurrences. The
sum of the operating cycles of a machine divided by the total number of failures.
Failure Rate: Number of failures per unit of gross operating period in terms of time, events,
cycles, or number of parts.
Reliability: R(t) indicates reliability at time t, where t is the duration of failure-free operation of
the equipment.
MTBF = (Operating time)/(Total number of equipment failures)
Failure Rate = (Total number of equipment failures)/(Operating time)
1. MTBF=1,000 hours means that, on the average, a failure will occur with every 1,000 hours
of usage.
2. A failure rate of 1 failure per 1000 hours (
= 0.001/hr.) means that, on the average, one
failure will occur with every 1,000 hours of usage.
3. R(t=1000 hr.) = 0.8 means that the probability of 1000 hours of failure-free performance is
80%.
Section 1 - page 7
What is the relationship between Reliability Numbers?

The relationship between the Failure Rate and MTBF is:
MTBF = 1/Failure Rate
Therefore a failure rate of 0.001/hr implies a MTBF of 1000 hours.
Assuming that the reliability function for the equipment is Exponentially Distributed, we can
use the following equation to calculate the reliability of a product or machinery at a specified
time t.
R(t) = e-t/MTBF
t>0
where t =
e=
time over which machine is to be operated without failure, and

the natural log number 2.718.
For example, a one-shift reliability of the machine with MTBF of 1,000 hour is:
R(t=1,000 hr.) = e-8/1000 = 0.992
=>
R8= 99.2%
There is 99.2% chance of running the equipment for 8 hours without encountering a failure. The
same equipment has only 79.4% chance of running for 100 hours without encountering a failure.
I recommend selection of an agreeable time frame over which reliability is to be sustained. An
example might be the 8-hour Reliability, denoted by R8, which represents the probability that
the machine will not fail during an 8-hr shift.
Example 1: The failure rate of a component is 0.001 hr-1.
a)
Find the MTBF.
b)
c)
Find the R8.
What is the probability that the component will not fail in a one-month continuous
operation.
Example 2: Given the reliability function R(t) = e-t/1000, where t is time to failure in hrs.
a)
Find the 100 hour reliability.
b)
Find the 1,000 hour reliability.
c)
If 1,000 devices are placed in operation. How many will still be operating at 100 hrs?
Section 1 - page 8
Example 3: A machine has an MTBF of 50 hours.

a)
Find the failure rate.
b)
Find the One-shift reliability.
c)
Find the three-shift reliability.
d)
In 100 hours of operational time, how many failure would you expect?
What is the Relationship Between MTBF of a System and MTBFs of its Components?
Most systems consist of several subsystems. Occasionally we need to combine MTBFs from
different subsystems to calculate the MTBF for the main system. An example is to analyze a
design in which we may have data on the MTBFs of the different subsystems used in the new
design.
Example 4: Consider a system in which one subsystem has an MTBF of 25 hours. On the
average, in 100 hours of uptime, there will be 4 failures. Using the relationship between MTBF,
number of failures, and operating time it is seen that:
MTBF = (Uptime)/(Total Number of Failures)
MTBF1 = 100/4 = 25 hrs.
Now consider adding a second subsystem with a MTBF of 20 hours to the previous system. This
subsystem is expected to have 5 failures in 100 hours of uptime.
MTBF2 = 100/5 = 20 hrs.
How do we combine the MTBFs to obtain the MTBF for the main system?
Obviously, we can expect 4+5 = 9 failures in 100 hours of operation, therefore:
MTBFS= 100/9 = 11 hours!
that is the system fails more often than each of the subsystems.
Can you figure out the rule? The rule is to combine the failure rates
S=
1+ 2
or equivalently:
that is:
1/MTBFS = 1/MTBF1 + 1/MTBF2
1/MTBFS = 1/25 + 1/20 = 0.09 => MTBFS = 11 hrs.
Section 1 - page 9
Example 5: Consider a press which consists of the following five subsystems:

Subsystem
Crown Assemblies
Slide Assemblies
Gibing
Columns
Beds
MTBF
50,000 hrs
20,000 hrs
200,000 hrs
10,000 hrs
10,000 hrs
2X10 hr -1
5X10 -5 hr -1
5X10 -6 hr -1= 0.5 X 10-5 hr -1
1X10 -4 hr -1= 10 X 10 -5 hr -1
1X10 -4 hr -1= 10 X 10 -5 hr -1
a)
Calculate the MTBF for the press.
b)
What is the 8-hour reliability of the press.
-5
Example 6: Consider a work-station for which the subsystems failure rates are:
Subsystem
Load/unload mechanism
Mechanical actuator
Electronics
Hydraulics
(1/hr.)
0.00003
0.00001
0.000005
0.0004
MTBF(hrs)
33,333
100,000
200,000
2,500
a)
Calculate the MTBF for the work station.
b)
What is the 4-hour reliability of the work-station?
MTBF and Failure Rate are two related measures of the Reliability of the equipment or product.
The next question is: How do we measure the Maintainability?
Maintainability is a characteristic of design, installation and operation, usually expressed as the
probability that a machine can be retained, or restored to, specified operable condition (returned
to a serviceable state) within a specified interval of time when maintenance is performed in
accordance with prescribed procedures.
In what follows some Maintainability Improvement Strategies is discussed.
Section 1 - page 10
1.5 Maintainability Improvement Strategies

Safety: Safety engineering must be introduced at the design stage, not after the equipment is
built. Safety personnel must be consulted up front to fully utilize the best technology available in
a safe and ergonomically efficient manner. Properly designed, the operators environment will not
only reduce the risk of injury, it will also avoid exposure to health risks or activities likely to
cause repetitive motions disorders. Pinch points guarding, safety labels, personnel guards,
warning devices, lock-outs and other appropriate safety measures must be integrated into the
design. Safety requirements must be included in the specifications. Applicable safety standards
must be adhered to.
Accessibility: Accessibility means having sufficient working space around a component to
diagnose, troubleshoot and complete maintenance activities safely and effectively. Provision
must be made for movement of necessary tools and equipment with consideration for human
ergonomic limitations.
Operators, maintenance and service personnel have the best knowledge as to how the repair job
will be done and to identify the problems, therefore, they should be involved in evaluating the
design for accessibility.
Common Building Practices: Due to the relevance of the connections, piping runs, wiring and
plumbing to the performance of the equipment, there should be a more rigorous approach to the
practice. At a very minimum, machine suppliers should develop a well-documented practices
manual and advise their assemblers to adhere to its content.
Diagnostics: Diagnostic devices indicating the status of equipment should be built into
manufacturing machinery to aid maintainability support processes. Use of electronic light
emitting diodes to indicate fault status can be helpful. The diagnostics can be as simple as a
visual display indicating the equipment's status as a go/no-go condition, or as sophisticated as a
knowledge-based expert system with the capability of analyzing a problem and recommending
the most likely solution.
Diagnostic systems should have the capability of storing equipment performance data as
permanent records for reliability analysis and supplier feedback supporting the reliability growth
management process. Output from diagnostic systems should be in a compatible format with
commercially available data base management software.
When component assemblies and subsystems are used to create a manufacturing system,
hardware and software "hooks" should be put in place in the concept and design phase to
facilitate integration of the diagnostics system in the build phase. Diagnostic systems should
indicate the specific component to replace or repair.
Captive Hardware and Quick Attach/Detach: Captive and quick attach/detach hardware
provides for rapid and easy replacement of components, panels, brackets and chassis. The
environment in which these devices are used may restrict the type of device used. Spare parts and
replaceable subassemblies should also be configured with these devices preassembled. Examples
are:
Section 1 - page 11
plate, anchor and caged nuts

push and snap-in fasteners
clinch and self-clinching nuts
quarter-turn fasteners
Modularity: Modularity requires that designs be divided into physically and functionally distinct
units to facilitate removal and replacement. It allows design of components as removable and
replaceable units for an enhanced design with minimum downtime. Modular design concepts
typically are thought of in terms of electrical black boxes, printed circuit boards and other quick
attach/detach electrical components. These concepts are also applicable to the mechanical
elements of production equipment.
Advantages of modularity are:
New designs can be simplified and design time can be shortened by making use of
standard, previously developed building blocks.
Specialized technical skill will be reduced.
Training of plant maintenance personnel is easier.
Engineering changes can be made quickly with fewer side effects.

Maintenance Procedures: Maintenance procedures must describe in details the adjustments,
replacement and repair of machine systems, subsystems and component parts. The original
equipment manufacturer will provide recommended preventive maintenance procedures at
intervals based on time and/or machine cycle count. Maintenance requirements should be
prioritized to enable the equipment user to prioritize maintenance scheduling related to the
criticality of the activity.
The maintenance procedures should be contained in service manuals or a computerized data-base
reflecting the specific content and configuration of the equipment being supported. Exploded
view illustrations, photographs, simplified assembly drawings and/or parts lists relating to the
required maintenance activities and procedures should be included wherever applicable.
Technical information such as pressure settings, operational sequences and moving part
clearances should be included as appropriate.
Visual Management Techniques: Visual Management techniques differ for varying types of
equipment. A team effort must exist between supplier and user to deliver the best techniques to
the user. All the team members should review these on a continual basis at concept/design,
machine build and on the manufacturing floor. Visual management techniques are used on
machinery and equipment to bring the workplace awareness to a level that allows problems and
abnormal conditions to be quickly recognized at a single glance. Through visual management, a
system is created that enhances the equipment inspection process by allowing quick identification
of safety, quality, environmental, equipment and process abnormalities.
Typical visual management techniques include:
Match marking of all fasteners (nuts, bolts, screws) fixed, adjustable or critical
Match marking of all control adjustments (pressure, flow, temperature, speed,

level, voltage, current, etc.)
The identification of normal operating ranges and levels
Direction of flow and product color coding on piping and hoses
Direction of rotation on (drives, belts, chains, motors, etc.)

Section 1 - page 12
Function labels on (switches, valves, buttons, lights, etc.)

Identification labels on (cabinets, panels, boxes, etc.)
Filters (lube, hydraulic and air) that indicate when dirty
Filters labeled with replacement filter element number
Belt and chain drives with guarding that permit quick visual inspection and access
Replacement belt or chain number labels on guarding
Each lube point labeled with product number and color code
Temperature sensitive labels on all critical components (motors, drives, controls,
hydraulic units, etc.)
Equipment layout with all electrical control panel safety lockout points indicated
(affixed to the main electrical control panel)
Equipment layout with all lubrication fill points, frequencies and product codes
indicated (affixed to the main electrical control panel)
The identification of all control drawing numbers on the main electrical control
panel
Signals or alarms that indicate a major abnormality, safety interlock tripped,
process out of control, etc.
Equipment and process operator inspection list (affixed to the main electrical
control panel)
Spare Parts Management: Maintenance of manufacturing equipment and machinery requires a

readily available supply of spare parts and supporting materials to operate, maintain and service
the equipment. Spare parts management will identify and make available the required quantities
of spare parts at an optimum inventory cost to the
equipment user.
Plans for equipment support through spare parts management should begin during the equipment
design phase and continue through the life cycle of the equipment. Consideration should be given
to the lead time required to requisition, manufacture and receive into inventory the required parts
and/or materials to avoid excessive costs to procure replacement parts on an emergency basis.
The machinery and equipment manufacturer should make a recommended spare parts list
available to the equipment user. Parts may be provided from previously purchased inventory
(commercial parts and supplies) or purchased specifically for the subject equipment and
maintained in inventory for its use. Sourcing of spare and/or replacement parts, including
consumable materials, should be managed to ensure that the performance and capability of the
manufacturing machinery and equipment is maintained at or above the original manufacturer's
specifications.
Other strategies for maintainability improvement include Standardization of the component
parts that are commercial standard, readily available, and common from machine to machine; and
Color Coding which can help to speed up maintenance procedures
Now that we have reviewed a few of the procedures for maintainability improvement, it seems
natural to ask the following question:
Section 1 - page 13
How can we measure the Maintainability Performance of a machine?

The response is by measuring its Mean Time To Repair (MTTR), which can be used as an
indication of the ease of maintaining the equipment.
MTTR is the average time to restore machinery or equipment to specified conditions.
What procedure can be used to measure MTTR?
In case that there exists very little data to calculate MTTR:
1.
Determine Different Modes of Failure of the equipment, using your judgement and
maintenance experience with similar equipment;
2.
For each Mode of failure, estimate its Frequency of Occurrences;
3.
Based on the way the equipment is to be designed, estimate its Time to Repair;
4.
By multiplying the Failure Rate (item 2 above) by the Time to Repair (item 3 above),
calculate the Maintenance Load for each mode of failure;
5.
Calculate MTTR as:
MTTR = (Maintenance Load)/(System Failure Rate)
6.
in which system failure rate is the sum of failure rate for different modes;
Comparing the Maintenance Load for different Modes of Failure; initiate design action
for failure modes that create a high load on the maintenance function.
Example 7: During the equipment design and development phase (using the Failure Mode
Analysis), the following three failure modes were identified, and the corresponding failure rates
and times to repair were estimated. Use the information to estimate MTTR and to rank those
failure modes.
Failure rate per 1000 hrs
Time to repair (t) (hrs)
Maintenance Load ( x t)
Hydraulic leak
10
10
Torn part
10
20
Conveyor jammed
0.5
Failure Mode
MTTR = (Maintenance Load)/(System failure Rate) = (10+20+2)/(10+2+4) = 32/16 = 2 hrs

The procedure can be summarized as:
MTTR =
(component failure rate x time to repair component)

---------------------------------------------------------------------Total system failure rate
Section 1 - page 14
Example 8: Consider the following situation in which the MTBF and TTR for five different
modes of failure are listed:
Subsystem number
MTBF (hour)
Time to Repair (hour)
1,000
1.5
5,000
4.0
10,000
1.0
2,500
2.5
500
0.5
Calculate MTBF and MTTR for the system and rank the significance of different modes of
failure from the maintenance load point of view.
Subsystem
Failure Rate(
)
Time to Repair
Maintenance Load
1
2
3
4
5
MTBF =
MTTR =
Example 9: Consider the following situation in which the MTBF and TTR for six different
modes of failure are listed:
Subsystem number
MTBF (hour)
Time to Repair (hour)
120
4.0
100
5.5
600
3.0
1,000
1.0
1,500
0.5
750
1.5
Section 1 - page 15
Calculate MTBF and MTTR for the system and rank the significance of different modes of
failure from the maintenance load point of view.
Subsystem
Time to Repair
Maintenance Load
1
2
3
4
5
6
Total
MTBF =
MTTR =
1.6 The Relationship Between R&M and Availability
What is Availability and how is it related to R&M?
Availability (A) is the probability that at any time, the system is either operating satisfactory or
is ready to be operated on demand, when used under stated conditions.
The goal of availability engineering and management is to determine and achieve the availability
performance necessary to the manufacturer's corporate, operating, company, and plant-level
business performance and leadership.
Remember that the plant does not have to be shut down to experience reduced availability. When
many plant items fail, they do not shut down the plant. Nor do they always reduce its production
level. However, the plant's characteristic availability has been reduced.
A simple example is a plant with two pieces of equipment. One is a spare. When one fails, the
other is placed in service. Thus, the plant's real-time production level is not reduced. However,
the probability of maintaining that level over a period of time is substantially less.
The availability can be looked at as the ability of an equipment (under combined aspects of its
reliability, maintainability and maintenance support) to perform its required function at a stated
instant of time.
Availability includes the built-in equipment features (R&M) as well as in-plant maintenance
support function (M).
Section 1 - page 16
RELIABILITY + MAINTAINABILITY
=>
AVAILABILITY
How do we measure Availability?

Depending on the stage of the life cycle, we can use one of the following two models:
a)
During design and development phase, the availability is calculated from the design data
using:
A = MTBF/(MTBF + MTTR)
b)
During the later phases of the life of the product, the availability is calculated using the
actual data on operating time and downtime; that is:
A = Operating Time/Net Available Time
in which:
Net Available Time = Operating Time + Unplanned Downtime
We will see that the two equations result in the same value for A.
Example 10: Calculate the availability for the following welder machine:
Subsystem
MTBF (hr)
Failure rate
Ave. TTR (hr)
1.
2,400
1.25
2.
4,000
1.0
3.
200
2.25
4.
1,500
0.5
5.
7,000
3.0
Maint. Load
TOTAL
MTBF =
MTTR =
A=
One useful tool for measuring the performance of a piece of equipment is OEE.
Overall Equipment Effectiveness (OEE) is a comprehensive measure of equipment
effectiveness. The measurement encompasses:
1) What percentage of time the machinery is available (availability).
2) How fast the machinery is running relative to its design cycle time (speed ratio or
performance efficiency).
3) What percentage of the resulting product is within quality specifications (yield).
Section 1 - page 17
OEE = Availability x Performance Efficiency x Yield

The above formula is not only applied to the overall system, it is also applied to each individual
machine that comprises the system.
R&M is an excellent means of increasing both individual and system OEE percentages. This is
because emphasis of R&M practices has a profound effect on all three factors in the OEE
equation.
Availability. R&M increases uptime because the equipment is more reliable and when it does
require maintenance the services can be accomplished in a shorter time.
Performance Efficiency. Due to proper R&M practices, the equipment has fewer failures and
less maintenance time. This means the equipment can operate for longer periods at its designed
cycle time.
Yield. When components of the equipment are designed according to R&M practices the
equipment is less susceptible to variations that could result in unacceptable parts.
Improved Uptime. Uptime and its counterpart, downtime, as functions of OEE are more clearly
defined in Figure 1-5. The importance of R&M is realized when it is considered that the goal of
R&M practices is to reduce the time required for preventive and corrective maintenance. By
reducing the time required for these major contributors to downtime, the uptime increases
correspondingly.
Example 11: A machine that was designed to produce 360 parts per hour was put under a
continuous production test over a 5-day period. During that interval the machine broke down 6
times for a total of 14 hours. In addition the machine was on scheduled repair for 4 hours. The
parts produced by the machine were inspected. 32,000 of the parts passed the inspection, but
1,400 of them were rejected. Based on the given information, try to answer the following
questions:
a)
Calculate the MTBF.
b)
Calculate the MTTR.
c)
Calculate Availability.
d)
Calculate Performance Efficiency.
e)
Calculate the Yield.
f)
Calculate the machine OEE.
Section 1 - page 18
Figure 1-5. Relationship of Typical Time Elements
Section 1 - page 19
Few Notes:
1)
Reliability was defined as the probability that a component, device, or system will
perform satisfactory for at least a given period of time when used under stated conditions.
2)
To measure reliability we need to specify:

a)
A precise definition of satisfactory performance;
b)
The time base over which the performance must be maintained;
c)
The environmental conditions that will be encountered.
3)
The concept of reliability tells us that any given product has a built-in reliability function
that relates its reliability to time and decreases as time progresses.
Note also that:
Reliability is a probability concept.
Reliability theory is a subset of quality control, but Q.C. function deals primarily
with new products under inspection, whereas reliability deals with products in
service.
We shall consider the system as a set of interacting components working together
as an integrated whole.
A system is said to fail when it ceases to perform its intended function
4)
When there is a total cessation of function, the system has clearly failed, but often it is
necessary to define failure quantitatively to consider failure through deterioration or
instability of function. Examples:
A motor that is no longer capable of delivering a specified torque,
A machine that no longer processes parts at its designed capacity.
5)
The way in which time is specified in the definition of reliability varies considerably,
depending on the nature of the system under consideration. Therefore in any
intermittently operated system (say a switch), we must specify whether calendar time or
the hours of operation is to be used.
6)
Specifying the conditions under which a system is to operate is important. It may be

divided into:
The principle design loads, and
The environmental effects: weight that a structure must support, the electrical load
on a generator, the rate of information transfer on a telecommunication system,
temporary extremes, dust, salt, humidity, ...
7)
Several quantities can be used to characterize the reliability of a system including mean
time to failure and failure rate, and in the case of repairable systems mean time to repair
and availability.
8)
Reliability is defined positively, in terms of a system performing its intended function,

and no distinction is made between failures. In reality, there is concern not only with the
probability of failure, but also with the potential consequences of different modes of
failure. In particular, failures that present severe safety problems are important. Home
appliances need reliability for avoiding frequent failures that result in customer
dissatisfaction and or create a safety hazard such as electric shock.
Section 1 - page 20
1.7 System Life Cycle and Reliability

Successful implementation of R&M is dependent upon thorough communication between
marketing and design engineers, and/or the user and supplier. This communication must begin at
project conception and continue through the entire life of the product (equipment). This ensures
that equipment problems will be identified, root causes determined, and corrective action
implemented.
This section discusses a five-phase program management process that describes how R&M
applies to the various phases that occur in the life cycle of product (equipment). Techniques to
assist the R&M implementation are also suggested. For more detail refer to the "Reliability and
Maintainability Guideline" published by SAE.
Five-Phase Program Management Process
Machinery and equipment development programs can be managed using a five-phase program
management process. The process starts in Phase 1 with concept and proceeds through
decommissioning and/or conversion in Phase 5. This process is appropriate for any hardware
development program for machinery and equipment.
Figure 1-6. Five Phases of Manufacturing Machinery and Equipment Life Cycle
The reliability activities taking place during each of these phases of the product life may be quite
different. For instance in the project definitions, the objectives of the systems are set forth in the
form of one or more functional requirements. For an ergonomically designed chair, the exact
requirements and for a computer desk, the exact dimensions and specifications are specified. In
addition, the environment in which the system is to function must be determined (i.e.: the range
of temperature and humidity, the concentrations of dust or other contaminates,). Finally, the
service life to which the system is to be designed must be specified.
From such requirements, a conceptual design is formulated that in broad form outlines how the
system is to function, and provides the general plan for its construction. From the functional
requirements comes the definition of failure, and thus of reliability. Reliability requirements may
then be set, and the trade-offs between reliability, cost and functional requirements may be
examined as the design proceeds into the detailed phase.
Section 1 - page 21
The conceptual design must be converted into a detailed set of drawings and specifications from
which the system can be built. During this phase, maintenance requirements and procedures are
also likely to take place. As the design proceeds, experiments, testing, and analysis are required
to choose between alternatives, to solve problems, and to predict the performance of subsystems
or components.
Reliability considerations should permeate this stage of design in setting safety factors and design
margins, eliminating unnecessary complexities, translating system reliability criteria into
reliability requirements for subsystems, and on setting time intervals for inspection, maintenance
and replacement of parts subject to wear.
Note that in this stage, the detailed examination of potential failure mechanisms and models is
most beneficial, for often they may be eliminated or mitigated without too much expense. In the
later stages of the design of the process, prototypes are built and the first reliability tests may be
performed.
Historically, reliability considerations during the manufacturing of a system are related to the
practices of quality control. Reliability in manufacture is monitored and controlled, and use of
statistical Q.C. techniques for reliability testing on manufactured item is exceedingly important.
Verification of end product reliability by testing to failure is not possible in large one-of-a-kind
system. Thus very stringent acceptance criteria on components, careful supervision and control of
the construction process and an elaborate set of proofs or acceptance tests are necessary in such
situations.
Reliability and Phase 1 - Concept Phase
The first phase is research and limited development or design usually resulting in a proposal.
During this phase both the user and the supplier must work together to establish system
requirements. It is recommended that the user team include machine operators, maintenance
personnel and product engineers. The supplier team should include MM&E suppliers.
Machinery mission and environmental requirements are defined during this phase. Also
identified are safety issues, desired goals for reliability and maintainability and life cycle cost.
Simultaneous (concurrent) engineering can be introduced at either Phase 1 or Phase 2 depending
on the particular situation and MM&E.
Reliability and Phase 2 - Development/Design Phase
The development/design phase determines the majority of the life cycle cost. The issues from the
concept phase are incorporated. Safety, ergonomics, accessibility and other maintainability issues
are designed into the system. R&M allocation requirements are formalized.
Components and component suppliers should be selected based on the predictive R&M statistics
they provide. It is recommended that MM&E suppliers utilize methods highlighted in the SAE
guideline to assure that R&M goals will be met.
The design review is a procedure for assuring that the planned design is likely to, or does in fact;
meet all requirements in the most cost-effective way, considering all variables and constraints.
Section 1 - page 22
1)
2)
3)
4)
5)
6)
7)
8)
9)
Maintainability is a major consideration in the design review.

A preliminary review is held prior to commitment to a final design approach.
It is followed by an intermediate design review when more details of the design become
available.
The status of design actions resulting from the preliminary design review may be
reviewed at this time.
A design review is conducted to review overall readiness for production prior to release
of drawings to the manufacturing function.
Regular design review sessions are recommended to ensure that communication is clear
between the user and MM&E suppliers.
It is also recommended to include operators, maintenance personnel and product
engineers in the design review. This will give all concerned an understanding of the
design intent.
At this phase, considerations must be given in the design for demonstration of compliance
to requirements through testing.
Suitable test plans must be developed.
Reliability and Phase 3 - Build & Install Phase

During the manufacturing and assembly of the machine, the achievement of reliability
requirements should be monitored. Issues that could affect R&M must be communicated back to
the design engineers to assure any redesign includes reliability improvements. Manufacturing
process variables affecting R&M should be identified and targeted for control. MM&E suppliers
and the user must negotiate meaningful R&M goals and requirements for future monitoring and
divide responsibility for collecting, analyzing and reporting of data.
Several events occur during Phase 3:
Problems encountered when runoff tests are conducted should be documented for
elimination.
Maintenance procedures are developed. A customer representative should be involved in
this process.
Training starts here and continues to the next phases.
Machine acceptance testing should be agreed to and performed prior to teardown and
installation.
R&M data base collection begins during machine acceptance testing. Problems
encountered during this phase should be documented for future reference/use.
The machine will be transferred from the builder's location to the customer's plant.
Critical assembly processes should be identified during teardown.
Installation is a very critical step: The machine has to be reassembled to the build
requirements. Special attention should be given to the critical assembly processes
identified during teardown.
Reliability and Phase 4 - Operation and Maintenance Phase
In this phase the equipment is at the customer location and fully operational. Data collection and
feedback are very important at this phase. Data collection mechanisms should be in place and
agreed upon by both parties. Information collected during this phase often leads to R&M growth
and continuous improvement.
Section 1 - page 23
During this phase maintenance should be performed regularly. For an R&M initiative to be
successful, the MM&E and component suppliers must have access to maintenance records and
R&M data bases.
Reliability and Phase 5 - Decommissioning and/or Conversion Phase
This phase is the end of the expected life of the machine. During this phase machine may require
decommissioning due to an increasing failure rate that has resulted in increasingly expensive
maintenance or may be rebuilt to a good-as-new state.
In another possible situation, the machine may still be in good condition but the production needs
have changed requiring the machine to go through major conversion to be used for production of
other products.
When either the decommissioning or conversion action is taken, the feedback from the user plant
should be recorded and all the information should be used for R&M growth and continuous
improvement in future generations of machinery.
________________________
Notes:
Section 1 - page 24

Section 1

Enviado por

Dados do documento

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Section 1

Enviado por

Direitos autorais:

Formatos disponíveis

SECTION 1

PRINCIPLES, CONCEPTS & DEFINITIONS OF RELIABILITY

Azim Houshyar, January 2011

Higher machinery & equipment availability

Reduced warranty costs

Unscheduled downtime reduced/eliminated

Reduced build costs

Reduced maintenance costs

Reduced design costs

Stabilized work schedule

Improved customer relations

Improved J-I-T performance capability

Higher customer satisfaction

Increased understanding of productions

Increased employee satisfaction

Increased sales volume

Lower overall cost of production

Increased employee satisfaction

Higher quality parts and product

Improved status in the marketplace

Less need for in-process inventory to cover downtime

A competitive edge in the marketplace

Reduced Life Cycle Cost

Figure 1-1. Total Life Cycle Cost

Azim Houshyar, January 2011

Azim Houshyar, January 2011

What do we mean when we say we have a Reliable Product?

Azim Houshyar, January 2011

The warranty period;

Psychological (taste, beauty, style, status);

Technological (hardness, vibration, noise, materials);

Time oriented (reliability and maintainability);

Contractual (warranty); and

Ethical (honesty of repairman, experience of sales force).

Azim Houshyar, January 2011

Replacement of wood by metal,

Replacement of piston with jet aircraft engine,

Replacement of vacuum tubes with solid electronics.

Reliability and performance,

Reliability and cost.

Repairable and non-repairable units;

Mean Time Between Failures (MTBF);

Mean Time To Failure (MTTF);

Mean Time To Repair (MTTR);

Reliability, Maintainability, and Availability.

Azim Houshyar, January 2011

What is the relationship between Reliability Numbers?

time over which machine is to be operated without failure, and

Find the R8.

Find the 1,000 hour reliability.

Azim Houshyar, January 2011

Example 3: A machine has an MTBF of 50 hours.

Find the One-shift reliability.

Find the three-shift reliability.

1/MTBFS = 1/MTBF1 + 1/MTBF2

1/MTBFS = 1/25 + 1/20 = 0.09 => MTBFS = 11 hrs.

Azim Houshyar, January 2011

Example 5: Consider a press which consists of the following five subsystems:

Calculate the MTBF for the press.

What is the 8-hour reliability of the press.

Calculate the MTBF for the work station.

What is the 4-hour reliability of the work-station?

Azim Houshyar, January 2011

1.5 Maintainability Improvement Strategies