Escolar Documentos
Profissional Documentos
Cultura Documentos
Isograph makes no representations or warranties of any kind whatsoever with respect to this document and its associated software.
Isograph disclaims all liabilities for loss of damage arising out of the possession, sale, or use of this document or its associated software.
1
Fault Tree Analysis
An Introduction
2
Isograph
Founded in 1986
Nuclear industry
Off-the-shelf PRA tool
Products
Fault Trees, simulation, optimization,
prediction
Me
Joined Isograph in 2003
Background in Math/Comp Sci
Support, training, development
3
This Presentation
Overview of Fault Tree
methods
Includes examples from RWB
Not in-depth look at Isograph’s FT
Sept 15-16, Alpine, UT
Oct 6-7, Detroit, MI
4
Introduction
Chapter 1
Hazard Fire
FTA
Deductive
5
What is Fault Tree Analysis?
No power
Deductive analysis
Determine causes AND
of TOP event
TOP event = hazard No power from
mains
Generator
doesn't start up
Logic gates
MAINS FAILURE OR
Basic events
Qualitative Generator Mains failure
failure not detected
Quantitative
EVENT1 EVENT2
TOP Events
Determine the scope of the
analysis
Chosen by Hazard
Identification
TOP events: want info on
Bottom events: already have info on
6
Typical Basic Events
Pump failure
Temperature controller failure
Switch fails closed
Operator does not respond
Crash or unexpected failure of
Software routine
7
Failure vs Success Logic
Normally failure events instead of
success
Some trees have both
Failure easier to define
Failure space is smaller, simpler
Easier to analyze; probabilities tend to be
lower
Some events neither failure nor success
TOP event can be success state
(dual tree)
Harder to analyze
Harder to conceptualize
©2015 Isograph Inc. Reliability Workbench 1–13
Quantification Parameters
Probabilistic System
Parameters:
Unavailability
Unreliability
Failure Frequency
Risk Reduction Factor
Component Parameters:
Unavailability
Failure Frequency
Failure rate and Repair rate
Inspection Interval and Time at Risk
©2015 Isograph Inc. Reliability Workbench 1–14
8
Failure Rate
Component failure rate (probability
per unit time)
Failure rate
9
Non-constant failure rate
Aging model requires
numerical solution
Can’t be reduced to analytical
expression
Monte Carlo simulation
Availability Workbench
Exponential, Normal, Lognormal,
Weibull, etc.
Strong dependencies
Maintenance costs
Optimization
©2015 Isograph Inc. Reliability Workbench 1–17
10
Unavailability Q(t)
Unavailability: not operating at
time t
Continuously operating systems
Unavailability: does not work
on demand
Safety/standby system
PFD
Unavailability per flight hour:
Q(T)/T
Used in aerospace/ISO 26262
©2015 Isograph Inc. Reliability Workbench 1–19
Unreliability F(t)
Probability of failure over time
Prob. that system fails between time
0 and time t
Prob. that system fails over given
time period
Non-repairable systems
Probability of catastrophic
event
Warranty costs
©2015 Isograph Inc. Reliability Workbench 1–20
11
Q&F
In general
Q(t) ≤ F(t)
Non repairable
Q(t) = F(t)
Unavailability = Unreliability
12
Risk
Quantifiable with ETA
Coupled with Fault Trees (or just
using ETA)
Risk
Categories and policy
Safety
E.g. deaths per million operating hours
Environmental
Tons of toxic release over lifetime
Operational
Threat to completion of mission
Economic
Financial loss
13
Risk policy (acceptable risk)
Aerospace
deaths per flight hour
Automotive
controllability of vehicle
Railway
deaths per train miles
Space
operational risk
Pharmaceutical
human risk
14
End of Chapter 1
Summary
FT is deductive hazard analysis
Graphically shows logical relationship
between TOP and Basic events
Qualitative/quantitative
Constant rates
Unavailability/Unreliaiblity/Frequency
Risk
15
Fault Tree Construction
Chapter 2
16
Other Symbols
Symbol Name Meaning
OR Gate Example
No output from
High Pressure
Valve 1
HPV1
17
AND Gate Examples
Fire Both Pum ps
Propagates Unavailable
FPROP PUMPSYS
2 2
HIGHTEMP BRAKEFAIL
Tem perature Tem perature Tem perature Brake 1 Fails Brake 2 Fails Revers e
Sensor 1 Fails Sensor 2 Fails Sensor 3 Fails Thrust Not
Engaged
18
Priority AND Gate Example
System
Unavailable
SYS
GATEA GATEB
Transfer Symbols
19
Transfer Symbols
Loss of supply
TP1
Leg 1 Leg 2
GT1 GT2
SEN1 SEN2
Gate Types
Other Gate Types
Inhibit
NOT
Exclusive OR
Special Cases
Not normally used
Not covered
20
Primary Event Types
Symbol Name Meaning
SYSFAIL
Sub-System X Sub-System Y
Unavailable Unavailable
X Y
SX HX SY HY
21
House Event Example
System
Unavailable
SYSFAIL
Sub-System X Sub-System Y
Unavailable Unavailable
X Y
SX HX SY HY
False False
SYSFAIL
Sub-System X Sub-System Y
Unavailable Unavailable
X Y
SX HX SY HY
True False
22
System & Component Events
System Events
Failures not directly associated with a
single component
Component Events
Failures entirely associated with a
given component
Component Events
COMPONENT
UNAVAILABLE
PRIMARY COMMAND
FAILURE FAULT
23
Construction Guidelines
Define system bounds
Identify TOP event(s)
Identify immediate causes
using top-down approach
Continue to identify immediate
causes through intermediate
levels of complexity
24
Example 1: Electrical System Fault Tree
GRID DGEN
T1 T2
C1 BOARD A C2
(PUMPS)
T3 T4
C3 BOARD B C4
(VALVES)
LO SS O F
SUPPLY TO
BO ARD B
ELECB
NO SU PPLY NO SU PPLY
FR OM FROM
CON TAC T CONTAC T
BREAKER 3 BR EAKER 4
25
Board B Fault Tree
NO SU PPLY
FROM
CONTAC T
BREAKER 3
G AT E1
CO NTACT NO SU PPLY
BREAKER 3 FROM
TRANSFOR MER
F AILURE 3
C3 G AT E3
G AT E1
CO NTACT NO SUPPLY
BREAKER 3 FROM
TRANSFORMER
F AILURE 3
C3 G AT E3
TRANSFORMER LO SS O F
3 FAILURE SUPPLY TO
BO ARD A
T3 ELECA
26
Board B Fault Tree
LOSS OF
SUPPLY TO
BOARD B
ELECB
NO SUPPLY NO SUPPLY
FROM FROM
CONTACT CONTACT
BREAKER 3 BREAKER 4
GATE1
CONTACT NO SUPPLY
BREAKER 3 FROM
TRANSFORMER
FAILURE 3
C3 GATE3
TRANSFORMER LOSS OF
3 FAILURE SUPPLY TO
BOARD A
T3 ELECA
ELECB
NO SUPPLY NO SUPPLY
FROM FROM
CONTAC T CONTAC T
BREAKER 3 BREAKER 4
G ATE1 G ATE2
C3 G AT E3 C4 G AT E4
TRANSFORMER LO SS O F TRANSFORMER LO SS O F
3 FAILURE SUPPLY TO 4 FAILURE SUPPLY TO
BO ARD A BO ARD A
T3 ELECA T4 ELECA
27
Board A Fault Tree
LO SS O F
SUPPLY T O
BO ARD A
ELECA
NO SUPPLY NO SUPPLY
FROM FROM
CONTAC T CONTAC T
BREAKER 1 BREAKER 2
G AT E6
CO NTACT NO SUPPLY
BREAKER 1 FROM
TRANSFORMER
F AILURE 1
C1 G AT E8
28
Board A Fault Tree
NO SU PPLY
FROM
CONTAC T
BREAKER 1
G AT E6
CO NTACT NO SUPPLY
BREAKER 1 FROM
TRANSFORMER
F AILURE 1
C1 G AT E8
TRANSFORMER GRID
1 FAILURE UNAVAILABLE
T1 G RI D
ELECA
NO SUPPLY NO SUPPLY
FROM FROM
CONTACT CONTACT
BREAKER 1 BREAKER 2
GATE6
CONTACT NO SUPPLY
BREAKER 1 FROM
TRANSFORMER
FAILURE 1
C1 GATE8
TRANSFORMER GRID
1 FAILURE UNAVAILABLE
T1 GRID
29
Board A Fault Tree
LO SS O F
SUPPLY TO
BO ARD A
ELECA
NO SUPPLY NO SUPPLY
FROM FROM
CONTAC T CONTAC T
BREAKER 1 BREAKER 2
G ATE6 G ATE7
C1 G AT E8 C2 G AT E9
T1 G RID T2 DG EN
30
Reducing Fault Trees
Linked OR gates can become
single OR gate
TOP1
E VENT 1 GATE1
= TOP1
E VENT 2 GATE2
EVENT3 EVENT4
TOP1 TOP1
31
Reducing Electrical Fault Tree
ELECA brought to top of tree
It causes route from A to B to be lost
Component events combined
Transformer and contact breaker
failures are linked OR gates
ELECB
LO SS O F ROUTE FROM
BO ARD A BOARD A TO
SUPPLY BOARD B LOST
ELECA G ATE3
T3 O R C3 T4 O R C4
FAILED FAILED
G ATE4 G ATE5
C3 T3 C4 T4
32
Reduced Board A Fault Tree
LO SS O F
BO ARD A
SUPPLY
ELECA
NO SUPPLY NO SUPPLY
FROM G RID FRO M
DIESEL
GAT E1 GAT E2
C1 GRID T1 C2 DGEN T2
33
Rocket Propulsion Example
Define System Bounds:
Items shown in schematic
Both mechanical and electric circuits to
be included
Identify TOP events
3 Possible system failures:
Failure to provide propulsion on demand
Inadvertent firing of the system when not
required
Continued firing after system has been
commanded off
Examine third possibility
©2015 Isograph Inc. Reliability Workbench 2–37
THRUST
34
Rocket Propulsion Fault Tree
Continue identifying immediate
causes through intermediate levels
Isolation valve
IV3 remains
open after
cutoff
IV3 OPEN
IV3 OPEN
K5 POWER K5
35
Rocket Propulsion Fault Tree
Isolation valve
IV3 remains
open after
cutoff
IV3 OPEN
K5 POWER K5
K3 POWER K3
IV3 OPEN
K5 POWER K5
K3 POWER K3
S3 CLOSED K6 CLOSED
36
IV2 Leg
Isolation valve
IV2 remains
open after
cutoff
IV2 OPEN
IV2 OPEN
S3 CLOSED K6 CLOSED
37
Rocket Propulsion Fault Tree
Isolation valve
IV2 remains
open after
cutoff
IV2 OPEN
S3 CLOSED K6 CLOSED
S3 S3 OP K6 K6 TIMER
THRUST
38
Rocket Propulsion Fault Tree
Isolation valve
IV3 remains
open after
cutoff
IV3 OPEN
K5 POWER K5
K3 POWER K3
K3 POWER
S3 CLOSED K6 CLOSED
S3 S3 OP K6 K6 TIMER
39
Rocket Propulsion Fault Tree
Isolation valve
IV2 remains
open after
cutoff
IV2 OPEN
S3 CLOSED K6 CLOSED
S3 S3 OP K6 K6 TIMER
40
Reduced Rocket Fault Tree
Thruster
supplied with
propellant after
thrust cutoff
THRUST
Q=0.0002715
ARMING IVS
Primary failure Operational Primary failure Primary failure Primary failure Primary failure Primary failure
of S3 to open failure of S3 to of K6 to open of K6 timer to of IV3 to close of K5 to open of K3 to open
when open when after timing out time out after cutoff after cutoff after cutoff
commanded commanded
S3 S3 OP K6 K6 TIMER IV3 K5 K3
Disadvantages
May be more difficult to
understand
Errors may be made in
construction process
41
Workshop 2.1: Chemical Reactor vessel
CON
MV1 MV2
Input 1 Input 2
EV1 EV2
TS
NRV
Pressure relief OP
PS ALARM
By-product
Product
Workshop 2.1
TOP event – Fails to stop
rupture
Base events:
Name Description Name Description
EV1 Electrical valve 1 failure TS1 Temperature sensor failure
EV2 Electrical valve 2 failure PS1 Pressure sensor failure
MV1 Manual valve 1 stuck open ALARM Alarm unit failure
MV2 Manual valve 2 stuck open NRV Pressure relief valve failure
CON Controller failure GRID No electrical supply from the grid
OP Operator Unavailable
42
Workshop 2.1
CON
TS
NRV
Pressure relief OP
PS
ALARM
By-product
Product
G0
G1 NRV
G2 G3
43
Workshop 2.1 Solution (cont.)
INP UT 1 NOT
S HUT DOW N
G2
MA NUA L E LE CTRICA L
V A LVE 1 NOT V A LV E 1 NOT
S HUT S HUT
G4 G5
G8 MV 1 G9 EV1 GRID
G10 A LA RM PS 1 TS 1
G3
MA NUA L E LE CTRICA L
V A LVE 2 NOT V A LV E 2 NOT
S HUT S HUT
G6 G7
G8 MV 2 G9 EV2 GRID
G10 A LA RM PS 1 TS 1
44
End of Chapter 2
Summary
Gate symbols
Event symbols
Construction guidelines
45
Minimal Cut Sets
Chapter 3
46
Boolean Algebra Techniques
Represent gates with
equivalent Boolean expression
Variables represent inputs
EventX·EventY
· symbol represents AND logic
EventX + EventY
+ symbol represents OR logic
47
AND gate
TOP1 = A · B
3 inputs: TOP1 = A · B · C
TOP1
A B
OR gate
TOP1 = A + B
3 inputs: TOP1 = A + B + C
TOP1
A B
48
VOTE gate
TOP1 = A·B + A·C + B·C
3oo4 (failures):
TOP1 = A·B·C + A·B·D + A·C·D + B·C·D
2
TOP1
A B C
49
Boolean Algebra Example
G1 = A + B
G2 = A·C + A·D + C·D
TOP = G1 · G2 TOP
2
G1 G2
A B A C D
50
Workshop 3.1
HEX
NRV1
EP1 EV1
Cooling
NRV2
FS1 EP2 EV2
CON1
Workshop 3.1
TOP event: Total Loss of
Cooling
Mechanical failures only
Ignore electrical failures
Ignore failure of FS1 and CON
Assume negligible probabilities
Build tree & calculate cut sets
by hand
51
Workshop 3.1
HEX
NRV1
EP1 EV1
Cooling
NRV2
EP2 EV2
FS1
CON1
COOLING
LOSS OF HEAT
COOLING TO EXCH ANGER
HEX FAILU RE
SYS1 HEX
LOSS OF LOSS OF
COOLING COOLING
LEG 1 LEG 2
SYS2 SYS3
52
Workshop 3.1 Solution
Minimal Cut sets:
HEX
EV1.EV2
EV1.EP2
EV1.NRV2
EP1.EV2
EP1.EP2
EP1.NRV2
NRV1.EV2
NRV1.EP2
NRV1.NRV2
©2015 Isograph Inc. Reliability Workbench 3–15
Workshop 3.2
Determine by hand the minimal
cut sets for ‘Total Loss of
Cooling’ fault tree from
Workshop 3.1
Consider the full fault tree
including electrical faults
53
Cooling System
TOTAL LOSS
OF COOLING
COOLING
LOSS OF HEAT
COOLING TO EXCHANGER
HEX FAILURE
SYS1 HEX
LOSS OF LOSS OF
COOLING LEG COOLING LEG
1 2
SYS2 SYS3
Cooling System
LOSS OF
COOLING LEG
1
SYS2
54
Cooling System
LOSS OF
COOLING LEG
2
SYS3
Electric System
LO SS O F
SUPPLY TO
BO ARD B
ELECB
LO SS O F ROUTE FROM
BO ARD A BOARD A TO
SUPPLY BOARD B LOST
ELECA A TO B
T3 O R C3 T4 O R C4
FAILED FAILED
LEG 3 LEG 4
C3 T3 C4 T4
55
Electric System
LO SS O F
BOARD A
SUPPLY
ELECA
NO SUPPLY NO SUPPLY
FROM GRID FRO M
DIESEL
NSGRID NSUD
C1 G RID T1 C2 DG EN T2
Cooling
TOTAL LOSS
LOSS OF HEAT
COOLING TO EXCHANGER
HEX FAILURE
SYS1 HEX
LOSS OF LOSS OF
COOLING LEG COOLING LEG
1 2
SYS2 SYS3
56
SYS2 – Loss of Cooling Leg 1
SYS2 = PUMP1 + VALVE1 + NRV1
LOSS OF
PUMP1 = ELECA + EP1 COOLING LEG
1
57
ELECB – Loss of Supply to Board B
LO SS O F
LEG3 = C3 + T3 LO SS O F
BO ARD A
ROUTE FROM
BOARD A TO
SUPPLY BOARD B LOST
LEG4 = C4 + T4
ELECA A TO B
T3 O R C3 T4 O R C4
FAILED FAILED
LEG 3 LEG 4
C3 T3 C4 T4
ELECA
NO SUPPLY NO SUPPLY
FROM G RID FRO M
DIESEL
C1 GRID T1 C2 DGEN T2
58
Cooling
COOLING = SYS1 + HEX
SYS1 = SYS2 · SYS3 TOTAL LOSS
OF COOLING
LOSS OF HEAT
COOLING TO EXCHANGER
HEX FAILURE
SYS1 HEX
LOSS OF LOSS OF
COOLING LEG COOLING LEG
1 2
SYS2 SYS3
59
Workshop 3.1 Solution (cont.)
COOLING =
(PUMP1 + VALVE1 + NRV1) ·
(PUMP2 + VALVE2 + NRV2)
+ HEX
60
Workshop 3.2 Solution (cont.)
COOLING =
ELECA +
ELECB +
(EP1 + EV1 + NRV1) · (EP2 + EV2 + NRV2)
+ HEX
61
Workshop 3.2 Solution (cont.)
COOLING =
ELECA +
A TO B +
(EP1 + EV1 + NRV1) · (EP2 + EV2 + NRV2)
+ HEX
62
Workshop 3.2 Solution (cont.)
COOLING =
(C1 + GRID + T1) · (C2 + DGEN +T2) +
(C3 + T3) · (C4 + T4) +
(EP1 + EV1 + NRV1) · (EP2 + EV2 + NRV2)
+ HEX
63
Program Demonstration
Using a Fault Tree program to
obtain cut sets
End of Chapter 3
Summary
Boolean operators
Boolean gate expressions
Boolean algebra rules
Evaluating cut sets in a computer
program
64
Basic Probability Theory
Chapter 4
65
Independent Events
Independent events:
unaffected by other’s
occurrence
Rolling a die, flipping a coin
Generally Assumed in FTA
Simplifies calculations
Not necessarily the case
Increased stress, etc.
CCFs, discussed later
©2015 Isograph Inc. Reliability Workbench 4–3
Exclusivity
Mutually exclusive events:
cannot occur together
Ex: Failed and working states
Non-exclusive events
Ex: failure of two independent
components
Die showing 6, coin landing heads
66
Multiplication Law
P ( A ⋅ B ) = P ( A) ⋅ P ( B )
Where:
P(A·B) = probability of A and B occurring
together
P(A) = probability of A occurring
P(B) = probability of B occurring
A, B independent, non-exclusive
Multiplication Law
P( A ⋅ B ⋅ C ) = P( A) ⋅ P( B) ⋅ P(C )
For three events
n
P ( A1 ⋅ A2 ⋅ K An ) = ∏ P( Ai )
i =1
For n events
67
Addition Law
P( A + B) = P( A) + P ( B ) − P ( A) ⋅ P ( B )
Where:
P(A+B) = probability of A and B
occurring together
P(A) = probability of A occurring
P(B) = probability of B occurring
A, B independent, non-exclusive
Addition Law
Illustrated with Venn diagram
P( A + B) = P( A) + P ( B ) − P ( A) ⋅ P ( B )
©2015 Isograph Inc. Reliability Workbench 4–8
68
Addition Law for 3 Events
P( A + B + C ) = P( A) + P( B ) + P(C )
− P( A) ⋅ P ( B) − P( A) ⋅ P(C ) − P( B) ⋅ P(C )
+ P( A) ⋅ P( B) ⋅ P(C )
P(A)
P(A)·P(B)·P(C)
P(B) P(B)·P(C)
P(C)
Addition Law
General form:
n n −1 n
P ( A1 + A2 + ... + An ) = ∑ P( Ai ) − ∑ ∑ P( A ) P( A ) + ...(−1)
i j
n +1
P ( A1 ) P ( A2 )...P( An )
i =1 i =1 j =i +1
Very complex
Approximation methods
Success states
69
Addition Law
Success states:
P( A ⋅ B)
P( A + B) = 1 − P( A ⋅ B)
©2015 Isograph Inc. Reliability Workbench 4–11
Addition Law
Using Multiplication Law
P ( A + B ) = 1 − P ( A) ⋅ P ( B ) = 1 − (1 − P ( A)) ⋅ (1 − P ( B))
For n events
n
P ( A1 + A2 + ... An ) = 1 − ∏ (1 − P ( Ai ))
i =1
70
Example 4.1
Two-sided coin and a twenty-
sided die are thrown
Probability of the coin landing heads
AND the dice showing 20?
71
Example 4.2
Spin 3 coins
Probability of AT LEAST ONE landing
heads?
72
Example 4.3
3 sensor system
99.9% uptime
Probability of all sensors being
unavailable at the same time?
Probability of AT LEAST ONE
sensor being failed?
73
Lower/Upper bounds
Q=0.001
Q + Q + Q = 0.003
3Q·Q = 0.000003
Q·Q·Q = 0.000000001
Example 4.4
Weather forecaster predicts
40% chance of rain for five
days
Probability that it rains at least
one day?
74
Example 4.4 Solution
P(Rain) = 0.4
5·P(Rain) = 2
10·P(Rain)2 = 1.6
5 choose 2 = 10
10·P(Rain)3 = 0.64
5 choose 3 = 10
5·P(Rain)4 = 0.128
5 choose 4 = 5
P(Rain)5 = 0.01024
2
2
1.5
0.912
0.5
0.4
0
5·P -10·P^2 +10·P^3 -5·P^4 +P^5
75
End of Chapter 4
Summary
Independence
Exclusivity
Multiplication Law
Addition Law
De Morgan’s Theorem
76
Quantitative Data
Chapter 5
Quantitative Data
Fault Trees are both:
Qualitative
Quantitative
Qualitative
Cut set analysis
Quantitative
Multiplication/Addition laws
Need input values
77
Input Data
Entered for all events
Required for quantitative analysis
Function to calculate Q and ω
Equation depends on event
characteristics
Options will differ between FT
tools
Common Parameters
Unavailability
Failure Frequency
Mean Time To Failure (MTTF)
Failure Rate (1/MTTF)
Inspection (Test) Interval
Mean Time to Repair (MTTR)
Repair Rate (1/MTTR)
Time at Risk/Lifetime
©2015 Isograph Inc. Reliability Workbench 5–4
78
Common Event Models
Fixed Failure Probability
Failures on demand, operator errors,
software bugs, conditional events
Fixed probability of failure
Constant Rate
Repairable or non-repairable
components with a constant failure
rate and repair rate
Weibull
Failure rate varies with time
©2015 Isograph Inc. Reliability Workbench 5–5
79
Fixed Probability
Constant Q and ω
Useful for
Operator errors
Failure on demand
Software bugs
Conditional events
Probability of failure on
demand = Q
Input Q and ω directly
©2015 Isograph Inc. Reliability Workbench 5–7
Fixed Probability
Initiators and Enablers
80
Constant Rate
Failures immediately revealed
Constant Failure and repair
rates
Component does not age
Preventative maintenance before
wear out
Exponentially distributed
Both failures and repairs
Constant Rate
Inputs
Failure rate or MTTF
Repair rate or MTTR
1 1
λ= µ=
MTTF MTTR
81
Constant Rate
λ
Q (t ) = (1 − e −( λ + µ )t )
λ+µ
ω (t ) = λ[1 − Q(t )]
λ = failure rate, µ = repair rate
Constant Rate
Steady-state Region
Q(t)
Transient Region
82
Constant Rate
Transient Region
Constant Rate
Steady-state Region
83
Non-Repairable Events
Non-repairable components
Repair rate = 0
Substitution yields:
λ
Q(t ) = (1 − e −( λ + 0 )t )
λ +0
Q(t ) = 1 − e −λt
Non-Repairable Events
0.8
0.6
0.4
0.2
84
Exposure Time
Determined by FT goals
Lifetime of the system
Time between overhauls
Mission time
Maintenance budgeting interval
Global
All components in the fault tree
Event-specific
Each event has independent time at
risk
©2015 Isograph Inc. Reliability Workbench 5–17
Dormant Failures
Failures not immediately
revealed
Non-repairable between inspections
Ex: Protection/standby system
Failures only revealed on
inspection (test)
Fixed test interval
Repair if test reveals failure
85
Dormant Failures
Three methods for calculating
Q
Mean
Max
IEC 61508
Must calculate single Q
Multiplication and addition laws don’t
work on functional inputs
Dormant Failures
Q(t)
τ 2τ 3τ 4τ
τ << MTTF
86
Mean Unavailability
λτ − (1 − e − λτ ) + λ ⋅ MTTR(1 − e − λτ )
Qmean =
λτ + λ ⋅ MTTR(1 − e −λτ )
ω = λ (1 − Qmean )
Simplifies to:
λτ
Qmean = + λ ⋅ MTTR
2
where τ , MTTR << MTTF
Mean Unavailability
Qmean
τ 2τ 3τ 4τ
87
Maximum Unavailability
Qmax = 1 − e − λτ
ω = λ (1 − Qmax )
Maximum Unavailability
Qmax
τ 2τ 3τ 4τ
88
IEC 61508 Averaging
From the standard
Q for 1 oo 2 voted configuration:
߬
ܲܦܨ௩ = 2( 1 − ߚ ߣ + 1 − ߚ ߣ )ଶ ீݐா ݐா + ߚ ߣ ܴܶܶܯ+ ߚߣ + ܴܶܶܯ
2
where
ߣ ߬ ߣ
ீݐா = + ܴܶܶܯ+ ܴܶܶܯ
ߣ 3 ߣ
ߣ ߬ ߣ
ݐா = + ܴܶܶܯ+ ܴܶܶܯ
ߣ 2 ߣ
89
IEC 61508 Averaging
Reason for the discrepancy
For a given function f(x):
Approximating in FT
Apply Markov to cut sets with two or
more dormant failure events
Which Method?
Max method – worst case
Ex: safety-critical system
IEC 61508 – multiple dormant
events
Ex: Protection system with many
overlapping dormant faults
Mean method otherwise
90
Weibull Distribution
Failure rate varies with time
Requires 3 parameters:
η – Characteristic Lifetime
β – Shape Parameter
γ – Location Parameter
Weibull Distribution
Rate, Unreliability given by:
t −γ
β
β −1 −
β (t − γ ) η
r (t ) = , F (t ) = 1 − e
ηβ
91
Other Cases
Phases
Failure Rate, Q change with respect to phase
E.g., rocket launch (on pad, launch, in space
flight)
Steady State
Component already in use
Normal, Lognormal
Other statistical distributions
Sequences
Failures can only occur in sequence
Limited replacement spares
Limited repair crews
Standby failure rate
Imperfect Proof Testing
©2015 Isograph Inc. Reliability Workbench 5–31
Failure Rates
Historical Data
CMMS tracking/Work order history
Weibull analysis
Libraries
NPRD 2011, IAEA
Integrated with RWB
Exida
Linked via External App
SIS-Tech
92
Failure Data Sources
Prediction Standards
Electronic
MIL-HDBK-217F
RIAC 217+
Telcordia SR-332 Issue 3
IEC TR 62380
Siemens SN 29500
GJB/z 299
Mechanical
NSWC
93
End of Chapter 5
Summary
Common model parameters
Common event failure characteristics
94
System Quantification
Chapter 6
System Quantification
Determine cut sets
Solve Q and ω
For basic events
For cut sets (multiplication law)
For TOP events (addition law)
Use TOP event Q and ω to
solve:
TDT, W, F, CFI
95
Calculation Methods
Cross Product
Esary-Proschan
Rare
Lower Bound
Example
A.B + A.C.D + A.C.E
Q=0.01
w=2
TP1
A B A C D A C E
96
Minimal Cut Set Q and ω
Multiplication law
n
Q cut (t ) = ∏ Qi (t )
i =1
n n
ω cut = ∑ ω j ∏Q i
j =1 i =1,i ≠ j
Example
Cut Set Q and ω
ωACD = ωA QC QD + ωC QA QD + ωD QA QC
= 2 × 0.01 × 0.01 + 2 × 0.01 × 0.01 + 2 × 0.01 × 0.01 = 0.0006
ωACE = ωA QC QE + ωC QA QE + ωE QA QC
= 2 × 0.01 × 0.01 + 2 × 0.01 × 0.01 + 2 × 0.01 × 0.01 = 0.0006
©2015 Isograph Inc. Reliability Workbench 6–6
97
Cross-Product Method
Exact method
Slow to solve for large trees
Limit product terms
Upper bound
n n −1 n n − 2 n −1 n
QSYS = ∑ Qcuti (t ) − ∑ ∑Q ij (t ) + ∑ ∑ ∑Q ijk (t )...( −1) n +1 Q1.2.3...n (t )
i =1 i =1 j =i +1 i =1 j = i +1k = j +1
Example
Cross-Product
98
Esary-Proschan Method
Multiplication law
Odds that no cut set occurs
Upper-bound
Faster, still accurate
m n
Qsys (t ) = ∏ qi 1 − ∏ [1 − Qcutj (t ) ]
i =1 j =1
n n
ω sys (t ) = ∑ ω cuti (t )∏ [1 − Qcutj (t ) ]
i =1 j =1
j ≠i
©2015 Isograph Inc. Reliability Workbench 6–9
Example
Esary-Proschan Approximation
99
Rare Approximation
Cross Product — First iteration
Upper bound
Fastest
Less accurate for Q > 0.2
n
QSYS (t ) = ∑Qcuti (t )
i =1
n
ωSYS (t ) = ∑ωcuti (t )
i =1
©2015 Isograph Inc. Reliability Workbench 6–11
Example
Rare Approximation
100
Lower Bound for Q
Cross Product
First two iterations
n n−1 n
Qlower (t ) = ∑Qcuti (t ) − ∑ ∑Qij (t )
i =1 i =1 j =i +1
Example
Lower Bound
101
Errors Due to Approximations
A + B·C + B·D
% Difference
Event Q Cross Product Esary-Proschan Rare Lower Bound
0.5 0% 4.5% 45% 9.1%
0.1 0% 0.69% 2.5% 0.085%
0.01 0% 0.0096% 0.029% 0.000098%
0 1
T
MTBF SYS =
ω (∞ )
WSYS = ∫ ω SYS (t ) ⋅ dt
Q (∞ )
0 MTTR SYS =
ω (∞ )
ω SYS
λ SYS = TDT SYS
1 − QSYS Q SYS =
T
T
= 1 − e ∫0
− λ SYS ( t )⋅dt 1
FSYS RRF =
Q SYS
©2015 Isograph Inc. Reliability Workbench 6–16
102
Modularizing Fault Trees
Goal: Reduce analysis time
Reduce number of cut sets
Replace isolated sections of
tree with super-events
Analyze sections independently
Modularization Example
Cut sets:
TOP1 = GATE1 · GATE2
GATE1 = A + B
GATE2 = C + D
Unmodularized:
TOP1 = A·C + A·D + B·C + B·D
QTOP1 = QAB + QAD + QBC + QBD – QACD – QABC
– QABCD – QABCD – QABD – QBCD + QABCD +
QABCD + QABCD + QABCD – QABCD
15 product terms
©2015 Isograph Inc. Reliability Workbench 6–18
103
Modularization Example
Modularized:
QGATE1 = QA + QB – QAB
QGATE2 = QC + QD – QCD
QTOP1 = QGATE1 · QGATE2
7 product terms
Program Demonstration
Using a FT tool to analyze a
tree
104
End of Chapter 6
Summary
Approximation methods
Cross Product, Esary-Proschan, Rare,
Lower Bound
Differences
Other parameters
Modularization
105
Importance Analysis
Chapter 7
Importance Analysis
Helps determine:
Event contribution to TOP event
TOP event sensitivity to event
changes
Weak areas in the system
Where to cut corners
Useful during the design stage
106
Importance Measures
Fussell-Vesely Importance
Birnbaum Importance
Barlow-Proschan Importance
Sequential Importance
Risk Reduction Worth
Risk Achievement Worth
Fussell-Vesely Importance
Contribution to system Q
High F-V Importance — worst
actor
Decreasing Q on these events =
biggest decrease to system Q
Percentage of failures
involving the event
QSYS − QSYS (qi = 0)
I iFV =
QSYS
©2015 Isograph Inc. Reliability Workbench 7–4
107
Birnbaum Importance
Sensitivity of system Q
High Birnbaum — highly
sensitive
Increasing Q on these events =
biggest increase in system Q
n
∑Q j =1
cutj
I iBB ≈
qi
Where n = number of cut sets containing event i
©2015 Isograph Inc. Reliability Workbench 7–5
Barlow-Proschan Importance
Contribution to ω as initiator
Last to fail
Probability system fails because
event failed last
Sum of frequency terms with event
as initiator ÷ system ω
n
∑ω Q
j =1
i cutj
BP
I i =
ω SYS
Qcutj = product of events in j-th cut set, excluding event i
©2015 Isograph Inc. Reliability Workbench 7–6
108
Example
Barlow-Proschan
A·B + A·C·D
Frequency terms: ωA·QB, ωB·QA,
ωA·QC·QD, ωC·QA·QD, ωD·QA·QC
BP ω A × QB + ω A × QC × QD
I A =
ω SYS
Sequential Importance
Contribution to ω as enabler
Not last to fail
Probability system fails
because event was failed when
failure event occurred
Sum frequency terms with
event as enabler ÷ system ω
109
Example
Sequential
A·B + A·C·D
Frequency terms: ωA·QB, ωB·QA,
ωA·QC·QD, ωC·QA·QD, ωD·QA·QC
ω B × Q A + ω C × Q A × QD + ω D × Q A × QC
I AS =
ω SYS
QSYS
I iRRW =
QSYS (qi = 0)
110
Risk Achievement Worth
Contribution to risk
Worth of component to current
risk level
Importance of maintaining
reliability of component
QSYS ( qi = 1)
I iRAW =
QSYS
Program Demonstration
Using a FT program to
calculate importance
111
End of Chapter 7
Summary
Importance analysis
Fussell-Vesely, Birnbaum, Barlow-
Proschan, Sequential, Risk Reduction,
Risk Achievement
112
Common Cause Failures
Chapter 8
113
CCF Model Types
Beta Factor Model
Multiple Greek Letter (MGL)
Model
Alpha Factor Model
Beta Binomial Failure Rate
(BFR) Model
Pump Example
Two pumps
Independent power supplies
Attached to same structure
Vibration, high temperature,
humidity, impact, stress
May be identical pumps
Incorrect maintenance
Manufacturing defects
114
Two Pump System
Both pumps
unavailable
TP1
P1 P2
TP2
Pump 1 Pump 2
unavailable unavailable
PUMP1 PUMP2
P1 CCF P2 CCF
115
Beta Factor Model
QI = (1 − β ) ⋅ QT
QCCF = β ⋅ QT
β = beta factor
QI = Q due to independent
failures
QCCF = Q due to CCF
QT = Total Q
QT = 0.001, β = 0.1
116
IEC Beta Factor Model
What if I don’t know what Beta
factor to use?
IEC 61508-6 Annex D
Provides method for determining beta
factor
Table D.1: questionnaire about
components
Beta assigned based on score
Separation/segregation
Are all signal cables for the channels routed separately at all positions?
Are the logic subsystem channels on separate printed-circuit boards?
Are the logic subsystem channels in separate cabinets?
If the sensors/final elements have dedicated control electronics, is the
electronics for each channel on separate printed-circuit boards?
If the sensors/final elements have dedicated control electronics, is the
electronics for each channel indoors and in separate cabinets?
117
CCF Models
Beta factor: “All or nothing”
CCFs affect either all components in
group, or none All sensors failed
TP2
Sensor 1 failure All sensors fail Sensor 2 failure All sensors fail Sensor 3 failure All sensors fail
due to common due to common due to common
causes causes causes
118
Beta Factor Adjustment
Calculation of β for systems with levels of redundancy
greater than 1oo2 (IEC 61508, 2010)
m oo n n
(success) 2 3 4 5
m 1 β 0.5β 0.3β 0.2β
2 – 1.5β 0.6β 0.4β
3 – – 1.75β 0.8β
4 – – – 2β
CCF Models
Alternate method: other CCF
models
Replace a single event with
multiple events representing
possible combos
Beta factor replaces event with two
events (independent and CCF)
Other models replace with multiple
events (combinations of CCF events)
119
CCF Models
Example: CCF Group A, B, C, D
Event A replaced in cut sets with:
A + [AB] + [AC] + [AD] + [ABC] +
[ABD] + [ACD] + [ABCD]
A represents independent failure
[] represent CCF event affecting
those components
[ACD] represents CCF of A, C, and D
CCF Models
Example: 3 sensors
All sensors failed
TP1
S1 S2 S3
120
CCF Models
TP2 = S1.S2.S3 + S12.S3 +
S13.S2 + S23.S1 + S123
All sensors
failed
SENSORS
Sensor 1 Sensors 1 Sensors 1 Sensors 1, Sensor 2 Sensors 1 Sensors 2 Sensors 1, Sensor 3 Sensors 1 Sensors 2 Sensors 1,
failed and 2 failed and 3 failed 2, and 3 failed and 2 failed and 3 failed 2, and 3 failed and 3 failed and 3 failed 2, and 3
failed failed failed
MGL Model
Expansion of Beta Factor model
Three parameters: β, γ, δ
β — conditional probability that
component failure is CCF shared by 1 or
more other components
γ — conditional probability that CCF
shared by 1 or more other components
is shared by 2 or more other
components
δ — conditional probability that CCF
shared by 2 or more other components
is shared by 3 other components
©2015 Isograph Inc. Reliability Workbench 8–18
121
MGL Model
CCF Event Probability
1
ܳ = ෑ ߩ 1 − ߩାଵ ்ܳ
݉−1
݇ − 1 ୀଵ
Where ܳ = unavailability of kth order CCF failure
ߩଵ = 1, ߩଶ = β, ߩଷ = ߛ, ߩସ = ߜ, ߩାଵ = 0
்ܳ = total unavailability
m = CCF group size
݉−1 ݉−1 !
=
݇−1 ݉−݇ ! ݇−1 !
MGL Model
Q1 = Independent probability
1
ܳଵ = 1 1 − ߚ ்ܳ = (1 − ߚ)்ܳ
݉−1 !
݉−1 ! 1−1 !
122
MGL Model
Sensor Example
1 1
ܳଶ = 1 ∙ ߚ 1 − ߛ ்ܳ = ߚ 1 − ߛ ்ܳ
3−1 ! 2
3−2 ! 2−1 !
= 4.0 × 10ିହ
1
ܳଷ = 1 ∙ ߚ ∙ ߛ 1 − 0 ்ܳ = ߚߛ்ܳ
3−1 !
3−3 ! 3−1 !
= 2.0 × 10ିହ
MGL Model
Example
TP2 = 0.0009∙0.0009∙0.0009 +
0.00004∙0.0009 + 0.00004∙0.0009 +
0.00004∙0.0009 + 0.00002 =2.011E-5
All sensors
failed
TP1
Q=2.011E-05
S1 S2 S3
123
Comparison
Beta factor model, β = 0.1
All sensors
failed
SENSORS3
Q=0.0001
S1 S2 S3
124
Alpha Factor Model
CCF Event Probability
݇ ߙ
ܳ = ܳ
݉ − 1 ߙ் ்
݇−1
Where ܳ = unavailability of kth order CCF failure
்ܳ = total unavailability
m = CCF group size
ߙ ் = ݅ߙ
ୀଵ
݉−1 ݉−1 !
=
݇−1 ݉−݇ ! ݇−1 !
©2015 Isograph Inc. Reliability Workbench 8–25
1 0.9507
ܳଵ = ∙ 0.001 = 0.0009
1 1.056
2 0.04225
ܳଶ = ∙ 0.001 = 4.0 × 10ିହ
2 1.056
3 0.007042
ܳଷ = ∙ 0.001 = 2.0 × 10ିହ
1 1.056
©2015 Isograph Inc. Reliability Workbench 8–26
125
Program Demonstration
CCF Model
Include CCFs without another event
Not recommended for system,
component and operator failures
Cut sets/Importance
End of Chapter 8
Summary
Model types
Beta factor model
MGL, Alpha factor models
Including CCFs in a FT
126
Confidence Analysis
Chapter 9
Confidence Analysis
Assuming failure rates exactly
known
Not necessarily true
Sparse data
Introduces uncertainty in component
Q
127
Confidence Analysis
Example
Confidence Analysis
Uncertainty expressed as
range, distribution
10–5 ± 0.5×10–5 normal distribution
10–6 to 10–4 lognormal distribution
Modeled using Monte Carlo
sampling
Pick failure rates from distribution
Run analysis
Repeat
128
Sampling procedure
Sample failure rates
from distribution
For n = 1 to number
of simulations
Program Demonstration
Using a FT program to find
confidence bounds
129
End of Chapter 9
130
Initiators, Enablers, and Sequencing
Chapter 10
131
Initiator Example
SPARK is initiator Explosion
IMFLAM is enabler
TOP1
SPARK → INFLAM: safe
INFLAM → SPARK: fire Fire Starts PROTECTION
SYSTEM
Present
E I
INFLAM SPARK
Q=0.1 w=2
132
Sequencing
More precisely specify order of
failures
First, second, third, fourth, fifth, etc.
Priority AND gate
Applied to cut sets
Markov used to solve
TP1
All working
1 2 3
A B C
λ1
λ1 λ2 λ3 λ2 λ3
A B C
λ2 λ1 λ1
λ3 λ3 λ2
λ3 λ2 λ3 λ1 λ2 λ1
133
Modularizing Priority AND
Example
TOP1
GATE1 D
1 2 3
A B C
134
Modularizing Priority AND
Non-modularized cut sets
TOP1 = A · B · C · D
Allowed failure sequences
A→B→C→D
Program Demonstration
Event sequence status
Sequencing options
Auto-sequence Priority AND
Verification
Exactly 1 initiator under AND
Results
135
End of Chapter 10
136
Event Trees
Chapter 11
137
Pipe Break Event Tree
Nuclear safety example
Examines effectiveness of protective
system
Initiating event - Pipe break
Enablers - Protective systems
All possible outcomes examined
Each branch examines failure or
success
Failure branches: failure of basic event
or the minimal cut sets of a gate
Success branches: success state of basic
event or minimal path sets of a gate
Success
Success No Release
Failure
Success No Release
Success
Failure No Release
Failure
Success Very Small Release
Success
Success Small Release
Failure
Failure Small Release
Success
Failure Small Release
Failure Failure
Medium Release
Success
Success Medium Release
Failure
Success Large Release
Success
Failure Medium Release
Failure
Failure Large Release
Success
Success Large Release
Failure
Failure Large Release
Success
Failure Large Release
Failure
Very Large Release
138
Pipe Break Event Tree
Simplify by
Removing impossible sequences
Removing sequences leading to ‘No
Release’
Combine neighbouring end-branches
with the same consequences
Success
Success No Release
Failure
Success No Release
Success
Failure No Release
Failure
Success Very Small Release
Success
Success Small Release
Failure
Failure Small Release
Success
Failure Small Release
Failure Failure
Medium Release
Success
Success Medium Release
Failure
Success Large Release
Success
Failure Medium Release
Failure
Failure Large Release
Success
Success Large Release
Failure
Failure Large Release
Success
Failure Large Release
Failure
Very Large Release
139
Simplifying – “No Release”
Pipe Break Electric Power Emergency Cooling Fission Product Containment Consequence
Removal Integrity
Success
Success No Release
Failure
Success No Release
Success
Failure No Release
Failure
Success Very Small Release
Success
Success Small Release
Failure
Failure Small Release
Success
Failure Small Release
Failure Failure
Medium Release
Success
Success Medium Release
Failure
Success Large Release
Success
Failure Medium Release
Failure
Failure Large Release
Success
Success Large Release
Failure
Failure Large Release
Success
Failure Large Release
Failure
Very Large Release
Success
Success No Release
Failure
Success No Release
Success
Failure No Release
Failure
Success Very Small Release
Success
Success Small Release
Failure
Failure Small Release
Success
Failure Small Release
Failure Failure
Medium Release
Success
Success Medium Release
Failure
Success Large Release
Success
Failure Medium Release
Failure
Failure Large Release
Success
Success Large Release
Failure
Failure Large Release
Success
Failure Large Release
Failure
Very Large Release
140
Simplified Pipe Break Event Tree
Pipe Break Electric Power Emergency Fission Product Containment Consequence Frequency
Cooling Removal Integrity
ω=0.01 Q=0.00016 Q=0.0016 Q=0.02 Q=0.01
Success Null
Large Release 1.5e-6
Failure Null
Success
Large Release 3.1e-8
Failure
Failure Very Large
3.2e-10
Release
141
Spark Event Tree
Explosion
TOP1
FIRE PROTECT
INFLAM SPARK
Q=0.1 w=2
Success
None 1.77
Success
Failure
None 0.0306
Success
None 0.197
Failure
Failure
Explosion 0.0034
142
Results
Per Consequence
Frequency
Importance
Cut sets
Per category
Risk
F-N Curve
Correlates weight with
frequency
X-axis: weight
Y-axis: cumulative frequency of all
consequences with that weight
In a given category
143
Pipe Break F-N Curve
Safety F-N Curve
0.0001
1E-05
1E-06
1E-07
Cumulative frequency
1E-08
1E-09
1E-10
1E-11
1E-12
1E-13
0.1 1 10
Weight
Modularization
Consider:
Tank Overfill Shutoff Emergency Relief Consequence
Success
No effect
Success
Failure
No effect
Success
No effect
Failure
Failure
Chemical spill
144
Modularization
Where:
Shut off does not Emergency relief
engage system fails to
open
SHUTOFF RELIEF
Q=0.0199 Q=0.0199
Shut-off valve Level sensor fails Pressure relief Level sensor fails
fails open to detect high valve fails closed to detect high
level level
Modularization
If SHUTOFF and RELIEF considered
separately:
Tank Overfill Shutoff Emergency Consequence Frequency
Relief
ω=2 Q=0.0199 Q=0.0199
Success
No effect 1.921
Success
Failure
No effect 0.03901
Success
No effect 0.03901
Failure
Failure
Chemical spill 0.000792
145
Modularization
SHUTOFF
= VALVE + SENSOR
= 0.0199
RELIEF
= PVALVE + SENSOR
= 0.0199
Chemical Spill
= OVERFILL · SHUTOFF ∙ RELIEF
= 2 · 0.0199 · 0.0199
= 7.92E-4
©2015 Isograph Inc. Reliability Workbench 11–19
Modularization
However, SENSOR is common
event
SHUTOFF and RELIEF are not
independent
Chemical Spill ≠ OVERFILL ∙
SHUTOFF · RELIEF
Accurate calculation must resolve
consequences to minimal cut sets
146
Modularization
Chemical Spill:
SHUTOFF · RELIEF
= (VALVE + SENSOR) · (PVALVE + SENSOR)
= SENSOR + VALVE · PVALVE
Modularization
If SHUTOFF and RELIEF resolved to
minimal cut sets:
Tank Overfill Shutoff Emergency Consequence Frequency
Relief
ω=2
Success
No effect 1.941
Success
Failure
No effect 0.0196
Success
No effect 0.0196
Failure
Failure
Chemical spill 0.0202
147
Partial Failure Branches
Success/Failure logic
Gives two and only two outcomes
Partial failure
More than two possible outcomes
Gives a gradation of possibilities
Not necessarily mutually exclusive
Each branch associated with a
different gate or event failure
E.g., partial capacity
©2015 Isograph Inc. Reliability Workbench 11–23
0-10 passengers
2 fatalities 1.031E-5
21-30 passengers
8 fatalities 1.378E-4
True
0-10 passengers
8 fatalities 9.277E-7
21-30 passengers
24 fatalities 1.392E-6
148
Program Demonstration
Evaluating an Event Tree in a
computer program
End of Chapter 11
149