Você está na página 1de 10

Reliability Engineering and System Safety 72 (2001) 293302

www.elsevier.com/locate/ress

An analysis of maintenance failures at a nuclear power plant


Pekka Pyy*
VTT Automation, PL 1301, 02044 VTT, Finland
Received 17 August 2000; accepted 23 February 2001

Abstract
In this paper, a study of faults caused by maintenance activities is presented. The objective of the study was to draw conclusions on the
unplanned effects of maintenance on nuclear power plant (NPP) safety and system availability. More than 4400 maintenance history reports
from the years 19921994 of Olkiluoto BWR NPP were analysed together with the maintenance personnel. The human action induced faults
were classied, e.g. according to their multiplicity and effects. This paper presents and discusses the results of a statistical analysis of the data.
Instrumentation and electrical components appeared to be especially prone to human failures. Many human failures were found in safety
related systems. Several failures also remained latent from outages to power operation. However, the safety signicance of failures was
generally small. Modications were an important source of multiple human failures. Plant maintenance data is a good source of human
reliability data and it should be used more in the future. q 2001 Elsevier Science Ltd. All rights reserved.
Keywords: Human failures; Human reliability; Data analysis; Statistical analysis; Maintenance; Nuclear power plants

1. Introduction
In human reliability research, the main attention has
usually been focused upon the control room crew performance in post initiating event conditions. The control room
operators have an essential role in disturbance management.
On the other hand also maintenance may have an impact on
the severity of an incident by recovering lost systems or by
erroneously disabling safety related equipment.
The chances of operators to successfully manage a disturbance are worsened, if there are latent equipment faults in
the safety related systems. Especially, common cause failures (CCFs), affecting several trains of a safety system, may
have a signicant contribution to the reactor core damage
risk [1,2]. Often, CCFs are caused by human maintenance
actions. In some cases, even single human actions may
affect safety by inuencing several components through
latent system interactions [3].
In probabilistic safety assessment (PSA), human failures 1
have been divided into three categories [4,5]: (A) pre-initiator events that cause equipment/systems unavailability, (B)
* Fax: 1358-9456-6752.
E-mail address: pekka.pyy@vtt. (P. Pyy).
1
The term human failure is used in this paper instead of human error to
emphasise that the reasons for a failing human action may be many. Sometimes even a correct human action transmits a fault mechanism into the
equipment, e.g. due to a faulty instruction or tools, and thus causes an
equipment fault. In the following, the term human failure is used as a
synonym to a fault caused by human action.

actions leading to PSA initiating events i.e. human induced


initiators, (C) post-initiator human actions. Generally, maintenance actions are included in the PSA models for class A.
Type A activities have not been one of the key areas of developments in human reliability analysis (HRA), although some
effort has been made, e.g. to assess probabilities of human
failures in maintenance [2,6,7]. Furthermore, exact modelling
and quantication of repeated human failures in PSA has been
formulated in Ref. [8], and maintenance data from several
installations has been used to draw generic conclusions in
Ref. [9]. Even though some models go as far as to simulate
maintenance activities [10], the efforts to develop, validate and
verify HRA models with plant data have been few. An example of such studies is that of Reiman [2], and references will be
made to it in several parts of this paper.
A study on maintenance induced faults was performed for
more than 4400 maintenance history reports from the years
19921994 of Olkiluoto boiling water reactor (BWR)
nuclear power plant (NPP) in Finland [11]. This paper
presents a follow-up analysis of the data collected in that
study. The objective of the follow-up analysis was to allow
statistical inference of unplanned effects of maintenance
actions on plant safety. Further, the idea was to generate a
database available for potential future probabilistic studies.
2. Methods and material

0951-8320/01/$ - see front matter q 2001 Elsevier Science Ltd. All rights reserved.
PII: S 0951-832 0(01)00026-6

In this section, the main characteristics of data collection

294

P. Pyy / Reliability Engineering and System Safety 72 (2001) 293302

and the statistical analysis are presented. The collection


work of the original data, including the detailed classications and description on the analysis ow, is described in a
detailed manner in Ref. [11].
In the study, the human failures were classied into single
ones, into multiple ones (Human related Common Cause
Failures (HCCFs) and Human related Common Cause
Non-critical failures (HCCNs)) and into Human related
Shared Equipment Failures (HSEFs). HCCFs were common
cause failures in redundant components caused by repeated
human actions and HCCNs were corresponding non-critical
common cause failures, i.e. they do not directly cause faults
in a component but may reduce its operability characteristics. HSEFs were multiple faults caused by a single human
failure, which is possible especially through latent electrical
and instrumentation system dependencies. Other faults
taking place in more than one (non-redundant) component
or system were classied as single failures, although some
of them may have included dependent features.
Thorough interviews with the utility personnel revealed
206 single human failures leading to equipment faults, as
shown in Fig. 1 demonstrating the screening of data. Apart
from this amount, 126 fault history records could be
grouped into 37 dependent failures and 11 HSEFs. This
reduction in number was due to the fact that several fault
records in the database referred to a same dependency
mechanism. Together eight HCCFs and six HCCNs
were identied in the data, whereas a closer analysis
revealed that actually 23 dependent cases were not due to
human actions.
A typical fault record in the plant maintenance database
included the following information: component identication number, component type, room, fault detection time,
repair initiation time, repair nishing time, number of work-

ers, hours spent, urgency class, fault report number, codes


and a short free text description. The codes referred to fault
causes, consequences and repair actions usually accompanied by the text description of 12 sentences. No information allowing deeper analyses of human behaviour was
available in the database due to the fact that it is primarily
intended for other purposes than HRA.
The data was veried with the maintenance foremen and
against other plant documentation, and more information
was collected to complete the picture. This part of the analysis was very resource intensive, and it was even completed
later on with the regulatory body personnel and plant PSA
group. Finally, all single human failures were classied
according to the phenotype of the human failure, type of
equipment involved, time of failure origin, time of failure
detection and type of action that revealed the failure.
Further, the mechanisms of the dependent failures were
studied in a more detailed manner. The statistical analysis
also included studies of safety and economic signicance of
the failures. This means studying the number of repair hours
required, repair urgency class and the PSA signicance of
the fault.
Especially the single human failure database was large
enough to allow proper statistical testing and inference.
Here, non-parametric tests such as x 2, Fisher's exact probability test and MannWhitney's rank sum test [12,13] were
used. Generally, a 0:05 was used as signicance criterion. SIGMA STAT [14] software was used with verication
calculations by other tools. The ability to carry out statistical
testing is the reason for devoting a longer discussion for
single human failures in the following. Statistical tests
were also performed for HCCFs, HCCNs and HSEFs that
are more safety signicant important categories of human
failures, but these sample sizes were considerably smaller.

Fig. 1. Flow chart of the screening of human failures [11].

P. Pyy / Reliability Engineering and System Safety 72 (2001) 293302

295

Table 1
Distribution of all faults and human failures in cause categories as reported by the plant maintenance personnel (for dependent failures, see Table 6)
Reported main cause
category (given by utility)

Number of all fault


records (unit 1 1 unit 2)

A Failure in installation or
earlier
B Operating or maintenance
personnel
C Consequence of operation
D Miscellaneous causes
Total
a

Number of single
human failure records

Percentage of single human


failures of fault records (%)

279 1 221 500

29 1 21 50

10.0

113 1 101 214

44 1 27 71

33.2

1250 1 1491 2741


505 1 447 952
4407

10 1 3 13
23 1 47 70
204 a

0.5
7.4
4.6

Two cases came from utility event reports (206 single human error cases, see Fig. 1).

Table 1. The division of single human failures in different equipment type is presented in Table 2. As seen,
control and instrumentation (I & C, 84 cases) and electrical equipment (40 cases) are often affected by human
actions. Their share together is about 60% of the total.
A closer study of the whole database revealed that the
high number of human originated instrument faults is
analogous to the share of I & C in all the faults
(<40%).
Nearly three fourths of all single human failures (152
cases) were found in process systems, whereas only 41
cases where found in so-called electrical or instrumentation
systems, e.g. in bus bars or in plant protection. Also many
electrical and I&C faults were found in process systems,
which is due to the amount of I&C equipment in all kinds
of systems. Consequently, more emphasis should be put in
PSA to study complex equipment such as instrumentation,
control, protection, electrical power supply and drives in all
systems.
The next step was to study which kind of human failures
take place. Swain [6] divides human failures into errors of
commission and omission. An error of omission (EoO) is a
failure to perform an action totally, i.e. one omits it. An
error of commission (EoC) is an incorrect performance of
an action, or performance of some additional action. HRA
studies mostly concentrate upon errors of omission. Table 2
shows how the Swain's taxonomy [6] was expanded in this
study so that wrong set point failures and wrong direction
failures (e.g. an electric motor rotates in wrong direction due

The results obtained for them are compared to the ones of


single human failures in Section 3.2.
3. Results
In the following sections, the results of the statistical
analysis will be discussed separately for single human failures, dependent ones (HCCFs and HCCNs) and HSEFs.
3.1. Single human failures
The plant maintenance personnel had classied the
human failures, i.e. human actions that led to a fault, in all
four available fault cause categories AD, as seen in Table
1, although there was a specic category B in the reporting
form for human failures. This was mostly due to the fact that
the categories, e.g. `operating and maintenance personnel'
and `failure in installation or earlier' are not mutually exclusive. Similarly, some of their sub-categories, e.g. `foreign
labour' and `installation error' may apply to the same fault,
and slightly different reporting principles have obviously
been applied in the two plant units. Based on this result,
the search for human failures may not be restricted to the
corresponding cause categories in the maintenance history
only but also free text descriptions need to be used.
3.1.1. Equipment type and human failures
Together 206 (4.6%) maintenance history records
were due to single human failures, as was shown in

Table 2
Single human failure types and their distribution among different equipment categories
Human failure type

I&C
components

Mechanical
components

Electrical
components

Valves a

Instr. line
valves a

Total

Omission
Commission, wrong set points
Commission, wrong direction
Commission, Other
Total

13
11
11
49
84

7
0
3
26
36

14
5
7
14
40

7
2
6
19
34

8
0
0
4
12

49
18
27
112
206

a
Instrument valves are shown separately due to their proneness to omission errors but all valves were considered as one class in x 2-tests to avoid zero
frequency categories.

296

P. Pyy / Reliability Engineering and System Safety 72 (2001) 293302

to misplaced wires) were separated as own failure classes.


This was due to the fact that they were saliently present in
the data and the results became comparable with those of
Reiman's study [2]. The type of database did not allow for
any deeper classications of human failure mechanisms.
The distribution of failures shown in Table 2 is not homogenous, which is conrmed by the x 2-test p 0:02: EoCs
dominate the results with the share of 76%, and especially
the category `other commission' is salient (54%) deserving
further analysis. The share of EoCs in mechanical and I&C
equipment is exceptionally high. In contrast, many EoOs
had taken place in instrument line block valves (67% of
all instrument valve failure modes). For valves, the result
could be expected failure types such as wrong direction
and wrong settings do not very often take place in process
components. The amount of wrong settings in I&C and
electrical equipment was about 1213% of the total amount
of failures, which cannot be seen as exceptionally low.
However, the wrong settings are the category of EoCs that
is sometimes included in PSAs, whereas the other types of
EoCs are normally neglected.
Table 3 gives more detailed information about the maintenance related human failures based on a closer investigation. The events could be, for example, decomposed into:
lack of attention causing a short circuit, confusion in cables
(put in wrong order), use of excessive force causing crushed
instrumentation tubes and use of too little force causing bad
connections, untight bolts, broken pieces etc. Especially, the
analysis of `other commission failures' is further expanded.
The nding of this part of the study was that the rst four
columns in Table 3, referring more or less to carelessness,
correspond more than 52% of the total.
The distribution of detailed human failure type frequencies in Table 3 is not homogenous, as conrmed by the x 2test (p , 0.001). The `lack of attention' type failures seem
to be common in mechanical components (53% of faults).
`Forgetting' is the most frequent mode in valves and electrical equipment, which is usually taken into account in PSA
studies. Excessive force was in many cases used to adjust
valves. Work planning and system design and layout (ergonomics) appeared to contribute to many of these failures.
Control and instrumentation equipment is prone to wrong

direction failures, choosing a wrong object and using of too


little force (including loose connections).
The data of this study showed somewhat higher yearly
frequencies for selected human failure classes when
compared to the one collected by Reiman [2] earlier from
the same power plant. Reiman discovered < 9.6 omissions
and 3.8 wrong direction commissions per year through
19811991, whereas the ndings of this study are 16.3
and 9.0, correspondingly. The difference is mostly due to
the more extensive effort put to this study. The search for
human failures in Ref. [11] covered all the maintenance
record classes, and not only those pre-classied as human
failures (Class B, Table 1) as in Ref. [2]. No statistically
signicant increase or decrease trends were noticed in the
yearly frequencies of those failure classes.
Finally, also working hours spent were studied as an
indicator of the unavailability time and related costs,
although the unavailability time cannot be based directly
based on them. Data directly from the maintenance history
could be used, here. There were no statistically signicant
differences in the distributions of working hours due to
single human failures and due to other faults according to
the MannWhitney rank sum test p 0:30: The average
was 22.5 h but the data formed clearly two groups with very
short and long repairs. This result cannot be seen as surprising, since simple faults are xed at once independently on
whether they are caused by human actions or by other
factors. The mechanical components had signicantly
longer repair times than the other component groups (average 32.1 h) and they were the only group deviating saliently
from other data.
3.1.2. Fault origination and detection
In NPPs, many preventive maintenance, modications
and testing activities take place during the annual refuelling
outage. This fact partly explains that 127 (<62%) single
human failures also stem from that period, in comparison
to the 78 ones taking place in power operation. The origination and detection of human failures is important information for HRA purposes.
About 94% of the failures born during the power operation
were also detected during the power operation, as shown in

Table 3
A detailed human failure classication and its distribution in different equipment classes

I&C
Mechanical
Electrical
Valve a
Instr. Valve a
a

Lack of
attention

Too much
force

Too little
force

Wrong a
object

Wrong a set
point

Wrong
direction,
sequence

Forgetting
a phase

Total

17
19
10
9
1
56

4
2
1
8
0
15

10
3
2
2
0
17

17
2
1
2
3
25

11
0
4
1
0
16

11
3
8
6
0
28

14
7
14
6
8
49

84
36
40
34
12
206

Valves and `wrong' failure modes were combined for x 2-test to avoid zero frequencies.

P. Pyy / Reliability Engineering and System Safety 72 (2001) 293302

297

Fig. 2. Plant operating mode at the time of detection of 127 single human failures (faults) stemming from outages (left) and 78 stemming from the operating
period (right). For one case, the timing remained unclear. The detection took place by a preventive (prev.) or by other type of action.

Fig. 2. As a contrast to that, 49% of human failures born


during outages remained latent until the plant start-up or
even until the power operation. This was some unexpected
since several preventive actions, such as tests and inspections, take place at the end of an outage in order to detect
remaining faults and human failures. The nding led to a
closer investigation of the birth and detection of human
failures.
Fig. 3 shows the results of a more detailed study of
the failures born during the outages. Detection percentages
are shown (1) as a function of different equipment classes
and (2) the plant operating mode at the time of the detection.
More faults are detected in start-up or during the power
operation for all the component classes except for the
mechanical component faults, which are mostly detected
during the outages (67%). The obvious explanation is that
mechanical damages have a good visibility. However, the

equipment type is not a statistically signicant explanatory


factor (x 2, p < 0.42) for differences in detection frequencies.
3.1.3. Safety signicance
The indicators of safety signicance were based on NPP
Safety Technical Specications (TechSpecs), giving deterministic safty related rules, and PSA. To some extent,
different systems are listed in TechSpecs and modelled in
PSA fault trees (FTs). A PSA study includes important
systems from severe core damage point of view. Often,
those systems are also in stand-by state during the power
operation. The safety system concept in TechSpecs is
different, since e.g. re and mechanical fuel integrity risks
are taken into account. In the following, more emphasis is
put on systems modelled in PSA, which required some
further investigation of the data and co-operation with the

Fig. 3. Detection frequencies of the outage born faults as a function of different equipment classes and the plant operating mode at the time of the detection.

298

P. Pyy / Reliability Engineering and System Safety 72 (2001) 293302

Table 4
Distribution of single human failures in different equipment between safety related and non-safety related systems

IC
EL
MEC
VAL
IVAL
Total

Number in TechSpecs
systems

Number in
other systems

Total

Number in PSA
systems

Number in
other systems

Total

40
22
9
8
4
83

44
18
27
26
8
123

84
40
36
34
12
206

33
23
9
14
8
87

51
17
27
20
4
119

84
40
36
34
12
206

PSA-team. The assessment of the safety signicance was


carried out in three phases.
The rst phase of the safety signicance review was to
study the frequencies of single human failures in safety
related and other systems. The amount of them in systems
with a FT in PSA was 87 (compared to 119 in other
systems). To nd out if the frequencies of technical faults
are distributed in a similar way, all the 4146 fault records
due to technical reasons were classied into PSA (1431
records) and non-PSA systems (2715 records). A comparison to the distribution of human failures (87/119) was
carried out by x 2-test. The result indicates that the amount
of single human failures in systems with a PSA FT is higher
than random factors would explain p 0:03: One reason
for this nding is the large amount of different preventive
maintenance and test tasks carried out in safety related
systems. It may also be that not all the faults in totally safety
insignicant systems are duly booked.
Table 4 was drawn to study which components groups are
prone to human failures in safety related systems. The distribution of number of human failures in different equipment
types is not homogenous according to the x 2-test. The situation is not much affected by whether the division is made
with regard to PSA/non-PSA system p 0:03 or with
regard to TechSpecs/non-TechSpecs p 0:01 systems.
Electrical faults induced by human actions tend to concentrate in safety related systems, whereas the mechanical ones
are frequent in other systems. Many human failures in
instrumentation valves were in PSA systems.
The share of single human failures in the systems
modelled in PSA fault trees was for I&C faults 39% and
for electrical faults 58%. The same ratios for systems
mentioned in TechSpecs were 48 and 55%, correspondingly. This result has to be interpreted against the background that the number of plant safety related systems is
considerably smaller than that of non-safety related systems.
For example, the number of systems modelled in PSA is 47
and the number of other systems is 185.
More than 50% of the outage born single human failures
in PSA systems were also identied in outages, as shown in
Table 5. However, the result is not signicantly better than
for all single human failures. Valves in PSA systems were
the only group, where more safety system related faults
were detected during the power operation than in shutdown.

Some negligible seal leakages were present in valve data,


which may be the explanation for this nding. Nevertheless,
there was only a symptomatic difference in the frequencies
between the different equipment classes x2 ; p 0:09:
Not all the safety signicant events are modelled in fault
trees, e.g. those inducing initiating events of PSA. Furthermore, in systems modelled in PSA FTs, there are components such as small tubes that do not have to do with core
damage risks. Thus, one should study the human failures
leading to PSA basic events in order to nd out a realistic
nuclear risk contribution. This is discussed, in the following,
as the second phase of the safety signicance review for the
data of our study.
Together 19 single human failures, modelled as a basic
event in the plant PSA model, were found. This amount
corresponds 9.2% of the single human failures. Two of
them were only modelled as a part of a CCF mechanism.
The most frequent classes were, again, I&C and electrical
component faults corresponding together about 63% of the
total. An interesting detail is that in PSA basic events, there
were more omission type of human actions and wrong direction failures at the cost of failure types somehow caused
by careless actions, when compared to Table 3. One potential explanation may be taking better care of important
components.
The dominant part of human failures leading to PSA basic
events were born in outages (15 cases, 79%). About 40% of
them remained undetected from outage to start-up or to
power operation. This is slightly less than the average for
Table 5
The detection frequencies of single human failures born in outages taking
into account if the system had a FT model in PSA or not
Detected in outage

Detected in power

System

PSA

NON-PSA

PSA

NON-PSA

Total

Component
I&C
Mechanical
Electrical
Valve a
Instr. valve a
Total

13
3
8
4
2
30

13
11
2
7
1
34

9
2
6
7
2
26

19
5
5
6
2
37

54
21
21
24
7
127

The valves were analysed together in statistical testing.

P. Pyy / Reliability Engineering and System Safety 72 (2001) 293302

all systems, which could be explained by tighter check-ups


and tests in safety related systems especially for important
components. Nevertheless, the difference between the populations (PSA/other events) remains unproven in x 2-testing.
As the last phase of the study of safety impact, also
generally used reliability importance measures where used
to study the signicance of human failures. The Fussell
Vesely (FV) importance for basic events was applied, meaning the risk proportion of all cut-sets where the specied
basic event is present. Another used importance measure
was the risk achievement worth (RAW), meaning the relative rise in the core damage risk given that a basic event
takes place with probability 1. These two importance
measures give somewhat different results. For example,
RAW gives more weight to rare CCFs and single component faults do not appear high in its ranking list. For FV, the
situation is often the opposite. Thus, RAW gave no signicant importance to the single human failure basic events.
The highest FV importance, less than 1%, was obtained for
some important control valve faults in the reactor water
cleaning system. Should all the identied single faults
have taken place at once, the FV importance contribution
to the core damage risk would have been 1.2%. This shows
clearly the small signicance of single failures in a redundant and diversied nuclear power plant. The importance of
dependent failures is discussed under Section 3.2.3.
3.2. Dependent human failures
Candidates for dependent human failures were found in
126 maintenance records in the database and in four plant
licensee event reports. After a screening analysis, this
amount could be reduced to 43 records referring to 13
HCCF/HCCN cases and 14 records referring to 10 HSEF
cases. Other records referred to e.g. ageing mechanisms, as
indicated already in Fig. 1. Apart from the amount listed
before, one HCCF and one HSEF case came from other
utility reporting (such as LERs). This makes the amount
of HCCFS and HCCNs together 14, as shown in Table 6,
and the amount of HSEFs 11, correspondingly. The HCCF
and HCCN cases have been listed in Appendix A.
There were more fault records in the maintenance history
than actual human failures, since dependent failures had

299

effect on several components, as shown in Table 6. Fault


records are normally booked on component basis rather than
on mechanism basis. Furthermore, many dependent failures
that were rst classied as wrong calibrations were due to
ageing etc. and could be screened out in the course of the
study [11]. This shows the need for plant staff interviews as
a part of data collection.
The proportion of human failure records in the whole
maintenance data was about 6.0% and the proportion of
human failure cases about 5.2%. The latter result slightly
underestimates the human failure share, since no grouping
of other dependent faults than human failures was
done. Moreover, some human failures may have remained
unidentied. The dependent human failures (HCCFs,
HCCNs and HSEFs) represent 10.8% of the total amount
of human failures. The share of HCCFs, representing clearly
critical failures, is 3.5% based on Tables 1 and 6. These
results represent rather low probabilistic dependence and
they also conrm from their part the ndings of Reiman [2].
A further analysis showed that, of the eight HCCF cases,
one case affected two redundancies out of two, three cases
affected two out of four and three cases affected four out of
four (different systems had different degrees of redundancy). In addition, in one case the number of redundant
systems was higher than 10, four of which had failed. In one
case, where all redundant subsystems on one plant unit had
become unavailable, also another unit was affected (two out
of four). Thus, dependent human failures can extend across
the system and plant block boundaries.
3.2.1. Affected equipment and human failure types for
HCCFs and HCCNs
Statistical analysis of HCCF and HCCN data was difcult
due to the small amount of data. This also means that no
tables are used here to explain results. In some cases, x 2and Fishers exact probability test could be used to support
inference. However, many conclusions could only be based
on qualitative reasoning and comparisons to the results of
single human failures. Anyway, the results obtained for
HCCFs and HCCNs resemble those of the single human
failures, as shown in the following.
Instrumentation (10 cases) and electrical equipment
(four cases) were the only affected equipment types. The

Table 6
Identied dependent human failure related records with their distribution both in HCCF/HCCN cases and in reported cause categories
Reported cause category

Fault
records

Records referring
to single human
failures

Records referring to
HCCFs/HCCNs

Dependent human
failure cases
(HCCFs/HCCNs)

Number of records
per dependent
h.f. case

A Failure in installation or earlier


B Operating/maintenance staff
C Consequence of operation
D Miscellaneous causes
Total

500
214
2741
952
4407

50
71
13
70
204 a

12
13
8
10
43 b

4
6
1
2
13 b

3
2.2
8
5
3.3

a
b

The amount excludes two reports not coming from the maintenance records, together 206 single human failures.
Excludes one case coming from other utility records.

300

P. Pyy / Reliability Engineering and System Safety 72 (2001) 293302

corresponding distribution of the eight HCCFs is ve in I&C


and three in electrical equipment. This result supports the
conclusion made in Section 3.1.1 about the importance of
analysing maintenance actions related to electrical and I&C
equipment.
The dominant failure type category for HCCFs and
HCCNs is EoC (see Section 2), as for the single failures.
Its contribution is 86%, which means seven HCCFs and ve
HCCNs. Wrong settings (four cases) is also a salient
category, but all of them appear to be non-critical instrumentation related HCCNs. The contribution of wrong
settings is, anyway, more signicant than in the case of
single human failures. A wrong direction failure was behind
two instrumentation related HCCFs. Six cases could be seen
as other types of EoCs. In contrast to single human failures,
many dependent EoCs were related to systematic deciencies in work planning and practices than due to carelessness
and other random causes.
The issue of non-criticality of faults (HCCNs) is sensitive
to denition of equipment boundaries. Only in one case, the
effect on equipment was so negligible that the case was left
outside further study of consequences of failures. For the
rest 13 HCCF/HCCN cases, ve equipment inoperabilities
and eight wrong functions were identied as consequences.
The result was compared to the distribution of single human
failures by using x 2-test, which showed that the distributions are not homogenous p 0:03: Thus, more wrong
equipment functions appeared in the consequence of dependent human failures. The difference may be easily explained
by the population, since most dependent failures were
instrumentation related and, thus, wrong signals are a
common fault mode.
3.2.2. Fault origination and detection
Only three dependent failure cases (HCCFs and HCCNs)
were born during the power operation and the rest 11 were
born in outages. This is analogous to single human failures.
Fig. 4 presents the plant operating mode at the time of
detection of those 11 failures born during the outages.
Also in this case, the results do not differ from the ones

Fig. 4. Distribution of the detection modes of human induced dependent


faults introduced during an outage (11 cases). The more left in the picture
the better the situation.

obtained for the single human failures. Seven cases including three HCCFs remained undetected at least until the plant
start-up. Furthermore, Fisher's exact probability test
conrms the homogenous distribution of outage born
HCCFs and HCCNs detection frequencies, when the failures are classied according to their detection in outage or
later, i.e. start-up or power operation (2 2 contingency
table). However, the data was very sparse.
A more thorough analysis of the dependent human failures allowed further inference about their causes and means
of detection. Modications are an important source with the
share of 50% (seven cases). Further, periodic testing and
alarms detected together 50% of the cases. However, different types of preventive actions also caused ve cases. The
importance of modications is problematic from the safety
point of view, because it is difcult to know which kind of
hazards are induced by the new equipment and set-ups.
Nuclear utilities normally carry out extensive start-up testing programs for their new equipment. However, in many
cases either the test program was not found to be comprehensive enough, or the tests were not carried out thoroughly. Similarly, aws were found in barriers like control
and adjustment, personnel training and work planning. In
future, plant backttings and modications must be seen
as activities having an impact on many parts of the plant
and its organisation.
3.2.3. Safety signicance of dependent human failures
A deeper analysis of the data reveals that the amount of
dependent human failures (HCCFs and HCCNs) is equal or
higher in safety related (PSA or TechSpecs) systems than in
other systems. Seven out of 14 dependent faults were in
systems modelled in PSA FTs. The corresponding gures
for TechSpecs systems were nine out of 14. The low amount
of data did not allow for a statistical conrmation of this
nding, but it is analogous to the one obtained for single
failures.
The dependent failures in systems modelled in PSA fault
trees/mentioned in TechSpecs are detected slightly earlier than
in other systems. The situation is also some better with regard
to preventive actions, since 46% of dependent failures were
detected by them. The slightly better detection efciency was
not, however, proven statistically. A remarkable fraction of
dependent failures, as was the case with single failures born
in outages remain latent until the power operation.
An assessment of safety signicance of dependent human
failures was also based on the importance measures
discussed under Section 3.1.3. To identify the corresponding CCF events in the plant PSA models, additional work
and judgement was required. Finally, ve approximate
correspondents were found. The highest contribution to
the core damage risk was, according to FV importance,
due to a fourfold HCFF in seawater mussel lters
(<1.4%) and, according to RAW importance, due to a manifold HCCF in hydraulic scram system (<1.2%). These two
cases are signicant contributors to core damage frequency.

P. Pyy / Reliability Engineering and System Safety 72 (2001) 293302

In the latter case, some uncertainty was left if the human


failures really caused unavailability of several scram
groups. Similarly, the plant specic PSA model is a living
one, which leads to constant change in the results.
3.3. Human induced shared equipment faults
Altogether 11 shared equipment faults (HSEFs) were
identied based on 14 fault records and one internal utility
report. In HSEFs, single human actions cause multiple
consequences through system structure or component interactions. Examples of HSEFs are, e.g. a short circuit in two
electrical trains due to deciently installed jumpers and a
missing restoration work order (many restorations on the
same form). They are not HCCFs or HCCNs, since repeated
human failures were not involved. In eight HSEF cases, two
components were affected. Furthermore, in two cases three
components and in one case four components were affected.
Instrumentation was, again, a dominant equipment group.
Otherwise the ndings (type of human failure, consequence,
time of origin, time of detection) resembled those of single
human failures. The result is not surprising, since HSEFs are
single human failures by their nature. The population of
HSEFs was quite sparse to allow profound inference, and
it was not carried out in this study. Obviously more attention
should be paid, in future, to potential HSEFs. This means
studying dependencies caused not only by system structure
but also by human actions.
4. Discussion and conclusions
A large amount of plant specic maintenance data was
used as the source material of this study. The data analysis,
and especially interviewing, required a lot of resources.
Plant maintenance records, however, offer the best database
for maintenance related human failures. Their wider utilisation in HRA work is recommended, in future.
The used database is also a source of some uncertainties.
Identifying human failures as sources of the fault records
and classifying them is not straightforward. The cause categories used in the plant maintenance records did neither
directly address the faults in redundant components, nor
explicitly allow for human failure classications. It is
impossible to totally exclude subjective biases from the
results. In order to reduce them, several discussions were
carried out with the plant and regulatory body personnel.
In the light of the results, instrumentation, control and
electrical components are especially prone to human failures, partly due to the vulnerability and partly due to the
complexity of the equipment. Thus, more emphasis has to
be put on studying I&C and electrical components in safety
related systems. An amount of human failures, stemming
from outages, remain undetected until the power operation.
In that respect, single and dependent human failures show
similar behaviour, and more rigid testing and verication is
suggested.

301

Many single human failures were related to lack of vigilance, whereas the most dependent ones were related to
planning and co-operation gaps. The single human failures
led more frequently to equipment unavailability than to
wrong equipment functions. Wrong systems functions
were frequent in the consequence of HCCFs, which may
be explained by the amount of I&C equipment.
Human reliability analyses of PSA studies often concentrate upon errors of omission (EoOs) and not on errors of
commission (EoCs). There is confusion in the discussion
about this topic, since one may mean by the acronym EoC
or EoO either the external human failure type or its consequences. There is no xed mechanism that would lead from
an EoO to system unavailability consequence only and from
an EoC to wrong system functions only. As shown by the
results of this study, as much as 68% of EoCs led to unavailability of equipment and some EoOs led to a wrong system
response. Thus, more analysis effort than just using EoO &
EoC paradigm is required.
A high number of human failures takes place in
safety related systems. Potential explanations to this
are the high amount of scheduled activities in safety
systems and that the ofcial fault reporting in nonsafety systems does not work as well as in safety
systems. Electrical faults due to human failures tend
to concentrate into safety related systems, whereas the
mechanical ones are rare in them.
The amount of human failures in the maintenance data is
not insignicant, but especially the number of dependent
failures remained considerably low. Dependent human failures (HCCFs and HCCNs) and single ones show rather
similar behaviour with regard to many traits. Plant modications appeared as a very important source of dependent
human failures. Thus, more extensive planning, co-ordination and testing of the modications may be recommended.
Despite the number of human failures found, only few
HCCFs turned out to be safety signicant in a closer
study. When the human failures related to maintenance
are discussed, one should also remember that more safety
degradation would probably take place if no maintenance
were performed.
Acknowledgements
The author wishes to acknowledge Dr Kari Laakso
for the amount of work put in the screening analysis
of the material used in this study. The help of Dr Urho
Pulkkinen, Dr Lasse Reiman and other reviewers in
preparing the manuscript is also highly appreciated.
The nancial support of the Finnish National Nuclear
Research Programmes RETU and FINNUS has been
vital for the work. Finally, the author wishes to warmly
thank the Olkiluoto NPP and STUK regulatory body
personnel that participated both in data analysis and in
commenting about the manuscript.

302

P. Pyy / Reliability Engineering and System Safety 72 (2001) 293302

Appendix A
HCCF and HCCN failures (Table A1) (Type of equipment (IC instrumentation and control, EL electrical) given in
parentheses).
Table A1
Plant units affected
HCCF
1.
2.
3.
4.
5.
6.
7.
8.
HCCN
1.
2.
3.
4.
5.
6.

The trip limits lowered on wrong neutron ux trip conditions (IC)


Neutron ux trip limits left too low after valve self-closure test (IC)
Power cables cut to the supply pumps of the diesel fuel tanks (EL)
Difference pressure measurements crosswise connected in mussel lters (IC)
Couplings broken between actuators and control valves (IC)
The actuation times too long due to mineral oil impurities in the anchors of the solenoid valves (EL)
Simultaneous work in two subsystems of the auxiliary feedwater system during the refuelling outage of unit 2
(IC)
Turning pieces of ow measurement devices mixed after cleaning (IC)

One unit only


One unit only
One unit only
Two units
One unit only
Two units
One unit 1 single
failure on another
One unit 1 single
failure on another

The temperature measurement values of the bearing pads of the turbine set too low (IC)
The protective coverings broken in the power supply cables of solenoid valves (EL)
Air left in instrument lines of the pressure difference measurements of the suction strainers. In addition
unnecessary alarms (IC)
Wrong settings of the piston position indications of the operating oil pressure accumulators due to start-up
problems (IC)
The signal lights of the operating oil pressure accumulators do not indicate due to wrong settings (IC)
The air pressure correction was lacking in the calibration method of the temperature monitoring limit switches
(IC)

One unit only


One unit only
One unit only

References
[1] Hirschberg S, editor. Dependencies, Human interactions and Uncertainties, nal report of NKS/RAS-470. NORD 1990:57 report. P. 2-12-65. ISBN 87 7303 454 1, 1990.
[2] Reiman L. Expert judgment in analysis of human and organizational
behaviour at nuclear power plants. Helsinki: Finnish Centre for
Radiation and Nuclear Safety. Thesis for the degree of Doctor of
Technology. STUK-A118 report, ISBN 951-712-012-5, 1994. 226 p.
[3] IAEA. Single human failures in nuclear power plants: a human factors
approach to the event analysis. Report of a consultants meeting,
limited distribution. IAEA-CS12/96, 1996. 61 p.
[4] Illman L, Isaksson J, Makkonen L, Vaurio JK, Vuorio U. Human
reliability analysis in Loviisa probabilistic safety assessment.
Proceedings of SRE Symposium '86, Espoo, October 1986. 12 p.
[5] IAEA. Procedures for conducting probabilistic safety assessments of
nuclear power plants (Level 1), Safety Series No. 50-P-4, IAEA,
Vienna, 1992.
[6] Swain AD, Guttmann HE, Handbook of Human Reliability Analysis
with Emphasis on Nuclear Power Plant Applications. NUREG/CR1278, Sandia National Laboratories, Albuquerque, USA, 1983. p. 554.
[7] Samanta PK, O'Brien JM, Morrison HW. Multiple sequential

[8]

[9]

[10]
[11]
[12]
[13]
[14]

One unit only


One unit only
Two units

failur model: evaluation of and procedures for human failure dependency. NUREG/CR-3637. Brookhaven National Laboratory. May
1985.
Vaurio J. Modelling and quantication of testing, maintenance and
calibration failures in system analysis and risk assessment. In: Schueller GI, Kafka P, editors. Safety and Reliability. Proceedings of
ESREL '99 conference, 1999. p. 6639.
Morris IE, Walker TG, Findlay CS, Cochrane EA. Control of maintenance errors. Safety and reliability, Proceedings of ESREL '98
Conference, Trondheim, Norway. Rotterdam: Balkema, 1998. p.
2815.
Siegel AI, Bartter WD, Wolf JJ, Knee HE, Haas HE, Haas PM.
Maintenance Personnel performance simulation (MAPPS) model,
Vol. 1. Summary Description. NUREG/CR-3626, 1984.
Laakso K, Pyy P, Reiman L. Human failures related to maintenance
and modications. STUK-YTO-TR 1998;139:42.
Conover WJ. Practical nonparametric statistics. New York: Wiley,
1971. p. 493.
Siegel S. Non-parametric statistics for the behavioral sciences. New
York: McGraw-Hill, 1956. p. 312.
SPSS. SigmaStat, Statistical Software Version 2.0. User's Manual.
ISBN:1-56827-149-2, 1997. p. 860.

Você também pode gostar