Benchmarking Industry Practices For The Use of Alarms As Safeguards and Layers of Protection

GCPS 2013 __________________________________________________________________________
Benchmarking Industry Practices for the Use of Alarms as Safeguards and Layers of Protection
Todd Stauffer, PE exida Consulting 64 N. Main Street, Sellersville, PA tstauffer@exida.com Dr. Peter Clarke, CFSE exida Asia Pacific Pte Ltd 51 Goldhill Plaza, #21-08/09, Singapore peter.clarke@exida.com
Copyright exida 2013, all rights reserved. Distributed by AIChE with permission of the authors Prepared for Presentation at American Institute of Chemical Engineers 2013 Spring Meeting 9th Global Congress on Process Safety San Antonio, Texas April 28 May 1, 2013 UNPUBLISHED AIChE shall not be responsible for statements or opinions contained in papers or printed in its publications
GCPS 2013 __________________________________________________________________________
Benchmarking Industry Practices for the Use of Alarms as Safeguards and Layers of Protection
Todd Stauffer, PE exida Consulting 64 N. Main Street, Sellersville, PA tstauffer@exida.com Dr. Peter Clarke, CFSE exida Asia Pacific Pte Ltd Keywords: Alarm Management, ISA-18.2, Independent Protection Layers, Alarm Rationalization, Safety IPL Alarms, Operator PFD, Operator response to alarms, Safeguards, PHA, LOPA
Abstract
Operator response to alarms is a common risk reduction mechanism considered during layer of protection analysis (LOPA). Industry practices on how to treat alarms as independent protection layers can vary greatly. For example, some companies do not allow any credit to be taken for alarms during a LOPA (zero risk reduction), while others allow up to two orders of magnitude (risk reduction factor of 100, SIL 2) to be taken. This paper discusses current industry practices around the use of alarms as safeguards and layers of protection as established by a recent benchmark survey of over 200 safety practitioners from around the world. Areas explored in the survey include: typical and maximum claimed risk reduction, considerations used to determine whether an alarm can be credited with risk reduction, how often IPL alarms are determined to be invalid or ineffective in operation, and practices for display and annunciation through a Human-Machine Interface (HMI). Key results and conclusions are presented as well as recommendations on where industry should focus on improvement.
1. Introduction
Alarms and operator response to them are one of the first layers of protection in preventing a plant upset from escalating into a hazardous event. When alarms fail as a layer of protection, catastrophic accidents, such as Milford Haven (UK), Texas City (USA), and Buncefield (UK) can be the result. At the Buncefield Oil Depot, a failure of a tank level sensor prevented its associated high level alarm from being annunciated to the operator. As the level in the tank reached its ultimate high level, a second protection layer, an independent safety switch, failed to trigger an alarm to notify the operator and failed to initiate a trip which would have automatically shut off the incoming flow. The tank overflow and ensuing fire resulted in a 1 billion (1.6 billion USD) loss [1]. Treatment of alarms used as safeguards and protection layers has become an increasingly important topic for companies and regulatory agencies alike. For example, OSHAs Refinery
GCPS 2013 __________________________________________________________________________
National Emphasis Program includes provision for citing a refinery if they claim an ineffective alarm as a safeguard or if the alarm design and implementation does not comply with RAGAGEP (Recommended and Generally Accepted Good Engineering Practice) [2]. The standard ANSI/ISA-18.2, Management of Alarm Systems for the Process Industries (ISA18.2) provides guidance on how to design, engineer, implement and maintain an alarm system [3]. It is considered RAGAGEP by OSHA, so following its requirements and recommendations is critical for safety practitioners that want to use alarms as a layer of protection. This paper documents the results from a survey that was conducted to benchmark the current practices used in industry for the management of safety-critical alarms (those that are used as safeguards and/or independent protection layers). The purpose of the paper is to allow companies to compare their own practices against industry benchmarks and best practices, as well as to highlight areas where companies can improve.
2. Survey Demographics
The survey was conducted over the period September 24th October 5th, 2012. A total of 225 respondents participated in the survey, which consisted of a series of 26 questions. Relevant results are analyzed and presented for the three largest demographic groups described below in order to highlight differences based on region or industry. Table 1. Survey Demographics
# 1 2 3 Region North America Europe Asia Pacific % of Respondents 30% 25% 18% Industry Oil & Gas Chemical Engineering & Consulting % of Respondents 55% 23% 10%
3. Process Hazard Analysis (PHA)

Process Hazard Analysis (PHA) is a required activity of the IEC 61511 standard on functional safety and the OSHA Process Safety Management (PSM) regulation [4, 5]. There are numerous different techniques that can be used to perform hazard analysis, including What-If, Checklist, Hazard and Operability Study (HAZOP), and Failure Modes & Effects Analysis (FMEA). The HAZOP technique is one of the most commonly used in the process industry [6]. Some of the survey questions are specific to the use of the HAZOP method while others are generic in nature.
GCPS 2013 __________________________________________________________________________
3.1
Alarms Identified as Safeguards
Survey respondents answered the following question: Estimate the number of different alarms in your system that are typically identified as a Safeguard or Recommendation during the Process Hazards Analysis (PHA) process? Number of Alarms that are Safeguards / Recommendations
24.6% 21.6% 15.8% 25.1%
7.6% 1.8% None (0) <10 11-50 51-100 101-500 >500
Figure 1. Number of Alarms that are Safeguards or Recommendations
Number of Safeguards / Recommendations in a System - By Industry

100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% Chemical Engineering & Consulting Oil & Gas
>500 101-500 51-100 11-50 <10 None (0)
Figure 2. Alarms as Safeguards by Industry Figure 1 shows that the majority of respondents (>65%) indicated that they have more than 50 alarms in their system that were identified as safeguards / recommendations during a PHA.
GCPS 2013 __________________________________________________________________________
Figure 2 shows that the number of alarms identified as safeguards varies considerably by industry. In oil & gas, 73% of the respondents identify more than 50 alarms as safeguards in their system, whereas only 55% for chemical. This can be partly attributed to the size of the respective systems; the most common system size for respondents in the chemical industry was 2,000-5,000 I/O, whereas it was 5,000-10,000 I/O for those in oil & gas. 3.2 Analysis of HAZOP Cause / Consequence Pairs
Survey respondents answered the following question: Estimate what percentage of cause / consequence pairs (in a Hazard and Operability Study) call for the use of an alarm as safeguard or recommendation? Percent of HAZOP Cause / Consequence Pairs that call for the use of an Alarm
23.8% 19.4% 20.0% 13.8% 23.1%
<5%
5-15%
16-25%
26-50%
>50%
Figure 3. Percent of Cause / Consequence Pairs that call for the Use of An Alarm Percent of Cause:Consequence Pairs that call for an Alarm - by Industry
100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% Chemical Engineering & Consulting Oil & Gas
>50% 26-50% 16-25% 5-15% <5%
Figure 4. Percent of Cause / Consequence Pairs By Industry
GCPS 2013 __________________________________________________________________________
Figure 3 shows that the responses were relatively evenly distributed between the five choices. This indicates that there is significant variation in the percentage of HAZOP cause / consequence pairs that call for the use of an alarm. The same number of respondents answered < 5% as did 26-50%. One likely explanation for these results is variation in how the PHA / HAZOP process is carried out from company to company and the rigor with which all potential safeguards are documented. Figure 4 and Table 2 shows that there is also significant variation by industry. On one end of the spectrum, 36% of the respondents in the chemical industry answered that <5% (a small minority) of cause consequence pairs call for an alarm, compared to only 7% in engineering and consulting. At the opposite end of the spectrum 33% of engineering and consulting respondents indicated that >50% (a majority) of cause consequence pairs call for an alarm, versus only 3% for chemical. Table 2. Disparity in Alarms as Percentage of Cause / Consequence Pairs
Industry Chemical Engineering & Consulting Oil & Gas % of Cause / Consequence Pairs that call for an alarm < 5% (small minority) > 50% (majority) 36% 3% 7% 33% 20% 15%
3.3
Steps to Ensure Alarms Identified in a PHA are Valid and Effective
Survey respondents answered the following question: When an alarm is identified as a safeguard or recommendation during a PHA, what steps are typically taken to ensure that it is a valid and effective alarm? Check all that apply.
GCPS 2013 __________________________________________________________________________
When an alarm is identified as a safeguard or recommendation during a PHA,what steps are typically taken to ensure that it is a valid and effective alarm?
90.0% 80.0% 70.0% 60.0% 50.0% 40.0% 30.0% 20.0% 10.0% 0.0% Discuss / document the operators response (action) to the alarm Series1 83.5%
% of Responses
Discuss / document whether the operator has sufficient time to 70.6%
Verify that Define the the alarm basis for is the alarm independe setpoint nt from the (limit) cause 64.7% 62.9%
Discuss / document operator training relative to the alarm 52.4%
Verify the Discuss / operator document response alarm does not mechanical place him / integrity her in requireme danger nts 47.1% 34.1%
None
2.9%
Figure 5. Steps to Ensure Alarms are Valid and Effective Figure 5 documents the steps that are taken to ensure that an alarm which is identified as a safeguard or recommendation is valid and effective. Best practices such as those documented in ISA-18.2, by exida, and by the Center for Chemical Process Safety (CCPS) would suggest that the following activities at a minimum should be performed for an alarm that is used as a safeguard: Discuss / document the operators response (action) to the alarm According to ISA18.2, if the alarm does not require an operator action, then it should not be considered a valid alarm. During the rationalization process, each alarm is subjected to this review [3]. Discuss / document whether the operator has sufficient time to respond This is another criterion which is reviewed during the rationalization process. If an operator does not have sufficient time to respond to prevent the consequences, then the alarm will not be effective and should not be considered a safeguard [6, 7, 8]. Verify that the alarm is independent from the cause This must be TRUE for the alarm to be considered a valid Independent Protection Layer, so it would make sense that it should also be applied to a safeguard when appropriate [6, 7, 8]. Verify the operator response does not place him / her in danger If the operators response to the alarm places them in danger, then the alarm should not be considered a safeguard. The survey indicated that over half the respondents (52.9%) do not apply this criterion [8].
If the four criteria described above are accepted as best practice, then 100% of the respondents should have indicated that these steps are taken. Instead only 83.5%, 70.6%, 62.9% and 47.1% respectively indicated that they follow these best practices. Thus there is a gap between the actual practices used in industry versus those that are recommended and accepted as best
GCPS 2013 __________________________________________________________________________
practices. By not applying alarm management best practices upfront during the PHA / HAZOP process it is more likely that some of the alarms identified as safeguards will be proven to be invalid / ineffective during alarm rationalization or operation.
Figure 6. Steps to Ensure Alarms are Valid and Effective By Region
Figure 7. Steps to Ensure Alarms are Valid and Effective By Industry
GCPS 2013 __________________________________________________________________________
Figures 6 and 7 present the results based on region and industry. In Figure 6 the percent of North American respondents which indicated that they discussed mechanical integrity (MI) requirements (45%) was significantly higher than Europe (28%) and Asia Pacific (38%). This is likely from the strength of OSHA in the US in driving compliance to their Process Safety Management (PSM) regulation 1910.119 which includes requirements for the creation of a mechanical integrity program (a management system assuring equipment is inspected, maintained, tested and operated in a safe manner)[5, 7]. Of interest in Figure 7 is that engineering & consulting and chemical had higher response scores (greater compliance to best practices) than oil & gas for all categories except for one. This indicates that the understanding, acceptance and adoption of best practices may be higher here than in oil & gas. 3.4 Treatment of PHA Results
Survey respondents answered the following question: After the PHA or HAZOP has been completed, what is done with the requirements for alarms identified as safeguards or recommendations? Check all that apply.
After the PHA or HAZOP has been completed, what is done with the requirements for alarms identified as safeguards or recommendations? Check all that apply.
70.0% 60.0% 50.0% 40.0% 30.0% 20.0% 10.0% 0.0%
They are Management available for of Change review during (MOC) alarm process is rationalizatio initiated n and design They are transferred automatically They are They are to a Master extracted automatically Alarm manually by Database so extracted reviewing all that they are into a available PHA reports spreadsheet during alarm rationalizatio n and design 42.5% 27.5% 18.6%
None
Response Percent
59.3%
51.5%
5.4%
Figure 8. Treatment of PHA Results
GCPS 2013 __________________________________________________________________________
The survey indicates that only 59.3% of respondents make the PHA results available during alarm rationalization. This figure should be 100% as alarm rationalization will revisit some of the topics that are covered during the PHA (establishing likely causes and consequences). This will improve the efficiency of the alarm rationalization and help ensure consistency between the alarm design and the PHA.
4. Layer of Protection Analysis (LOPA)

Layer of Protection Analysis is one of the most commonly used techniques for risk analysis. It is a method of analyzing the likelihood (frequency) of a harmful outcome event based on initiating event frequency and on the probability of failure of a series of independent protection layers capable of preventing the harmful outcome [6]. The primary goal of a LOPA is to determine if there are adequate protective devices or features in the process to produce a tolerable risk level. These protective devices or features are called Protection Layers or Independent Protection Layers (IPLs). Examples of potential protection layers include the mechanical integrity of a vessel, control loops and trips within the basic process control system (BPCS), operator intervention, a safety instrumented function, and physical relief devices. It is important to note the difference between a safeguard and a layer of protection. A safeguard is any device, system or action that would likely interrupt the chain of events following an in initiating event. The benefit of some safeguards may not be able to be easily quantified because of lack of data, or uncertainty of whether it meets specific criteria such as independence, effectiveness, and auditability. An independent protection layer is a safeguard whose effectiveness can be quantified and which meets well-defined criteria. All IPLs are safeguards, but not all safeguards are IPLs [8]. 4.1 Origin of Alarms Identified in a LOPA
Survey respondents answered the following question: What percentage of the alarms that are considered during a Layer of Protection Analysis (LOPA) were identified during a PHA?
GCPS 2013 __________________________________________________________________________
What percentage of the alarms that are considered during a Layer of Protection Analysis (LOPA) were identified during a PHA
33.6%
22.6% 17.5% 12.4%
All (approximately 100%)
75- 99%
50 - 74%
<50%
Figure 9. Percentage of LOPA Alarms Originating during a PHA After the process hazards analysis has been completed, the results and recommendations are reviewed to determine which scenarios require further analysis to determine if there are adequate layers of protection, or if safety instrumented functions (SIF) will be needed to properly manage the risk. One industry reference defines a safeguard as a potential protection layer that has yet to be evaluated in a LOPA to determine effectiveness and independence [6]. Thus it would be expected that ideally 100% of the alarms that are considered in a layer of protection analysis would have first been identified as a safeguard / recommendation in the PHA. Figure 10 shows that this is far from the case in practice. Only 12.4% of the respondents indicated that all (100%) of the alarms in the LOPA had come from the PHA. Furthermore, 33.6% indicated that less than 50% of the time was the LOPA alarm identified during the PHA. It is certainly possible that the LOPA may legitimately identify some alarms that were not considered during PHA. Allowing for this, one could consider the 75-99% response as acceptable. This leaves 51% of the respondents which appear to frequently identify new alarms during LOPA that were missed during the PHA. This would seem to indicate poor PHA practices are being used. Failing to identify alarms during a PHA could signal various issues, such as a lack of thoroughness, lack of documenting all safeguards in order to save time, or a lack of understanding of the process.
GCPS 2013 __________________________________________________________________________
Percentage of LOPA Alarms identified during a PHA - by Region

100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% North America Europe Asia Pacific <50% 50 - 74% 75- 99% All (approximately 100%)
Figure 10. Percentage of LOPA Alarms Originating during a PHA By Region The results in Figure 10 demonstrate a significant disparity in the quality of the PHAs being conducted based on region. The percentage of poor PHA applications ranged from 41% in North America to 79% in Asia Pacific (where poor is defined as those that answered <50% or 50-74%). Note that the variation was much less significant when analyzed by industry. 4.2 Typical Risk Reduction for a Safety IPL Alarm
Survey respondents answered the following question: What level of risk reduction (RRF) do you typically take for a Safety IPL alarm? The effectiveness of an independent protection layer is typically characterized by assigning a probability of failure on demand (PFD), which is defined as the probability that it will fail to perform a specified function when called upon [8]. The risk reduction factor (RRF), which is a measure of how much a protective function reduces the frequency of the hazardous event, is the inverse of PFD [7]. RRF= 1 / PFD [Eq. 1]
GCPS 2013 __________________________________________________________________________
What level of risk reduction (RRF) do you typically take for a Safety IPL alarm
43.0%
20.0% 14.8% 10.4% 3.0% 1.0 (no risk reduction) Up to 2.0 2.0 - 9.9 10.0 >10.0
Figure 11. Typical Level of Risk Reduction for a Safety IPL Alarm The level of risk reduction that can be taken for a Safety IPL Alarm (an alarm used as an independent protection layer) is an area of debate in the safety community. The debate originates because of the significant disparity that exists from plant to plant, unit to unit and person to person in the ability of an operator to prevent a hazardous situation form developing into an accident. Figure 11 shows that the most common risk reduction factor taken for a Safety IPL Alarm is 10.0 (with 43% of the respondents). This corresponds to the risk reduction that is most commonly cited in the literature [4, 8]. Table 3 shows the correspondence between RRF, PFD, and Safety Integrity Level (SIL). It should also be noted that 10 % of the respondents claim no risk reduction for a Safety IPL alarm, while 20% claim a RRF between 2.0 and 9.9. Table 3. Correspondence between RRF, PFD and SIL [4]
Risk Reduction Factor (RRF) 10 > 10 to 100 > 100 to 1,000 > 1,000 to 10,000 Probability of Failure on Demand (PFDavg) 10-1 to 100 10-2 to < 10-1 10-3 to < 10-2 10-4 to < 10-3 Safety Integrity Level (SIL) potentially achievable SIL 0 SIL 1 SIL 2 SIL 3
GCPS 2013 __________________________________________________________________________
Typical Risk Reduction (RRF) for a Safety IPL Alarm - by Region

100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% North America Europe Asia Pacific Risk Reduction Factor (RRF) >10.0 10.0 2.0 - 9.9 Up to 2.0 1.0 (no risk reduction)
Figure 12. Typical Level of Risk Reduction for a Safety IPL Alarm By Region Typical Risk Reduction (RRF) for a Safety IPL Alarm - by Industry
100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% Chemical Engineering & Consulting Oil & Gas Risk Reduction Factor (RRF) >10.0 10.0 2.0 - 9.9 Up to 2.0 1.0 (no risk reduction)
Figure 13. Typical Level of Risk Reduction for a Safety IPL Alarm By Industry
GCPS 2013 __________________________________________________________________________
Figures 12 and 13 show that the level of risk reduction varies considerably by region and by industry. For example in North America a clear majority (72%) use an RRF of 10, while in Asia Pacific only 30% use an RRF of 10. In Asia Pacific, a large percentage of the respondents are either very conservative (26% claim no risk reduction) or very aggressive (13% claim a risk reduction greater than 10). It is also interesting to note that numerous respondents in the Engineering & Consultancy sector claimed to use RRFs that are not powers of 10 (i.e. in the 2.0~9.9 range). This suggests that quantitative LOPA techniqueswhich can make use of such RRF valuesmay be used frequently within this sector. Risk reduction factors greater than 10.0 (PFD < 0.1) should be used sparingly if ever for Safety IPL alarms. As shown in Table 4, there are very few situations when it would be appropriate to use such a value. When it is believed to be appropriate, it is necessary to document a sound technical basis for that conclusion. Table 4 Simplified Technique for Estimating Operator Response [6]
Category Description Probability that Operator responds successfully 90% PFD RRF
Normal Operator Response In order for an operator to respond normally to a dangerous situation, the following criteria should be true: Ample indications exist that there is a condition requiring a shutdown Operator has been trained in proper response Operator has ample time (> 20 minutes) to perform the shutdown Operator is ALWAYS monitoring the process (relieved for breaks) Drilled Response All of the conditions for a normal operator intervention are satisfied and a drilled response program is in place at the facility. Drilled response exists when written procedures, which are strictly followed, are drilled or repeatedly trained by the operations staff. The drilled set of shutdowns forms a small fraction of all alarms where response is so highly practiced that its implementation is automatic This condition is RARELY achieved in most process plants Response Unlikely / Unreliable NOT ALL of the conditions for a normal operator intervention probability have been satisfied
0.1
10
99%
0.01
100
0%
1.0
GCPS 2013 __________________________________________________________________________
Some alarm management practitioners have proposed that even a risk reduction factor of 10 should not be applied blindly without ensuring that specific alarm management requirements are / will be met, such as the following: 4.3 The alarm system must be rationalized. Alarm system performance must be measured and proven to be adequate (based on industry-accepted KPIs) [9]. Maximum Risk Reduction for a Safety IPL Alarm
Survey respondents answered the following question: In your experience, what is the maximum level of risk reduction (RRF) that has been taken for a Safety IPL alarm?
In your experience, what is the maximum level of risk reduction (RRF) that has been taken for a Safety IPL alarm?
48.1%
8.1%
11.9%
10.4%
10.4% 2.2%
1.0 (no risk reduction)
Up to 2.0
2.0 - 9.9
10.0
100.0
>100.0
Figure 14. Maximum Risk Reduction for a Safety IPL Alarm Figure 14 shows that a risk reduction factor of 10 was again the most popular response (48%). It is interesting to note that the percent of respondents that indicated 10.0 was the maximum risk reduction taken (48%) was slightly greater than the amount that indicated 10.0 was the typical value taken (43%) from the previous question. It also of interest to note that 12.6% of the respondents indicated a maximum RRF of 100.0 or greater.
GCPS 2013 __________________________________________________________________________
4.4
Considerations for Determining When an Alarm Can be Credited with Risk Reduction
Survey respondents answered the following question: What considerations are used to determine whether an alarm can be credited with risk reduction? Check all that apply. What considerations are used to determine whether an alarm can be credited with risk reduction
80.0% 70.0% 60.0% 50.0% 40.0% 30.0% 20.0% 10.0% 0.0% The operators have been The alarm is The alarm is trained on completely auditable the causes, independent (proof tested potential from the at consequence cause of the appropriate s, and upset frequency) corrective actions for the alarm Series1 73.3% 67.9% 63.4% The alarm is specifically designed to prevent the consequence s under consideratio n by the operator The alarm is dependable (based on There is not calculating more than the one alarm Probability of credited with Failure on risk Demand for reduction the per layer of annunciation protection of the alarm and successful 48.9% 42.7% Alarm system performance (# of alarms / per hour, nuisance alarms, alarm floods) is measured and determined to be acceptable 38.9%
All alarms in the system (safety and non-safety) have been rationalized
59.5%
32.1%
Figure 15. Considerations for Determining When an Alarm can be Credited with Risk Reduction
GCPS 2013 __________________________________________________________________________
The general criteria for determining when a safeguard can be considered an IPL are well established in the literature and include the following: Table 5 Survey of General Criteria Used to Determine when a Safeguard can be used as an IPL
Layer of Protection Analysis [8] Independent Auditable Effective Practical SIL Target Selection [6] Independent Auditable Dependable Specific Guidelines for Safe & Reliable Instrumented Protective Systems [7] Independence Auditability Reliability, Integrity Functionality Access Security Management of Change
The generic criteria above have been used to create specific considerations that should be taken into account to ensure that an alarm can be credited with risk reduction (represented in Figure 15). A more detailed discussion about criteria applied to alarms can be found in Appendix A. The presence of nuisance alarms which are alarms that annunciate excessively, unnecessarily, or do not return to normal after the correct response is taken can interfere with the operators ability to detect and respond to safety IPL alarms. Standing alarms (lasting > 24 hours) and chattering alarms (points that go needlessly in and out of alarm on a frequent basis) are nuisance alarms that clutter the operators display making it more difficult to detect a new alarm and increasing the chances that they might miss a critical alarm. Alarm rationalization, which is the process of reviewing potential or existing alarms to justify that they meet the criteria for being an alarm, is a technique for ensuring the integrity of the alarm system and eliminating problems such as nuisance alarms, alarm overload and alarm floods . It includes defining and documenting the design attributes (such as priority, limit, type and classification) as well the cause, consequence, time to respond, and recommended operator response. Since all of the criteria shown in Figure 15 have been cited as recommended best practices in the literature, it can be concluded that a large portion of safety practitioners are NOT following industry recommended practices (else the scores would be close to 100% for each consideration). 4.5 Invalid & Ineffective Safety IPL Alarms
Survey respondents answered the following question: How often do you find that an alarm identified as an IPL is not valid, or is ineffective (does not provide the level of risk reduction expected)?
GCPS 2013 __________________________________________________________________________
How often do you find that an alarm identified as an IPL is not valid, or is ineffective (does not provide the level of risk reduction expected)?
38.9%
26.0% 17.6%
14.5%
4.6%
Never (0% of the time)
Infrequently (< 1% of the Safety IPL Alarms)
Sometimes (between 1 to 5 % of the Safety IPL Alarms)
Frequently (> 5% of the Safety IPL Alarms)
Unknown
Figure 16. Frequency of Ineffective Safety IPL Alarms Figure 16 shows how often a Safety IPL Alarm is found to be ineffective at providing the expected level of risk reduction. 65% of the respondents indicated that sometimes / frequently they find that an alarm is an ineffective IPL. This could create a situation where the actual risk reduction no longer meets or exceeds the company-defined tolerable risk level.
Figure 17. Risk Reduction through the use of multiple protection layers [10] Figure 17 illustrates how the loss of risk reduction from an ineffective IPL alarm can have a ripple effect on the requirements for other layers of protection such as a safety instrumented function in an SIS. The higher the SIL, the more complicated and expensive is the Safety Instrumented System (SIS). A higher SIL may also require more frequent proof testing, which adds cost and can be burdensome in many plants [11].
GCPS 2013 __________________________________________________________________________
One could surmise that this finding is partly caused by the gap in following best practices that exists as illustrated by Figure 15. A detailed discussion of failure modes of Safety IPL alarms is the subject of another paper [9]. 4.6 Prioritizing Safety IPL Alarms
Survey respondents answered the following question: What statement best describes how the priority of Safety IPL alarms are assigned? What statement best describe how the priority of Safety IPL alarms are assigned
35.0% 30.0% 25.0% 20.0% 15.0% 10.0% 5.0% 0.0%
Based on company defined risk matrix, taking into consideration consequence to economic, safety, environmental and Public Image aspects Series1 30.2% Automatically set to the highest priority allowed in the system (e.g. Critical, Emergency, etc) Based on the direct & immediate consequence (assuming all other layers of protection operate as expected) and the amount of time available for the operator to respond 17.1%
Based on the ultimate consequence defined in the HAZOP / PHA
Not Applicable
Based on the assumption that the associated SIF and other associated IPLs fail
22.5%
21.7%
4.7%
3.9%
Figure 18. Methodology for Prioritizing Safety IPL Alarms Alarm priority represents the importance assigned to an alarm within the alarm system to indicate the urgency of response. It helps the operator to know to which alarm to respond to first. Alarm priority is typically determined based on the severity of the potential consequences (in areas such as personnel safety, equipment damage, environmental, economic loss) and the time available to respond as shown in Table 6. Analysis of the severity of consequences is an activity that is common within the safety lifecycle. For a safety IPL alarm it is important to work with the direct (proximate) consequences and not the ultimate consequences which could occur after a series of failures [12, 13].
GCPS 2013 __________________________________________________________________________
Table 6. Example Alarm Priority Matrix Figure 18 provides a view into how Safety IPL alarms are prioritized. As shown by Table 7, 48% of the respondents indicated that they use prioritization criteria which do not follow alarm management best practices.
Prioritization Criteria Based on company defined risk matrix, taking into consideration consequence to economic, safety, environmental and public image aspects Based on the ultimate consequence defined in the HAZOP / PHA Automatically set to the highest priority allowed in the system (e.g. Critical, Emergency, etc) Based on the direct & immediate consequence (assuming all other layers of protection operate as expected) and the amount of time available for the operator to respond Based on the assumption that the associated SIF and other associated IPLs fail % 30.2% 22.5% 21.7% 17.1% 3.9% Compliance with Best Practices YES NO NO YES NO
Table 7. Alarm Prioritization Results and Compliance with Best Practices
GCPS 2013 __________________________________________________________________________
5. Human Machine Interface (HMI) Practices for Safety IPL Alarms

Safety IPL alarms are communicated to the operator through the Human Machine Interface (HMI). Once the alarm is annunciated, a series of steps must be performed by the operator to prevent escalation of the hazardous scenario and bring the process back to the normal operating range (reference) as shown in Figure 19.
Figure 19 Feedback Model of Operator Process Interaction [3] For a successful outcome, the operator must proceed quickly through three stages of activity: a) the deviation from desired normal operation is detected, b) the situation is diagnosed and the corrective action determined, c) the action is implemented to compensate for the disturbance. The operator also continues to monitor the measurement as it returns to normal. A well designed HMI should support situation awareness and ensure that the operator is able to quickly and repeatably detect, diagnose, and respond within the operator response time. Operator response time represents the time from the activation of the alarm until the last moment the operator action will prevent the consequence (i.e., time available) [3]. Poor graphics, including alarm depiction deficiencies, have been identified as contributing factors to several major industrial accidents (such as Buncefield). 5.1 Display of Safety IPL Alarms
Survey respondents answered the following question: What statement(s) best describes your current practice for display of Safety IPL alarms? Check all that apply.
GCPS 2013 __________________________________________________________________________
What statement(s) best describes your current practice for display of Safety IPL alarms?
70.0% 60.0% 50.0% 40.0% 30.0% 20.0% 10.0% 0.0% They are annunciated through the same HMI as the BPCS Series1 64.1% They are annunciated through hardwired light boxes or panel boards 31.3% 64.1% 31.3% 21.4% They are annunciated through light boxes or panel boards and the same HMI as the BPCS 21.4% 20.6% 18.3%
The are annunciated through dedicated HMIs 20.6%
They are part of a standalone system
18.3%
Figure 20. Display of Safety IPL Alarms Safety IPL alarms and information can be presented to the operator in a number of different ways, including: Graphic displays on the basic process control system (BPCS) operator interface, Dedicated graphic displays on stand-alone video display units, Panel mounted graphic displays, and Panel mounted annunciators.
Figure 20 illustrates that a variety of architectures are used for the display of safety IPL alarms, with the most popular (64.1%) being annunciation through the same HMI as the BPCS. Selected recommended best practices for display design include the following: Lightbox alarms, which provide an independent alarm display that can be typically seen by multiple operators within the control room, should be replicated in the BPCS interface for acknowledgement and logging purposes. Lightbox annunciators should be located close to the operators work station or work areas so that it is visible from all locations where its information would be considered important [7]. Graphic displays should be designed to maximize operator situation awareness and "pattern recognition" to aid in operator response. Graphic displays should be designed so the visibility of information is related to its operational importance; background information should be given low visibility, normal
GCPS 2013 __________________________________________________________________________
plant measurements a medium visibility and abnormal conditions (values and states) should have the highest visibility. It is important that alarm state indications represent the presence of an alarm using not only color, but also symbols, patterns and/or text (8-12% of the male population is color blind). Alarm colors should be reserved for alarms only and not used for other functions within the HMI (such as process piping or equipment status). Alarm color coding should reflect the priority of the alarm. Alarm Response Procedures
5.2
Survey respondents answered the following question: Do you provide Alarm Response Procedures to the operator for safety IPL alarms? If Yes, please indicate the format:
Alarm Response Procedures for Safety IPL alarms (Yes) - Provided (No) - Not Provided % Format Paper manuals On screen display called up in context within the HMI Call up files or displays on a dedicated computer (other than the HMI)
% 54% 28% 18%
74% 26%
Table 8. Use of Alarm Response Procedures for Safety IPL Alarms Alarm response procedures typically include the following information: Likely cause(s) of the alarm Potential consequences of inaction Corrective action that is required by the operator to prevent the consequence Time available to respond Confirmation / Verification of the alarm condition [9, 11].
As shown in Table 8, 26% of the respondents do not provide alarm response procedures to the operator to help them respond to Safety IPL Alarms. This is inconsistent with the practices that should be followed to ensure that the operator response is effective and reliable / dependable [6, 7, 8]. For those that do provide alarm response procedures, 54% of the respondents indicated that they are provided in paper format. The use of printed (paper) manuals can be ineffective if they are not within immediate reach of the operator, are not kept up-to-date or require significant time for the operator to locate the relevant procedure. The ability to display the alarm response procedures in context within the HMI, which was selected by 28% of the respondents, is the most effective format and should be considered a best practice [9, 11]. Best practices assert that, Operator response integrity can be improved by displaying operator action on request [7].
GCPS 2013 __________________________________________________________________________
6. Conclusion
Operator response to alarms can be used to reduce risk as a safeguard or as an independent protection layer. Survey results indicate that there is significant variation in the practices employed within industry for the management of safety-critical alarms. In some cases these variations are more significant when analyzing based on industry or region. Analysis of survey results also revealed that there is significant room for improvement when it comes to the adoption of, and compliance with, industry best practices. In particular the following areas were identified: Improving the rigor and thoroughness of PHAs so that, for example, all alarm safeguards are identified and documented Verifying that an alarm identified as a safeguard or recommendation is likely to be valid and effective Ensuring that alarms credited with risk reduction meet the criteria established for them to be independent protection layers as cited in industry best practices [6, 7, 8] Understanding the implications and guidelines for assigning a risk reduction factor or probability of failure on demand to a Safety IPL alarm Prioritizing safety IPL alarms based on the ISA-18.2 standard and alarm management best practices Consider providing operators with alarm response procedures for Safety IPL alarms in context and within the HMI
Safety practitioners are encouraged to compare their own practices against the benchmark survey results and the best practices cited in this paper. This should highlight areas of improvement that can help improve the safety of the people and the processes they work with. It is also recommended that safety practitioners increase their knowledge of alarm management best practices such as those in ISA-18.2.
7. References
[1] [2] The Buncefield Incident; The final report of the Major Incident Investigation Board, Volume 2, Crown publishing, United Kingdom, (2008). Occupational Health and Safety Administration (OSHA), Petroleum Refinery Process Safety Management National Emphasis Program, Directive CPL-03-00-010, Washington, DC, (2009). ANSI/ISA 18.00.02-2009 Management of Alarm Systems for the Process Industries. ANSI/ISA-84.00.01-2004 Part 1 (IEC 61511-1 Mod) Functional Safety: Safety Instrumented Systems for the Process Industry Sector.
[3] [4]
GCPS 2013 __________________________________________________________________________
[5] [6] [7] [8] [9] [10] [11] [12]
OSHA, Process safety management of highly hazardous chemicals, 29 CFR 1910.119, Washington, DC, (1992). Hartmann, H., Scharpf, E., and Thomas, H., Practical SIL Target Selection: Risk Analysis per the IEC 61511 Safety Lifecycle, exida, Sellersville, PA, (2012). CCPS. Guidelines for Safe and Reliable Instrumented Protective Systems. Center for Chemical Process Safety. New York, NY. (2007). CCPS. Layer of Protection Analysis: Simplified Process Risk Assessment. Center for Chemical Process Safety. New York, NY. (2001). Stauffer, T. and Clarke, P., Using Alarms as a Layer of Protection, AIChE 8th Global Congress on Process Safety, Houston, TX (2012). Hatch, D, and Stauffer, T., Operators on Alert: Operator response, alarm standards, protection layers keys to safe plants, Intech, (September 2009). Stauffer, T. Making the Most of Alarms as a Layer of Protection, Safety Control Systems Conference IDC Technologies (May 2010) Stauffer, T., Sands, N., and Dunn, D., Get a Life(cycle)! Connecting Alarm Management and Safety Instrumented Systems, ISA Safety & Security Symposium (2010). Hollifield, B., and Habibi, E., Alarm Management A Comprehensive Guide (2nd Edition), ISA, Research Triangle Park, NC, (2011).
[13]
Additional references not cited: [14] [15] [16] [17] [18] [19] EEMUA 191, Alarm Systems: A Guide to Design, Management and Procurement Edition 2. The Engineering Equipment and Materials Users Association (2007). Nimmo, I., The Operator as IPL, Hydrocarbon Engineering, September 2005. Stauffer, T., Sands, N., and Dunn, D., Alarm Management and ISA-18 A Journey, Not a Destination, Texas A&M Instrumentation Symposium (2010). Suttinger, L. and Sossman, C., Operator Action within a Safety Instrumented Function, WSRC-MS-2002-00091 (2002). The Explosion and Fires at the Texaco Refinery, Milford Haven, 24 July 1994, HSE Books, Sudbury, U.K. (1995). BP America Refinery Explosion U.S. CHEMICAL SAFETY BOARD www.chemsafety.gov/investigations (2009).
GCPS 2013 __________________________________________________________________________
Appendix A. Survey of Criteria for using Alarms as Layers of Protection A.1 Guidelines for Safe and Reliable Instrumented Protective Systems [7]
Protection layers are known as IPLs are designed and managed to meet the following seven core attributes: Independence the performance of a protection layer is not affected by the initiating cause of a hazardous event or by the failure of other protection layers; Functionality the required operation of the protection layer in response to a hazardous event; Integrity related to the risk reduction that can reasonably be expected given the protection layers design and management; Reliability the probability that a protection layer will operate as intended under stated conditions for a specified time period; Auditability ability to inspect information, documents and procedures, which demonstrate the adequacy of and adherence to the design, inspection, maintenance, testing and operation practices used to achieve the other core attributes; Access Security use of administrative controls and physical means to reduce the potential for unintentional or unauthorized changes; and Management of Change formal process used to review, document, and approve modifications to equipment, procedures, raw materials, processing conditions, etc., other than replacement in kind, prior to implementation.
Applying the seven core attributes to alarms, allows definition of specific recommendations and best practices. Alarms should only be used when the operator is expected to take a specified action, which is covered by operating procedure. Operators should be trained on how to respond to the alarm according to a written procedure. For most hazardous events, only one protective function can be claimed in the supervisory layer, irrespective of the number of indications or alarms. For an alarm to be classified as an IPL, it must meet the following three criteria: The alarm is independent of the initiating cause and other protective layers addressing the identified hazardous event. The alarm function, including inputs and outputs, is designed to provide the allocated risk reduction. There is sufficient time for the operator to detect a problem exists, to determine what to do and to take appropriate action necessary to return the process to normal operating limits.
GCPS 2013 __________________________________________________________________________
The total operator response time should be less than one-half of the available process safety time. For a protective alarm, the process safety time is the time between the alarm occurrence and the hazardous event occurrence. A.2 Practical SIL Target Selection: Risk Analysis per the IEC 61511 Safety Lifecycle [6] The sensor and logic solver used to activate the alarm must be at least 90 percent reliable and independent of the initiating event and other IPLs (independent) The alarm must be part of a well-rationalized alarm annunciation system such that the operator is not overwhelmed with too many alarms The alarm setpoint must be within the operating range of the sensor and may not be changed without permission and a change management procedure (dependable and auditable) The alarm must not be capable of being bypassed or inhibited and it must be annunciated in a control room that is continually manned when the process is operating (dependable) The operator must have adequate time to respond to the alarm. This response time includes the time it takes him to detect the alarm, diagnose what should be done, physically move to the final elements to be manipulated and execute the manipulation (dependable). For example, a high level alarm on a compressor suction drum will require the control room operator to acknowledge the alarm, determine the need to drain the drum, call the field operator, and request the action. Then the field operator must stop their current activity and physically go to the compressor, locate the correct drain valve, and then open the valve. This response time must also include the time it takes the operator to recover from making an incorrect decision or process manipulation or come back into the control room to get a wrench to move a stuck valve! An alarm response procedure detailing the actions required by each type of operator (control room and field) must exist and be available to the operators. All operators must be trained, drilled and periodically audited on the procedure and its required actions (auditable) All operators must be capable, and willing, to make the correct intervention actions at least 90% of the time (dependable) The operators must have a final element to manipulate that is independent of the initiating event and other IPLs, including any SIFs (independent) The alarm must reveal the dangerous condition under all circumstances (specific) The proper functionality of the alarm must be periodically verified and documented (auditable)
GCPS 2013 __________________________________________________________________________
Alarm system performance must be measured and proven to be adequate (dependable). To ensure performance is acceptable it must be measured and compared to key performance metrics (targets) such as those defined in the ISA-18.2 standard.
Alarm Performance Metrics Based upon at least 30 days of data Metric Annunciated Alarms per Time: Annunciated Alarms Per Day per Operating Position Annunciated Alarms Per Hour per Operating Position Annunciated Alarms Per 10 Minutes per Operating Position Metric Percentage of hours containing more than 30 alarms Percentage of 10-minute periods containing more than 10 alarms Maximum number of alarms in a 10 minute period Percentage of time the alarm system is in a flood condition Percentage contribution of the top 10 most frequent alarms to the overall alarm load Quantity of chattering and fleeting alarms Stale Alarms ~<1% ~<1% 10 ~<1% Target Value Target Value: Very Likely to be Acceptable ~150 alarms per day ~6 (average) ~1 (average) Target Value: Maximum Manageable ~300 alarms per day ~12 (average) ~2 (average) Target Value
~<1% to 5% maximum, with action plans to address deficiencies. Zero, action plans to correct any that occur. Less than 5 present on any day, with action plans to address 3 priorities: ~80% Low, ~15% Medium, ~5% High or 4 priorities: ~80% Low, ~15% Medium, ~5% High, ~<1% highest Other special-purpose priorities excluded from the calculation Zero alarms suppressed outside of controlled or approved methodologies Zero alarm attribute changes outside of approved methodologies or MOC
Annunciated Priority Distribution Unauthorized Alarm Suppression Unauthorized Alarm Attribute Changes
Table 10. ISA-18.2 Alarm Performance Metrics [3] A.3 Layer of Protection Analysis: Simplified Process Risk Assessment [8] The indication for action required by the operator must be detectable. The indication must always be: o Available for the operator, o Clear to the operator even under emergency conditions, o Simple and straightforward to understand. The time available to take action must be adequate. This includes the time necessary to decide that the action is required and the time necessary to take the action. The longer the
GCPS 2013 __________________________________________________________________________
time available for the action, the lower the PFD given for human action as an IPL. The decision making for the operator should require: o No calculations or complicated diagnostics, o No balancing of production interruption costs versus safety The operator should not be expected to perform other tasks at the same time as the action required by the IPL, and the normal operator workload must allow the operator to be available to act as an IPL. The operator is capable of taking the action required under all conditions expected to be reasonably present. As an example, consider a proposed IPL where an operator is required to climb a platform to open a valve. If a fire (as the initiating event) could prevent this action, it would not be appropriate to consider the operator action as an IPL. Training for the required action is performed regularly and is documented. This involves drills in accordance with the written operating instructions and regular audits to demonstrate that all operators assigned to the unit cab perform the required tasks when alerted by the specified alarm. The indication and action should normally be independent of any alarm, instrument, SIF or other system already credited as part of another IPL or initiating event sequence.

Benchmarking Industry Practices For The Use of Alarms As Safeguards and Layers of Protection

Enviado por

Dados do documento

Descrição original:

Título original

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Benchmarking Industry Practices For The Use of Alarms As Safeguards and Layers of Protection

Enviado por

Direitos autorais:

Formatos disponíveis

GCPS 2013 __________________________________________________________________________

GCPS 2013 __________________________________________________________________________

GCPS 2013 __________________________________________________________________________

3. Process Hazard Analysis (PHA)

GCPS 2013 __________________________________________________________________________

Alarms Identified as Safeguards

7.6% 1.8% None (0) <10 11-50 51-100 101-500 >500

Figure 1. Number of Alarms that are Safeguards or Recommendations

Number of Safeguards / Recommendations in a System - By Industry

>500 101-500 51-100 11-50 <10 None (0)

GCPS 2013 __________________________________________________________________________

>50% 26-50% 16-25% 5-15% <5%

Figure 4. Percent of Cause / Consequence Pairs By Industry

GCPS 2013 __________________________________________________________________________

Steps to Ensure Alarms Identified in a PHA are Valid and Effective

GCPS 2013 __________________________________________________________________________

Discuss / document whether the operator has sufficient time to 70.6%

Discuss / document operator training relative to the alarm 52.4%

GCPS 2013 __________________________________________________________________________

Figure 6. Steps to Ensure Alarms are Valid and Effective By Region

Figure 7. Steps to Ensure Alarms are Valid and Effective By Industry

GCPS 2013 __________________________________________________________________________

Figure 8. Treatment of PHA Results

GCPS 2013 __________________________________________________________________________

4. Layer of Protection Analysis (LOPA)

GCPS 2013 __________________________________________________________________________

22.6% 17.5% 12.4%

All (approximately 100%)

GCPS 2013 __________________________________________________________________________

Percentage of LOPA Alarms identified during a PHA - by Region

GCPS 2013 __________________________________________________________________________

GCPS 2013 __________________________________________________________________________

Typical Risk Reduction (RRF) for a Safety IPL Alarm - by Region

GCPS 2013 __________________________________________________________________________

GCPS 2013 __________________________________________________________________________

1.0 (no risk reduction)

GCPS 2013 __________________________________________________________________________

GCPS 2013 __________________________________________________________________________

GCPS 2013 __________________________________________________________________________

Never (0% of the time)

Infrequently (< 1% of the Safety IPL Alarms)

Sometimes (between 1 to 5 % of the Safety IPL Alarms)

Frequently (> 5% of the Safety IPL Alarms)

GCPS 2013 __________________________________________________________________________

Based on the ultimate consequence defined in the HAZOP / PHA

GCPS 2013 __________________________________________________________________________

Table 7. Alarm Prioritization Results and Compliance with Best Practices

GCPS 2013 __________________________________________________________________________

5. Human Machine Interface (HMI) Practices for Safety IPL Alarms

GCPS 2013 __________________________________________________________________________

The are annunciated through dedicated HMIs 20.6%

They are part of a standalone system

GCPS 2013 __________________________________________________________________________

% 54% 28% 18%

GCPS 2013 __________________________________________________________________________

GCPS 2013 __________________________________________________________________________

[5] [6] [7] [8] [9] [10] [11] [12]

GCPS 2013 __________________________________________________________________________

GCPS 2013 __________________________________________________________________________

GCPS 2013 __________________________________________________________________________

GCPS 2013 __________________________________________________________________________

Você também pode gostar