Escolar Documentos
Profissional Documentos
Cultura Documentos
Abstract. The focus of safety, and therefore also the assumptions about the sources of risks,
must change to match the nature of industrial environments. Technology and engineering were from the beginning the major concerns, and models and methods were developed to a address that. Developments from the 1970s and onwards demonstrated the need to consider also human and organisational factors. Today it is, however, no longer possible to describe socio-technical systems precisely enough to apply the time-honoured methods. A change of approach is required where safety is seen as the ability to function effectively in expected and unexpected situations, rather than as the absence of adverse outcomes.
Introduction
The rationale for safety has always been to prevent that something a process or an activity failed or malfunctioned with unintended and unwanted outcomes as the result. The manifestations of failures or malfunctions usually lead to the loss of property, material, or life, and also bring normal functioning to a halt for a shorter or longer period of time. In the aftermath of such failures or malfunctions it is necessary both to recover or restore normal functioning and to find an explanation of why things went wrong, so that steps can be taken to prevent it from happening again. The realisation that things can go wrong is as old as civilisation itself. Early evidence can be found in the Code of Hammurabi, written in 1.760 BC. It was nevertheless not until the late 19 Century that industrial risk and safety became a common concern, as described by Hale & Hovden (1998). These authors described three distinct ages in the scientific study of safety, which they named the age of technology, the age of human factors, and the age of safety management.
th
13
In the 1940s, during and around the period of the Second World War, the development of new sciences and technologies (such as information theory, cybernetics, digital computers, and the transistor) radically changed the setting. From the 1950s and onwards, the technologies for control, computation, and communication began to develop in an exponential fashion. This meant that the size, and therefore also the complexity, of technological systems expanded very rapidly. Industrial and military systems alike soon became so complex that they challenged established practices and ways of working and new methods were needed to address risk and safety issues. Reliability engineering, as a combination of probability theory with reliability theory, became a separate discipline by the early 1950s. Fault tree analysis was developed in 1961 to evaluate the Minuteman Launch Control System for the possibility of an unauthorized missile launch. Other methods such as FMEA and HAZOP were developed not just to analyse possible causes of hazards (and later on, causes of accidents), but also to identify hazards and risks. Probabilistic Risk Assessment (PRA) was successfully applied to the field of nuclear power generation with the WASH-1400 Reactor Safety Study (Atomic Energy Commission, 1975). The WASH-1400 study considered the course of events which might arise during a serious accident at a large modern Light Water Reactor, using a fault tree/event tree approach, and established PRA as the standard approach in the safetyassessment of modern nuclear power plants.
14
15
The established approaches to risk assessment require that systems are tractable and that it is possible to describe them in detail. In order for this to be the case, systems must be relatively simple and reasonably stable. Neither of these conditions are fulfilled by socio-technical systems which generally are intractable or underspecified. This means that the established methods are not suitable and that it is necessary to develop approaches that can be used for intractable systems.
Resilience Engineering represents such an approach. Traditional approaches to risk and safety depend on detailed descriptions of how systems are composed and how their processes work in order to count errors and calculate failure probabilities. Resilience Engineering instead starts from a description of characteristic functions, and looks for ways to enhance an organisations ability to create processes that are robust yet flexible, to monitor and revise risk models, and to use resources proactively in the face of unexpected developments or ongoing production and economic pressures. Because socio-technical systems are incompletely described, hence underspecified, individuals and organisations must always adjust their performance to the current conditions in order to accomplish their objectives. Since resources and time are finite, it is inevitable that such adjustments are approximate. In resilience engineering, accidents and failures are therefore not seen as representing a breakdown or malfunctioning of normal system functions, but rather as representing the converse of the adaptations necessary to cope with the real world complexity.
16
even when nothing went wrong. Indeed, sneak circuits were formally defined as ... conditions which are present but not always active, and (which) do not depend on component failure (Hill & Bose, 1975). (It is interesting to note that the concern for sneak paths has reappeared to confront the so-called cyber threats, defined as access by adversaries who want to obtain, corrupt, damage, destroy, or prohibit access to very valuable information.) Despite the open acknowledgement that unwanted outcomes not always were a consequence of failures, the mainstream of safety methods remained faithful to the assumption that things work until they fail, and further that they worked as intended. This assumption may be quite reasonable in the case of technological systems, which have been designed to work in a certain way, with as little variability (and with as much precision and reliability) as possible. But the same assumption does not hold for socio-technical systems.
17
the consequences of failures can propagate. But for systems that are intractable, risks are emergent rather than resultant, and typically represent combinations of socio-technical performance variability rather than the consequences of failures and malfunctions. It is therefore necessary to deal with emergent risks in a different way. In technological systems (especially electronics and software), a few failures may be due to sneak paths but on the whole cause-effect thinking will suffice. In socio-technical systems, and in general in systems that are more complex or intractable, adverse outcomes can happen in the absence of failures or malfunctions and cause-effect thinking is therefore inappropriate. Indeed, for sociotechnical systems, such accidents without failures are likely the norm rather than the exception. For these it is necessary to think in a different way and to have access to different methods. The most important risks are not those that arise from single events but rather those that emerge from the underspecification that is a part of everything that happens.