Time To Think and Time To Do Hollnagel 012

Time to Think and Time to Do? I can fail, and so can you!
ERIC HOLLNAGEL University of Linkping, Sweden
Abstract Efforts to account for incidents and accidents usually include one or more instances of human actions gone wrong, which in turn are explained as human error. An abundance of theories, models and methods have been developed to help clarify what exactly human errors are and how they can be reduced. Most of this research has focused on what happens when things go wrong, and has neglected to look at what happens when things go right. This paper argues that we cannot understand the former without understanding the latter, and that we therefore need better theories and models of normal actions. It is proposed that both normal performance and failures should be seen as emergent phenomena, hence that neither can be attributed to nor explained by failures of specific components or parts. The paper discusses how human performance can be seen as a dynamic trade-off between efficiency and thoroughness, both at the sharp-end and the blunt-end.
It is one thing to show people they are in an error, and another to put them in possession of truth. John Locke (1632-1704). An Essay Concerning Human Understanding, Bk. IV, Ch. 7.
When Things Go Wrong When things go wrong, and in particular when the outcomes are so serious that they are classified as accidents, there is the inevitable rush to find the root causes or at least to provide a satisfactory explanation (Hollnagel, in preparation). This is done both to make a purportedly practical contribution to even higher system safety, and to fill the affective and cognitive void that often develops when something unexplained happens. This hunt for causes and explanations will always at some point come across a function or an action where humans play a part. For all technological systems, people are involved in one way or another from design and construction to operation and maintenance. It is therefore guaranteed that the analysis of the events leading up to an accident sooner or later will find traces of human actions. Yet all this proves is that the system was an artefact. Indeed, we reserve labels such as acts or god and natural disasters for unwelcome events where people are not involved. Examples are earthquakes, tsunamis, volcano eruptions and the like although in some cases mankind is not totally innocent even here. When something has gone wrong, the human actions that are implicated by the explanation are usually seen as failures or errors. It has always been known that humans can err although it probably is impossible to find the first evidence of that. Before the industrial age the study of error was mostly directed at perception and reasoning. David Hume, for instance, in his monumental work A Treatise of Human Nature noted that: In all demonstrative sciences the rules are certain and infallible; but when we apply them, our fallible said uncertain faculties are very apt to depart from them, and fall into error. (Part IV, Section I.)
Hollnagel
As technology became more ubiquitous and complex, the significance of incorrect human actions or human failures grew. This is clearly demonstrated by the Domino Theory of Accidents (Heinrich, 1931), which states that: Industrial injuries result only from accidents Accidents are caused directly only by (a) the unsafe acts of persons, or (b) exposure to unsafe mechanical conditions Unsafe actions and conditions are caused only by faults of persons Faults of persons are created by environment or acquired by inheritance According to this view, faults of persons are directly or indirectly the root causes of accidents. This was expressed even more directly in the Second Axiom of Industrial Safety, according to which the unsafe acts of persons are responsible for a majority of accidents. The danger of any simplified formulation is that it may be used out of context. In this case the Domino Theory probably became the starting point for the belief that human error and human failures are indispensable to explain why things sometimes go wrong. When Things Go Right The popular and technical/academic literature is replete with studies of accidents, and event reporting systems are aplenty. There are even specialised journals such as Accident Analysis and Prevention. Yet by focusing on all the cases when things go wrong, we fail to pay attention to all the cases when things go right. In medicine, for instance, much is made of the claim that in 1997 about 44,000 patients in the US died because of medical error an uncertain attribution, if ever there was one (Kohn, Corrigan, & Donaldson, 2000). Yet this number must be seen in relation to the annual figure of 33.6 million admissions. To take another example, the accident rate on freeways is 0.80 accidents per million vehicle miles travelled (the corresponding number for twolane highways is 2.9). Finally, in aviation the number of major accidents per million hours flown has steadily decreased over the last 20 years, as shown by Figure 1.
1 0,9 0,8 0,7 0,6 0,5 0,4 0,3 0,2 0,1 0
19 91 19 85 19 97 19 93 19 99 19 89 19 83 19 87 19 95 20 01
Major
Serious
Figure 1: Major and serious accidents per million hours flown. Indeed, in most cases of everyday activities accidents are the exception and successful functioning is the norm. We should therefore consider the following conundrum: Human activity is a necessary part of every system and humans play a pivotal role when systems are built, operated, and maintained. Most systems function perfectly well in nearly all cases, but every now and then something gives. That being the case, why do we continue to focus on the relatively few
Hollnagel
cases where something goes wrong, instead of the far more numerous cases where things work out all right? Why Do Things Go Right? In pondering the relation between normal outcomes and accidents, there are two obvious but contradictory hypotheses. One is that actions that lead to accidents are qualitatively different from actions that lead to normal outcomes. Since accidents therefore have different explanations than normal actions, it makes sense to study the actions leading to accidents per se, i.e., to study errors. The other hypothesis is that accidents and normal actions spring from the same source and that, in the words of Ernst Mach, only success can tell one from the other. In this case it makes little sense to study merely accidents, not least because they are few and far between. The study should instead be directed at correct actions and accidents alike in order to understand what the crucial differences are. In other words, the question to answer is why qualitatively indistinguishable actions sometimes lead to an accident but in most cases do not? An artless explanation of why there are relatively few accidents could, for instance, be as follows: Systems are well designed and scrupulously maintained Procedures and instructions are complete and correct People behave as they are taught and expected to Designers can effectively anticipate and prepare for every contingency
According to this line of reasoning, accidents are unique occurrences and system safety can be improved by further regulating actions and restraining human variability. Since the general experience strongly suggests that the above assumptions are somewhat off the mark, an alternative explanation must be found. This could, for instance, have the following parts: Despite the fact that most designs have flaws and functional glitches, people soon learn to overcome these People can interpret and fine-tune procedures and instructions to match the actual conditions People can balance resources and demands to achieve an effective level of performance When things do go wrong, people can effectively detect and correct them According to this line of reasoning, accommodating and furthering human variability can improve system safety. In this view humans are an asset, rather than a liability. The Importance Of Time The most fundamental characteristic of human actions is unquestionably that they take place in time. This means that actions take time; indeed, doing anything including thinking about what to do takes time. When actions furthermore take place relative to some process or set of events that develops over time, i.e., in a dynamic system, time is also a limited resource. If more time is spent to plan and carry out the actions than is available, the control of the situation will sooner or later be lost (Hollnagel, 2002). The fact that actions take time is so omnipresent that it has normally been overlooked. Said politely one may assume that it has been taken for granted. (There is also a more impolite version.) Whatever the reason may be, very few models of human performance take time into consideration. This goes for models of human error (Kirwan, 1994), of decision making (Rasmussen, 1986), of limited attention (Wickens, 1987), and so on. Most models focus on the processes that take place in the mind cognition and describe these as reactions to events that happen in the world without considering the time it takes. Yet it is obvious that people must cope with limited time and that whatever they do take time. Models of normal human actions and in
Hollnagel
particular models of action failures must therefore comprise time and be able to describe how time, as a resource and a demand, affects what people do. One example of that is a model that emphasises how humans or more generally, a system can maintain control of a dynamic situation by establishing a balance between feedforward and feedback. The dynamics of control are described in terms of a cyclical relation among the current understanding of the situation, the ensuing actions, and the events (Hollnagel, 2002). The actions taken reflect the current understanding, and the events reflect the outcome of actions taken. In addition to these events, which basically are the anticipated responses, there may also be unanticipated events and developments. While the anticipated responses generally reinforce the current understanding, unexpected events challenge it and may force a revision, which leads to new actions and so on. The model comprises several types of time that affect the ability to remain in control, as shown in Figure 2. These are the time needed to evaluate events and update/develop an understanding of the situation (TE), the time needed to choose or select an appropriate action (TS), and the time that is available in the current situation (TA). (A more complete version of the model also includes TP, the time needed to perform an action.) According to this line of reasoning, if (TE+TS) > TA then the person has too little time to understand what is going on and to respond effectively, and control therefore will sooner or later be lost. Conversely, when (TE+TS) < TA then the person has enough time to understand what is going on and to choose and effectuate a response, and is therefore likely to remain in control and possibly even able to plan ahead. TE and TS can to some extent be reduced by judiciously applied technology, for instance interaction design and automation.
Time available
Time needed Time to evaluate event Time to think Time to select action Time to do Time Need to do something! (Intention) Latest starting time Latest finishing time
Figure 2: Time to think and time to do . To complicate matters even further, humans are normally involved with several processes and sub-processes that cover multiple time spans and have different temporal characteristics. Related tasks may be embedded in each other, while other tasks may be in direct competition. The situation is therefore rarely as neat as Figure 2 suggests. Time and resources must be juggled to meet multiple demands and a mixture of low-level and high-level control is therefore usually required. This makes it even more remarkable that things usually go well, and even more urgent to understand why this is so.
Hollnagel
The Efficiency-Thoroughness Trade-Off Humans must always meet multiple, changing, and often conflicting criteria to performance. Humans are usually able to cope with this complexity because what they do and how they do it can be adjusted to match the current conditions. This ability has been described in several ways by terms such as adaptation, optimisation (and sub-optimisation), satisficing, suffisance, minimising cognitive effort, minimising workload, balancing arousal, etc. For instance, Amalberti (1996) described how people make cognitive compromises in order to produce acceptable performance while safeguarding cognitive resources. It is probably impossible to account for this behaviour using a single criterion or concept, and the issue here is not to attempt a comprehensive description of how this adjustment, adaptation or optimisation takes place, but rather to consider the consequences it has. As a starting point, we take for granted that people constantly try to optimise their performance. This is tantamount to striking a balance between resources and demands, which both may vary over time, and involves making a trade-off between efficiency and thoroughness. On the one hand people genuinely try to meet their (internalised) goals, i.e., they try to do what they are supposed to do, as they see it. They also try to be as thorough as they believe is necessary, since otherwise they expose themselves to risk. On the other hand they try to carry out their tasks as efficiently as possible, which means that they try to do it without spending unnecessary efforts or time. In making this trade-off people are greatly helped by the regularity or stability of their work environment and, indeed, the regularity of the world at large. If the work environment were continually changing it would be unpredictable. Such a lack of predictability would in effect make it impossible to take any shortcuts or indeed to learn how things could be done in a more efficient manner. Performance would be limited to reacting to external events, and that would sooner or later lead to a loss of control. It is precisely because the work environment has some measure of regularity or stability that it becomes predictable, and therefore allows performance to be optimised by making various shortcuts. ETTO Rules Being able to make an efficiency-thoroughness trade-off (ETTO) is a fundamental ingredient in coping with complexity and making up for the fact that things take time. Because the environments in which we must function are complex, it is rarely possible to focus on a single line of activity or to spend enough time on all the things that must be done. Something has to give, and by knowing where efficiency can be traded off for thoroughness where shortcuts can safely be made and where one can rely on the actions of others a person can allocate the limited resources to that which is most important. The benefits of making shortcuts are obvious, since they save time and effort. If a person always can assume that condition A is true in situation B, then there is no real need to check for the condition. Instead of examining every possible condition or prerequisite of an action, efforts can be reserved to check conditions that are known to vary across situations, or conditions that are seen as being more salient and important. In the case of RO-RO ferries, for instance, if the bow port always is closed when the ferry leaves harbour, then there is no need explicitly to verify this condition. Or, to take another example, if a hospital laboratory has routines to ensure that the right type of blood is issued, then it is only necessary to check that the identification of the patient is correct. The nurse has to bring the blood to the right patient, but need not check whether the blood is of the right type. People develop a number of efficiency heuristics or ETTO rules during their daily practice. Examples on the level of individual cognition are the heuristics for judgment under uncertainty described by Kahneman, Slovic & Tversky (1982), and the cognitive primitives of similarity matching and frequency gambling described by Reason (1990). In relation to work, some of the most common ETTO rules are:
Hollnagel
It looks fine! It is not really important It is normally OK, so we dont need to check it It will be checked by someone else usually followed by It has been checked by someone else I cant remember how to do it (and it takes too long to find out) I have no time or no resources now, so I will do it later, and everybodys favourite It worked the last time
The efficiency-thoroughness trade-off is, however, not confined to individuals but can also be found on the organisational level. A typical example is the principle of negative reporting, which means that only deviations from the normal should be reported. The implication is that if nothing is reported, then everything is OK. Another example is the principle not to do the same work twice, in particular to avoid double-checking. ETTO At The Sharp-End And The Blunt-End It has become common practice to make a distinction between the sharp-end and the blunt-end, the former being the people who actually interact with the hazardous process in their roles as pilots, physicians, space controllers, or power plant operators (Woods, Johannesen, Cook, & Sarter, 1994, p. 20) and the latter the people who affect safety through their effect on the constraints and resources acting on the practitioners at the sharp end (ibid). While it is right and proper to acknowledge that accidents are the result of many factors and not just blame the people at the sharp-end, it should not be forgotten that everyone is at the sharp-end when they carry out their actions. The blunt-end, sharp-end distinction is therefore relative rather than absolute, since in either case people have to cope with the complexity in which they find themselves. This means that people at the blunt-end are as likely to use ETTO rules as are people at the sharp-end, the main difference being that the consequences of their choices and actions may be less direct and considerably delayed. The efficiency-thoroughness trade-offs that people make are of course not the only reason why things sometimes go wrong. In order to carrying out their work people have to cope with a host of problems, which have been described in detail in, e.g., the human factors literature. Common examples are inefficient or deficient organisation, incompatible working conditions, inappropriate Human Machine Interface (HMI) and operational support, inappropriate availability of procedures / plans, too many simultaneous goals and too little available time, lack of adjustment to time of day (circadian rhythm), inadequate training and experience, inefficient crew collaboration, inefficient communication, and shortage of resources (both human and technological). Each of these can be specified in further detail, for instance as disregard of basic ergonomic principles in the interface, colour coding, confusing signals, etc. Many of these problems are routinely attributed to failures at the blunt-end, such as deficient planning and resource allocation, inadequate interface design, etc. The point is, however, than instead of just seeing this as the result of bad decisions at the blunt-end, one should realise that the people who made these decisions themselves were at the sharp-end when it happened, even though they are at the blunt-end now. In other words, the somewhat surprising deficiencies that are found at the workplace are can themselves be explained as a result of an ETTO, as well as of overconfidence, strong beliefs in basic values, etc. (cf. Merton, 1979). The Sources Of success Socio-technical systems on the whole function safely and efficiently because people quickly learn to disregard those aspects or conditions that normally are insignificant. This adjustment is furthermore not only a convenient ploy for the individual, but also a necessary condition for the joint system (i.e., people plus other people plus technology) as a whole. Just as individuals adjust their performance to avoid wasting effort, so does the joint system. This creates a functional entanglement, which is essential for understanding why failures occur. The performance
Hollnagel
adjustment on the joint system level cannot be effective unless the aggregated effects of what individuals do are relatively stable, since these constitute an important part of the joint systems environment. On the other hand, the efficient performance of the join system contributes in a significant manner to the regularity of the work environment for the individuals, which is a precondition for their performance adjustment. As far as the level of individual human performance is concerned, the local optimisation through shortcuts, heuristics, and expectation-driven actions is the norm rather than the exception. Indeed, normal performance is not what is prescribed by rules and regulation but rather what takes place as a result of the adjustments, i.e., the equilibrium that reflects the regularity of the work environment. This means that it is a mistake to look for the cause of failures in the normal actions since they, by definition, are not wrong. Normal actions are successful because people adjust to the local conditions and correctly anticipate the developments. Failures occur when this adjustment goes awry, but both the actions and the principles of adjustment are correct. This is consistent with the view of complexity theory according to which some properties of the system cannot be attributed to individual components but rather emerge from the whole system. The conclusion is that both normal performance and failures are emergent phenomena, hence that neither can be attributed to or explained by specific components or parts. For the humans in the system this means in particular that the reason why they sometimes fail, in the sense that the outcome of their actions differ from what was intended or required, is due to the variability of the context and conditions rather than to the failures of actions. The adaptability and flexibility of human work is the reason for its efficiency. At the same time it is also the reason for the failures that occur, although it is never the cause of the failures. Herein lies the paradox of optimal performance at the individual level. If anything is unreasonable, it is the requirement to be both efficient and thorough at the same time or rather to be thorough, when with hindsight it was wrong to be efficient. References Amalberti, R. (1996). La conduite de systmes risqu. Paris: Presses Universitaires de France. Heinrich, H. (1931). Industrial accident prevention. New York: McGraw-Hill. Hollnagel, E. (2002). Time and time again. Theoretical issues in Ergonomics Science, 3(2), 143158. Hollnagel, E. (in preparation). Of mountains and molehills. In M. S. Bogner (Ed.), Human error in medicine (2nd Ed). Erlbaum: in preparation. Kahneman, D., Slovic, P. & Tversky, A. (Eds.), (1982). Judgment under uncertainty: Heuristics and biases. New York: Cambridge University Press. Kirwan, B. (1994). A guide to practical human reliability assessment. London: Taylor & Francis. Kohn, L. T., Corrigan, J. M. & Donaldson, M. S. (Eds.) (2000). To err is human. Building a safer health system. Washington, D.C.: National Academy Press. Merton, R. K. (1979). Sociological ambivalence and other essays. New York: The Free Press. Rasmussen, J. (1986). Information processing and human-machine interaction: An approach to cognitive engineering. New York: North-Holland. Reason, J. T. (1990). Human error. Cambridge, U.K.: Cambridge University Press.
Hollnagel
Wickens, C. D. (1987). Information processing, decision-making and cognition. In G. Salvendy (Ed.), Handbook of human factors. New York: Wiley. Woods, D. D., Johannesen, L. J., Cook, R. I. & Sarter, N. B. (1994). Behind human error: Cognitive systems, computers and hindsight. Columbus, Ohio: CSERIAC.

Time To Think and Time To Do Hollnagel 012

Enviado por

Dados do documento

Título original

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Time To Think and Time To Do Hollnagel 012

Enviado por

Direitos autorais:

Formatos disponíveis

Time to Think and Time to Do? I can fail, and so can you!

ERIC HOLLNAGEL University of Linkping, Sweden

Você também pode gostar