Você está na página 1de 13

IEEE TRANSACTIONS ON CONTROL SYSTEMS TECHNOLOGY, VOL. 5, NO.

6, NOVEMBER 1997

529

A Neural-Network Approach to Fault Detection and Diagnosis in Industrial Processes


Yunosuke Maki and Kenneth A. Loparo, Senior Member, IEEE
Abstract Using a multilayered feedforward neural-network approach, the detection and diagnosis of faults in industrial processes that requires observing multiple data simultaneously are studied in this paper. The main feature of our approach is that the detection of the faults occurs during transient periods of operation of the process. A two-stage neural network is proposed as the basic structure of the detection system. The rst stage of the network detects the dynamic trend of each measurement, and the second stage of the network detects and diagnoses the faults. The potential of this approach is demonstrated in simulation using a model of a continuously well-stirred tank reactor. The neural-network-based method successfully detects and diagnoses pretrained faults during transient periods and can also generalize properly. Finally, a comparison with a model-based method is presented. Index TermsFault detection, fault diagnosis, neural networks.

I. INTRODUCTION

AULT detection and diagnosis problems have been studied intensively in industries such as chemical processing and utility power generation. Prompt detection and diagnosis of faults is essential for the reliable, safe, and efcient operation of the plant and for maintaining quality of the products. Faults may occur in the process, the sensors, the actuators, and the instruments independently or simultaneously. For a simple fault that can be detected by a single measurement, a conventional alarm circuit may be sufcient. However, because it is usually very difcult in complex industrial systems to directly measure process states that are good indicators of faults, more elaborate and automatic measures are necessary. Observing multiple data simultaneously, skilled operators are often required to make tough decisions based on their experience and empirical knowledge. One of the common approaches is to use model-based methods for detection and diagnosis. This requires modeling of the process, ltering the measured data, and estimation of the unknown state variables. The basic idea is compare the output of the model to the measurements from the process, thereby generating a residual or error which is used make a decision about the operating state of the system. A wide variety of methods and applications have been studied and are summarized by Himmelblau [1], Frank [2], and Gertler [3]. The nonlinear ltering approach studied in [4] is in this category. Himmelblau et al. [5], [6] demonstrated the appli-

Manuscript received August 22, 1995; revised February 5, 1997. Recommended by Associate Editor, E. O. King. The authors are with the Case School of Engineering, Case Western Reserve University, Cleveland, OH 44106-7082 USA. Publisher Item Identier S 1063-6536(97)07770-1.

cation of extended Kalman ltering (EKF) to fault detection and diagnosis in chemical processes. Model-based approaches can use either state space or inputoutput representations of dynamic systems. As a consequence, the system model must be known and accurate for these methods to be highly effective. Uncertainty in the process model can easily degrade the estimation output and cause either missed detections or false alarms. The nonlinear ltering approach is usually more robust than conventional linear ltering-based methods, but a substantial modeling effort may be required. On the other hand, qualitative approaches that do not require process models have elicited considerable research interest in the last ten years. Decision table-based methods, knowledgebased expert systems [7] and articial neural-network-based methods are considered to be in this category. Neural-networkbased methods have received much attention because of their fast and robust implementation, their performance in learning arbitrary nonlinear mappings and their ability for pattern recognition and association. The fault detection and diagnosis problem can be interpreted as a pattern recognition task. Neural networks are an appropriate tool for fault detection and diagnosis in which measured data, not discernible at the instant of sensing, is transformed into useful information for decision-making. The potential of this approach for chemical processes was initially proposed by Hoskins and Himmelblau [8] and Venkatasubramanian and Chan [9]. Watanabe et al. [10] demonstrated the use of a two-stage neural network to add information about the severity of the fault. More detailed analysis regarding the learning, recall and generalization characteristics of the method was given by Venkatasubramanian et al. [11] and a large-scale application to a complex chemical plant was demonstrated by Hoskins et al. [12]. However, these approaches are static in nature because the neural networks are trained using only steady-state data. If the steady-state operating conditions are changed, the network must be retrained in order to work properly. Oftentimes, faster detection of the fault is required and it is necessary to use transient data for this purpose. Dietz et al. [13] trained the network by presenting dynamic data and Li et al. [14] developed an approach using a moving time window. Ohga and Seki [15] trained the network using a number of sets of time series data. The major motivation of this work is the use of articial neural networks, capable of operating during process transients, for fault detection and diagnosis of industrial processes. The ultimate goal is to develop a general method that can

10636536/97$10.00 1997 IEEE

530

IEEE TRANSACTIONS ON CONTROL SYSTEMS TECHNOLOGY, VOL. 5, NO. 6, NOVEMBER 1997

The parameters, assumed to be constant, are as follows. frequency factor; activation energy; gas constant; volumetric heat capacity; H heat of reaction; heat exchange area; overall heat transfer coefcient; area of the tank; jacket volume; heat capacity of water; density of water; temperature of the inlet coolant (water). The equations describing the system are
Fig. 1. Process ow of well-stirred tank (Luyben model).

Mass balance: (1)

be applied to a broad spectrum of industrial processes. The main feature of the proposed method is the rapid and robust detection of faults during transient periods of the process. Maintenance of the neural network should also be done with less effort. This paper is organized as follows: First, a model of a well-stirred tank is developed as a target process for this study. Second, the neural-network-based method is developed and implemented in a simulation environment. Third, a simulation study using the plant model is performed and the results of various tests are discussed. Finally, the proposed neural-network-based approach is compared with a conventional model-based method, the EKF. II. PLANT MODEL In order to illustrate the method proposed in this work, a target plant model is developed. A continuously well-stirred tank reactor (CSTR) is used for all case studies in this work. In this section, the plant model and its implementation are described in detail and the faults that are considered in the study are introduced. A. Description of the Plant Fig. 1 illustrates the jacketed CSTR in which an irreversible B takes place. The reactor and exothermic reaction A is operated by three control loops that regulate the outlet temperature, the inlet ow rate of the reactant tank level. A cooling jacket surrounds the reactor and the coolant is water in this case. Negligible heat losses, constant densities and perfect mixing inside the tank are assumed. Therefore the temperature in the jacket is uniform and equal to the outlet temperature. The process variables are as follows. concentration of at the inlet and outlet, respectively; ow rate of the liquid at inlet and outlet, respectively; ow rate of coolant; volume of the tank; temperature of the inlet reactant; temperature of the outlet coolant (water); temperature of the tank; control valve openings.

(2) Energy balance:

H (3)

(4) The system is highly nonlinear. All equations and parameter values are taken from Luyben [16]. Moreover, this CSTR model has been used in previous neural-network-based studies; see [8] and [11]. B. Implementation of the Plant Model For computer simulation, the plant model is implemented using Simulink in Matlab. The basic time unit is the hour. The step size for Euler integration is denoted by and it is usually 0.01 [h]. A block diagram of each control loop is shown in Fig. 2. Three PI controllers are used to regulate the outlet temperature , the inlet ow rate of the reactant and the tank level . Equal percentage valves are used to control the ow rate of the reactant, coolant and outlet liquid. To simplify the simulation, upstream and downstream pressures are assumed to be constant. A rst-order lag is used to model the actuator and sensing dynamics. C. Faults Studied Complex and frequently observable faults are selected for this study as listed in Table I. All possible faults are not included. However, three different kinds of faults that effect the sensor, actuator, and process are considered.

MAKI AND LOPARO: FAULT DETECTION AND DIAGNOSIS IN INDUSTRIAL PROCESSES

531

Fig. 2. Block diagram of the control loop. TABLE I FAULTS STUDIED

LIST

OF

III. NEURAL-NETWORK-BASED METHOD A. Introduction In recent years articial neural networks have generated considerable interest in the eld of engineering as problem solving tools. The fundamental element is a neuron which has multiple inputs and a single output. Each input is multiplied by a weight, the inputs are summed and this quantity is operated on by the transfer function of the neuron to generate the output. The output is sometimes referred to as an activity level. In this study, the multilayer feedforward neural network that has one hidden layer is used. The bias unit, whose activity level is xed at one, is connected to all neurons in the hidden and output layer to adjust the weighted sum input of each neuron. The number of neurons in the input and output layer is determined by each application, and the number of neurons in the hidden layer must be adjusted during the learning phase so that the network can be trained efciently. The activity level of the th neuron is obtained as (5) where activity level (output) of the th neuron; input to the th neuron; the transfer function of the th neuron; connection weight from the th neuron to the th neuron; activity level of the th neuron in the prior layer; connection weight from the bias unit to the th neuron.

The log-sigmoid function is used as the transfer function in this study. The backpropagation algorithm [17] is used to train the network. The connection weights such as and are adjusted so that the average squared error between the network output and the desired output (target) for a given reference input is minimized. Learning continues iteratively until the sum of the squared error is below a certain goal. The incremental change of weight from the th neuron to the th is computed by (6) (7) (8) where incremental change in the weight at time ; desired output of the th neuron in the output layor; learning rate (usually a constant); momentum (usually a constant). Equation (7) holds for the th neuron in the output layer, and (8) holds for the th neuron in the hidden layer. In (6), and are adjustable parameters. In order to accelerate the learning, the following methods are applied in this study. 1) The second term on the right-hand side of (6) is added to the original update term to improve the learning [17]. Momentum is the key parameter here and it is set at 0.95 in this study. 2) The adaptive learning rate [18], which attempts to keep the learning rate as large as possible while maintaining the stability of the learning process, is also used. This has a signicant effect on convergence of the weights.

532

IEEE TRANSACTIONS ON CONTROL SYSTEMS TECHNOLOGY, VOL. 5, NO. 6, NOVEMBER 1997

Fig. 3. Schematic diagram of fault detection and diagnosis system.

Fig. 4. Three training patterns for primary neural network.

B. Design of the Neural-Network-Based Detection System 1) Basic Concept: The capability of neural-network-based methods for fault detection has been established in previous works. The particular goals of this study are. 1) Preknown faults should be detectable by the neuralnetwork-based system. Unknown operating conditions should not generate a false alarm. In another words, the system is designed to detect faults that have occurred in the past and to be robust to unmodeled operating conditions. 2) The transient state of the fault can be detected dynamically. No steady-state values of process variables are required as parameters in the design of the detection system. 3) Detection is expected to be fast, reliable, and robust to noise. 4) The method should be applicable to various industrial processes with little additional effort and adjustments to the parameters and network structure are conducted easily. 2) Basic Structure: Fig. 3 depicts the basic structure of the fault detection and diagnosis system developed in this work. A two-stage neural-network system is proposed to improve exibility and applicability to other industrial processes. The rst stage network is referred to as the primary neural network and the second stage network is referred to as the secondary neural network. Each primary neural network corresponds to a channel of measured data and is used to detect the extent of changes such as increasing, decreasing, and steady behavior with numbers that indicate the extent of such changes. Therefore, the primary neural network can be designed independent from the secondary neural network. Furthermore, we do not have to design more than two primary neural networks, even for multiple measurements, because the same network can be

applied to different measurement channels. The primary neural network eliminates the need for additional input neurons to capture the dynamic aspects of the data, refer to Li [14]. The moving time window technique as described in [14] is used and a delay unit eliminates the effect of plant uctuations and accommodates for differences in the response time for different measurement channels. A reset and restriction rule is used to reduce the probability of false alarm. Details are presented later. 3) Design of Primary Neural Network: As we mentioned above, this network is designed to be used with the various observations that are available. Observation histories are categorized into three types of behavior: increasing, decreasing, and steady, and this network is trained to give this type of trend information including the extent of change. We assume that the measured data is normalized to the range [ 1, 1] before it is used as an input to the network. We begin with data obtained by periodic sampling (21 samples) from a single measurement source. After normalization, a vector of 21 elements is given to the network as an input. A feedforward type network is used. The number of units in the input layer is 21 and the number of units in the output layer is three. The activity level, dened to be between zero and one, of each unit corresponds to the extent of increase, decrease, and steadiness of the input, respectively. These activity levels are denoted by and in the following examples. The number of hidden units is adjustable and after some trial and error during the learning phase it is chosen at 15. The superior feature of the feedforward type network is that its output can include information about both the direction and the extent of change as mentioned above. Training is performed by presenting the three target patterns as given in Fig. 4. Fig. 5 shows examples of how well the network can generalize. Two different sets of noisy data, marked by x, are presented to the networks and the recognized output values are given under the same graph. The extent of increase in the case of (a) is apparently greater because the value of is larger. On the other hand, the extent of steadiness in the is larger. case of (b) is greater because the value of 4) Design of Secondary Neural Network: The secondary neural network receives the outputs from the primary neural networks and produces information about the faults. A conceptual diagram of a two-stage network system is depicted in Fig. 6. This network must be designed and trained to satisfy the particular requirements of each application problem. A

MAKI AND LOPARO: FAULT DETECTION AND DIAGNOSIS IN INDUSTRIAL PROCESSES

533

(a) Fig. 6. Conceptual diagram of two-stage neural network.

(b) Fig. 5. Examples of given input and recognized output. (a) Input: typical increase. (b) Input: slight increase.

Fig. 7. Moving time window that trace the dynamic data.

feedforward type network that can be trained using preknown information is also appropriate for this case. Suppose that the number of sensors used for detection is and the number of faults to be detected is , the secondary neural network has 3 neurons in the input layer and ( ) neurons in the output layer. The number of units in the hidden layer is adjustable and 15 is chosen for this paper from experimental trial and error. The transfer function chosen is of the log-sigmoid type. This is a reasonable choice because the extent of each fault can be represented by a number between zero and one. For the plant model of CSTR, the eight variables, and are assumed to be measurable. In actual processes, it is very difcult to measure concentration continuously. Hence, concentrations of the substance A, and , are assumed to not be measurable. The number of input neurons of the secondary network is . For the training of the network, target patterns must be set beforehand. According to the faults dened in Table I, Table II gives the 12 sets of target patterns used for this study. Each column corresponds to one specic fault and each row corresponds to a neuron in the input layer. The values in each column of the table are used as a reference input to the network for each of the faults to be learned. Any combination of faults can be chosen and the number of output neurons is so determined. As the target patterns, the value of the corresponding output neuron is set to one and the value of other outputs is set to zero. These targets for the network can

be determined empirically by carefully investigating the faults that have occurred in the past, but ne adjustment may be necessary in order that the network generalizes satisfactorily. Details will be discussed later. 5) A Moving Time Window and Normalization: A moving time window is an indispensable technique to track dynamic data and detect the transient state of faults. As shown in Fig. 7, the window moves forward at each time increment . The right side of the each window corresponds to the current time, , and the number of the time span of the window is . The window length is adjustable for samples is each application. For this study, only three different lengths must are used: 20, 50, and 100. Vertical window height be specied according to the range of each measurement and the amount of change in the measurement signals that can be caused by the faults. Table III gives the values of and for uniform samples each window used in this study. Using of a time series of data, the window calculates the average of the samples and rescales the vertical axis. The average value is set equal to zero, the upper value is set to one, and the lower value is set to 1 using . Values that exceed one are set equal to one and values below 1 are set equal to 1. Finally, the output of the moving window is used as the input to the primary neural network. In adjusting , there is a tradeoff between prompt detection and disturbance rejection. By increasing , the network is unlikely to be effected by plant disturbances. However, the detection can become insensitive to the measurements and the response of the detection system can also become sluggish.

534

IEEE TRANSACTIONS ON CONTROL SYSTEMS TECHNOLOGY, VOL. 5, NO. 6, NOVEMBER 1997

TRAINING PATTERNS

FOR THE

TABLE II SECONDARY NEURAL NETWORK

HEIGHTS

OF

TABLE III EACH WINDOW (NOTE:

dt

= 0:01 H)

TABLE IV TIME CONSTANTS OF DELAY UNITS

Normalization of the inputs to the primary network is automatically conducted by this moving time window; each output of the primary network is between zero and one and consequently, normalization for the secondary network is also accomplished within the primary network. Although the parameters shown in Table III must be adjusted for every application, knowledge of the steady-state values for each measurement is not necessary. Even though the steady-state operating conditions of the plant are likely to change, the frequency at which network parameters require readjustment should be low. 6) Delay Unit, Reset, and Restriction Rules: Some supplementary functions of the neural-network-based method are briey discussed in this section. When multiple observations are necessary to detect a certain fault, the response time for each observation may be different because of the plant characteristics. In addition, some obser-

vations may be contaminated by noise, and the intensity of this sensor noise may be different for each measurement. In order to accommodate the different response characteristics for multiple data or to remove the effects of noise, a rstorder lag (delay unit) is incorporated into the detection system. The sensitivity of the detection system to changes in the parameters of the delay unit should be evaluated with the width of the window xed and training patterns specied. The time constants chosen for this study are shown in Table IV. Note if the window length is chosen to be large enough, and the time constant tau of the coolant ow rate delay unit is either 0.01 or 0.10, the network does not yield a false alarm from unmodeled disturbances considered in this study. For particular applications, incorporating a pure dead-time delay in the network could also be effective. A key feature of the detection system developed in this work is the detection of faults during transient operating conditions.

MAKI AND LOPARO: FAULT DETECTION AND DIAGNOSIS IN INDUSTRIAL PROCESSES

535

RESULTS

OF

TRAINING

TABLE V AND RECALL (MULTIPLE FAULTS)

Because the detection system does not include information on normal steady-state values, it cannot determine if the plant is in a normal steady state or in an abnormal steady state. If all observations are steady during a fault condition, it is possible that the detection system can misdiagnose the situation and conclude that the plant is normal. Hence, it is necessary that the detection system is manually reset (reinitialized) after a fault is detected and the operator concludes that the plant has resumed normal steady-state operation. This is not going to be a problem in practical implementations because once the system detects a fault, the alarm will be kept until it is manually reset. The outputs of the network have values between zero and one and a threshold value of 0.9 is used in this study to set the alarms for the operator. From Table II, we notice that most faults are enumerated by pairs that have opposite direction. For example, if we want to detect the fault #8p, it is natural to train the network by the target patterns of faults #8p and #8n. Because many of the process variables have second order dynamic response characteristics, a false alarm of #8n is likely to occur after the correct detection of fault #8p. However, the probability of such an event is quite low in actual applications. Therefore, the fault detection system as designed and implemented in this work in a way that if a fault is detected, a fault with opposite direction to the fault detected is not considered until after the system is reset manually. IV. SIMULATION RESULTS The proposed fault detection and diagnosis system is expected to recall pretrained faults correctly. Also it should generalize appropriately even from distorted or noisy input data. From a different point of view, it is also desired that the neural network can be trained to detect multiple faults, as many as possible. In this section, the capabilities and limitations regarding these requirements are examined and discussed using the CSTR simulation. A. Recall to Trained Faults Basically, if the training of the secondary neural network is accomplished within the time allocated for training and if the error goal is achieved, recall should not be a problem. The

error goal of the backpropagation algorithm is set at 1 10 for the following experiments. For a single fault, it is necessary to train the network by presenting the target pattern of the fault and the normal condition. Learning a faulty pattern along with the pattern of normal operation is important to achieving correct recall. If the network is trained only using the target pattern of a single fault, then only a single neuron in the output layer is red for any input. Because most faults are considered in pairs as mentioned previously, from a practical point of view, it is recommended that the network be trained with the normal pattern and at least one pair of faulty patterns. For multiple faults, target patterns should include the faults and the pattern for normal operation. Recall ability of the network is investigated by presenting both the normal pattern and one pair of faulty patterns. Then, by augmenting the number of faults, training and recall capabilities are also investigated. Results are summarized in Table V. For example, suppose the network is trained using ve sets of data that represent the faults #1p, #1n, #2p, #2n and normal operation. Fig. 8 shows an example of recall by presenting the data of fault #1p. A symptom of the fault started at time 2.00. The rst neuron that corresponds to the fault #1p is red promptly and at time 2.20 detecting the change that occurred in . Note that the fth neuron, which corresponds to the normal operation also responds but no false alarm occurs. The alarm comes before the plant stabilizes to the faulty steady condition at about time 2.60. Therefore, detection by this method is faster than the conventional method that uses steady-state data. From the results shown in Table V, this fault detection and diagnosis system can also be trained using multiple faults within an allowable time period. However, as the number of faults increases, trapping in a local minimum in the error surface is more likely to happen. This actually occurred when we attempted to simultaneously train 12 patterns. Randomized network weights and biases at the beginning of training can help mitigate this problem. Obviously from Table V, the correct recall rate decreases as the number of trained patterns increases. One reason is that there are faults that are similar to each other, such as faults #1n and #2p, and so on. The network trained using many faults is likely to give a false alarm by misunderstanding such similar patterns. Furthermore, right after the occurrence of fault #2p,

536

IEEE TRANSACTIONS ON CONTROL SYSTEMS TECHNOLOGY, VOL. 5, NO. 6, NOVEMBER 1997

TABLE VI GENERALIZATION RESULTS FROM TRAINED DATA

WITH

NOISE

Fig. 8. An example of correct recall.

#2n, #6p or #6n, the coolant ow rate uctuates for a while. This can also initiate a false alarm of fault #3p/#3n or #5p/#5n because is also the key observation for these fault pairs. By updating the error goal to a smaller value, the occurrence of such false alarms can be reduced, but not eliminated entirely. There is no distinct limit regarding the number of patterns to be trained. But for this particular application, we found that it is better to train the network using less than ten patterns. Of course, this will vary from application to application, and this limitation must be discovered as a part of the learning and training process. B. Generalization from Untrained Faults, Input with Noise, and Faults with Different Severity How does the secondary neural network respond to untrained faults? How does it work with noise-corrupted input data or faults with different severity? These generalization issues are examined next. Furthermore, this section contains a discussion of the applicability of the proposed system to real-world industrial processes. First, the response of the detection system to an input that represents an untrained fault is investigated. According to the experiments performed with many pretrained networks, untrained faults were diagnosed as the normal operating con-

dition. In the hyperplane generated by the input vectors, an arbitrary input vector locates closest to the vector of normal operation. Arbitrary input vectors can be considered to be outputs of the primary network representing unknown faults. Therefore, the network cannot generalize to this situation, but this is consistent with our objectives for the design of the detection system. As depicted in Fig. 6, the output neuron for the normal condition should be referred to as normal or untrained faults. When the set point of the controller is changed, an unsteady operating condition is generated in the plant, but this is still normal operation. If this response is not similar to one of the trained faults, the network diagnoses a normal operating condition, similar to the case of an untrained fault. However, for example, fault #1p and #1n are quite similar to the response of the plant to a set point change of the inlet ow rate . It is quite likely that a false alarm is generated during such set point changes. As these set point changes are commonly initiated by an operator, the detection system should be tentatively disabled while the plant is in such a transient state. Second, noisy inputs of pretrained faults are given to the network. White noise with a normal distribution [ (0, 1)] is multiplied by a constant and added to each measurable variable. The control actions of the three controllers are also affected by the noise processes. The noise level is expressed as a percentage that represents the ratio of the standard deviation of the noise term to the amount of change caused by the fault. Let us consider the same example as given in Fig. 8, in which the change of the coolant ow rate is about 4 ft /h and the change of the control valve opening is about 8.1% of total stroke. If the standard deviation of the noise term of the coolant ow rate is 0.55 ft /h, the noise level of the coolant ow rate is %. If the standard deviation of the noise term of the control valve opening is 0.56%, the noise level of %. the control valve opening is For simplicity, the noise level of temperatures and level measurements are kept constant. Other noise levels are altered as in Table VI, which also shows the results of this experiment. It is obvious that the more noise that is added the more difcult it is to detect faults. However, the network performance demonstrates that it is adequately robust to noise. Fig. 9 illustrates an example where fault #1p is correctly generalized from a faulty input with noise. Because the threshold level of ring each neuron is set at 0.9, the system can detect this fault . Further discussion of noise is given in the at time next section.

MAKI AND LOPARO: FAULT DETECTION AND DIAGNOSIS IN INDUSTRIAL PROCESSES

537

(a)

Fig. 9. An example of generalization from input data with noise (fault #1p occurred at time 2.00). (b)

Finally, the severity of the trained faults is changed and the ability of the network to generalize is investigated. For the network trained by the data of faults #1p, #1n, #2p, #2n and normal operation, the data for fault #1p with different severity is presented. In Fig. 10, the ow rate of the inlet reactant, , is changed from 40 to 44 ft /h. In this case, the generalization results are studied by changing from 40 to 48 ft /h. Even though the severity (magnitude of the bias in this example) is doubled, the output of the secondary neural network remains similar. The height of the moving window determines the sensitivity of detectability to the severity of the fault. As long as the severity is not negligible when compared to the window height, it can usually be detected. V. COMPARISON
WITH

Fig. 10. Results of fault #3p: (a) Trend of estimated and actual v ca and (b) trend of estimated and actual fo .

MODEL-BASED METHOD

We have shown the effectiveness of the proposed neuralnetwork-based approach to fault detection and diagnosis problems of the CSTR. In this section, the results of a comparison with a conventional model-based method are presented. Because the CSTR plant is highly nonlinear, the EKF is chosen to estimate the unmeasurable state variables and relevant parameters, [5], [6]. A. Introduction to the Extended Kalman Filter [19] [20] The recursive form of the EKF algorithm is used. The system and measurement equations are given by (9) (10) where state vector; output vector (measurable) at time ; modeling uncertainty vector, i.e., (0, 1); process noise covariance matrix; measurement noise vector at time , i.e., (0, 1);

measurement noise covariance matrix; nonlinear function of ; nonlinear function of at time . Assume and are known diagonal matrices. Equations (9) and (10) represent a problem where the process is continuous and the measurement is discrete. Computation is performed by repeating the following propagation and updating steps in a discrete manner. Propagation of State Estimate and Error Covariance: Suppose at time are known, integration of the following two equations in the sampling interval ( h) yields and : (11)

(12) where Jacobian matrix of evaluated at (13)

state estimate at time ; error covariance matrix of at time . The function is linearized at ( ) at every time step. Updating of the State Estimate and Error Covariance: Denote the estimated state and the error covariance at as and , respectively. The measurement is taken at and is used to update and to

538

IEEE TRANSACTIONS ON CONTROL SYSTEMS TECHNOLOGY, VOL. 5, NO. 6, NOVEMBER 1997

give

and (14) (15)

as deviation variables, the initial state is given as . From (1)(4), the system equations of the CSTR are written as shown in (20) at the bottom of the page, where

where (16)

(17) and are used to compute the state estimate and error covariance at the next time step. B. Application of EKF to the CSTR Dene the state vector represents the modeling uncertainty. From the assumption on measurability, the output equation is written below, where the observation matrix is denoted by (18) and are measurable. It where it is assumed that follows that , , , , and . From the assumption, and are measurable and the others are not. Usually and are treated as inputs in such a state space representation. However, in this application they are dened as variables because they must be estimated. For computational reasons, the system equations are modied to be dimensionless. The normalized state variables in deviation form are dened as shown in (19) at the bottom of the page, where (ft ), , , , (ft /h) and (ft /h). These are the steady-state values. Hence, , , , , and . From here, the * is omitted to simplify the notation and because they are dened (22) (21)

Here

(23)

and is the measurement noise vector. Examining the parameters given in (21), we notice that there are variables that require estimation. For example, the coolant ow rate, , and the reactant concentration at the inlet, , can change during the operation of the process. In a fault mode, parameters such as , , and can also vary. The reason for

(19)

(20)

MAKI AND LOPARO: FAULT DETECTION AND DIAGNOSIS IN INDUSTRIAL PROCESSES

539

the choice of the system state as given in (18) is discussed in the next section. C. Observability The model (20) has unmeasured state variables and parameters that can change during different operating modes of the process. If parameters such as , , and can be estimated, it is very helpful for fault detection and diagnosis. However, because they are not measurable directly and are necessary for the detection of certain faults, the state vector is augmented to include these variables and parameters to be estimated by the EKF. The augmented state must be observable, otherwise it is meaningless to incorporate these variables and parameters into the model. Various realizations, other than the system representation given in the previous section, were considered. Unfortunately, all other realizations that were tried resulted in an unobservable realization. D. EKF Tuning and Simulation for Fault Detection The parameters of the EKF are , , and , which are assumed to be diagonal matrices. Also, the initial state must be specied. As described in (9) and (10), and can be determined from the known noise covariances. After that, however, further tuning by trial and error is always necessary to achieve stable and accurate estimation. Each element of is inversely proportional to the gain matrix . Hence, smaller elements are chosen as long as the estimation process is stable. As each element of increases, faster response is obtained but the amplitude of uctuation increases. Conversely, the smaller each element of is, the slower the response and the smaller the uctuations. The elements of that correspond to unmeasurable states need to be chosen so that the estimated states track the true values. As long as is given as , and and are chosen appropriately, it is not necessary to adjust . This delicate balancing of estimator parameters and performance is similar to the situation we discussed earlier regarding the inuence of window height on network performance for different fault severity scenarios. Two faults, #3p and #5p are selected for the case study to compare the performance of the EKF to the performance of neural-network-based approach. For #3p, the unmeasurable state, , rises due to the sticking of the control valve. For #5p, the constant, , rises and consequently the unmeasurable state increases. Hence, the estimates of or by the EKF are used to detect these faults. The square root of the covariance matrix is given as Results of estimated states are shown in Fig. 10. The change of coolant ow rate affects the accuracy of the plant model. Nevertheless, the estimates of and track the actual values very well. The fault is detected by trending of the estimated states and . Fault #5p: For fault #5p, and are adjusted to be

Fig. 11. Results of fault #5p.

Computer simulations for the two faults are performed under the above conditions. Fault #3p: For fault #3p, after adjustment and of the EKF are

Results of the estimated states are shown in Fig. 11. Despite the intensive effort to search for the optimal and , does not follow the actual value. The coolant ow rate and the inlet concentration change due to the fault and affect the accuracy of the model. The plant model with degraded accuracy hinders the estimation and proper tracking. Hence, the fault is not detectable. E. Comparison of the Two Methods Following the above results, a comparison of the EKF and neural-network-based method is examined for the two faults #3p and #5p. Simultaneously, ve different levels of noise are given and performance for different S/N ratios are compared. As the noise level of fault #3p increases, false alarms are likely to happen and tuning of the parameters is necessary for more than half the cases. For all cases of fault #5p, the EKF does not work well even though extensive efforts were taken to tune the parameters. We conclude this section with the following comments regarding the comparison. 1) A model-based approach such as EKF signicantly depends on the validity of the model. The performance of a model-based system is easily degraded by unmodeled disturbances such as measurement or process noise caused by perturbations or other malfunctions of the plant. 2) Use of the EKF is limited by observability of the realization, including unmeasurable states and parameters. If the system is not observable, we must look for a reduced set of states and/or parameters that are observable.

The square root of the covariance matrix

is given as

The covariance of the initial state estimation error as

is given

540

IEEE TRANSACTIONS ON CONTROL SYSTEMS TECHNOLOGY, VOL. 5, NO. 6, NOVEMBER 1997

Reducing the state variables makes the model vulnerable to uncertainty. 3) Parameters of the EKF need to be adjusted every time the noise level changes. Parameters of the neuralnetwork-based approach are generally more robust in this sense. 4) The advantage of the model-based method is that it can estimate unmeasurable parameters, if they are observable. On the other hand, the neural-network method must rely on the measurable information and can best correlate with preknown faults through training.

VI. CONCLUSIONS A neural-network-based fault detection and diagnosis system is developed and applied to a plant model of a CSTR. We summarize and review the main results of this work, evaluate the neural-network-based approach for fault detection and discuss the applicability to general industrial processes. A nonlinear CSTR was chosen as the study system. A plant model was developed and implemented for computer simulation. The following results were obtained from the case studies conducted using the CSTR model. 1) A two-stage neural-network system, where each element can be designed independently, has been proposed. Using data preprocessed by the moving time window, the primary network detects the transient state of each measurement dynamically, and this architecture can be used for many industrial process applications. A secondary neural network was developed to detect and diagnose a set of preknown faults according to the specication of the application. The two-stage network approach yields an efcient and simplied design procedure. 2) For the training of the neural networks, the backpropagation algorithm was chosen. This approach is considered to be better for training feedforward neural networks, as opposed to Hopeld networks. Combined with momentum and the adaptive learning rate method, training of the neural networks was performed very efciently. 3) The secondary neural network can be trained for multiple faults as long as all the patterns differ from each other. It can also recall the trained faults correctly. However, the more patterns that are trained, the more likely it is to be trapped in a local minimum. Moreover, similar patterns of faults or perturbations that occur after a fault are likely to produce a false alarm. 4) The neural-network-based system can detect trained faults promptly during the transient period. It is faster than the method trained using steady-state data. It detects untrained fault as the normal operating condition, as desired. When a set point of a controller is changed during normal operation of the plant, the detection system diagnoses that the plant is normal unless the response of the plant induced by the change in set point is similar to that of the trained faults. Because an operator should be aware of either a set point change or the occurrence of a measurable disturbance, in these situations the detection

system should be turned off temporarily until the process returns to a normal operating state. 5) The secondary neural network can generalize from faulty data with noise if the amplitude of the noise is within certain bounds. Also it can generalize from faulty data with different severity, unless the severity is much smaller than the window height. 6) A conventional model-based approach, the EKF, is chosen for this study. Its performance strictly depends on the accuracy of the model and its applicability is restricted by observability of the realization. Compared with the EKF model-based approach, the neural-network-based approach is more robust with respect to noise. Generally, we do not have to change parameters of the neuralnetwork-based method for different faults and different noise level. 7) The secondary neural network must be designed using the given specications of the plant. However, tuning of the parameters can be completed efciently. This feature implies that a wide scope of industrial process applications can be addressed using the approach developed in this work. REFERENCES
[1] D. M. Himmelblau, Fault Detection and Diagnosis in Chemical and Petrochemical Processes. New York: Elsevier, 1978. [2] P. M. Frank, Fault diagnosis in dynamic systems using analytical and knowledge-based redundancyA survey and some new results, Automatica, vol. 26, no. 3, pp. 459474, 1990. [3] J. J. Gertler, Survey of model-based failure detection and isolation in complex plants, IEEE Contr. Syst. Mag., vol. 8, pp. 311, 1988. [4] K. A. Loparo, M. R. Buchner, and K. S. Vasudeva, Leak detection in an experimental heat exchanger process: A multiple model approach, IEEE Trans. Automat. Contr., vol. 36, 1991. [5] S. Park and D. M. Himmelblau, Fault detection and diagnosis via parameter estimation in lumped dynamic systems, Ind. Eng. Chem. Process Des. Dev., vol. 22, no. 3, pp. 482487, 1983. [6] R. Li and J. H. Olsen, Fault detection and diagnosis in a closed-loop nonlinear distillation process: Application of extended Kalman lters, Ind. Eng. Chem. Res., vol. 30, no. 5, pp. 898908, 1991. [7] S. K. Shum, J. F. Davis, W. F. Punch, and B. Chandrasekaran, An expert system approach to malfunction diagnosis in chemical plants, Comput. Chem. Eng., vol. 12, no. 1, pp. 2736, 1988. [8] J. C. Hopkins and D. M. Himmelblau, Articial neural-network models of knowledge representation in chemical engineering, Computers Chem. Eng., vol. 12, nos. 9/10, pp. 881890, 1988. [9] V. Venkatasubramanian and K. Chan, A neural-network methodology for process fault diagnosis, AIChE J., vol. 35, no. 12, pp. 19932001, 1989. [10] K. Watanabe, I. Matsuura, M. Abe, and M. Kubota, Incipient fault diagnosis of chemical processes via articial neural networks, AIChE J., vol. 35, no. 11, pp. 18031812, 1989. [11] V. Venkatasubramanian, R. Vaidyanathan, and Y. Yamamoto, Process fault detection and diagnosis using neural networksI: Steady-state processes, Computers Chem. Eng., vol. 14, no. 7, pp. 699712, 1990. [12] J. C. Hoskins, K. M. Kaliyur, and D. M. Himmelblau, Fault diagnosis in complex chemical plants using articial neural networks, AIChE J., vol. 37, no. 1, pp. 137141, 1991. [13] W. E. Dietz, E. L. Kiech, and M. Ali, Jet and rocket engine fault diagnosis in real time, J. Neural-Network Computing, vol. 1, no. 5, pp. 517, 1989. [14] R. Li, J. H. Olson, and D. L. Chester, Dynamic fault detection and diagnosis using neural networks, in Proc. 5th IEEE Symp. Intell. Contr., 1990, pp. 11691174. [15] Y. Ohga and H. Seki, Abnormal event identication in nuclear power plants using a neural network and knowledge processing, Nuclear Technol., vol. 101, pp. 159167, Feb. 1993. [16] W. L. Luyben, Process Modeling, Simulation, and Control for Chemical Engineers. New York: McGraw-Hill, 1990.

MAKI AND LOPARO: FAULT DETECTION AND DIAGNOSIS IN INDUSTRIAL PROCESSES

541

[17] D. E. Rumelhart, G. E. Hinton, and R. J. Williams, Learning internal representations by error propagation, Parallel Distributed Processing: Explorations in the Microstructure of CognitionI: Foundations, D. E. Rummelhart and J. L. McClelland, Eds. Cambridge, MA: MIT Press, 1986. [18] T. P. Vogel, J. K. Mangis, A. K. Rigler, W. T. Zink, and D. L. Alkon, Accelerating the convergence of the backpropagation method, Biol. Cybern., vol. 59, pp. 257263, 1988. [19] C. K. Cui and G. Chen, Kalman Filtering. New York: Springer-Verlag. [20] A. H. Jazwinski, Stochastic Processes and Filtering Theory. New York: Academic, 1970.

Yunosuke Maki was born in Toyohashi, Japan, in 1958. He received the B.S. degree in mathematics and instrumentation from the University of Tokyo in 1981 and the M.S. degree in Systems and Control Engineering from Case Western Reserve University, Cleveland, OH, in 1994. Since 1981, he has been working for the Kawasaki Steel Corporation in Japan. His research interests include the industrial applications of systems and control theory, especially to the steel making process. Mr. Maki is currently a member of the Iron and Steel Institute in Japan.

Kenneth A. Loparo (S75M77SM89) received the Ph.D. degree in systems and control engineering from Case Western Reserve University, Cleveland, OH, in 1977. He was an Assistant Professor in the Mechanical Engineering Department at Cleveland State University, OH, from 1977 to 1979, where he received the Distinguished Faculty Award for contributions to teaching and research. From 1979 to the present time, he has been on the faculty of The Case School of Engineering, Case Western Reserve University where he is currently Associate Dean of The Case School of Engineering and Professor of Systems and Control Engineering. He is also Professor of Mechanical and Aerospace Engineering and Professor of Mathematics. He served as Chair of the Department of Systems Engineering from 1990 to 1994 and as Associate Director of the Center for Automation and Intelligent Systems Research from 1985 to 1989. His research interests include stability and control of nonlinear and stochastic systems with applications to large-scale electric power systems; nonlinear ltering with applications to monitoring, fault detection, diagnosis and recongurable control; information theory aspects of stochastic and quantized systems with applications to adaptive and dual control; and the design of digital control systems. At Case Western Reserve University he has received numerous awards including the Sigma Xi Research Award for contributions to stochastic control, the John S. Diekoff Award for Distinguished Graduate Teaching, the Tau Beta Pi Outstanding Engineering and Science Professor Award, the Undergraduate Teaching Excellence Award, and the Carl F. Wittke Award for Distinguished Undergraduate Teaching.

Você também pode gostar