Escolar Documentos
Profissional Documentos
Cultura Documentos
Abstract— Data-driven process monitoring has been exten- The PCP technique stems from compressed sensing [6],
sively discussed in both academia and industry because of [7], which reveals a surprising message: the minimum num-
its applicability and effectiveness. One of the most applied ber of data needed to reconstruct a signal may overcome the
techniques is the principal component analysis (PCA). Recently
a new technique called principal component pursuit (PCP) limitation imposed by the Nyquist-Shannon criterion if the
is introduced. Compared to PCA, PCP is more robust to signal is sparse in a certain sense. Inspired by this idea, a
outliers. In this paper, the application of the PCP technique to matrix completion method was proposed to recover a data
process monitoring is thoroughly discussed from training data matrix from only a few entries [8]. Going one step further,
preprocessing to residual signal post-filtering. A new scaling researchers focused on a more challenging problem: recov-
preprocessing step is proposed to improve quality of data
matrices in the sense of low coherence. A residual generator and ering a low rank data matrix contaminated by gross errors
a post-filter suitable for PCP generated process models are also on some of its entries. A novel solution, the PCP technique,
provided. The post-filtered residual represents the fault signal, was then provided [1], [2]. The essential idea of the PCP
which makes the fault detection, isolation and reconstruction technique is to replace the original non-convex optimization
procedures simple and straightforward. A numerical example problem of the matrix rank and the count of non-zero entries
is provided to describe and illustrate the PCP-based process
modeling and monitoring procedures. by a convex optimization problem using the nuclear and
ℓ1 norms. In [2], [3], deterministic conditions under which
N OTATION the two optimization problems are exactly equivalent were
provided; statistic counterparts were provided in [1]. While
the conditions are relatively mild, they greatly depend on the
coherence of the uncontaminated data matrix. The concept
kXk∗ nuclear norm (sum of singular values)
of coherence and a related coherence index were firstly
kXk0 0 norm of a matrix (number of non-zero entries)
introduced in a compressed sensing method in [9]; the same
kXk1 ℓ1 norm of a matrix (sum of magnitudes of all
authors then adjusted this concept to matrix completion and
entries)
PCP techniques [1], [8], and defined three coherence indices.
kXk∞ ℓ∞ norm of a matrix(maximum magnitude of all
Generally speaking, the smaller the coherence indices are, the
entries)
lower the requirement on the completeness of the signal or
diag(X) the set of diagonal entries of a square matrix
the matrix is.
I. I NTRODUCTION The PCP technique becomes popular in the image pro-
cessing area. The technique has been applied to video
In process monitoring, data-driven methods are widely surveillance [1], [10] and face recognition [11] successfully.
used because of their low cost and availability of enormous There are also some attempts to apply PCP to latent sematic
amount of process data from historical data bases. It is gener- indexing [12]. The PCP technique can also be applied to
ally true that high dimensional process data are governed by many other potential problems. Briefly speaking, in all the
underlying low dimensional subspaces. Therefore, principal areas that PCA can be used and outliers are inevitable, PCP
component analysis (PCA) becomes one of the most popular may be a good alternative. Process monitoring is such an
and widely used data-driven methods in industry to capture area. However, to our best knowledge there has been only
such low dimensional subspaces. However, PCA is sensitive one paper on this topic [13]. In [13], a comparison between
to outliers, also called gross errors, in the data. In recent the process monitoring approaches based on PCA and PCP
years, a new matrix decomposition technique called princi- was given. A conclusion was drawn that the PCP technique
pal component pursuit (PCP) is introduced and extensively was promising in process monitoring because the PCP-based
discussed [1]–[5]. Compared with PCA, PCP is robust to out- method could overcome most of the shortcomings of PCA-
liers. Moreover, the new technique outperforms other robust based methods.
PCA methods because of its polynomial-time complexity This paper makes a further discussion on how to apply
and mild conditions under which good performance can be the PCP technique to process monitoring, especially on
guaranteed. the data preprocessing. The rest of the paper is organized
as follows. In Section II, a brief introduction on PCP is
This work was supported by an NSERC strategic project. given. In Section III, a new scaling method is proposed as
Y. Cheng and T. Chen are with the Department of Electrical and Computer
Engineering, University of Alberta, Alberta, T6G 2V4, Canada. Email: a preprocessing step to improve the PCP-based modeling
cheng5@ualberta.ca; tchen@ualberta.ca. result. An online fault detection and diagnosis procedure
3536
x1
20
0
n1 > n2 . Assume that the outlier-free low rank matrix L0
−20
0 100 200 300 400 500 600 700 800 900 1000
has its SVD: U ΣV T , and the scaling vector is:
x2
50
0
α = [ α1 α2 ··· αn2 ],
−50
0 100 200 300 400 500 600 700 800 900 1000
2000
x3 where αi > 0, i = 1, 2, · · · , n2 . The scaled outlier-free
0 matrix is:
−2000
50
0 100 200 300 400 500
x4
600 700 800 900 1000
α1 0 · · · 0
.. ..
0
T 0
α2 . .
= Us Σs VsT .
−50
0 100 200 300 400 500 600 700 800 900 1000 Ls = U ΣV .
x5
.. . .. . ..
20
0
0
−20
0 100 200 300 400 500 600 700 800 900 1000
0 ··· 0 αn2
x6
20
0
Obviously, the left null space of L0 is the same as that of
−20
0 100 200 300 400 500 600 700 800 900 1000
Ls . As a result, µ(U ) = µ(Us ), since
x7
r
= max kU ei k2 = max(diag(U U T ))
20
0 n1 µ(U )
−20
0 100 200 300 400 500 600 700 800 900 1000 = 1 − min(diag(I − U U T )) = max(diag(Us UsT ))
= nr1 µ(Us ).
;;;
x29
20
0
However, µ(V ) 6= µ(Vs ). Given an orthogonal basis of the
−20
0 100 200 300 400 500 600 700 800 900 1000
null space of L0 , say V⊥ and the scaling vector α, the
50
x30
procedure to obtain µ(VS ) is as follows:
0
−50
1) Calculate a basis of the null space of Vs :
0 100 200 300 400 500 600 700 800 900 1000
x31 −1
50
0
α1 0 · · · 0
..
0 α2 . . .
−50
0 100 200 300 400 500
x32
600 700 800 900 1000
. V⊥ ;
2000
. .. ..
..
0
. . 0
−2000
2000
0 100 200 300 400 500
x33
600 700 800 900 1000
0 ··· 0 αn2
2) Find out a normalized orthogonal basis Vs⊥ ;
0
−2000
0 100 200 300 400 500 600 700 800 900 1000
5000
x34 3) Calculate the coherence index:
0
T
−5000
0 100 200 300 400 500 600 700 800 900 1000
µ(Vs ) = 1 − min(diag(Vs⊥ Vs⊥ )).
x35
5000
3537
in the 50 trials are shown in Fig. 2. The solution is not quite 11
stable; but they have similar trends. Figs. 3 and 4 provide
the values of µ(Vs ) and µ1 corresponding to the optimal 10 µ of well scaled data
1
solutions searched in the 50 trials. The red dash dot line µ of standard normalized data
1
µ1 of orignal data
and black dash line show us the values of the original data 9
but the scaled results are obviously better than the original
and standard normalized ones. Then we apply Algorithm 7
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
Fig. 5. Histogram of the number of iterations used to reach convergence
0
0 5 10 15 20 25 30 35
−10
we search for the optimal scaling parameters. As a result, 0 20 40 60 80 100
x2
120 140 160 180 200
50
the RS of the well-scaled data rather than the original data 0
−1000
0 20 40 60 80 100 120 140 160 180 200
using the outlier-free vector x̂ is the stationary requirement 100
x4
0
hold in practice, especially when the process works at several −10
0 20 40 60 80 100 120 140 160 180 200
different operating points. 200
x6
0
residual variable. Because of the existence of outliers, also −100
0 20 40 60 80 100 120 140 160 180 200
called impulse noise in the signal processing literature, the 50
x31
−2000
Example 3: We continue with Examples 1 and 2. 200 0 20 40 60 80 100
x34
120 140 160 180 200
5000
more samples are generated in the original PCS with 5% 0
−2000
2) the 101st to 120th samples of the 5th variable have an Fig. 6. Time trends of the fault signals and the filtered residual signals
offset of -4;
3) from the 31st sample, an offset of 800 is added on the
35th variable. V. CONCLUDING REMARKS AND FUTURE WORK
Randomly choosing one group of scaling vectors and the
corresponding PCS and RS we obtained in Example 2, we A process monitoring method based on PCP is thoroughly
solve the optimization problem in (7) for each sample. Fi- discussed. In order to improve modeling accuracy, an optimal
nally the residual signal is obtained. Since the residual is the scaling method is proposed as a preprocessing step. After
3539
modeling, a PCP-based fault detection and diagnosis ap- [13] J. D. Isom and R. E. LaBarre, “Process fault detection, isolation,
proach is introduced. Compared with PCA-based approaches, and reconstruction by principal component pursuit,” in 2011 American
Control Conference, San Francisco, CA, USA, 2011, pp. 238–243.
the proposed approach determines the residual signal by [14] M. G. Borgognone, J. Bussi, and H. Guillermo, “Principal component
solving constrained linear programming problems instead analysis in sensory analysis: covariance or correlation matrix?” Food
of unconstrained quadratic programming problems. Using Quality and Preference, vol. 12, no. 5-7, pp. 323–326, 2001.
[15] J. Wen, X. Xiao, J. Dong, Z. Chen, and X. Dai, “Data normalization
filtered residual signals via univariate generalized median for diabetes II metabonomics analysis,” in The 1st International
filters, the fault detection, isolation and reconstruction can Conference on Bioinformatics and Biomedical Engineering, Wuhan,
be fulfilled simultaneously with ease, which is the main China, 2007, pp. 682–685.
[16] M. Elad, “Optimized projections for compressed sensing,” IEEE
advantage over PCA-based approaches. Transactions on Signal Processing, vol. 55, no. 12, pp. 5695–5702,
However, there is still large room to improve the optimal 2007.
scaling algorithm. In the future work, we will adopt some [17] J. M. Duarte-Carvajalino and G. Sapiro, “Learning to sense sparse
signals: simulataneous sensing matrix and sparsifying dictionary opti-
other heuristic or non-heuristic algorithms that may be better mization,” IEEE Transactions on Inage Processing, vol. 18, no. 7, pp.
than the DE algorithm in the sense of accuracy and swiftness. 1395–1408, 2009.
Coordinate search may be a good choice because of its [18] R. Storn and K. Price, “Differential evolution - a simple and efficient
heuristic for global optimization over continuous spaces,” Journal of
low computational burden, but its convergence property in Global Optimization, vol. 11, no. 4, pp. 341–359, 1997.
our problem still needs further study. Iterations between the [19] L. H. Chiang, E. L. Russell, and R. D. Braatz, Fault Detection and
optimal scaling and modeling steps can increase the accuracy Diagnosis in Industrial Systems. London Great Britain: Sorubger-
Verlag, 2001.
on coherent indices caused by using the observed matrix [20] S. J. Qin, “Statistical process monitoring: Basics and beyond,” Journal
M instead of the outlier-free matrix L0 to estimate V⊥ ; but of Chemometrics, vol. 15, pp. 480–502, 2003.
it is time consuming until some computationally efficient [21] F. A. Carlos and S. J. Qin, “Reconstruction-based contribution for
process monitoring,” Automatica, vol. 45, pp. 1593–1600, 2009.
algorithms are introduced. [22] D. L. Donoho, “For most large underdetermined systems of linear
Moreover, in practice, both outliers and noise exist in the equations the minimal l1 -norm solution is also the sparest solution,”
disturbance. But in our work, only outliers are considered. A Communication on Pure and Applied Mathematics, vol. 59, no. 6, pp.
797–829, 2006.
robust PCP technique also considering small entry-wise noise [23] A. C. Bovik, T. S. Huang, and D. C. Munson, “A generalization of
is recently proposed in [4]. So in the future, performance of median filtering using linear combinations of order statistcs,” IEEE
Transections on Acoustics, Speech, and Signal Processing, vol. 31,
the robust PCP technique with both noise and outliers will no. 6, pp. 1342–1350, 1983.
also be studied. [24] Y. H. Lee and S. A. Kassam, “Generalized median filtering and re-
lated nonlinear filtering techniques,” IEEE Transections on Acoustics,
R EFERENCES Speech, and Signal Processing, vol. 33, no. 3, pp. 672–683, 1985.
3540