Você está na página 1de 7

Energy and Buildings 39 (2007) 5258

www.elsevier.com/locate/enbuild

Using intelligent data analysis to detect abnormal energy


consumption in buildings
John E. Seem *
Johnson Controls, Inc., 507 East Michigan Street, Milwaukee, WI 53202, USA
Received 31 October 2005; received in revised form 11 March 2006; accepted 18 March 2006

Abstract
This paper describes a novel method for detecting abnormal energy consumption in buildings based on daily readings of energy consumption
and peak energy consumption. The method uses outlier detection to determine if the energy consumption for a particular day is significantly
different than previous energy consumption. For buildings with abnormal energy consumption, the amount of variation from normal is determined
using robust estimates of the mean and standard deviation. This new data analysis method will reduce operating costs by detecting problems that
previously would have gone unnoticed. Also, operators should save time by not having to manually detect faults or diagnose false alarms. The new
data analysis method has successfully detected high-energy consumption in many buildings. This paper presents field test results for buildings that
had the following problems: (1) chiller failure and a poor control strategy, (2) poor design of ventilating and air-conditioning equipment, and (3)
improper operation of equipment following a change in the electrical panel.
# 2006 Elsevier B.V. All rights reserved.

Keywords: Energy consumption; Fault detection; Outlier analysis; Performance monitoring; Robust statistics

1. Introduction The research community has developed a number of


methods for detecting faults in buildings and heating,
Energy management and control systems can collect and ventilating, and air-conditioning systems. Two major research
store massive quantities of energy consumption data. Facility efforts have been sponsored by the International Energy
operators can be overwhelmed with the quantity of data. For Agency: Annex 25 [1,2] and Annex 34 [3]. There are two basic
many operators, it is not possible to detect equipment, design, approaches to fault detection and diagnostics in buildings: a
or operation problems because of data overload. Modern component level (bottom-up) approach and a whole-building
building management systems have two systems to help the (top-down) approach. The component level approach looks for
operators with this data overload: alarm and warning systems faults in individual systems such as variable-air-volume boxes,
and data visualization programs. Today, operators must select air-handling units, chillers, or boilers. The whole-building
the thresholds for alarms and warnings. This is a difficult task. approach looks for unusual behavior in high-level measure-
If the thresholds are too tight, then a number of false alarms ments such as the whole-building cooling, heating, or electrical
are issued, and if the thresholds are too loose, then equipment consumption.
or system failures can go undetected. The data visualization Claridge et al. [4] describe an energy consumption report
programs can help building operators detect and diagnose method that helps building operators and facility managers
problems, but a large amount of time can be spent detecting identify if the building systems are working properly. The
problems. Also, the expertise of building operators varies greatly. report contains scatter plots of daily chilled water energy
New or inexperienced operators may have difficulty detecting consumption versus average daily temperature and daily hot
faults and the performance of an operator can vary with the time water consumption versus average daily temperature for a 3-
of day or day of the week. month period. For the last month, the scatter plot uses letters
(M, T, W, H, F, S, U) to identify the days of the week. The letters
helps building operators identify outliers in energy consumption
* Tel.: +1 414 524 4677; fax: +1 414 524 5810. for a particular day. The report also contains two- and three-
E-mail address: john.seem@gmail.com. dimensional time series plots of chilled water consumption and

0378-7788/$ see front matter # 2006 Elsevier B.V. All rights reserved.
doi:10.1016/j.enbuild.2006.03.033
J.E. Seem / Energy and Buildings 39 (2007) 5258 53

through the tedious process of manually inspecting graphs to


Nomenclature detect abnormal energy consumption. Instead, the operator or
maintenance operator can investigate only buildings with
a2B a is an element of set B
abnormal energy consumption. The method accounts for
a2=B a is not an element of set B
weekly variation in energy consumption by grouping days of
i index used in for loop in Fig. 1
the week with similar power consumption. A robust outlier
n number of elements in set X
detection method is used to determine if the energy consumption
nout number of outliers in set X
is significantly different than previous energy consumption.
p right tail area probability for t-distribution
For time periods with abnormal energy consumption, the
Ri extreme studentized deviate for ith extreme
amount of deviation from normal is determined using robust
s standard deviation for elements in set X
statistical methods.
srobust robust estimate of standard deviation for ele-
ments in set X
tn,p critical value (tn,p) for the Students t-distribution 2. Overview of data analysis method
with n degrees of freedom and a right tail area
probability of p Fig. 1 shows the major steps required to identify abnormal
xe,i value of ith extreme energy consumption in buildings. The feature extraction block
xj value of jth observation in set X determines features such as the average daily consumption or
x average of elements in set X peak demand for a day from energy data such as the whole-
x robust robust estimate of average of elements in set X building electrical consumption. The features are then sorted
X set of observations that contain outliers and non- into groups based on days of the week with similar energy
outliers consumption profiles. (In this paper, the term day type refers
Xnon-out set of observations that contain no outliers to days of the week with similar consumption profiles.) After
Xout set of observations that contain outliers the data is grouped based on day type, outlier identification is
zm modified z-score (standard score) used to determine the features that are significantly different
{} set of observations or elements from the features for the same day type. If any outliers are
j such that identified, then a modified z-score [9] is used to determine the
amount and direction of variation from a normal observation.
Greek letters (z-Scores are also called standard scores [10].) Next, details
a probability of declaring a normal value an outlier on a robust outlier identification method and a robust method
li critical value for Rosners generalized ESD for determining the amount of variation from normal are
many-outlier procedure presented.

whole-building electric consumption. By inspecting these plots,


building operators can identify days of abnormal energy
consumption. Haberl and Abbas [5,6] review several new
graphical displays for viewing building energy data.
Dodier and Kreider [7] present a method for detecting
whole-building energy problems for the following energy uses:
whole-building total electric energy, whole-building total
thermal energy, HVAC-other-than-chiller electric energy, and
chiller energy usage. They used an Energy Consumption Index
(ECI) to determine if the energy consumption was higher than
normal, normal, or lower than normal. The ECI is the ratio of
actual energy consumption to expected energy consumption as
determined from a neural network. If the ratio is larger than an
upper limit (e.g., 1.125) then the state of the system is higher
than normal. If the ratio is lower than a lower limit (e.g., 0.875),
then the state of the system is lower than normal. If the ratio is
between the lower limit and upper limit, then the state of the
system is normal. Graphs of the ECI will help building
operators identify major changes in energy consumption.
Figures presented by Dodier and Kreider [7] show a weekly
cycle of ECI.
This paper presents an intelligent data analysis method [8]
for automatically detecting abnormal energy consumption in
buildings. With this method, operators will not have to go Fig. 1. Block diagram for detecting abnormal energy consumption.
54 J.E. Seem / Energy and Buildings 39 (2007) 5258

3. Outlier identification: GESD many-outlier procedure  Block 1: Set nout = 0. This step is used to initialize the number
of outliers to zero.
An outlier is an observation that appears to be inconsistent  Block 2: Compute average (x) of elements in set X. The
with the majority of observations in a data set. For example, in average is determined from
the data set {1, 2, 1, 0, 3, 2, 101, 2}, the observation 101 Pn
appears to be an outlier. Data sets may contain more than one j1 x j
x (1)
outlier. For example in the data set {1, 2, 1, 0, 3, 2, 101, 2, n
96, 2, 0, 209}, the observations 101, 96, and 209 appear to where xj is a member of set X and n equals the number of
be outliers. elements in set X.
Barnet and Lewis [11] provide details on several common  Block 3: Compute standard deviation (s) of elements in set X.
outlier identification methods. After comparing several popular The standard deviation is determined from
outlier identification methods, Iglewicz and Hoaglin [9] highly s
Pn 2
recommend the generalized extreme studentized deviate (ESD) j1 x j  x
many-outlier procedure that was proposed by Rosner [12] s (2)
n1
because it works well under a variety of conditions.
 Block 4: s = 0. This block checks if the standard deviation of
The generalized ESD many-outlier procedure can identity
the elements in set X is zero. If the standard deviation equals
the elements in a set that are outliers. Fig. 2 is a flow chart for
zero, then the elements in set X all have the same value and
determining one or more outliers from a set of n observations
there are no outliers in the remaining elements in set X.
X 2 {x1, x2, x3, . . ., xn}. The user needs to specify the
(During field-testing of this method, several data sets had a
probability, a, of incorrectly declaring one or more outliers
standard deviation of zero.) To prevent a divide by zero in
when no outliers exist and an upper bound, nu, on the number of
Block 6, execution goes to Block 10 when the standard
potential outliers. Carey et al. [13] said the upper bound (nu)
deviation determined in Block 3 equals zero.
could be determined by finding the largest integer that satisfies
 Block 5: Find ith extreme (xe,i) in set X. The extreme element,
the following inequality: nu  0.5(n  1). Following are details
xe,i, is the element in set X that is furthest from x . Of all the
on the numbered blocks in Fig. 2.
elements in set X, the extreme element xe,i maximizes the
function jx j  x j where xi is an element of set X.
 Block 6: Compute ith extreme studentized deviate Ri. The
extreme studentized deviate is determined from
jxe;i  x j
Ri (3)
s
where Ri is a normalized measure of how far the ith extreme is
from the average value (x) determined in Block 2.
 Block 7: Compute ith critical value li. Rosner [12] developed
the following equation for determining the critical value:
n  itni1; p
li q (4)
n  i 1n  i  1 tni1; 2
p

where tni1,p is the Students t-distribution with (ni1)


degrees and the tail area probability p is determined from
a
p (5)
2n  i 1
Abramowitz and Stegun [14] review equations for
estimating the Students t-distribution.

 Block 8: Ri > li. This block determines if the ith extreme


studentized deviate, Ri, determined in Block 6 is greater than
the ith critical value, li, determined in Block 7.
 Block 9: Set nout = i. This block sets the number of outliers,
nout, equal to i.
 Block 10: Remove extreme element xe,i from set X. The
extreme element xe,i is removed from set X and after
removing the extreme element xe,i, the number of elements in
Fig. 2. Flow chart for implementing Rosners generalized many-outlier pro- Set X is n  i. If i equals nu, then execution goes to Block 11;
cedure. otherwise, return to the for loop on i.
J.E. Seem / Energy and Buildings 39 (2007) 5258 55

 Block 11: Outliers fxe;1 ; xe;2 ; . . . ; xe;nout g. This block  Block 5: Compute modified z-score(s). The modified z-score
determines the extreme values that are outliers. The first is a measure of the number of robust standard deviations an
nout extremes identified in Block 5 are considered outliers. outlier is from the robust estimate of the mean. In equation
Note that all the extreme values determined in Block 5 are not form, we determine the modified z-score with
outliers. xoutlier  x robust
zm (9)
4. Modified z-score srobust
where xoutlier is the value for an outlier.
To help facility operators rank the severity of an outlier, a
modified z-score [9] is used to quantify how far and in which 5. Field test results
direction an outlier is from the mean value of typical (i.e., non-
outlier) observations. Fig. 3 is a flow chart for determining a We used the data analysis method presented in the previous
modified z-score from robust estimates of the mean and standard section to analyze electrical consumption data for 97 buildings.
deviation [15,16] for a set of n observations X 2 {x1, x2, x3, . . ., When analyzing energy consumption for a particular day, the
xn}. Following are details on the numbered blocks in Fig. 3. energy consumption was compared to previous energy
consumption for days of the same day type. The maximum
 Block 1: Identify outliers in set X. Outliers number of previous days used in the analysis was limited to 30.
xe;1 ; xe;2 ; . . . ; xe;nout in set X are identified with the Many of the 97 buildings did not have abnormal high-energy
generalized ESD many-outlier procedure described in the consumption. This section contains field test results for three
previous section. The outliers are members of set Xout (i.e., buildings that had abnormal high-energy consumption.
X out fxe;1 ; xe;2 ; . . . ; xe;nout g). If there are no outliers in set Many buildings had abnormally low energy consumption
X, then Xout is an empty set. during holidays and special events such as offsite company
 Block 2: Determine set of non-outlier observations. meetings. The method described in this paper is robust to these
Determine the set of observations Xnon-out in set X that do unusual conditions. For example, following periods of low
not include the outliers. In equation form, the set of non- energy consumption during the winter holiday season, the
outlier observations is determined from method did not detect false outliers in the beginning of January.
X non-out fxjx 2 X and x 2
= X out g (6) Also, for this study, we did not find situations where the method
failed by incorrectly detecting high outliers.
 Block 3: Compute mean of non-outlier observations. The
The method for detecting abnormal energy consumption
robust estimate of the mean, x robust , is the average value of the
requires knowledge of day type (i.e., days of the week with
elements in set Xnon-out.
Pnnout similar energy consumption profiles). There are three basic
j1 x j approaches to determining the day type: (1) an operator can select
x robust (7) the day type based on knowledge of the building consumption,
n  nout
(2) an operator can use time series or box plots [17,18] to
where xj 2 Xnon-out and nout is the number of outliers in set X. determine the day type, or (3) a pattern recognition algorithm can
 Block 4: Compute standard deviation of non-outlier automatically determine day type [19]. The operator can select
observations. The robust estimate of the standard deviation, the days based on knowledge of building use, or the operator can
srobust, is the sample standard deviation of the elements in set use schematic or time series plots to determine the days with
Xnon-out similar power consumption. Fig. 4 is a two-panel box plot (trellis
s
Pnnout plot [20,21]) of the average daily consumption and peak daily
j1 x j  x robust 2 demand for a telecommunications building 1. From this plot, we
srobust (8)
n  nout  1 can conclude there are two day types: weekdays (Monday
through Friday) and weekends (Saturday and Sunday).
Fig. 5 is a time series graph of the peak daily electric
consumption for a telecommunications building. On 7 August,
the outlier analysis method determined that the energy
consumption was abnormally high. Robust estimates of the
mean and standard deviation determined that the energy
consumption was 10 standard deviations above the mean on 7
August. Investigation into the high-energy consumption
revealed a controls problem that was caused by failure of
the primary chiller. After the primary chiller failed, a secondary
chiller was started to meet the cooling load. After the primary
chiller restarted, both chillers were operating simultaneously
and demand limiting control was not used to limit the peak
electric consumption of the chillers. The high electric usage on
Fig. 3. Flow chart for determining modified z-scores of outliers. 7 August cost the customer around 12,000 United States
56 J.E. Seem / Energy and Buildings 39 (2007) 5258

Fig. 4. Two-panel boxplot of average daily consumption and peak demand.

Dollars (USD). To prevent this problem in the future, the Fig. 7 shows the peak and daily energy usage for the second
customer revised the control strategy to prevent the two chillers telecommunications building. On 8 February 2000 a new
from operating simultaneously. electric distribution panel was installed. To test the panel, a
Fig. 6 shows the peak and average daily power usage for a number of electric devices were turned on. This caused the peak
second telecommunications building that uses district heating. electric consumption to rise significantly, and increased the
The customer had an agreement to not use (curtail) district-heat electric bill for the customer by about 10,000 USD. The
during times requested by the supplier. On 16 and 17 December, customer could have reduced the costs by performing tests
the average and peak electric consumption were abnormally during periods of low electric costs.
high because the supplier asked the customer to not use district- Fig. 8 shows the peak and daily power consumption for a
heat. The customer then used electric boilers to heat water. This third telecommunications building that had a duct system that
caused the electric bill to increase by approximately 4300 USD. was not sized to allow free cooling during time periods when
The customer should determine if the curtailment savings from the outdoor temperature is low. The control strategy of using
district-heating supplier are greater than the higher electric cool outside air to reduce or eliminate mechanical cooling is
energy costs that result from curtailing district-heat. commonly called economizer cycle control [22]. The high

Fig. 6. High outliers caused by district-heating curtailment at a data center for a


Fig. 5. High outlier caused by chiller failure at a telecommunications building. telecommunications building.
J.E. Seem / Energy and Buildings 39 (2007) 5258 57

faulty operation. Also, building energy costs will decrease


because control, system, or operation problems will be detected
and corrected.

Acknowledgements

I would like to thank Jim Kummer, Bill Huth, and the field
organization of Johnson Controls, Inc. for their valuable
contribution in collecting data and working with our customers
to identify the causes and associated costs of abnormal energy
consumption.

References
Fig. 7. High outliers caused by equipment testing at a telecommunications
building. Next to the high outliers are the corresponding modified z-scores. [1] J. Hyvarinen, S. Karki (Eds.), Building Optimization and Fault Diagnosis
Source Book, International Energy Agency: Energy Conservation in
Buildings and Community Systems Annex 25, Real Time Simulation
of HVAC Systems for Building Optimization, Fault Detection and Diag-
nosis, VTT Building Technology, Espoo, Finland, 1996.
[2] J. Hyvarinen (Ed.), Technical Papers of the IEA Annex 25, International
Energy Agency: Energy Conservation in Buildings and Community
Systems Annex 25, Real Time Simulation of HVAC Systems for Building
Optimization, Fault Detection and Diagnosis, VTT Building Technology,
Espoo, Finland, 1996.
[3] A. Dexter, J. Pakanen (Eds.), Demonstrating Automated Fault Detection
and Diagnosis Methods in Real Buildings, International Energy Agency:
Energy Conservation in Buildings and Community Systems Annex 34,
Computer Aided Evaluation of HVAC System Performance, VTT Tech-
nical Research Centre of Finland, Espoo, 2001.
[4] D.E. Claridge, J.S. Haberl, R.J. Sparks, R.E. Lopez, K. Kissock, Mon-
itored commercial building energy data: reporting the results, ASHRAE
Transactions 98 (1) (1992) 881889.
[5] J.S. Haberl, M. Abbas, Development of graphical indices for viewing
building energy data. Part 1, Journal of Solar Energy Engineering, The
American Society of Mechanical Engineers 120 (3) (1998) 156161.
[6] J.S. Haberl, M. Abbas, Development of graphical indices for viewing
building energy data. Part 2, Journal of Solar Energy Engineering, The
American Society of Mechanical Engineers 120 (3) (1998) 156161.
Fig. 8. High outliers caused by HVAC system problem at a telecommunications [7] R.F. Dodier, J.F. Kreider, Detecting whole building energy problems,
building. Next to the high outliers are the corresponding modified z-scores. ASHRAE Transactions 105 (1) (1999) 579589.
[8] M. Berthold, D.J. Hand (Eds.), Intelligent Data Analysis: An Introduction,
Springer-Verlag, Berlin, 1999.
[9] Iglewicz, Hoaglin, How to Detect and Handle Outliers, American Society
outliers shown in Fig. 8 could have been avoided if the duct for Quality, Milwaukee, 1993.
system was sized to allow economizer cycle control. The higher [10] B.S. Everitt, The Cambridge Dictionary of Statistics, Cambridge Uni-
electric consumption on 9 and 10 November cost the building versity Press, 1998.
owner approximately 600 USD. [11] V. Barnet, T. Lewis, Outliers in Statistical Data, Wiley Series in Prob-
ability and Mathematical Statistics, 3rd ed., John Wiley & Sons, New
York, 1994.
6. Conclusions [12] B. Rosner, Percentage points for a generalized ESD many-outlier proce-
dure, Technometrics 25 (2) (1983) 165172.
This paper presented a new method for converting energy [13] V.J. Carey, C.G. Wagner, E.E. Walters, B.A. Rosner, Resistant and test-
consumption data into information. The method is computa- based outlier rejection: effects on gaussian one- and two-sample inference,
tionally efficient and thus can be implemented in todays Technometrics 39 (3) (1997) 320330.
[14] M. Abramowitz, I.A. Stegun, Handbook of Mathematical Functions with
building energy management and control systems. The method Formulas, Graphs and Mathematical Tables, Dover Publications, Inc.,
uses robust statistical methods to determine if the energy New York, 1970.
consumption is significantly different than previous energy [15] J.S. Simonoff, A Comparison of Robust Methods and Detection of Out-
consumption. If the energy consumption increases significantly, liers Techniques when Estimating a Location Parameter, Communications
then building operators or maintenance staff can investigate and in Statistics-Theory and Methods (1984) 275285.
[16] J.S. Simonoff, Outlier detection and robust estimation of scale, Journal of
correct the problem to prevent it from happening in the future. Statistical Computation and Simulation 27 (1987) 7992.
This new data analysis method will make building operators [17] J.W. Tukey, Exploratory Data Analysis, Addison-Wesley Publishing
more productive by reducing time requirements for detecting Company, 1977.
58 J.E. Seem / Energy and Buildings 39 (2007) 5258

[18] D.C. Hoaglin, F. Mosteller, J.W. Tukey, Understanding Robust and [21] W.N. Venables, B.D. Ripley, Modern Applied Statistics with S-PLUS, 3rd
Exploratory Data Analysis, John Wiley & Sons, Inc., 1983. ed., Springer-Verlag, Inc., New York, 1999.
[19] J.E. Seem, Pattern recognition algorithm for determining days of the week [22] ASHRAE Handbook: Heating, Ventilating, and Air-conditioning Appli-
with similar energy consumption profiles, Energy Buildings 37 (2) (2005) cations, American Society of Heating, Refrigerating and Air-conditioning
127139. Engineers, Inc., Atlanta, GA, 1999, p. 45.6.
[20] W.S. Cleveland, Visualizing Data Summit, Hobart Press, NJ, 1993.

Você também pode gostar