Escolar Documentos
Profissional Documentos
Cultura Documentos
𝑦𝑘
multistep estimator
input: 𝝍 = 𝒇(𝒖) 𝑥𝑘 𝑥𝑘
system: 𝒙ሶ = 𝒈 𝒙, 𝝍 𝜓𝑘 𝒖=𝒇 −𝟏
(𝝍) 𝑢𝑘
𝒚 = 𝒉 𝒙, 𝝍
Matteo Kirchner
November 2018
Joint state/input estimation in structural dynamics
State/force estimation using compressive sensing within a multistep
approach
Matteo KIRCHNER
November 2018
© 2018 KU Leuven – Faculty of Engineering Science
Uitgegeven in eigen beheer, Matteo Kirchner, Celestijnenlaan 300 box 2420, B-3001 Leuven (Belgium)
Alle rechten voorbehouden. Niets uit deze uitgave mag worden vermenigvuldigd en/of openbaar gemaakt worden
door middel van druk, fotokopie, microfilm, elektronisch of op welke andere wijze ook zonder voorafgaande
schriftelijke toestemming van de uitgever.
All rights reserved. No part of the publication may be reproduced in any form by print, photoprint, microfilm,
electronic or any other means without written permission from the publisher.
Fruitur tamen etas nostra beneficio precedentis, et sepe plura novit non suo
quidem precedens ingenio, sed innitens viribus alienis et opulenta patrum.
Dicebat Bernardus Carnotensis nos esse quasi nanos gigantium humeris
insidentes, ut possim plura eis et remotiora videre, non utique proprii visus
acumine aut eminentia corporis, sed quia in altum subvehimur et extollimur
magnitudine gigantea.
Metalogicon (III, 4)
Iohannes Saresberiensis
Preface
This is the outcome of almost six years of research, that would not have
been possible without funding. Therefore, I would like to begin this book
by acknowledging the European Commission for its support through the
Marie Skłodowska-Curie ITN-EID project eLiQuiD within the 7th Framework
Programme (GA 316422). Moreover, my research was partially supported by
Flanders Make, the strategic research centre for the manufacturing industry,
within the MoForM project. I am also grateful to the Flanders Innovation
& Entrepreneurship Agency and the Research Fund KU Leuven for their
support. Additionally, my work has been partially funded by the COMET K2 –
Competence Centres for Excellent Technologies Programme of the Austrian
Federal Ministry for Transport, Innovation and Technology (BMVIT), the
Austrian Federal Ministry for Digital, Business and Enterprise (BMDW), the
Austrian Research Promotion Agency (FFG), the Province of Styria and the
Styrian Business Promotion Agency (SFG).
Many people contributed to my life as a PhD student, and I would like to take
some time to express my gratitude to everyone who surrounded me during
the past years. Chronologically, my first big thanks goes to Stijn Donders. In
July 2012 you brought to my attention the open positions within the eLiQuiD
project at the KU Leuven Noise & Vibration Research Group, and you helped
me getting in touch with my future supervisors. This is how my PhD journey
started and I am very grateful to you for this.
My second and greatest thanks goes to my supervisors: Wim Desmet and Bert
Pluymers. I remember very well my interview during ISMA 2012, when we
had a nice talk over an Irish coffee. Wim, thanks a lot for the opportunity
of pursuing a PhD within your group. The technical as well as the human
interaction with many colleagues made my research interesting, challenging
and certainly also enjoyable. Thanks for the feeling of freedom and trust that
accompanied my research in the past years, and thanks for having me still on
board in the upcoming months. Bert, organisation, projects and practicalities
i
ii PREFACE
could not go any smoother, thanks a lot for all your help.
My gratitude goes also to all members of my examination committee for the
discussion during the preliminary defence and for going through my manuscript:
to Bert Pluymers, Davy Pissoort, Eugène Nijman, Paul Sas, Wim Desmet, for
your constructive hints since day one as members my supervisory committee;
to Claus-Peter Fritzen, for reading my text in detail and spotting the main
aspects that I could improve as external jury member; to Jan Croes, for the
countless inputs, technical discussions, feedback and revisions during the past
four years as colleague, friend and additional jury member; finally, to Carlo
Vandecasteele for making sure everything went smoothly as the chair of my
examination committee.
My PhD research started at Virtual Vehicle Rerearch Center (ViF) in Graz, and
my next thanks goes to Anton Fuchs, Eugène Nijman and Jan Rejlek. I really
enjoyed the nice environment at ViF, for which you are certainly responsible:
Toni from the top, Eugène with great research ideas and Jan supporting me
during every step in the eLiQuiD project.
My initial PhD track was not at all oriented towards the development of a
technique for virtual sensing, and two occurrences played a crucial role in
establishing my final topic, that I want to mention here. On December 19th ,
2013, I was sitting at my desk at ViF, finalising and submitting my first
conference paper. Unexpectedly I received an email from Eugène Nijman, with
subject “article”, body “For the future. . . ”, and with a paper attached. Eugène,
it was kind of exceptional receiving an email from you with both subject and
body that made sense: I was used to nothing at all or possibly the result of
randomly pressing keys. That said, in attachment there was the first paper I
read about compressive sensing. So right you were!
Almost one year later, after moving to KU Leuven, I was trying to find an
application of compressive sensing that would steer my research. On October
14th , 2014, I was at a graduate school on innovative technologies for energy
conversion, where Jan Croes gave a talk about filtering techniques. Jan, it is still
not fully clear to me what both of us were doing at a course on energy conversion.
My best guess is that you were replacing a speaker who declined last minute,
whereas I was curious to know more about topics beyond my main research
interests (and at the same time I was collecting precious credits for my doctoral
diary). The point is that there we met for the first time, and you suggested very
enthusiastic to consider compressive sensing for input representation within a
moving horizon estimator. Well, nice intuition! Eugène, Jan, thanks a lot!!!
My PhD journey brought me to different locations and several topics, and
this gave me the chance to interact with a lot of people. Concerning the
technical part of my research, I am very grateful to Eugène Nijman, Francesco
Cosco, Frank Naets, Goele Pipeleers, Jakob Fiszer, Jan Croes, Karim Asrih,
PREFACE iii
Luca Sangiuliano, Noé Geraldo Rocha de Melo Filho, Simon Vanpaemel, Ward
Rottiers, Wim Desmet, for all the help and inputs that I received from you.
Thanks to Alex Ricardo Mauricio, Daniel de Gregoriis, Daniele Brandolisio,
Eddy Smets, Elke Deckers, Florian Maurin, Gunther Penninckx, Jean-Pierre
Merckx, Sebastiaan van Aalst, Tom Henskens, for being present whenever I
needed something practical. Thanks to Daniel de Gregoriis, Mathijs Vivet,
Siemen Timmermans and Simon Vanpaemel for translating title and abstract of
this dissertation in Dutch. Thanks to all my colleagues in the Noise & Vibration
Research Group at KU Leuven, and in particular thanks to the vibro-acoustics
group. Thanks also to the members of the consortia I interacted with during the
past years. Getting in touch with you helped me understanding the importance
of keeping an eye on the bigger picture. Among others, I would like to mention
COST TU1105, EARPA, eLiQuiD, GRESIMO, MoForM.
It has been a pleasure to share my daily working time with the colleagues in the
open space of Area C at ViF, in my previous office in the MECH building, and
recently at LVL. Hoping not to forget too many of you, thanks for all the coffee
breaks and the funny moments with Christopher, Elmar, Fred, Giorgio, Jan,
Markus, Petra, Rafael, Sanaz, Sophie, Vittorio, Thomas, Yasser, Zoran (ViF),
Alireza, Anna, Axel, Elke, Jaime, Kengo, Matt, Nicolas, Philip, Sjoerd, Vamsi,
Yasuo (MECH), Amar, Andrea, Bart, Daniel, Daniele, Dries, Emin, Enrico,
Francesco, Giovanni, Harald, Hui, Jan, Jakob, Jelle, Karim, Kylian, Lorenzo,
Marco, Martijn, Mathijs, Maurice, Mikel, Niccolò, Pavel, Rocco, Sebastiaan,
Siemen, Simon, Simone, Thijs, Ward (LVL). Thanks to the MECH newcomers
2014, the MECH happy hour (in particular to Alireza, Laurens, Pavel, Philip,
Sjoerd), the PMA social event team, the ISMA 2018 team, the colleagues at
ICT, HR, financial office and secretary (of course including the lekkere broodjes).
Almost six years of my life do not include exclusively research, and many people
contributed to my well-being outside ViF and KU Leuven. Thanks to Ali,
Christopher, Francesca, Michael, Rafael, Sanaz, Sophie, for the multiple dinners
at Schillerheim. Thanks to Alberto, Alex, Alireza, Andrea, Anna, Barbara, Bart,
Costanza, Ettore, Florian, Gianmaria, Gorka, Hendrik, Hervé, Ines, Jonathan,
Laura, Laurens, Lise, Luca, Marco, Marcello, Matt, Mikel, Mirian, Nico, Noé,
Pavel, Philip, Sepide, Sergio, Siemen, Simona, Sjoerd, Vyacheslav, Wim, for the
nights in the Oude Markt, the barbecues, and similar high quality activities. Nen
dikke merci aan Barbara, ik ben blij dat je naast mij staat. Thanks to Adrian,
Attila, Dorota, Janick, Mario, Michal, Sebastian, Tânia, for the nice weekends
around Europe that we manage to have once in a while even if geography does
not help us much. Thanks to my friends in Trento. Knowing that you are there
means the world to me, and I am happy that somehow feelings do not change
much with time and distance. To Alessandro, Marco, Thomas, it is always a
good day to brew another batch of Cagnara. To Alessandro, Annalisa, Cristian,
iv PREFACE
Matteo
v
vi ABSTRACT
sensing (CS) principles in a moving horizon estimator (MHE), allowing for the
observation of a large amount of input locations for a small set of measurements.
In the new approach, called compressive sensing–moving horizon estimator
(CS-MHE), the capability of the MHE of minimising the noise while correlating
a model with measurements is enriched with an `1 -norm optimisation in order
to promote a sparse solution for the input estimation. This allows us to model
an input through a few shape functions belonging to a predefined set, i.e., we
exploit a known input shape to represent an input and to relax the typical
limitations of estimation problems (observability, disturbance dynamics and
required sampling rate). Combining the estimation of states and inputs is
extremely valuable, since in the case of mere state estimation a model error
could be perceived as an input, i.e., the estimated input incorporates the model
inaccuracy.
This book includes the state of the art of model based estimation techniques,
the mathematical derivation of the CS-MHE formulas, and two application
cases in the field of structural dynamics. In a first example we employ the
CS-MHE for the estimation of force impacts entering a mechanical system at
an unknown location, while in a second example we focus on the estimation of
periodic loads, which we typically find in rotating machinery.
Beknopte samenvatting
vii
viii BEKNOPTE SAMENVATTING
bemonsteringssnelheid.
Het doel van dit proefschrift is om de typische uitdagingen van gezamenlijke
schattingsproblemen, gerelateerd aan observeerbaarheid en verstorende dy-
namica, te overwinnen. We stellen een nieuwe tijdsdomeinaanpak voor met
betrekking tot de gezamenlijke toestands/input-schatting van mechanische
systemen, waarbij de nieuwigheid zit in het combineren van compressive sensing
(CS) en een bewegende horizon-schatter (moving horizon estimator, MHE),
wat toelaat om een groot aantal inputlocaties te observeren voor een kleine
set metingen. In de nieuwe aanpak, genaamd compressive sensing–moving
horizon estimator (CS-MHE), is aan de functionaliteit van de MHE, die de ruis
minimaliseert wanneer het model met metingen wordt gecorreleerd, een `1 -norm
optimalisatie toegevoegd met als doel een ijle oplossing voor de input-schatting
te verkrijgen. Het stelt ons in staat om een input te modelleren met enkele
vormfuncties behorend tot een vooraf gedefinieerde set. Dit wil zeggen dat
we een bekende inputvorm gebruiken om een input te voor te stellen en om
de typische beperkingen van schattingsproblemen (observabiliteit, verstorende
dynamica en minimale bemonsteringssnelheid) te versoepelen. Het combineren
van toestands- en inputschatting is uiterst waardevol, aangezien in het geval
van loutere toestandsschatting een modelfout kan worden waargenomen als een
input: de geschatte input omvat de modelonnauwkeurigheid.
Dit boek bevat de state of the art van modelgebaseerde schattingstechnieken,
de wiskundige afleiding van de formules CS-MHE en twee toepassingen in het
gebied van structurele dynamica. In een eerste voorbeeld gebruiken we de
CS-MHE voor het schatten van kracht impacten die een mechanisch systeem
exciteren op een onbekende locatie. In een tweede voorbeeld focussen we op de
schatting van periodieke belastingen die typisch terug te vinden zijn in roterende
machines.
List of abbreviations
1D one-dimensional
2D two-dimensional
3D three-dimensional
AD algorithmic differentiation
cf. confer
cond condition number
CP convex programming problem
CPU central processing unit
CS compressive sensing
CS-MHE compressive sensing–moving horizon estimator
CS-MUSIC compressive sensing–multiple signal classification
FE finite element
ix
x LIST OF ABBREVIATIONS
Fig Figure
fps frames per second
Freq. Frequency
FRF frequency response function
i.e. id est
ID identification number
IP interior point
IST iterative shrinkage/thresholding
KF Kalman filter
KKT Karush-Kuhn-Tucker
PA polynomial approximation
PBH Popov-Belevitch-Hautus
LIST OF ABBREVIATIONS xi
PC personal computer
PDE partial differential equation
PDF probability distribution function
pdf probability density function
Thm Theorem
TTL transistor–transistor logic
General symbols
˙
(·) first time derivative
(·)LB lower bound
(·)UB upper bound
(·)> transpose
(·)−1 inverse
(·)k variable evaluated at time step t = tk
0 null matrix
1 vector of all entries 1
C set of complex numbers
Cn n vector of nn complex numbers
Cnn ×nm matrix of nn rows times nm columns complex numbers
d(·) differential
∂(·) partial differential
E(·) expected value
I identity matrix
=(·) imaginary part of a complex number
k generic time step t = tk
|| · ||0 `0 -norm
|| · ||1 `1 -norm
|| · ||2 `2 -norm (Euclidean norm)
N (z̄, σz2 ) Gaussian distribution with mean z̄ and standard deviation σz
R set of real numbers
Rn n vector of nn real numbers
Rnn ×nm matrix of nn rows times nm columns real numbers
<(·) real part of a complex number
t time variable (continuous time)
tk time variable at time step k (discrete time)
xiii
xiv LIST OF SYMBOLS
Note 1: this list is not exhaustive. It includes some important general symbols
as well as most of the symbols that we use in the development of the
CS-MHE in chapter B1. The explanation of the remaining symbols needs
to be found in the text.
Note 2: throughout this dissertation, we refer to a column vector as vector.
Accordingly, we refer to a row vector through the transpose (·)> .
Note 3: we refer to matrices and vectors with uppercase and lowercase Roman
italic characters, respectively. The only exceptions regard letters f (·),
g(·) and h(·), which are reserved to indicate functions, letters k, n, N
and T , which are scalars, and letter d, which indicates a differential.
Scalars and other particular instances are indicated by Roman characters
(“upright”, both uppercase and lowercase). The use of Greek characters
and calligraphic letters will be clear from the context.
Contents
Abstract v
List of abbreviations ix
Contents xvii
Introduction 1
Outline and structure of the dissertation . . . . . . . . . . . . . . . 4
Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
A1 State estimation 13
A1.1 Dynamical system modelling . . . . . . . . . . . . . . . . . 14
A1.2 Probability theory . . . . . . . . . . . . . . . . . . . . . . . 18
A1.3 Least squares estimators and recursive formulation . . . . . 25
A1.4 Single step estimators . . . . . . . . . . . . . . . . . . . . . 27
xvii
xviii CONTENTS
A2 Optimisation 53
A2.1 An introduction to nonlinear programming . . . . . . . . . 54
A2.2 Convex programming problems . . . . . . . . . . . . . . . . 58
A2.3 Numerical methods for nonlinear programming . . . . . . . 59
A2.4 Norm approximation problems . . . . . . . . . . . . . . . . 64
A2.5 Complex optimisation variables . . . . . . . . . . . . . . . . 68
A2.6 Covariance matrix of constraint optimisation problems . . . 69
A2.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
A3 Compressive sensing 73
A3.1 Introduction to compressive sensing . . . . . . . . . . . . . 74
A3.2 Methods for solving the compressive sensing problem . . . . 75
A3.3 Feasibility considerations for compressive sensing . . . . . . 77
A3.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
C Applications 119
Bibliography 203
Nowadays, automation is present in many aspects of our daily life. From small
home appliances to heating, ventilation and air conditioning systems, from
electric bicycles to spacecrafts, from family businesses to industrial production
plants, an increasing number of tasks is performed by CPUs in an automated
fashion. Digital controllers make use of various configurations, algorithms,
modelling tools and measurements with the aim of obtaining better performances,
reducing costs and decreasing emissions. Furthermore, some specific tasks cannot
be carried out unless a controller continuously monitors a certain system flow.
Control strategies require specific information, which is not always available. In
such a situation, or if prediction is needed for taking future decisions, estimators
can help a controller by providing the missing information.
Besides control engineering, estimators are key elements within the communities
of structural health monitoring (with a primary interest in bridges and buildings)
and condition monitoring (related to machinery and moving components). In
fact, the evaluation of the health of a structural component or of the whole
assembly allows to predict and possibly avoid failure, to plan maintenance, thus
improving safety and reliability, and reducing costs and energy consumption. An
accurate assessment of the health status requires some information, which could
be acquired by direct measurement. Unfortunately, there are many situations
in which economic or physical constraints do not allow to place a certain sensor.
In particular, forces and torques are of paramount importance for durability.
Let us just mention forces and torques in bearings and gearboxes, contact forces
on gear teeth, aerodynamic loads on wind turbine blades, and the interaction
between a machine tool and a work piece. In such situations a direct force
measurement is often not possible, while estimators can retrieve this information
by combining a model with other measurements such as accelerations or strains.
Such an estimator is also referred to as a virtual sensor.
In order to implement an automatised routine on a digital computer, we can
describe mathematically a generic system through some so-called states, which
1
2 INTRODUCTION
Industry 4.0 [76] is the current trend of automation and data exchange in
manufacturing technologies. Its name was proposed by the German federal
government (Bundesministerium für Bildung und Forschung [22]) and refers
to what is expected to be the fourth industrial revolution. Industry 4.0 aims
at creating a so-called smart factory, in which cyber-physical systems monitor
physical processes, create a virtual copy of the physical world and make
decentralised decisions. Cyber-physical systems communicate and cooperate
with each other and with humans in real time, and both internal and cross-
organisational services are offered and used by the participants of the product’s
value chain. Industry 4.0 has the following four design principles [76]:
An example for Industry 4.0 is a machine which can predict failures, trigger
maintenance processes autonomously and react to unexpected changes in
production. Such a smart product knows its history and its current and
target states. It can steer itself through the production process by instructing
other machines to perform some tasks in a certain production stage [76]. In
such framework, the CS-MHE can provide a smart product with extra data to
be employed for self-optimisation, self-configuration and self-diagnosis, as well
as further support to human workers. The CS-MHE is based on physical models
(which are a central component in Industry 4.0) and allows for the estimation
of inputs which cannot be detected by other state of the art technologies. This
improves the knowledge of a process, and can be implemented by exploiting an
already available hardware.
Part A collects the state of the art regarding model based state/input
estimation, and includes a primer in nonlinear optimisation theory and
compressive sensing.
In Chapter A1 we give an overview of the most common state of the art
state estimation techniques which are typically employed in automotive,
manufacturing and chemical industry. After introducing single step estimators,
we focus on the (multistep) moving horizon estimator. Next, we consider several
approaches to model an input. In particular, this chapter includes the state
of the art of compressive sensing as a tool to improve the performances of
an estimator and as a way to represent a force input. Finally, we discuss the
concept of observability, which is relevant for all state estimation techniques and
becomes even more crucial in the framework of joint state/input estimation.
In Chapter A2 we outline the fundamentals of constrained optimisation, since
the CS-MHE needs to solve a minimisation problem. Moreover, we introduce
some relevant concepts such as norm approximation problems, optimisation
OUTLINE AND STRUCTURE OF THE DISSERTATION 5
Introduction
C Applications
C1 Estimation of force impacts
C2 Estimation of periodic loads described by
Fourier components
problems with complex variables, and how to obtain the covariance matrix in
case of constraint optimisation problems.
In Chapter A3 we treat compressive sensing. Together with the moving horizon
estimator (cf. chapter A1), compressive sensing forms the foundation of the
CS-MHE.
−2
F [N]
−4
F −6
0.2
s1 s2 s3 0.08
0.3 0.06
0.04
0.4 0.02
0 0.125 0.245 0.325 0.405 x [m] 0
t [s]
x [m]
Part C shows a few application cases of the CS-MHE for joint state/input
estimation.
In Chapter C1 we present an application case that serves as first validation
scenario for the CS-MHE. An LTI mechanical system is subjected to a force
impact, and the CS-MHE estimates the states of the system (i.e., the first three
structural eigenmodes and their time derivatives) as well as the force impact
in terms of magnitude, time and location (Fig. I). The chapter includes some
numerical simulations and one experiment.
In Chapter C2 we present a few examples that exploit a complex input
representation. First, we present two numerical test cases with a 1D and a 2D
Fourier dictionary, respectively. Furthermore, we introduce an experimental
application case which exhibits more challenges in terms of system modelling
and measurements, i.e., we treat the case of a structure excited by a shaker,
CONTRIBUTIONS 7
Iteration 1
MPF pos,n 3 n ( ) 3
10 -3
5 20
uk
0 0
5
-5 -20
T-N+1 T-1 T Fourier components 0
F(k) [N]
k
-5
Figure II: Second application example. Experimental set-up (top left), model
(bottom left), periodic load estimation in time (right).
modelled by finite elements, and measured through a high-speed camera (Fig. II).
We show the estimates for a periodic load applied at a known location and
consisting of up to four Fourier components, and we investigate the performance
of the CS-MHE in relation to the system calibration and to the length of the
estimation window.
Contributions
Methodological contributions
Applications
The CS-MHE for the estimation of force impacts: application case (vii)
As a first application example, we propose an LTI mechanical system (i.e., a
cantilever beam) modelled analytically, subject to a force impact entering the
system at an unknown location. This example demonstrates the effectiveness of
compressive sensing as a tool to model an input, and includes an experimental
validation.
cf. chapter C1
11
Chapter A1
State estimation
This chapter outlines the state of the art concerning estimation techniques
based on probability theory. Most of the approaches aim at correlating a model
with measurements while taking into account the related stochastic phenomena.
Our discussion focuses on discrete time systems and state-space representations,
which are widely employed in digital control engineering. In this chapter we
describe the moving horizon estimator (MHE) which, together with compressive
sensing (cf. chapter A3), constitutes the foundation of the CS-MHE.
We start by defining a state in section A1.1. Next, in section A1.2 we outline a few
concepts of probability theory, which represent the groundwork of least square
estimators (section A1.3), of the Kalman filter and its extensions (section A1.4),
and of the MHE (section A1.5). We dedicate some space to the latter since it is
a key ingredient of the CS-MHE. Furthermore, in section A1.6 we introduce the
concept of joint estimation, while in section A1.7 we describe four approaches
for input modelling. These subjects raise the topic of observability, which
we address in section A1.8. Finally, in section A1.9 we close this chapter by
explaining the choice of the moving horizon estimator and compressive sensing
as the two main ingredients of the CS-MHE.
Acknowledgements
This chapter is an overview of the state of the art on the topic. Great sources
of inspiration for writing this chapter were [138, 183] and the excellent MHE
references such as [72, 166, 167]. Concerning input modelling, we acknowledge
the work related to KU Leuven and reported in [38, 124, 125, 127, 128, 133, 134].
The section about compressive sensing within estimation techniques comes
primarily from [104], of which Matteo Kirchner is first author.
13
14 STATE ESTIMATION
The states of a system are those variables that provide a complete representation
of the internal condition (or status) of the system at a given time instant. In
general a system does not have a unique representation, and different applications
may require different models, resulting in different sets of states. Therefore, a
state (or state variable) represents a certain quantity of a system which we want
to monitor or control. The aim of state estimation is then to extract from a
system all available information about a state. State estimation is applicable to
many scientific disciplines where it is possible to have a mathematical model of
the system under analysis.
The mathematical representation of a system through states variables is referred
to as state-space model (or state-space representation). Eq. (A1a) shows a
generic state-space representation of a continuous-time time-varying nonlinear
system.A1 Throughout this thesis we will always consider two equations, i.e., the
over-mentioned state equation (A1a) combined with a measurement equation
(A1b). Time dependency is indicated by t, belonging to a continuous time
horizon.
do not exhibit any dependency of the model on the time derivative of those states [183].
A2 Throughout this dissertation, we refer to a column vector as vector. Accordingly, we refer
State estimation and state-space models became popular within the engineering
community during the second half of the twentieth century, and got a strong
boost after the introduction of the Kalman filter [94]. The two main engineering
applications of state estimation regard control engineering (the estimate of
a system state is needed in order to implement a state-feedback controller)
and measurement systems (certain estimates are sought if direct measurements
are not possible). A state equation such as Eq. (A1a) represents a dynamic
system, where the term dynamic refers to the time changing characteristics of a
system. It is possible to describe many real world processes by mathematical
models such as Eq. (A1a) by means of ordinary differential equations (ODEs)
or differential algebraic equations (DAEs).A3
It is difficult to handle mathematically nonlinear systems such as Eq. (A1a),
while linear system theory provides us with a series of tools for linear systems
[183]. In order to apply those tools we need to linearise a system, obtaining a
matrix representation such as Eq. (A2), where AC ∈ Rnx ×nx is the state matrix
(or system matrix), BC ∈ Rnx ×nu is the input matrix (or control matrix),
CC ∈ Rny ×nx is the output matrix, DC ∈ Rny ×nu is the feedthrough matrix
(or feedforward matrix), LC (t) ∈ Rnx ×nx and GC (t) ∈ Rny ×ny are two error
matrices, and finally cx ∈ Rnx and cy ∈ Rny are constants which may appear
due to the linearisation (for linear systems, these constants are equal to zero
[183, 192]). Subscript C indicates continuous-time.
yk = g(xk , uk , vk , k) (A3b)
A3 Partial
differential equations (PDEs) are also possible. They can be reduced to ODEs
and DAEs [204].
16 STATE ESTIMATION
A1.1.2 Linearisation
1 ∂ 2 f 2 1 ∂ 3 f 3
∂f
f (x) = f (x̄) + x̃ + x̃ + x̃ + · · · (A5)
∂x x̄ 2! ∂x2 x̄ 3! ∂x3 x̄
nx
! nx
!2
X ∂ 1 X ∂
f (x) = f (x̄) + x̃i f (x) + x̃i f (x)
i=1
∂x i 2! i=1
∂x i
x̄ x̄
nx
!3
1 X ∂
+ x̃i f (x) + · · · (A6a)
3! i=1
∂xi
x̄
nx
!
X ∂
≈ f (x̄) + x̃i f (x) (A6b)
i=1
∂xi
x̄
ẋ = f (x, u, w) (A7a)
∂f ∂f ∂f
≈ f (x̄, ū, w̄) + x̃ + ũ + w̃ (A7b)
∂x (x̄,ū,w̄) ∂u (x̄,ū,w̄) ∂w (x̄,ū,w̄)
= x̄˙ + AC x̃ + BC ũ + LC w̃ (A7c)
If we linearise also the measurement equation (A1b), we finally obtain the linear
system (A8), where all matrices are given explicitly in Eqs. (A9).A4
x̃˙ = AC x̃ + BC ũ + LC w̃ (A8a)
ỹ = CC x̃ + DC ũ + GC ṽ (A8b)
∂f ∂f ∂f
AC = BC = LC = (A9a)
∂x (x̄,ū,w̄) ∂u (x̄,ū,w̄) ∂w (x̄,ū,w̄)
∂h ∂h ∂h
CC = DC = GC = (A9b)
∂x (x̄,ū,v̄) ∂u (x̄,ū,v̄) ∂v (x̄,ū,v̄)
A1.1.3 Discretisation
Z t
AC (t−t0 )
x(t) = e x(t0 ) + eAC (t−τ ) [BC (τ )u(τ ) + w(τ )] dτ (A10a)
t0
Z tk
AC (tk −tk−1 )
x(tk ) = e x(tk−1 ) + eAC (tk −τ ) BC (τ ) dτ u(tk−1 )
tk−1
Z tk
+ eAC (tk −τ ) w(τ ) dτ (A10b)
tk−1
Z tk
xk = AD,k−1 xk−1 + BD,k−1 uk−1 + eAC (tk −τ ) w(τ ) dτ (A10c)
tk−1
xk = x(tk ) (A11a)
uk = u(tk ) (A11b)
The state estimation techniques that we will mention in this chapter derive
from probability theory, the main reason being that process noise can often be
modelled as a random variable (also referred to as random quantity, aleatory
variable, or stochastic variable). Consequently, we introduce a few definitions
concerning probability theory. Our discussion follows mainly from [183].
PROBABILITY THEORY 19
P(A, B)
P(A | B) = . (A12)
P(B)
We note that in general P(A | B) > P(A, B). Moreover, we can apply Eq. (A12)
to P(B | A), leading to the equivalence of Eq. (A13).
Theorem A1.4 (Bayes’ rule). The so-called Bayes’ rule (or Bayes’ theorem)
is obtained by rearranging Eq. (A13) as follows:
P(B | A)P(A)
P(A | B) = . (A14)
P(B)
dFX (x)
fX (x) = . (A16)
dx
σ 2 = E (X − x̄)2 . (A17)
We use the notation in Eq. (A18) to indicate that X is a random variable with
mean x̄ and variance σ 2 . A pdf of a random variable may be asymmetric around
its mean, and this phenomenon is referred to as skewness [183].
X ∼ (x̄, σ 2 ) (A18)
Definition A1.11 (Gaussian random variable). A random variable X is said
to be Gaussian (or normal) if its pdf is as follows:
1 −(x − x̄)2
fX (x) = √ exp , (A19)
σ 2π 2σ 2
where x̄ and σ are the mean and standard deviation of the Gaussian random
variable, respectively.
PROBABILITY THEORY 21
X ∼ N (x̄, σ 2 ) (A20)
A peculiarity of the Gaussian random variables concerns its PDF, which can be
obtained by integrating the pdf (A19) (cf. Def. A1.7 ). In fact, a Gaussian PDF
can be approximated by a closed-form expression [183], i.e., normal random
variables are mathematically convenient [167]. For our purposes, it is important
to consider how the pdf of a random variable propagates through a function.
It can be proven that a linear transformation of a Gaussian random variable
results in a new Gaussian random variable [183]. This aspect is crucial for the
Kalman filter (cf. section A1.4).
Up to this point we limited the discussion to one single random variable.
However, the framework of state/input/parameter estimation presents typically
a higher number of random variables, which may or may not be somehow
correlated. Such topic requires the concept of covariance.
Definition A1.12 (Covariance). The covariance of two scalar random
variables X and Y is defined as follows:
CXY
ρ = , (A22)
σx σy
where σx and σy are the covariances of X and Y, respectively.
We note that if two random variables are independent, then they are also
uncorrelated, but the contrary does not hold.
Definition A1.15 (Orthogonality). Two random variables X and Y are said
to be orthogonal if RXY = 0.
Two uncorrelated random variables are orthogonal only if at least one of them
is zero-mean.
All previous definitions can be generalised for vectors, leading to quantities which
are vectors and matrices. For two vectors x ∈ Rnx and y ∈ Rny , the correlation
and covariance are defined as shown in Eqs. (A24) and (A25), respectively.
= .. .. .. (A24b)
. . .
E(xnx y1 ) · · · E(xnx yny )
Rx = E(xx> ) (A26a)
The error term of Eq. (A27a) has a covariance given by Eq. (A28), and a similar
formulation applies to the error term of the measurement equation (A27b)
[183]. Consequently, Eq. (A27) assumes the form of Eq. (A29), with the purely
additive error terms following the distributions indicated in Eq. (A30). It is
common practice to define covariances QD,k = LD,k Q̃D,k L> D,k and RD,k =
GD,k R̃D,k GG,k , such that wk ∼ (0, QD,k ) and vk ∼ (0, RD,k ).
>
Throughout this thesis we only deal with discrete-time systems. We will omit
to explicitly indicate the dependency of the discrete time step k and we will
24 STATE ESTIMATION
consider purely additive noise. Under those assumptions, the generic discrete-
time (nonlinear) system of Eq. (A3) becomes as shown in Eq. (A31).
yk = g(xk , uk ) + vk (A31b)
xk+1 = Ak xk + Bk uk + wk (A32a)
yk = Ck xk + Dk xk + vk (A32b)
wk ∼ N (0, Qk ) (A34a)
vk ∼ N (0, Rk ) (A34b)
system (A29a), where wk ∼ N (0, QD,k ) is a Gaussian noise with covariance QD,k .
Moreover, let its corresponding continuous-time error be w(t) ∼ N (0, QC (t)).
We can investigate how the mean of the state xk changes with time by computing
the expected value of both sides of Eq. (A29a), as indicated in Eq. (A35).
Furthermore, we can define the quantity (xk − x̄k ), and obtain the covariance
Pk of xk as shown in Eq. (A36). All mathematical details can be found in [183].
Eq. (A37) shows the integral formula for QD,k−1 [183], which is strictly linked
to the integral in Eq. (A10c), obtained from the discretisation of w(t). QD,k−1
is in general difficult to calculate, but there exist a few approximations for small
∆t [183] and a convenient formula if AD is invertible [196].
Z tk
>
QD,k−1 = eAC (tk −τ ) QC (τ ) eAC (tk −τ ) dτ (A37)
tk−1
The idea on which least square (LS) estimators are based dates back to the
beginning of the nineteenth century, when Karl Friedrich Gauss published his
Theoria Motus [59]. An example which most engineers face at least once during
their career is linear regression, which aims at minimising an LS residual function
in order to determine some parameters from a set of noisy measurements.A5
Let x ∈ Rnx be a vector of unknowns and let y ∈ Rny be a vector of (noisy)
measurements. Furthermore, let y be a linear combination of x plus some noise
A5 We will consider the same example also in section A2.4, where we discuss optimisation
problems with norm approximations. In the context of optimisation, we will show that LS
minimisation involves a squared `2 -norm.
26 STATE ESTIMATION
term v ∈ Rny . Then every measurement can be expressed by Eq. (A38) or, in
matrix form, as Eq. (A39), where C ∈ Rny ×nx .
nx
X
yi = Ci,j xj + vi ∀ i = 1, . . . , ny (A38)
j=1
y = Cx + v (A39)
y = y − C x̂ (A40)
ny
2y,i = >
X
J= y y (A41)
i=1
∂J
= −2y > C − 2x̂> C > C = 0 (A42)
∂ x̂
The solution of Eq. (A42) is shown in Eq. (A43), and contains term (C > C)−1 C > ,
which is the so-called (left) Moore-Penrose pseudo-inverse, and exists if ny ≥ nx
and C is full rank.
yk = Ck x + vk (A45a)
The new estimate x̂k is computed only from the previous estimate x̂k−1 and the
new measurement yk without the need to augment the system. We note that if
either the gain or the correction term is zero, then the new estimate is equal to
the previous estimate. For a procedure to determine the optimal gain Kk we
refer to [183]. The least square estimator is historically important because it
made a bridge between Bayes’ rule (Thm. A1.4) and optimal state estimation.
Moreover, it paved the way to the Wiener filter [198, 199] and to the Kalman
filter [94]. It is worth mentioning that Kk relies on a recursive formula for the
covariance of the LS estimation error. In fact, the propagation of states and
covariances (cf. section A1.2) is crucial for recursive filters for dynamic systems.
Single steps estimators are recursive filters in which the estimation takes place
at one single time step. Any available prior or future information is typically
condensed into a covariance matrix and propagated to the estimation time
step. According to the choice of the time step it is possible to distinguish four
approaches, recalling the concepts of conditional probability (Def. A1.3) and
expected value (Def. A1.8). The following four definitions outline the differences
between those approaches [183].
Definition A1.16 (A posteriori estimate). The a posteriori estimate x̂+ k =
x̂k|k is defined as the expected value of xk conditioned on all of the measurements
up to and including time step k.
x̂+
k = E(xk | y1 , y2 , . . . , yk ) (A46)
Definition A1.17 (A priori estimate). The a priori estimate x̂−
= x̂k|k−1
k
is defined as the expected value of xk conditioned on all of the measurements up
28 STATE ESTIMATION
x̂−
k = E(xk | y1 , y2 , . . . , yk−1 ) (A47)
The most popular linear estimator is the Kalman filter (KF) [94]. It exploits
Defs. A1.16–A1.17 and consist of two phases. The first phase involves Eqs. (A50),
which derive from Eqs. (A35–A36) and show the time update equations x̂− k and
Pk− , i.e., the propagation from the a posteriori estimate of the previous time
step to the a priori estimate of the current time step. This phase is referred to
as the prediction.
+
x̂−
k = Ak−1 x̂k−1 + Bk−1 uk−1 (A50a)
+
Pk− = Ak−1 Pk−1 k−1 + Qk−1
A> (A50b)
Next, Eqs. (A50) (which are a priori estimates) are updated with a new
measurement, recalling Eq. (A45). This is the so-called update phase and results
in the a posteriori estimates (A51). For the details of the KF we refer to [183].
We note that at time step k = 0 we need to initialise x̂+ +
0 and P0 according to
our best knowledge of the system.
SINGLE STEP ESTIMATORS 29
x̂+ − −
k = x̂k + Kk (yk − Ck x̂k ) (A51b)
Literature offers alternate forms for Pk+ and Kk . These can be found in [183],
together with their derivations and a discussion regarding their strengths and
weaknesses. Furthermore, it can be shown that if xk is a constant, then Ak = I,
Qk = 0 and uk = 0, and the KF reduces to a recursive LS estimator [183].
The most important feature of the KF is that Pk− , Kk and Pk+ do not depend
on the measurements yk , but depend only on the system parameters Ak , Ck ,
Qk and Rk . This implies that Kk can be precomputed off-line before starting
the filter, saving computational effort and allowing for a convenient real time
implementation.
We can assess the performance of the KF by introducing an error x̃ = xk − x̂,
which is a random variable, since xk is linked to the stochastic process wk and
x̂ is further computed by the stochastic process vk [183]. We can then minimise
the LS error as indicated by Eq. (A52), where Sk is an arbitrary (diagonal)
weighting matrix.
Let us now consider some nonlinear approaches based on the KF. Nonlinear
systems are common in practice since perfectly linear systems do not exist. It is
true on the one hand that some systems can be well represented by a linear model,
but on the other hand nonlinear models are unavoidable in many situations,
where a linear system would not lead to a sufficiently accurate approximation.
Furthermore, in the framework of joint estimators (cf. section A1.6) a linear
system may become nonlinear for what inputs and/or parameters are concerned.
A straightforward approach for handling nonlinearities consists of linearising
a system (cf. section A1.1.2) and run a KF. The only problem is to choose a
nominal state trajectory for the linearisation. The so-called extended Kalman
filter (EKF) [14, 183] tackles this problem by exploiting the a posteriori KF
estimation at the previous time step, i.e., the EKF performs a Taylor linear
approximation around xk−1 = x̂+ k−1 and wk = 0 [183]. The EKF is the most
popular estimation approach for mildly nonlinear systems since the only extra
element in comparison with the KF is the linearisation. Consequently, its
computational demand remains relatively low.
For highly nonlinear systems, higher-order approaches based on the EKF
may perform well. For example, the iterated EKF applies a further first-
order Taylor approximation around the new EKF estimate, and runs the EKF
again. This can be repeated iteratively until a certain linearisation accuracy
is obtained. Alternatively, a second-order EKF employs a second-order Taylor
series expansion of the system equations [6, 183]. Another approach for nonlinear
systems is the Gaussian sum filter [4, 183], in which a non-Gaussian pdf is
approximated by a sum of nm Gaussian pdfs, resulting in nm parallel KFs.
Their nm estimates are then combined in a final estimate. The pdf can also be
approximated by using non-Gaussian functions. In comparison with the EKF,
all other over-mentioned approaches require more computational power, and
their popularity is not growing. In fact, an EKF running at a high sampling
rate may be more efficient.
SINGLE STEP ESTIMATORS 31
In sections A1.4.1 and A1.4.2 we presented a few single step estimators, from
linear approaches (recursive LS estimator, KF, H∞ filter) up to methodologies
that can handle mild as well as severe nonlinearities (EKF and its variants,
UKF, particle filter). For linear systems, the KF provides the optimal estimate
32 STATE ESTIMATION
within a time window instead of a single time step. Even if optimality cannot
be guaranteed, the MHE has been shown to perform better than the EKF in
case of cost functions with local optima [71, 72]. The price to pay is again a
higher computational effort, which scales proportionally to the window length.
The MHE is becoming a popular estimator thanks to its intrinsic capability
to handle nonlinearities. Moreover, a recent research stream regarding model
predictive control (MPC) and optimisation algorithms is developing tools with
the aim to improve MPC/MHE real time implementation (cf. chapter A2).
However, the basic idea of MHE is not new. A list of references to trace the
history of multistep estimators is given in [167].
In the context of the CS-MHE for joint state/input estimation, we chose the
MHE among all other estimators since it provides a time dimension in which
we can exploit input sparsity (cf. section A1.7.4). In the remaining part of
this section we introduce the MHE, which we will recall in chapter B1 for the
derivation of the CS-MHE.
In this section we show how the MHE is derived from probability theory. Since
we are interested in implementing an MHE scheme on a digital computer, we
limit the discussion to a discrete-time MHE. Let us consider the full length
time signal of a nonlinear discrete-time dynamic system such as Eq. (A31),
from an initial point (k = 0) to the current time step (k = T ). We remind
that noise is assumed to be purely additive (cf. section A1.2). Under the
assumption of a discrete time Markov process (cf. section A1.2) [140], we can
formulate a state estimation problem by recalling the definition of conditional
probability (Def. A1.3) applied to a probability density function (pdf). Our
interest regards the conditional pdf in Eq. (A53), i.e., the pdf of the state
evolution x0 , x1 , . . . , xT given the process measurements y0 , y1 , . . . , yT −1 . In
fact, the optimal estimate of the state at time step k given measurements
y0 , y1 , . . . , yT −1 is a function of Eq. (A53) [167].
p(x0 , x1 , . . . , xT | y0 , y1 , . . . , yT −1 ) (A53)
We can express the joint probability of the state (Def. A1.2) as in Eq. (A54),
where px0 represents the prior information about the initial state of the system.
Moreover, Eq. (A31b) for the measurements leads to the conditional probability
in Eq. (A55).
34 STATE ESTIMATION
−1
TY
p(x0 , x1 , . . . , xT ) = px0 (x0 ) p(xk+1 | xk ) (A54)
k=0
−1
TY
p(y0 , y1 , . . . , yT −1 | x0 , x1 , . . . , xT −1 ) = pvk [yk − g(xk , uk )] (A55)
k=0
p(x0 , x1 , . . . , xT | y0 , y1 , . . . , yT −1 )
−1
TY
∝ px0 (x0 ) pvk [yk − g(xk , uk )] p(xk+1 | xk ) (A56)
k=0
T
X −1
= arg max log pvk [yk − g(xk , uk )]
x0 ,x1 ,...,xT
k=0
T
X −1
= arg max log pvk [yk − g(xk , uk )]
x0 ,x1 ,...,xT
k=0
If we choose pvk (·), px0 (·) and pwk (·) as the three Gaussian distributions in
Eq. (A60), we obtain problem (A61), where we defined kzk2H = z > Hz. It is
worth noting that the new problem involves a minimisation of the errors of a
model and measurements.
T −1
kyk − g(xk , uk )k2R−1
X
= arg min
x0 ,x1 ,...,xT
k=0
Furthermore,
LB UB we add a set of bounds on
the optimisation variables, i.e., xk ∈
xk , xk , wk ∈ wkLB , wkUB , vk ∈ vkLB , vkUB , where suffixes LB and UB
indicate lower and upper bounds, respectively. Those bounds are typical closed
and convex, such as polyhedral convex sets. A constraint on wk is common
practice, but the same is not true for vk and xk . In fact, reference [167]
advises against a constraint on the measurement noise due to the possibility
of outliers. Furthermore, constraining a state is not a trivial task since xk
and wk may be correlated, and a constraint may have as consequence the
violation of causality [167]. In general, constraints can significantly alter the
probabilistic structure of the problem. Nevertheless, their advantage is great
36 STATE ESTIMATION
during the modelling process, since constraints allow for simplified models
[167, 184]. For example, in problem (A62) we can see that a generic state-space
model has been directly implemented by the constraints (A62b–A62c). We
−1
defined {wk }Tk=0 = {w0 , w1 , . . . , wT −1 }.
T −1
kvk k2R−1 + kwk k2Q−1 + kx0 − x̄0 k2Π−1
X
minimise (A62a)
−1
x0 ,{wk }T 0
k=0 k=0
yk = g(xk , uk ) + vk (A62c)
xk ∈ xLB UB
, wk ∈ wkLB , wkUB , vk ∈ vkLB , vkUB (A62d)
k , xk
We notice that matrices Q and R are the tuning parameters for matching
the process model with the measurements. In fact, Q provides a measure of
confidence in the model, whereas R provides a measure of confidence in the
measurement system. Furthermore, matrix Π0 provides a measure of confidence
about the knowledge of the initial state x0 . Covariances Q and R have to be
built according to the information that are available about the model and the
measurement system. For the latter this is usually a simple task, since the
accuracy of the measurement system is known, while a few assumptions are
needed to choose a value for the model uncertainty.
Problem (A62) grows without bounds in time, up to the point at which solving
the system would result infeasible. For this reason, in Eq. (A63) the cost
function (A62a) is divided into two parts, the second of which is characterised
by a fixed horizon of N time steps.
T −N
X−1
ΓT = kvk k2R−1 + kwk k2Q−1 + kx0 − x̄0 k2Π−1 (A63a)
0
k=0
T −1
kvk k2R−1 + kwk k2Q−1
X
+ (A63b)
k=T −N
The second line of ΓT , indicated by Eq. (A63b), depends only on the state
−1 T −1
xT −N , the disturbance {wk }Tk=T −N and the process measurements {yk }k=T −N .
The last step before we can obtain the MHE formulation is to define the arrival
cost from Eq. (A63a). The arrival cost is labelled as ZT −N (x̄0 ) in Eq. (A64),
THE MOVING HORIZON ESTIMATOR 37
where x̂(·) is an already available estimate. We will discuss this MHE parameter
in section A1.5.2. Finally, problem (A65) shows a generic MHE [167]. It is
worth mentioning that for unconstrained linear systems the MHE collapses to
the KF [71].
T −1
kxT −N − x̂T −N k2Π−1 kwk k2Q−1 + kvk k2R−1 (A65a)
X
minimise +
−1
xT −N ,{wk }T T −N
k=T −N k=T −N
yk = g(xk , uk ) + vk (A65c)
xk ∈ xLB UB
, wk ∈ wkLB , wkUB , vk ∈ vkLB , vkUB (A65d)
k , xk
Estimation Window →
a.c.
0 T−N+1 T−1 T
k
T
X −1 T
X
minimise wa> Pa−1 wa + wk> Q−1
k wk + vk> Rk−1 vk (A66a)
wa ,wk ,vk
k=T −N +1 k=T −N +1
yk = g(xk , uk ) + vk (A66c)
xT −N +1 = x̄T −N +1 + wa (A66d)
xk ∈ xLB UB
, wk ∈ wkLB , wkUB , vk ∈ vkLB , vkUB (A66e)
k , xk
Eq. (A66a) is a cost function and consists of three noise terms to be minimised.
From left to right, they are related to the arrival cost wa ∈ Rnx , the model
error wk ∈ Rnx and the measurement error vk ∈ Rnr , where nx and nr are the
number of states and transducers, respectively. We assume that each variable
is associated with a covariance matrix as indicated in Eq. (A67).
wa ∼ N (0, Pa ) (A67a)
wk ∼ N (0, Qk ) (A67b)
vk ∼ N (0, Rk ) (A67c)
The role of the arrival cost for the MHE is to include the past information in
the estimation. Reference [71] outlines two families of approaches regarding the
computation of the arrival cost, i.e., the filtering and the smoothing schemes
(cf. Defs. A1.17–A1.18). In a filtering scheme, the optimisation takes into
account the past information by penalising the deviations of the initial estimate
within an horizon from an a priori estimate. In other words, the arrival cost is
evaluated based on information which are collected before (and not including)
the moving horizon. On the other hand, a smoothing scheme involves the
penalisation of the deviations of the trajectory of the states within the whole
estimation horizon from an a priori estimate. In this case, the arrival cost can
benefit from information which are located in the past (i.e., before the current
time step) but at the same time they are virtually in the future with respect
to the first time step of the estimation horizon. Reference [71] indicates the
smoothing scheme as superior to the filtering scheme.
There are several ways to deal with the arrival cost. A first approach is simply
not to take into account any information prior to the sliding window. A second
method is to set a constant value for the arrival cost (i.e., Pa = constant),
and a third technique employs a recursive filter, such as the EKF or the UKF
(cf. section A1.4.2) [155, 167]. If we follow the smoothing approach, the so-
called smoothed arrival cost exploits the covariance matrix of the optimisation
problem [122, 123, 168], which implements a quadratic approximation of the past
information [37]. It will become clear in chapter B1 that this approach is well
suited for the CS-MHE, since the covariance matrix is required independently
of the arrival cost (cf. section B1.2.1).
The effect of the arrival cost is minimal for a long window, in which a large set of
data is available for the optimisation, such that an extra element does not have
a strong influence on the solution. For the same reason, a good estimation of the
arrival cost is crucial if the window is short. The latter is the most interesting
case from a practical point of view, because a short window allows for a faster
computation, which may be the bottleneck for online (real time) applications.
The window length is then a tuning parameter for the MHE. According to [167],
a good trade-off between accuracy and computational effort can be obtained by
setting the length of the moving window as twice the observability index of the
system.
40 STATE ESTIMATION
In section A1.4 we showed that the KF provides the optimal estimates for
linear Gaussian systems at a very reasonable computation cost. Therefore, in
such a situation an MHE would not perform better and would require a higher
computational effort. However, optimality is not guaranteed for nonlinear
systems, and this applies to single step approaches as well as to the MHE.
Furthermore, both the nonlinear extensions of the KF and the MHE require a
higher computational cost than the KF. A comparison is not a simple task since
the computational performance of the MHE depends on the horizon length and
on the implementation strategy.
In general we expect the MHE to be more computationally demanding than
the EKF, but this comes with great advantages of a better description of the
nonlinear behaviour, the possibility to capture fast changes in the system and
robustness. In other words, optimality is not guaranteed for the MHE, but in
practice the MHE performs better than an EKF for nonlinear systems and in
case of estimation problems with multiple optima [71, 72]. For those reasons, the
MHE is gaining popularity among all other model based estimation techniques.
Finally, we will mention in chapter A2 that tailored algorithms are available for
a fast implementation of the MHE.
𝑦𝑘
Estimator 1 𝑥1,𝑘
𝑥1,𝑘
𝑦𝑘 𝑥1,𝑘 𝑥2,𝑘 Joint estimator 𝑥𝑘 = 𝑥
2,𝑘
Estimator 2 𝑥2,𝑘
Figure A1.2: Two strategies for the estimation of multiple variables. Dual filter
(left) and joint estimator (right).
exchanging the updated estimates. This method does not imply that one filter is
dedicated to the states while the other takes into account the inputs/parameters.
In fact, it may be convenient for certain applications to split the state vector
into two parts, provided that the dependency between the two new state vectors
is not too strong. An example is the separation between variables referring
to a vehicle longitudinal and lateral dynamics in [80]. On the other hand, a
joint estimator is a single filter based on state augmentation, i.e., the number of
state variables increases by the number of inputs/parameters to be estimated
[133, 183]. In parallel to this, also the state equations can be augmented with
extra equations that refer to the newly introduced variables.
The CS-MHE is a joint filter, and in the next section we will present some
state of the art techniques to model an input. The main advantage of a joint
estimator over a dual filter approach consist in capturing the cross-coupling
between all (augmented) states by means of a single covariance matrix, while
a drawback involves possible observability issues that can arise if we want to
estimate too many variables from too few information (cf. section A1.8). It
is worth mentioning that a linear system can be nonlinear with respect to its
inputs and/or parameters. For this reason, joint estimators require nonlinear
techniques, such as the nonlinear single step approaches based on the KF (cf.
section A1.4.2) and the MHE (cf. section A1.5).
In the previous section we introduced the concepts of joint estimation and state
augmentation, i.e., inputs and/or parameters to be estimated become part of
the state vector (cf. section A1.6). At the same time, additional equations may
be needed to model these new states. Furthermore, the system may become
unobservable (cf. section A1.8). Joint state/input estimators can benefit from
42 STATE ESTIMATION
input representations. In fact, these provide the estimation problem with some
extra information aiming at an improved observability and a higher estimation
accuracy.
In this section we outline four state of the art approaches for modelling an
augmented state in case of joint estimators (sections A1.7.1 to A1.7.4). We draw
the attention on the basic concepts, main applications, and points of strength
and weakness of each scheme. All methodologies that we present here can
be applied to represent both inputs and parameters. Nevertheless, parameter
estimation is in practice a simpler task since parameters influence the system
from an internal source which is relatively easy to locate. Moreover, their
dynamic behaviour is rather slow, since a parameter is likely to evolve slowly
over time. Being the estimation of an unknown input our primary interest,
high dynamic ranges and uncertainties make the task more challenging. This
section ends with a discussion that allows us to justify the choice of the MHE
and CS as foundation of the CS-MHE, and consequently of this dissertation
(section A1.7.5).
Let us begin this overview of representations for unknown inputs and parameters
by mentioning that it is possible to set up an estimation problem without
including any additional model. In such case there are neither extra equations
nor augmented states. The unknowns are taken into account by some uncertainty
terms (covariances), which have to be properly tuned. Furthermore, the model
for the state estimation problem should be as accurate as possible, to avoid
that any model uncertainty is considered as an input. We refer to this situation
with the acronym NI (no input).
The NI approach is used in the field of linear observers for unknown inputs, it
is very general and it is proven to be optimal for linear cases, i.e., it results
in the minimum variance [61, 62, 81, 82, 83, 111, 124, 125, 128]. It requires
measurements (including acceleration) to be processed at their exact time step
and for this reason it is often implemented following a smoothing approach (cf.
Def. A1.18) [124, 127]. If the input location is not known, it is possible to define
an arbitrary input which excites the system in an equivalent way [124]. The NI
approach requires a good knowledge of the system model, it is not suitable for
many inputs and requires a certain amount of measurements (more than the
states [124]). Among different methods for force reconstruction that fall into the
NI group, we mention inverse methods (e.g., the inverse structural filter [186])
and the unknown input observer [112]. Inverse methods are widely employed
but they often suffer from numerical instability [70], while the unknown input
INPUT MODELS 43
observer [69, 113] is a time domain approach which involves state augmentation.
It was originally developed for pole placement in control engineering, it operates
only on time-invariant systems, and it requires a full column rank feedthrough
matrix and a number of sensors which is at least as the number of unknown
inputs plus the number of any nonlinear term [112].
To summarise, the NI is well suited for linear systems if a very accurate model
is available and if a large number of transducers can be placed on the system.
While this could be the case in civil engineering, these requirements are likely not
to be satisfied for mechanical systems. For example, the joints of a mechanism
may bring in several model uncertainties as well as nonlinear behaviours, which
are extremely challenging to be modelled accurately. Furthermore, conditions
such as temperature gradients and high rotational speeds make measurements
more difficult from a hardware point of view. In general, the installation of
transducers modifies the system dynamics and adds extra costs (for purchase
and maintenance). For this reasons, in this dissertation we will not focus on
the NI approach. However, in section B2.1 we will set up an MHE scheme for
joint state/input estimation that utilises the NI, which we employ as a term
of comparison while discussing rank and condition number of the CS-MHE
matrices.
u̇ = 0 + wC,u (A68a)
The RW involves a time derivative, and the case illustrated by Eqs. (A68–A69)
is referred to as a zero-order RW. Higher orders of RW models are possible,
but they are not popular since they require the estimation of extra augmented
states, leading to possible observability issues. Variances QC,u and QD,u are
crucial tuning parameters for the estimation. In fact, they determine now much
a variable can vary between two consecutive time steps. A low variance allows
the parameter to vary smoothly. At the same time, it implies a high accuracy
of the input model. On the other hand, we would require a high variance if we
expect a variable to evolve with a fast dynamics, but this is very likely to make
the input model unreliable up to the point that the augmented state has no
weight in the estimation problem. For this reason, the designer of a filter should
pay special attention when setting the covariances, since a wrong choice can
jeopardise the estimation [195]. In case a covariance needs to be adapted online,
reference [75] proposes a so-called forgetting factor for the estimation of the
covariance of a parameter, while the Robbins-Monroe stochastic approximation
scheme recursively adapts the covariance of the process noise based on the
Kalman gain [119].
A zero-order RW model can effectively represent a parameter, since parameters
are typically not expected to vary much over two consecutive time steps. On
the other hand, if a signal input is characterised by a fast dynamics, then the
covariance associated to it would be relatively high and may lose its influence
within the estimator. An RW model is implemented by augmenting the state
vector with each input, assuming that the input signal evolves relatively slow
and according to a predefined covariance value. Its applicability holds for
single step as well as for multistep filters (cf. sections A1.4–A1.5). We can
expect better observability in comparison with the NI approach provided that
the number of augmented states satisfies the observability requirements (cf.
section A1.8). In a filtering scheme, the RW model may result in a delayed
input estimation.
To summarise, the RW is widely employed, being it a very generic modelling
tool. It can be instrumental for both single step and multistep estimators, and
offers better observability in comparison with the NI approach, provided that
the dynamics of the estimate is relatively slow. The RW does not model any
physical behaviour, often results in a delayed estimation, and the performance
of the estimation problem depends strongly on the tuning of the covariances.
Each augmented state is linked to an RW equation, and observability issues
INPUT MODELS 45
Estimation Window →
order 2
order 1
order 0
0 T−N+1 T−1 T
k
may arise if we need to model too many estimates, e.g., in case the location of
an input is not known a priori. Furthermore, an input applied at a location
which is not modelled may take part of both the state estimation and the input
estimation in a non controlled manner.
Table A1.1 summarises the major features of the input models that we discussed
in this section (we omitted the NI approach). The features marked with a cross
48 STATE ESTIMATION
Feature RW PA CS
known input location 4 4 4
unknown input location (8) (8) 4
single step estimator 4 8 8
multistep estimator 4 4 4
slow input dynamics 4 4 4
fast input dynamics (e.g., impulse) 8 8 4
lumped input 4 4 4
distributed input (8) (8) 4
based on a model 8 4 4
unknown input shape 4 4 8
between brackets indicate that in practice it is not possible for the RW and
the PA approaches to deal with unknown input locations and distributed loads
because of observability issues, even if in theory one could discretise the system
and apply a model to each node. From Table A1.1 we understand that the
estimation of lumped inputs with slow dynamics applied at a known location
is in general not a critical task, while the same does not apply to distributed
inputs with fast dynamics applied at unknown locations. We note that the CS
can provide a solution to this problems provided that the input shape is known
and the estimation problem is formulated on a time window instead of on a
single step.
the initial state x(0) can be uniquely determined by the knowledge of the input
u(τ ) and output y(τ ) ∀ τ ∈ [0, t].
Definition A1.21 (Observability of a discrete-time system). A discrete-time
system is observable if for any initial state x0 and any final time k the initial
state x0 can be uniquely determined by the knowledge of the input ui and output
yi ∀ i ∈ [0, k].
Def. A1.20 is more strict than Def. A1.21. In fact, for continuous-time systems
the initial state must be able to be determined at any final time. The definition
of observability involves the initial state, and implies that all states between the
initial and final times can also be determined [183]. Literature offers several
equivalent tests to assess observability for continuous-time and discrete-time
LTI systems. Here we limit our discussion to the observability matrix (also
referred to as the Kalman observability matrix), while we refer to [114, 183] for
further theorems.
CC
CC AC
O = .. (A71)
.
CC AnCx −1
rank(O) = nx (A72)
A possible test which relates to the topic of observability involves rank and
condition number of the discrete-time system, when we consider the problem
as an overdetermined weighted least square fitting. Those values establish a
link between observability and estimation uncertainty, and can be used to tune
estimation performance [79, 105]. Furthermore, such approach is suited to
estimators and inverse problems in general, thus including multistep estimators
such as the MHE. We would like to stress the fact that a test of rank and
condition number does not access observability as it was defined in Defs A1.20
and A1.21. However, those two metrics are of paramount importance to address
whether a system is ill-posed. In particular, we point out the following:
The (`2 -norm) condition number of a matrix is defined as the ratio of the largest
singular value of the matrix to the smallest. For arbitrary-size matrices we can
obtain the singular values through a singular value decomposition (SVD), while
for a square matrix we can consider its eigenvalues. Following this line, the
Popov-Belevitch-Hautus (PBH) criterion for local observability (Thm. A1.24)
[60, 114, 134] provides us with more information since it takes into account
every eigenvalue λ of the continuous-time state-space matrix AC . We refer to
[38] and referenced therein further metrics and considerations related to the
topic of observability.
Theorem A1.24 (PBH observability). The nx state continuous-time LTI
system (A70) is observable over either R or C if and only if it satisfies
condition (A73).
AC − λI
rank = nx ∀ λ∈C (A73)
CC
1. add measurements;
2. implement or change an input model (cf. section A1.7);
3. keep the number of inputs within the observability threshold;
4. reduce the model complexity.
In general we prefer to avoid points 3–4, since they provide us with less
information. On the other hand, solution 1 is relatively easy to implement,
and for this reason it is widely employed. The major concerns regard economic
constraints (measurements may be expensive due to the costs of transducers and
acquisition systems) and the fact that certain measurements which are needed
to increase observability may not be available. In fact, specific sensors may
not exist or cannot be physically mounted on a given set-up. Finally, point 2
suggests to draw the attention to input models, going in the direction of virtual
sensing and advanced model based estimation techniques. The CS-MHE follows
this concept, since it exploits known input information in order to implement
a tailored modelling approach. This requires on the one hand some effort to
build the models, but on the other hand it reduces the need of an overwhelming
amount of measurements.
In chapter B2 we will address rank and condition number of the CS-MHE
[105]. Besides observability, the state of the art of control engineering and state
estimation offers additional system properties such as stability, detectability,
reachability and controllability. Throughout this dissertation we do not employ
those definitions, and therefore we refer to [5, 12, 33, 34, 53, 93, 183] for further
information.
A1.9 Conclusions
Consequently, joining the MHE with CS into the CS-MHE seemed very
promising to us. We will describe the derivation of the CS-MHE in part B
of this dissertation, while in the next chapter we introduce a few concepts of
optimisation.
Chapter A2
Optimisation
Acknowledgements
This chapter is an overview of the state of the art on the topic. Great sources of
inspiration for writing this chapter were [21, 54, 137, 204]. We are grateful to
the developers of SPGL1 [193, 194] and CVX [67, 68], which are two packages
compatible with MATLAB® for specifying and solving convex programs that
helped us during our first steps in the world of numerical optimisation.
53
54 OPTIMISATION
h(z) ≤ 0 (A74c)
Definition A2.3 (Feasible set). The feasible set (or domain) is defined as
follows:
F = {z | g(z) = 0, h(z) ≤ 0}. (A77)
Definition A2.4 (Feasible point). A point z is called feasible if and only if
z ∈ F. Problem (A74) is feasible if a feasible point exists.
Definition A2.5 (Global optimum). A point z ∗ is called global optimum
(or, in this case, global minimum) if and only if
f (z ∗ ) ≤ f (z), ∀ z ∈ F. (A78)
f (z ∗ ) ≤ f (z), ∀ z ∈ F ∩ B (z ∗ ). (A79)
AN INTRODUCTION TO NONLINEAR PROGRAMMING 55
Dual variables are employed to build the so-called dual problem [137]. On the
other hand, the optimisation variables z are also referred to as primal variables,
and problem (A74) is called the primal problem.
In this section we define the conditions for local optimality. Let us first introduce
the concept of constraint qualification, which is necessary to ensure that the
constraint linearisation captures the essential geometric features of the feasible
56 OPTIMISATION
We can now introduce the following theorem concerning the first order necessary
conditions for local optimality.
Theorem A2.13 (Karush-Kuhn-Tucker (KKT) conditions). Let z ∗ be a local
solution to problem (A74) that satisfies the LICQ. Then there exist Lagrange
multipliers λ∗ and µ∗ such that the KKT conditions (A82) hold.
∇z L(z ∗ , λ∗ , µ∗ ) = 0 (A82a)
g(z ∗ ) = 0 (A82b)
h(z ∗ ) ≤ 0 (A82c)
µ∗ ≥ 0 (A82d)
Conditions (A82e) are called complementarity conditions and imply that either
constraint hi (z ∗ ) is active (i.e., hi (z ∗ ) = 0) or µ∗i = 0, or possibly both.
Definition A2.15 (Weakly active constraints and strictly active constraints).
Let point (z ∗ , λ∗ , µ∗ ) be a KKT point of problem (A74). If for an index
i ∈ A(z) it holds µ∗i = 0 then hi (z ∗ ) is said to be weakly active, and i ∈ Aw (z).
Alternatively, if µ∗i > 0 then hi (z ∗ ) is said to be strictly active, and i ∈ As (z).
Definition A2.16 (Strict complementarity). Let point (z ∗ , λ∗ , µ∗ ) be a KKT
point of problem (A74). If all active constraints are strictly active, i.e., if µ∗i > 0
for all i ∈ I ∩ A(z ∗ ), then strict complementarity holds.
AN INTRODUCTION TO NONLINEAR PROGRAMMING 57
The KKT conditions (A82) characterise critical points, while second derivative
information allows to further distinguish undesirable candidates such as saddle
points [137] and can also be employed to state sufficient conditions for local
optimality.
Definition A2.18 (Jacobian of the strictly active constraints). The Jacobian
of the strictly active constraints is defined as
>
Gsa = ∇g(z ∗ ) ∇hAs (z∗ ) (z ∗ ) (A83)
.
The definitions and theorems regarding local optimal conditions that we included
in this section want to give an overview to the topic, and do not pretend to be
58 OPTIMISATION
Convex programming problems (CPs) are an NLP subclass which derives from
problem (A74) under the additional assumptions of Def. A2.7. In this section
we list a few noteworthy types of CPs.
Quadratic programs (QPs) are crucial in the framework of state estimation
problems based on an MHE, which include the CS-MHE that we will discuss
in chapter B1. They are relatively easy to solve and for this reason they often
appear in real time optimal control problems such as MPC [54]. Formally, a
generic QP has the standard form of problem (A85), where z ∈ Rnz , H ∈ Rnz ×nz
is symmetric, q ∈ Rnz , Aeq ∈ Rneq ×nz , beq ∈ Rneq , Aieq ∈ Rnieq ×nz , bieq ∈ Rnieq .
We assume that the QP is convex, i.e., H ≥ 0.A9
1 >
minimise z Hz + q > z (A85a)
z 2
subject to Aeq z = beq (A85b)
in Eq. (A86), where 0 ≤ Qi ∈ Rnz ×nz is symmetric and qi ∈ Rnz for all
i ∈ {1, . . . , nieq }.
The last family of CPs that we indicate here is the second-order cone program
(SOCP), which is closely related to QCQPs [21]. In fact, the only difference is
the form of the inequality constraint, which we illustrate in Eq. (A87) for an
SOCP, where z ∈ Rnz and Ai ∈ Rni ×nz . We call such constraint a second-order
cone constraint [21].
kAi z + bi k2 ≤ c>
i z + di ∀ i ∈ {1, . . . , nieq } (A87)
SOCPs are more general than QCQPs, and will become relevant in the framework
of the CS-MHE in chapter B1, where we deal with the estimation of inputs
represented by complex shape functions (cf. section B1.4). A list of the
properties of CPs is given in [21]. For our purposes we just remind that every
local solution of a CP is also a global solution (Thm. A2.8), and CPs are
generally considered significantly more tractable than (non-convex) NLPs [54].
In this section we briefly outline two popular numerical methods for solving
NLP problems of the form of problem (A74). Further approaches and details
can be found in references such as [21, 137] and references therein. Interior
point (IP) methods (section A2.3.1) and sequential quadratic programming
(SQP) (section A2.3.2) are both employed in the frameworks of MHE and
CS. Both methods aim at finding a local minimum by computing a KKT
point v ∗ = (z ∗ , λ∗ , µ∗ ) of a given NLP such as problem (A74). If the KKT
conditions (A82) were smooth, we could compute a KKT point by applying
Newton’s method. Unfortunately, Eqs. (A82c–A82e) are nonsmooth and need
special care [204].
set the barrier parameter to a high value and progressively reduce it as the
algorithm converges. IP methods are among the most popular classes of NLP
algorithms, since they are proven to be highly competitive especially for convex
problems. Moreover, the recent evolution of compressive sensing is said to have
occurred thanks to some breakthroughs concerning IP methods [36].
One main disadvantage of IP is the need of always going back to the so-called
central path [137], even if an initial guess very close to the solution is available
[204], i.e., the solver needs to start with a large barrier parameter even if the
solution is very close to the initial guess, and then perform some iterates which
drift away from the solution before going back to it. This fact can be limiting if
one is interested in solving a series of parametric problems for small variations
of the parameter. In such framework SQP (cf. section A2.3.2) is typically much
faster, as it can be warm-started [204].
In IP methods, a vector of slack variables s ∈ Rnieq is usually introduced and
the inequality constraints (A74c) are reformulated as shown in Eq. (A88).
Primal interior point methods implement Eq. (A88) and add a barrier
Pnieqfunction
to the cost. Problem (A89) shows a logarithmic barrier, i.e., −τ i=0 log(si ).
For τ = 0 the barrier function becomes the indicator function i(s), as shown in
Eq. (A90).
nieq
X
minimise f (z) − τ log(si ) (A89a)
z,s
i=0
h(z) − s = 0 (A89c)
0 for s ≥ 0
lim −τ log(s) = i(s) = (A90)
τ →0 ∞ otherwise
The KKT conditions of problem (A89) are given in Eq. (A91), after having
defined S = diag(s) and e = 1> ∈ Rnieq . Conditions s∗i ≥ 0 and µ∗i ≥ 0 must
be enforced by selecting an appropriate step size [137].
NUMERICAL METHODS FOR NONLINEAR PROGRAMMING 61
g(z ∗ ) = 0 (A91c)
h(z ∗ ) − s∗ = 0 (A91d)
Starting from a rather large value for τ , the IP approach consists of iteratively
solving problem (A89) and decreasing the value of the barrier parameter τ .
Once τ reaches a prescribed tolerance, the problem is solved one last time and
the solution is given as output.
Two main disadvantages of primal interior point methods are the need of finding
a feasible initial guess, as well as dealing with Eq. (A91b), which becomes quite
nonlinear near the solution, as s → 0. Primal-dual interior point methods
tackle the latter issue by multiplying Eq. (A91b) by matrix S ∗ and solving the
equivalent system (A92), with s∗ ≥ 0 and µ∗ ≥ 0. We note that if τ = 0 we get
the standard KKT conditions (A82) [204].
−τ e + S ∗ µ∗ = 0 (A92b)
g(z ∗ ) = 0 (A92c)
h(z ∗ ) − s∗ = 0 (A92d)
1 > (k)
minimise ∆z H ∆z + ∇f (z (k) )> ∆z (A95a)
∆z∈Ω(k) 2
(k,i)
H (k) ∆z̃ (k,i) + ∇f (z (k) ) − ∇g(z (k) )λ̃(k,i) − ∇hA(i) (z (k) )µ̃A(i) = 0 (A96a)
Problem (A98) is convex and solvable [21]. Moreover, we assume without loss
of generality that the nz columns of A are independent, and nm > nz , i.e., the
system is underdetermined. Note that if nm = nz the optimal solution is simply
z = A−1 b, while the case of nm < nz falls out the purposes of this discussion.
Problem (A98) is a regression problem, and can assume multiple interpretations
such as approximation, estimation, projection, design [21]. The most common
approximation problem is the least-squares approximation problem of Eq. (A100),
and involves squaring a cost function consisting of an `2 -norm. The objective
is the sum of squares of the residuals, which are defined in Def A2.22. The
resulting problem is Eq. (A100a), which we can solve analytically by calling
the Moore-Penrose pseudoinverse, i.e., z = (A> A)−1 A> b, which is obtained
starting from the formulation given in Eq. (A100b). We note that this is exactly
the philosophy upon which LS estimators are based (cf. section A1.3).
Definition A2.22 (Residual). Vector
r = Az − b (A99)
nm
minimise kAz − bk22 = minimise ri2
X
(A100a)
z z
i=1
When the `1 -norm is used, the resulting norm approximation problem assumes
the form of Eq. (A101), and it represents the sum of the absolute residuals [21].
nm
X
minimise kAz − bk1 = minimise |ri | (A101)
z z
i=1
subject to − s ≤ Az − b ≤ s (A102b)
66 OPTIMISATION
nm
!1/p
X
minimise p
|ri | (A103)
z
i=1
nm
X
minimise |ri |p (A104)
z
i=1
nm
X
minimise φ(ri ) (A105a)
z
i=1
subject to r = Az − b (A105b)
It is crucial to note that the shape of the penalty function influences the solution
of problem (A105), i.e., φ(u) is a measure of dislike of a residual of value u.
We can stress this aspect by showing the comparison between the `1 -norm and
`2 -norm penalty function approximation problems [21]. The penalty functions
associated with the `1 -norm and `2 -norm are φ1 (u) = |u| and φ2 (u) = u2 ,
respectively. The subscript refers to the norm type. For |u| = 1, the two penalty
functions assign the same penalty. For small u we have φ1 (u) φ2 (u), i.e., the
`1 -norm approximation puts larger emphasis on small residuals compared to
`2 -norm approximation. On the other, large u result in φ2 (u) φ1 (u), i.e.,
the `1 -norm approximation puts less weight on large residuals, compared to
`2 -norm approximation. This difference in relative weightings for small and
large residuals is reflected in the solutions of the associated approximation
problems. The amplitude distribution of the optimal residual for the `1 -norm
approximation problem will tend to have more zero and very small residuals,
compared to the `2 -norm approximation solution. In contrast, the `2 -norm
NORM APPROXIMATION PROBLEMS 67
60
40
20
0
-3 -2 -1 0 1 2 3
10
0
-3 -2 -1 0 1 2 3
Figure A2.1: Histogram of residual amplitudes for `1 -norm and `2 -norm penalty
functions, with the (scaled) penalty functions also shown for reference. For this
example nm = 100 and nz = 50.
solution will tend to have relatively fewer large residuals [21]. This fact becomes
clear in Fig. A2.1, for an example involving A ∈ R100×50 and b ∈ R100 chosen
from a normal distribution.
From Fig. A2.1 we notice that the `1 -norm approximation generates many very
small (or even exactly zero) optimal residuals. This means that in `1 -norm
approximation we typically find that many of the equations are satisfied exactly
[21]. We refer to this phenomenon by defining the concept of sparsity, from
which we infer that the `1 -norm promotes sparsity.
Definition A2.23 (Sparsity). The sparsity S of a vector is defined as the
number of its nonzero elements.
The last relevant problem for the purposes of this thesis is problem (A107),
which differs from (A106) since the Euclidean norm is squared. We will show
in chapter B1 that in the simplest case it leads to a QP (cf. section A2.2).
nz
X
kzk1 = <(zi )2 + =(zi )2 (A108)
p
i=1
COVARIANCE MATRIX OF CONSTRAINT OPTIMISATION PROBLEMS 69
Problem (A107) can be reformulated as the convex problem (A109), where the
new variables are defined in Eq. (A110), and s ∈ Rnz is a slack variable.
<(z)
z̃ = ∈ R2nz (A110a)
=(z)
<(A) −=(A)
à = ∈ R2m×2nz (A110b)
=(A) <(A)
<(b)
b̃ = ∈ R2nz (A110c)
=(b)
1
minimise kf (z)k22 (A111a)
z 2
subject to g(z) = 0 (A111b)
It is worth to underline that g(z) includes all constraints, i.e., all the active
equality and inequality constraints must be taken into account (cf. Defs. A2.9–
A2.10). Moreover, the equality constraints are assumed to be perfectly true.
f (z) and g(z) are generic nonlinear functions, which we can linearise through a
truncated Taylor series at the operating point z̄ (cf. section A1.1.2), resulting
in problem (A112). Jf (z) and Jg (z) are the Jacobians with respect to z of
f (z) and g(z), respectively. Furthermore, Eq. (A113) shows the expansion of
the cost function, which is a QP with a quadratic term govern by the Hessian
H(z) = Jf (z)> Jf (z) and a linear term f (z̄)> Jf (z)∆z.
1
minimise kf (z̄) + Jf (z)∆zk22 (A112a)
z 2
subject to g(z̄) + Jg (z)∆z = 0 (A112b)
1 1
f (z̄)> f (z̄) + f (z̄)> Jf (z)∆z + ∆z > Jf (z)> Jf (z)∆z (A113)
2 2
Finally, Eq. (A114a) is the formula for the covariance matrix C ∈ Rnz ×nz given
in [19]. C is a symmetric positive semidefinite square matrix. Its diagonal
entries are the variances (or weighting numbers) of the optimisation variables,
while any nonzero off-diagonal element denotes cross-correlation between the
elements of z. By recompiling Eq. (A114a) into Eq. (A114b), we notice that the
Jacobian of the cost function Jf is no longer present, limiting the computational
effort and avoiding any related numerical errors.
−1
Jg> Jf> 0 0
H I
C = 0
I · ·
Jg 0 0 I 0 0
−1
0 Jg>
Jf H I
(A114a)
0 I Jg 0 0
−1 −1
Jg> 0 Jg>
H H H I
= 0 (A114b)
I
Jg 0 0 0 Jg 0 0
CONCLUSIONS 71
We will show in part B that the covariance matrix provides the CS-MHE not
only with the confidence levels for the estimated states and inputs, but also
with the arrival cost, which is crucial to include the past information in the
sliding estimation window (cf. section A1.5).
A2.7 Conclusions
Compressive sensing
Acknowledgements
This chapter is mainly based on reference [107], which documents some research
carried out at Virtual Vehicle Research Center in Graz (Austria). Matteo
Kirchner is the first author of [107]. A special thanks goes to Eugène Nijman,
who is second author of [107], first spotted the potential of compressive sensing,
and served as daily discussion partner. Furthermore, Matteo Kirchner and
Eugène Nijman outlined the principles of compressive sensing also in [109, 110].
73
74 COMPRESSIVE SENSING
Compressive sensingA10 (CS) is a well known scheme for data acquisition and
compression in the field of audio and image processing. CS wants to directly
acquire the minimum amount of data which is needed to fully represent the
signal. This does not usually happen with digital pictures, where cameras
acquire a huge amount of data which is compressed before being stored.
CS is based on signal sparsity (cf. Def. A2.23), i.e., a signal can be represented
(fully or in an approximate way) by just a few components belonging to a certain
transformed space. This space is referred to as dictionary, and its components are
the so-called basis functions or atoms [10, 25, 73]. Moreover, the sensing scheme
should have a dense representation in the dictionary [25]. This is often achieved
by a random sampling scheme (in space for images, in time for time signals).
Among others, we cite references [8, 10, 17, 25, 28, 30, 31, 73, 101, 146, 150].
Consequently, signals are compressible (i.e., they are well approximated by
sparse representations) when they have a sparse representation in some domain
[10, 25, 73]. The challenge is then to represent a signal in that specific domain
and keep only the relevant nonzero elements (the sparser the solution, the
better the compression). We indicate reference [73] for a simple overview
on compressive sensing, while references [25, 26] provide the mathematical
guarantees to be taken into account when dealing with CS.
Eq. (A115a) shows a sensing process. Vector u ∈ Rnu is an unknown signal
to be measuredA11 , y ∈ Rny is a set of measurements and Φ ∈ Rny ×nu is
the sensing matrix (or measurement matrix), and implements the operations
related to signal acquisition. In such way, the measurements can be seen as
a linear combination of the signal. Furthermore, Eq. (A115b) shows how u is
projected onto dictionary Ψ ∈ Rnu ×nα , such that α ∈ Rnα is a sparse vector.
Finally, Θ ∈ Rny ×nα in Eq. (A115c) brings together the sensing matrix and the
dictionary in the so-called global sensing basis.
y = Φu (A115a)
u = Ψα (A115b)
y = ΦΨα = Θα (A115c)
A10 In literature compressive sensing is also referred to as compressive sampling, compressed
Compressive sensing can solve Eq. (A115c) provided that y is sufficiently long
and α is sufficiently sparse [25]. Among all possible solutions of Eq. (A115c), CS
is interested in finding the sparsest solution, since the number of measurements y
needed to capture a sparse signal α is proportional to its sparsity. Consequently,
we need to choose a solution method that promotes a sparse solution when
inverting Eq. (A115c) in order to determine α. We remind that the Moore-
Penrose pseudo-inverse (cf. sections A1.3 and A2.4) does not provide a sparse
solution to an arbitrary underdetermined system such as Eq. (A115c) [10].
Literature offers several algorithms for obtaining a sparse reconstruction [191,
205]. Throughout this dissertation we focus on convex optimisation via `1 -norm
minimisation (cf. sections A2.4), but this is not the only way to generate sparse
solutions. Within the CS community, a possible alternative approach involves
greedy algorithms [190]. These are approaches that aim at solving NP-hard
problems such as problem (A116) in a heuristic fashion by choosing certain
paths within a combinatorial problem. This often results in short computation
times at the high risk of ending up in a local optimum. Reference [25] states
that when α is sufficiently sparse, the recovery via `1 -norm minimisation is
provably exact. Consequently, greedy algorithms may be preferred over convex
optimisation if a certain sparsity level cannot be guaranteed. The price to
pay links to the fact that greedy algorithms very rarely converge to a global
76 COMPRESSIVE SENSING
subject to Θα = y (A116b)
There are no efficient algorithms to solve such NP-hard problem, due to the
non-convexity of the `0 -norm optimisation [23, 24]. We can overcome this by
“relaxing” the `0 -norm up to an `1 -norm problem [42, 44], for the solution of
which some methods are available. Examples are the method of frames (MOF),
matching pursuit (MP), orthogonal matching pursuit (OMP), best orthogonal
basis (BOB), basis pursuit (BP) [36]. Among these methods, basis pursuit has
strongly been developed, and efficient algorithms are now available, especially
thanks to IP methods (cf. section A2.3.1) [36]. BP is capable of finding
the sparsest solution within a dictionary composed by non-orthogonal basis
functions [35, 36], while the other methods need the atoms to be orthogonal.
This becomes crucial while implementing overcomplete dictionaries with ad hoc
basis functions. Moreover, BP is based on global optimisation, offers better
sparsity and stable superresolution, and can be used with noisy data [36]. For
those reasons, we decided to employ convex optimisation rather than greedy
algorithms.
Under some conditions, which we will discuss in section A3.3, both the `0 -norm
and `1 -norm problems are proven to give the same and unique result. The
relaxation from `0 -norm to `1 -norm of Eq. (A116) leads to problem (A117),
which we solve through BP [36].
subject to Θα = y (A117b)
Starting from the first papers regarding the modern theory of CS [23, 27, 42],
literature offers a considerable number of publications on this topic. Some
extend the already mentioned references, while other apply the theory to several
scientific areas, among which image compression may be the most interested
field of application. A list of publications regarding the theory of compressive
sensing is given in [187], highlighting the main outcomes of each paper and
saying if a result has been superseded by a later publication. In particular,
references [25, 26] include mathematical conditions that replace many previous
ones.
In order to get the correct results of Eq. (A116) through the solution of
problem (A117), having few measurement points and few nonzero basis functions,
the matrices of the undetermined system have to satisfy a condition known
as restricted isometry property (RIP), which was proposed in [24]. The RIP
characterises matrices which are nearly orthonormal when they operate on
sparse vectors.
Let us consider the linear system of Eq. (A115c) and the `1 -norm problem (A117).
The RIP is a matrix condition which is set by means of restricted isometry
constants, defined as the smallest number δS such that Eq. (A119a) (or
alternatively Eq. (A119b) provided that kαk22 6= 0) holds for all S-sparse vectors
(S ≤ K, so that δS is defined for every S = 1, 2, . . . , K) [26]. A vector is said to
be S-sparse if it has S nonzero entries (cf. Def. A2.23).
kΘαk22
1 − δS ≤ ≤ 1 + δS (A119b)
kαk22
78 COMPRESSIVE SENSING
A matrix Θ satisfies a certain RIP if, for any arbitrary vector α having S ≤ K
nonzero entries, the central term of Eq. (A119b) is confined within a certain
region. In other words, there is a certain sparsity K below which the amplification
introduced by the matrix transformation remains bounded. Concerning the
bounds, reference [26] states the following:
• if δ2S < 1, then the `0 -norm problem (A116) has a unique S-sparse solution,
i.e., if we can prove that for a certain sparsity S any vector α with sparsity
up to 2S will amplify matrix Θ less than a factor of 1 (when normalised
by kαk22 ), then there is a unique solution with sparsity S;
√
• if δ2S < 2 − 1, then the solution to the `1 -norm problem (A117) is that
of the `0 -norm problem (A116), and the convex relaxation is exact, i.e., if
we can prove that for a certain sparsity S any vector
√ α with sparsity up to
2S will amplify matrix Θ less than a factor of 2 − 1 (when normalised
by kαk22 ), then the solution of the `1 -norm problem corresponds to the
sparsest solution that the `0 -norm problem would give [107].
Eq. (A120) indicates that a sparse signal (S small) requires a low number of
measurements. It is worth mentioning that if Eq. (A120) holds (i.e., if there
FEASIBILITY CONSIDERATIONS FOR COMPRESSIVE SENSING 79
are enough measurements ny for a certain sparsity S), then with overwhelming
probability a Gaussian random matrix obeys the RIP (i.e., a matrix whose
elements are independent and identically distributed random variables from a
Gaussian pdf with zero mean and variance 1/ny [25]). Further matrix typologies
with a similar behaviour are listed in [25]. To summarise, the RIP holds with
overwhelming probability if the following two conditions are satisfied:
The RIP condition cannot be verified for arbitrary matrices [10, 25, 30].
Nevertheless, we would like problem (A117) to yield the correct result. One
approach in the field of digital image processing proposes to pre-multiply both
sides of Eq. (A115c) by an additional sensing matrix, i.e., matrix Σ in Eq. (A121)
[10, 150]. Such operation aims at turning the quantity ΣΘ into a Gaussian
random matrix, which satisfies the RIP with high probability [11, 25]. A new
measurement vector y ∗ = Σy results then as a linear combination of the actual
measurements.
Σy = ΣΘα (A121)
regular sampling with a random sampling (cf. section A3.1 and references
therein). It is important to underline that both a random measurement matrix
and a low matrix coherence are not sufficient conditions to have the RIP satisfied,
i.e., they are just tools to enhance (but not guarantee) the RIP.
In references [107, 109] we presented a situation in which CS fails because of
poor matrices and unknown input sparsity, where we investigated the possibility
to apply CS to nearfield acoustical holography (NAH) [200] in order to decrease
the number of measurements needed in the high frequency range [108]. The
conclusion of our research was that for NAH applications it is not possible to
increase the rate of success of BP just by randomising the microphone positions,
because of the intrinsic structure of the matrices, which could not be modified
just by randomising the sampling scheme.
A3.4 Conclusions
The compressive
sensing–moving horizon
estimator (CS-MHE) for
joint state/input estimation
81
Chapter B1
83
84 FORMULATION OF THE CS-MHE
Acknowledgements
This chapter is based on references [103, 104], of which Matteo Kirchner is
first author. A special thanks goes to Jan Croes, who is the second author in
[103, 104] and actively contributed to the development of the CS-MHE with
great ideas, bringing in his expertise in the fields of state estimation and input
modelling, and being always an excellent discussion partner. Thanks also to
Francesco Cosco and Wim Desmet, who are both co-authors in [103, 104] and
provided precious pieces of advice during both the research and the paper
revision. Thanks to Goele Pipeleers for the suggestions and hints concerning
the practical implementation of the resulting SOCP. Finally, thanks to Eugène
Nijman who encouraged me to study the matrix implementation of the discrete
Fourier transform.
Problem (B1) shows the CS-MHE [104]. We explained most of the notation
already in Eq. (A66), that refers to the classical MHE (cf. section A1.5.1). The
new parts involve the last two terms of the cost function (B1a), their related
bounds in Eq. (B1g), and the new constraints denoted as Eqs. (B1e–B1f).
T
X −1 T
X
minimise wa> Pa−1 wa + wk> Q−1
k wk + vk> Rk−1 vk
wa ,wk ,vk ,να∗ ,αk
k=T −N +1 k=T −N +1
T
X −1
+ να>∗ Pα−1
∗ να∗ + λ kαk k1 (B1a)
k=T −N +1
yk = g(xk , uk ) + vk (B1c)
xT −N +1 = x̄T −N +1 + wa (B1d)
uk = ψk αk (B1e)
xk ∈ xLB UB
, wk ∈ wkLB , wkUB , vk ∈ vkLB , vkUB ,
k , xk
να∗ ∈ ναLB UB
, αk ∈ αkLB , αkUB (B1g)
∗ , να∗
The CS term in Eq. (B1a) consists of the `1 -norm of the sparse representation αk
of the input. It is expressed by Eq. (B1e), where ψk is the part of a dictionary Ψ
that refers to time step k, and Ψ was defined in Eq. (A115b). We formulated the
optimisation problem under the assumption that the input is fully unknown. If
this is not the case, the equations can be extended to include any available input
86 FORMULATION OF THE CS-MHE
information, without loss of generality. The CS term is the only linear term of
the cost function, while all other components are quadratic. A constant weight
λ balances this term with the rest of the cost function, and plays a crucial role
in the optimisation (cf. section A2.4). In fact, λ scales the contribution of the
sparsity exploitation with regard to the noise terms of model and measurements,
which are represented by the covariance matrices Qk and Rk , respectively.
Eqs. (B1d) and (B1f) and their related terms in the cost function contribute to
the CS-MHE by including information prior to the current window. Specifically,
Eq. (B1d) refers to the arrival cost (cf. section A1.5), while Eq. (B1f) allows
to exploit any available knowledge about an input, and we will discuss it in
section B1.2.1. Despite its similarity to a typical random walk equation (cf.
section A1.7.2), it is important to notice that Eq. (B1f) does not refer to the
input estimation. In fact, it propagates the participation factors of an already
detected input to the next iteration, while the estimation is performed by the CS
part. Eq. (B1g) includes two extra bounds on the newly introduced optimisation
variables (να∗ and αk ). In case of real variables, i.e., if the dictionary Ψ is
real, problem (B1) is a QP (cf. section A2.2). Otherwise, we can recast it
as an SOCP (cf. section A2.5). We will discuss the latter case in details in
section B1.4.
One of the motivations that drove us to developing the CS-MHE regarded
observability. Unfortunately, it is not easy to apply to the constraint optimisation
problem (B1) any of the methods for assessing observability that we presented
in section A1.8, including an evaluation based on rank and condition number.
In order to deal with this crucial aspect, we will introduce an equivalent
unconstrained CS-MHE formulation in section B1.3, which we employ in
chapter B2 for assessing rank and condition number.
2–4 These lines refer to the formulation and solution of the optimisation
problem i, as well as the computation of its associated covariance matrix.
Setting the problem includes any available prior information, i.e., the
arrival cost and the knowledge about a possible input. The solution of
the problem returns an estimation of states and inputs. The latter are
represented by the sparse vector αi|i , which is the collection of all αk
within the window.B1
We omitted index i from problem (B1). Similarly to the notation of the
arrival cost in Eqs. (A66d) and (B1d), we marked the prior data with a
bar, such that ᾱ∗ corresponds to αi|i−1
∗
.
5 Variable αi|i∗
collects the nonzero elements of αi|i , and their number is
denoted as nα∗ i|i . We consider a nonzero element of αi|i as such if its
absolute value exceeds a predefined threshold level εα . The purpose of εα
is to filter out the noise components and limit the size of the optimisation
problem. Its side effect is that some energy coming from the input is
discarded, and consequently the input magnitude is underestimated. We
will show a simple way to avoid this energy loss for the application
examples in chapter C1.
να∗ ∈ Rnα∗ in Eqs. (B1a) and (B1f) is a noise term related to the input,
B1 Notation zi|j refers to the estimation of variable z at time step i given the information at
time step j.
88 FORMULATION OF THE CS-MHE
may differ since the window is sliding in time, and any information at
T −N +1 is thus omitted, i.e., time step T −N +1 of the current iteration
does not belong to the window of the next iteration.
7 If at least one element is shared with the next time step, the updating
procedure takes place. Otherwise, nothing is transferred to the next
iteration (line 13).
8–11 The comparison between current and previous windows governs the way
the input weighting numbers are updated. In fact, if an element was present
during the previous window, then the current problem and covariance
matrix include its weighting number Pα∗ i|i . This is added to a drift term
Qdrift , resulting in Pα∗ i+1|i (line 10). On the other hand, if an element is
new, only Qdrift is assigned to Pα∗ i+1|i (line 11).
In plain text, whenever the CS-MHE detects an input, the algorithm assumes
that it is highly possible that the same input will be detected also in the
following iteration, until the step at which that input reaches the end of the
window. In other words, the sparsity pattern of the input does not change in
space, while it is shifted in time according to the sliding window. Moreover, a
drift term Qdrift is added to the input magnitude, to relax the constraints of the
next optimisation problem. Note that a small Qdrift implies tighter constraints,
while higher values give more freedom to the solver. The choice of Qdrift is
linked to the knowledge of the system, which is evaluated by the covariances
Qk and Rk [104].
The knowledge of an input is shared with the next iteration in terms of
augmented states. Consequently, the number of variables of the optimisation
problem grows proportionally to nα∗ i+1|i . The size of the problem may be kept
smaller if the threshold for the input detection εα gets higher. There are several
ways to model the drift term to correct the input estimation. The simplest
B2 In this chapter we adopt notation x ∈ Rnx to indicate the size of a (column) vector, in
line with the notation of the classical MHE in problem (A66). However, in case of a complex
dictionary notation x ∈ Cnx would be more appropriate. This will become clear in chapter B2,
where we will introduce a complex dictionary (cf. section B1.4). For this chapter, matrix and
vector sizes indicated by R may also refer to complex values.
THE CS-MHE: LIMITING THE AMOUNT OF CONSTRAINTS 89
approach employs a constant value, but a function of the time steps within
the window may be more appropriate [104]. We will show an example for the
numerical test case in chapter C1 (cf. section C1.1).
Before concluding this section, it is worth to mention that Eq. (B1f) does
not involve the last time step of the window, since it does not go beyond
k = T −1. In fact, it is based exclusively on the previous window (ᾱ∗ ). This
implies that the CS-MHE cannot estimate a force input acting at k = T unless
additional information enters the estimation problem. This can happen thanks to
acceleration measurements, since the state-space matrix D (direct feedthrough,
cf. section A1.1) is full when accelerometers are employed, while it is empty
in case of displacement transducers. The situation is different if a zero-order
random walk model is introduced to represent an input (cf. section B2.1) [105].
The accuracy of the CS-MHE depends on a few tuning parameters, such as the
window length (N ), the covariances associated to the model and measurement
errors (Q and R, respectively), the covariance related to the arrival cost (Pa ), a
drift term to propagate any input information to the next iteration (Qdrift ), a
threshold for the input estimation (εα ) and the balancing weight for the `1 -norm
term (λ) [104]. We will discuss the choice of these parameters in chapter C1,
where we introduce a numerical test case as well as an experimental validation.
Let us consider the CS-MHE described by problem (B1) and focus on the
constraint equations labelled as Eqs. (B1b–B1f). The hypothesis of additive
noise for discrete-time systems (cf. section A1.2, page 23) allows us to write
down explicitly the noise terms as shown in Eqs. (B2) and substitute them in
the cost function. This results in the optimisation problem (B3). Note that we
have first inserted Eq. (B1e) into Eqs. (B2a–B2b). The notation is the same of
sections A1.5 and B1.2 (cf. Eqs. (A66) and (B1)). Problem (B3) represents the
starting point for introducing the complex dictionaries in section B1.4 and for
discussing rank and condition number in chapter B2.
vk = yk − h(xk , uk ) (B2b)
wa = xT −N +1 − x̄T −N +1 (B2c)
>
minimise (xT −N +1 − x̄T −N +1 ) Pa−1 (xT −N +1 − x̄T −N +1 )
xk ,αk
T −1
X >
+ xk+1 − f (xk , ψk αk ) Q−1 xk+1 − f (xk , ψk αk )
k
k=T −N +1
T
X >
+ yk − h(xk , ψk αk ) Rk−1 yk − h(xk , ψk αk ) (B3a)
k=T −N +1
T −1
>
X
+ (α∗ − ᾱ∗ ) Pα−1
∗ (α − ᾱ ) + λ
∗ ∗
kαk k1
k=T −N +1
subject to xk ∈ xLB UB
, αk ∈ αkLB , αkUB (B3b)
k , xk
α = Fu (B4a)
u = F −1 α = Ψα (B4b)
ψT −N +1,T −N +1 ··· ψT −N +1,T −1
Ψ = .. .. .. (B5)
. . .
ψT −1,T −N +1 ··· ψT −1,T −1
Eq. (B6) shows the input projection onto dictionary Ψ at time step k.
Consequently, the discrete-time state-space representation of a (linearised)
system with additive noise (cf. Eq. (A32)) becomes as in Eq. (B7). We note
that each time step k involves all atoms of the dictionary [103].
T
X −1
uk = ψk,j αk (B6)
j=T −N +1
T
X −1
xk+1 = Ak xk + Bk ψk,j αk + wk (B7a)
j=T −N +1
T
X −1
yk = Ck xk + Dk ψk,j αk + vk (B7b)
j=T −N +1
By extracting the noise terms from Eq. (B7) and inserting them into the CS-
MHE problem (B3), we obtain the formulas for the CS-MHE for a discrete-time
system that implements dictionary Ψ, which we show in Eq. (B8). All notation
should already be clear. In matrix form, problem (B8) corresponds to Eq. (B9).
The new optimisation variable z and consequently all other matrices are given
in Eq. (B10). In appendix 1 we further discuss the matrix implementation of
the CS-MHE, giving the example of an LTI system and a horizon length of
N = 4 time steps.
THE CS-MHE WITH COMPLEX INPUT REPRESENTATIONS 93
>
minimise (xT −N +1 − x̄T −N +1 ) Pa−1 (xT −N +1 − x̄T −N +1 )
xk ,αk
>
T
X −1 T
X −1
+ xk+1 − Ak xk − Bk ψk,j αk ·
k=T −N +1 j=T −N +1
T
X −1
Q−1
k
xk+1 − Ak xk − Bk ψk,j αk
j=T −N +1
>
T
X T
X −1
+ yk − Ck xk − Dk ψk,j αk ·
k=T −N +1 j=T −N +1
T
X −1
Rk−1 yk − Ck xk − Dk ψk,j αk
j=T −N +1
T −1
>
X
+ (α − ᾱ) Pα−1 (α − ᾱ) + λ kαk k1 (B8a)
k=T −N +1
x
z = (B10a)
α
Hxx Hxα
H = (B10b)
Hαx Hαα
qx
q = (B10c)
qα
94 FORMULATION OF THE CS-MHE
Although some solvers may be able to deal with complex variables and with
an `1 -norm term, it is more convenient for computational speed to recast
the problem into the SOCP of Eq. (B11). We can obtain this by applying
the formulas that we introduced in section A2.5, defining a new optimisation
variable z̃ in which the real and imaginary parts of α are split as in Eq. (B12).
Moreover, a slack variable s keeps the relationship between <(α) and =(α)
through the new constraint Eq. (B11b) [101, 120]. Eq. (B12) gives all vectors
and matrices of problem (B11).
x
z̃ = <(α) (B12a)
=(α)
<(Hxα ) −=(Hxα )
Hxx
H̃ = <(Hαx ) <(Hαα ) −=(Hαα ) (B12b)
=(Hαx ) =(Hαα ) <(Hαα )
qx
q̃ = <(qα ) (B12c)
=(qα )
The new variable ζ in Eq. (B13) for the computation of the covariance matrix
of the optimisation problem comes with its associated Hessian H̃ζ in Eq. (B14).
Furthermore, we linearise the cone constraint Eq. (B11b) following the Taylor
linearisation around point ᾱ (cf. Eq. (A5)), which we obtain from the solution of
problem (B11). Eq. (B15) shows the linearisation, which results in the Jacobian
96 FORMULATION OF THE CS-MHE
J˜ζ in Eq. (B16). H̃ζ and J˜ζ correspond to H and Jg in Eq. (A114), respectively,
and allow to compute the covariance matrix.
x
<(α)
ζ =
=(α) (B13)
s
<(Hxα ) −=(Hxα ) 0
Hxx
<(Hαx ) <(Hαα ) −=(Hαα ) 0
H̃ζ =
=(Hαx ) (B14)
=(Hαα ) <(Hαα ) 0
0 0 0 0
<(ᾱ)2 + =(ᾱ)2
<(ᾱ)2 + =(ᾱ)2 − p +
p
<(ᾱ)2 + =(ᾱ)2
<(ᾱ) =(ᾱ)
· <(α) + p · =(α) ≤ s (B15)
<(ᾱ)2 + =(ᾱ)2 <(ᾱ)2 + =(ᾱ)2
p
J˜ζ =
<(ᾱ) =(ᾱ)
h i
0 √ √ −I (B16)
<(ᾱ)2 + =(ᾱ)2 <(ᾱ)2 + =(ᾱ)2
The system as such is likely not to be observable (cf. section B2.1) since
all possible inputs are included, and this would lead to problems in the
matrix inversion in Eq. (A114). For the CS-MHE described in chapter B1
we circumvented this problem by considering only the active component of the
dictionary, with the consequence of changing the problem size according to the
number of nonzero elements nα∗ . For the case of Fourier components we propose
a different approach that does not require to change the problem size, allowing
to keep the same matrix structure throughout the estimation. We implemented
this by introducing a regularisation factor on all the zero components of α. We
will show an example in chapter C2 (cf. section C2.1).
0 0
Amplitude
A A
0 0
A A
0 0
Basis functions Single basis function
Figure B1.1: Dirac delta dictionary (left) and magnitude of a complex Dirac
delta (right). Legend: Dirac delta (—–•); signal to be modelled (- - -×).
presented two algorithms to exploit the prior information in case of time domain
and Fourier dictionaries. We gave a few hints of the application domains of the
different formulations, and in this section we further discuss this aspect.
As an example, let us consider a force impact. Throughout this dissertation we
already mentioned that the CS-MHE is well suited for the estimation of impacts
due to their sparsity in time and space. Formally, we model such idea through a
Dirac delta dictionary, which consists of a Dirac delta for every possible location
and time step within the estimation window [36, 63]. We can visualise this by
looking at the graphs on the left hand side in Fig. B1.1. From top to bottom,
there are a Dirac delta dictionary (made of unit impulses), its adaptation for
the representation of an impulse with amplitude A (green cross) applied at a
modelled location, and finally also for a location which is not modelled. The
latter case causes a worse sparsity level, and the true location can be determined
by linear interpolation [63].B3 Let us now think of modelling an impulse in
a different way, i.e., by using a single Dirac delta which is being shifted by
a phase relation (e.g., through complex numbers). We depicted such idea in
the right hand side graphs of Fig. B1.1 (where we show only the amplitude).
Such representation is quite appealing for observability reasons, but it does
not comply with sparsity. In fact, a single basis function cannot form a sparse
signal. Because of this latter reason, in the context of the CS-MHE we prefer
to have a dictionary composed by Dirac deltas spanning a series of points in
time and/or space (pre-multiplied by factor that determines the amplitude),
B3 The graphs assume an impulse of magnitude 1, which can be scaled arbitrarily without
loss of generality.
98 FORMULATION OF THE CS-MHE
B1.6 Conclusions
Acknowledgements
This chapter reports the research that we presented in reference [105], of which
Matteo Kirchner is the first author. A big thanks goes to Jan Croes, who is
second author in [105] and contributed to the topic with ideas and served as
discussion partner.
101
102 RANK AND CONDITION NUMBER CONSIDERATIONS FOR THE CS-MHE
The discussion about rank and condition number of the CS-MHE consists of
a comparison with an MHE scheme with a random walk model for the input
representation (RW-MHE, cf. section A1.7.2) and an MHE without any input
model (NI-MHE, cf. section A1.7.1), and it is based on the test settings in the
following list [105].
• We choose a Dirac delta dictionary for the CS-MHE (cf. section B1.5),
such that the input u and its projection α are the same, being a Dirac
delta dictionary an identity matrix, i.e., uk = Iαk ≡ αk . Related to this
aspect, it is important to mention that the matrices contained in this
section (Eqs. (B20–B21)) are not generic, and hold only in case of a Dirac
delta dictionary. In case of other dictionaries we may obtain more complex
matrices, as the ones in the example in appendix 1.
• We exclude bounds (B3b) from the set of active constraints, since
mathematically bounds introduce equations with zero covariance.
• We introduce an LTI numerical test case in section B2.2. In fact,
for multistep estimators such as the MHE we can have an indication
concerning observability by assessing rank and condition number of the
discretised system, looking at the problem as an overdetermined weighted
least square fitting (cf. section A1.8). Accordingly, in the matrices that
we report in this section (Eqs. (B20–B22)) we omit to mark the time step
dependency (subscript k), resulting in the formulation of an LTI system.
However, in case of nonlinear systems the time step dependency can be
reinserted in the matrices following the notation in Eq. (B18) [105].
xT−N+1 xT−N+1
xT−N+2 xT−N+2
.. ..
. .
xT−1 xT−1
xT
z CS = ≡ z RW ≡ z NI = xT (B18)
αT−N+1 uT−N+1
αT−N+2 uT−N+2
.. ..
.
.
αT−1 uT−1
αT uT
H = A>
H ΣAH (B19)
C2.1.
104 RANK AND CONDITION NUMBER CONSIDERATIONS FOR THE CS-MHE
I
−A I −BI
−A I −BI
.. .. ..
. . .
0
−A I −BI
−C −DI
−C −DI
ACS = .. ..
H
. .
−C −DI
−C −DI
I
I
..
.
I 0
(B20)
I
−A I −BI
−A I −BI
.. .. ..
. . .
0
−A I −BI
−C −DI
−C −DI
ARW = .. ..
H
. .
−C −DI
−C −DI
−I I
−I I
.. ..
. .
−I I
(B21)
Let us now introduce a numerical test case to assess rank and condition number
of the three estimation schemes. We modelled analytically a cantilever beam,
NUMERICAL TEST CASES 105
Pa−1
−1
Q
Q−1
..
.
Q−1
Σ=
R−1
R−1
..
.
R−1
R−1
Pα−1
(B22)
s1 s2 s3
0 L
Figure B2.1: Cantilever beam. Legend: 1st mode (—–); 2nd mode (- - -); 3rd
mode (- · -); transducers (s1 , s2 , s3 ); input locations (+). Figure reproduced
from [105].
Parameter Value
Beam length [m] 0.400
Beam width [m] 0.025
Beam thickness [m] 0.003
Density [kg/m3 ] 2727
Young’s modulus [GPa] 67.8
ζ1 0.010
ζ2 0.012
ζ3 0.040
(a) NI-MHE
(b) RW-MHE
(c) CS-MHE
2. Two types of transducers
(a) displacements
(b) accelerations
3. Knowledge about the prior information
(a) the arrival cost is known
(b) the arrival cost is not available
4. Number of input locations nm = 1, . . . , 5
5. Two final test cases involving the NI-MHE without arrival cost. First, we
consider only one time step of a window for the input estimation, within
which we take into account a growing number of input locations [105].
Next, we discuss the case of one input location and an increasing amount
of time steps within the estimation window.
STRUCTURE OF THE MATRICES 107
11
10 rank(O) = n z
rank(O)
9
6
6 7 8 9 10 11 12 13 14
nz
Before presenting the results of the numerical study, it is worth to have a look
at the observability of the system for a single step estimator (cf. section A1.4)
with an RW model to represent an input. Let us consider the case in which
only the states are being estimated. Then, let us add one by one the possible
input locations as augmented states. We did this such that we always take into
account the beam tip (x = L), we never consider point x = 0, and the rest of the
beam is covered uniformly. Moreover, we assign a random walk model to each
input position [133].
For the system to be observable, matrices A and C of the state-space
representation have to satisfy the condition given in Thm. A1.23. Fig. B2.2
shows the rank of O as function of the number of states nz , that includes the
nx = 6 position and velocity MPFs as well as the nm augmented states for the
input estimation. The graph shows that the system becomes unobservable if
nz > 9, i.e., when the number of input locations is nm > 3. On the other hand,
we will show in chapter C1 that the CS-MHE is able to observe a higher number
of input positions, provided that the input is sparse. In fact, Figs. C1.1 and
C1.13 both exhibit nm = 8 (grey crosses).
the same time, the PBH test (cf. Thm. A1.24) suggested that some states may
encounter estimation difficulties when accelerometers are employed, due to the
fact that accelerometers cannot measure a DC component.
Fig. B2.3 shows matrix AH for the 12 resulting possible combinations of the
different estimation approaches, types of transducers and knowledge on the prior
information [105]. The columns correspond to the three different estimation
approaches, NI-MHE, RW-MHE and CS-MHE, respectively. The top block
considers the arrival cost to be known, whereas the bottom block does not
take this information into account. The two rows within each block make
a comparison between displacement and acceleration transducers. For the
numerical study of this section, all dependent variables are eliminated from
matrix H, which means that the system is observable if the rank equals the
number of columns, i.e., z contains only independent variables [105]. Every
set-up can be uniquely distinguished by looking at the distribution of the
nonzero components inside AH (small circles), to be compared with the blocks
in Eqs. (B20–B21). In general:
Each subplot of Fig. B2.3 presents rank, condition number (cond) and size
of AH (number of rows × number of columns). All values derive from the
cantilever beam described in section B2.2. Here follow a few comments about
the structure of matrix AH [105]:
0 0 0
10 10 10
20 20
disp
20
30 30
30
40 40
40
50 50
0 10 20 30 0 20 40 0 20
size 45x38 size 53x40 size 53x38
rank = 40. cond = 2e+11 rank = 40. cond = 1.3e+11 rank = 40. cond = 5.7e+10
0 0 0
10 10 10
20 20
20
acc
30 30
30
40 40
40
50 50
0 20 40 0 20 40 0 20 40
size 45x40 size 53x40 size 53x40
rank = 38. cond = 6.9e+11 rank = 40. cond = 1.4e+07 rank = 38. cond = 3.3e+06
−−−−−−−−−−−−−− no arrival cost −−−−−−−−−−−−−
0 0 0
10 10 10
20 20
disp
20
30 30
30
40 40
40
0 10 20 30 0 20 40 0 10 20 30
size 39x38 size 47x40 size 47x38
rank = 38. cond = 2.7e+17 rank = 38. cond = 1.9e+17 rank = 40. cond = 5.8e+10
0 0 0
10 10
10
20 20
acc
20
30 30
30
40 40
40
0 20 40 0 20 40 0 20 40
size 39x40 size 47x40 size 47x40
The number of states and inputs that can be observed with a certain number
of transducers is an important factor to take into account when setting up a
filter. Fig. B2.4 shows ranks and condition numbers for a set of input locations
nm = 1, . . . , 5. The graphs correspond to each case of Fig. B2.3, and include
also the test case with nm = 2 that we described in section B2.3. The solid grey
lines show the condition numbers, plotted on a base 10 logarithmic scale (grey
right vertical axes). The dashed black lines refer to the ranks (black left vertical
axes of the graphs), and different markers characterise each point indicating
one of the following situations:
# full rank–column, i.e., AH has more rows than columns; its rank is full,
limited by the number of columns. All states are thus observable.
2 full rank–row, i.e., AH has more columns than rows; its rank is full,
limited by the number of rows. More information is needed in order to
observe all states.
× rank deficient–column, i.e., AH has more rows than columns; its rank is
not full, meaning that some states are not observable. The red digit next
to the marker indicates the difference between the rank and the number
of columns.
3 rank deficient–row, i.e., AH has more columns than rows; its rank is not
full, meaning that more information is needed in order to observe all
states, and some information is redundant (linearly dependent). This can
happen because of critical locations of inputs and transducers.
RANK AND CONDITION NUMBER FOR DIFFERENT AMOUNTS OF INPUTS 111
42 20 60 20 50 6
41 18
40 16 45 5.75
50 −2 15
39 14
disp
−1
38 12 40 5.5
37 10 40 10
36 8 35 5.25
35 6
34 4 30 5 30 5
1 2 3 4 5 1 2 3 4 5 1 2 3 4 5
nm nm nm
45 15 60 20 55 25
50 −2 20
50 −2 15
−1 −1
acc
40 10 45 15
40 10
40 10
35 5 30 5 35 5
1 2 3 4 5 1 2 3 4 5 1 2 3 4 5
nm nm nm
−−−−−−−−−−−−−− no arrival cost −−−−−−−−−−−−−
39 12 60 20 50 7
38 11
45 6.5
50 −2 15
37 10
disp
−1
40 6
36 9
40 10
35 5.5
35 8
34 7 30 5 30 5
1 2 3 4 5 1 2 3 4 5 1 2 3 4 5
nm nm nm
39 20 50 18.5 60 25
−5
38 18
45 18
−4 50 −2 20
37 16 −1
acc
40 −3 17.5
36 14
40 15
35 −2 17
35 12
−1
34 10 30 16.5 30 10
1−1 2 3 4 5 1 2 3 4 5 1 2 3 4 5
nm nm nm
The following list discusses the main outcomes of the simulations [105]:
The last comment motivated two further investigations. First, we consider only
one time step for the input estimation (k = T−N+1, i.e., there are no B matrices
on the other time steps k = T −N +2, . . . , T −1), while as usual we estimate the
states within the whole window [105]. We only examine the NI-MHE without
arrival cost, and we start by taking into account one input location (nm = 1,
corresponding to a matrix B of size nx × 1). Next, we simulate an increasing
RANK AND CONDITION NUMBER FOR DIFFERENT AMOUNTS OF INPUTS 113
disp acc
33 20 33 20
−1 −2 −1 −2
log10(cond)
log10(cond)
rank
rank
32 10 32 15
31 0 31 10
1 2 3 4 5 1 2 3 4 5
nm nm
Figure B2.5: NI-MHE with no arrival cost. Scenario with an increasing number
of input locations within one time step.
disp acc
34 9 34 12
log10(cond)
log10(cond)
rank
rank
32 8 32 11
30 7 30 10
1 2 3 4 1 2 3 4
number of time steps with an input number of time steps with an input
Figure B2.6: NI-MHE with no arrival cost. Scenario with one input location on
an increasing number of consecutive time steps.
nm . Fig. B2.5 shows ranks and condition numbers for this test case. We see
that the rank grows proportionally to the number of inputs for nm ≤ 3, and
numerical instability arises for nm > 3. Finally, Fig. B2.6 shows ranks and
condition numbers for a last scenario with one single input location (nm = 1),
first active only at time step k = T −N +1 and then extended to consecutive
time steps up to k = T −1. Similarly to the previous case, we report the results
for the NI-MHE without arrival cost, both for displacement and acceleration
measurements. We notice that the rank grows proportionally to the number
of added time steps. However, the condition number corresponding to 4 time
steps is much higher, revealing that the threshold nm ≤ 3 may still influence
the accuracy of the results.
Up to this point, we showed the results of a set of simulations that allowed us
to investigate the behaviour of rank and condition number of AH for different
scenarios involving input models, types of transducers and arrival cost. A
comparison helped us to understand strengths and limitations of each test case.
In many situations, a number of inputs nm > 3 results in numerical instability,
expressed by rank deficiency and/or a badly conditioned matrix AH . The
threshold nm > 3 is in line with the observability matrix (cf. section B2.2.1)
and with the PBH test for single step estimators.
114 RANK AND CONDITION NUMBER CONSIDERATIONS FOR THE CS-MHE
Let us now focus on the CS-MHE, reminding that it deals with a sparse
representation of an input. Specifically, dictionary Ψ in Eq. (B1e) is now an
identity matrix. For the CS-MHE, the analysis in this chapter applies to the
number of nonzero components of a sparse vector, and the threshold nm ≤ 3
indicates a sparsity level that has to be guaranteed. Consequently, the CS-
MHE can take into account a much higher number of input locations, provided
that the number of nonzero basis functions stays within the over mentioned
requirements. In case of the cantilever beam example, it is possible to choose
nm > 3 provided that the input sparsity is S ≤ 3 (cf. chapter C1). From the
second last case (Fig. B2.5), it is clear that a number of input locations nm > 3
leads to numerical instability, while we could not draw any specific conclusions
with regard to the input distribution in time (Fig. B2.6). For what compressive
sensing is concerned, we remind that the whole sampling scheme (i.e., N ·nm )
should be proportional to the input sparsity (cf. chapter A3) [25].
If we look back at Eqs. (A71–A72) on page 49 and Fig. B2.2 on page 107, we can
see that the rank of the observability matrix saturates quite fast, i.e., after the
rank is satisfied it is not worth to add any further information. This information
may come from additional sensors (as we have illustrated in Fig. B2.2) as well as
from additional time steps, going in the direction of the MHE. In other words,
in case of an MHE with a random walk model the window length follows from
a go/no go assessment driven by observability requirements. Reference [38]
includes a study of the influence of the MHE window length for an unknown
input estimation with a random walk model. The outcome is a confirmation
of the fact that the random walk model benefits only mildly from a longer
window, and this trend dies out fast. On the other hand, let us consider
Eq. (A120) of compressive sensing on page 78, which indicates how the number
of required measurements (ny ) scales with the signal length (nα ) and its sparsity
(S). Fig. B2.7 (left) shows a numerical example of Eq. (A120) for a signal up to
nα = 50 and S = 10 (with cS = 3). We note that ny scales strongly with S and
smoothly with nα especially for small values of S. Moving towards an MHE,
we can rewrite Eq. (A120) by considering that the number of measurement is
equal to the number of sensors times the number of time steps, i.e., ny = N ·nr .
This results in Eq. (B23).
cS S log(nα /S)
nr ≥ (B23)
N
Moreover, the number of basis functions is equal to the number of input locations
times the number of time steps, i.e., nα = N ·nm . Fig. B2.7 (right) depicts
Eq. (B23) for nm = 1 (i.e., nα ≡ N ). We can see that for a certain sparsity
the number of required sensors decreases with an increasing number of basis
functions, indicating that given a certain sparsity a longer signal may bring in
RANK AND CONDITION NUMBER FOR DIFFERENT AMOUNTS OF INPUTS 115
20 0.5
15
nr
ny
10
0
5
50
50 40
40 30
30 20
20
10
10 10
10 n N 5
n 0 5 0 0
0
S S
15 1.5
nm = 1
10 1 nm = 2
ny
nr
nm = 3
5 0.5
0 0
0 10 20 30 40 50 0 10 20 30 40 50
n n = N nm
Figure B2.8: Sections from the curves in Fig. B2.7 for a constant sparsity S = 3.
In the right graph we indicate also the dependency on the number of input
locations (nm ).
extra information with less sensors. In other words, a long MHE window may
be worth if CS is used for input modelling. This behaviour is more evident if
we look at a section of constant sparsity, and Fig. B2.8 shows the case of S = 3.
The tendency of a decreasing amount of sensors holds also for multiple input
locations (right graph). Eq. (B23) does not express the amount sensors required
by the CS-MHE, since it reflects exclusively the theory behind compressive
sensing, and not its involvement in the CS-MHE. However, it indicates that the
need for sensors decreases with an increasing window length, and this trend
persists. Although these considerations come from Eq. (A120) which is a rather
empiric formula of CS, they constitute an interesting starting point for future
116 RANK AND CONDITION NUMBER CONSIDERATIONS FOR THE CS-MHE
B2.5 Conclusions
In this chapter we presented a few scenarios where we compared the rank and
condition number of the CS-MHE with an MHE with no input information
(NI-MHE) and an MHE with a random walk model for the input (RW-MHE).
The comparison involved different types of measurements (displacement and
acceleration) and the availability of prior information (arrival cost). This allows
us to list the following conclusions [105]:
• For multistep estimators based on the MHE, the limit on the number
of inputs which is possible to observe for the NI-MHE and RW-MHE
translates into a limit on sparsity for the CS-MHE, and in this chapter we
showed this threshold. For an LTI numerical test case with 6 states and
3 displacement transducers, the NI-MHE can estimate 3 inputs within
a window, the RW-MHE can follow 3 input positions evolving in time,
while the CS-MHE can reconstruct a signal from its sparse projection up
to 3 active basis functions. In case of accelerometers, the DC component
cannot be measured. An accurate arrival cost may improve the rank, but
the system is likely to deviate in time, leading to a poor estimation.
• The random walk model is a powerful representation which allows
the observation of a limited number of inputs under the hypothesis
of slow dynamics and known locations, whereas the CS-MHE admits
more locations and no dynamic constraints provided that the input
representation has a limited sparsity.
Applications
119
Chapter C1
Throughout this dissertation we developed the CS-MHE with the aim of reducing
the observability issues which are typical of joint state/input estimators and
allowing for the estimation of inputs characterised by a fast dynamics. In
particular, in chapter B1 we presented the formulas of the CS-MHE, while
in chapter B2 we discussed a few numerical aspects such as its rank and
condition number. However, up to here we did not show any application
example of the CS-MHE. This chapter presents a few numerical test cases and
one experimental validation that illustrate the capability of the CS-MHE to
go beyond the observability threshold linked to the amount of possible input
locations and to capture the fast behaviour of force impacts. Furthermore, these
application cases allow to illustrate how to determine some of the CS-MHE
tuning parameters such as the balancing weight λ and the covariance matrices,
which are crucial to achieve a good estimation accuracy.
We begin by introducing a numerical test case in section C1.1, which
demonstrates the potential of the CS-MHE over other state of the art techniques.
Furthermore, we present an experimental validation in section C1.2, before
concluding the chapter in section C1.3.
Acknowledgements
This chapter refers mostly to [104], of which Matteo Kirchner is first author. A
big thanks goes to Jan Croes and Francesco Cosco, who are co-authors of [104]
and actively contributed to the first experimental validation of the CS-MHE.
Thanks also to Jean-Pierre Merckx and Eddy Smets for their precious support
during the measurement campaign.
121
122 ESTIMATION OF FORCE IMPACTS
The evaluation of force impacts is in general not an easy task. Force transducers
do exist, spanning from simple resistive solutions up to accurate sensors based
either on strain gauges (for low frequency ranges) or on piezoelectric elements
(for high frequency ranges). However, such measurement systems can be very
expensive and it is not always possible to integrate them in the design of a
mechanical system due to geometrical aspects or durability issues. In such
context, model based estimators and virtual sensors offer a very appealing
solution to the problem. In this chapter we employ the CS-MHE formulation
that we described in section B1.2 in order to perform joint state/input estimation
for the detection of force impacts entering the system at an unknown location.
For the reason behind the choice of this particular formulation we refer to
section B1.5 and to the literature regarding force modelling through sparse
dictionaries and CS in section A1.7.4. On the one hand, the examples in this
section demonstrate the capability of the CS-MHE to go beyond the observability
issues which are typical of other approaches such as a random walk model. On
the other hand, the estimation of an impulse offers a further challenge related to
its intrinsic fast dynamics, for which other state of the art techniques (especially
single step estimators) have strong limitations to deal with (cf. section A1.7).
Most of the material of this chapter comes from reference [104].
We begin this section by introducing a numerical example to test the CS-
MHE formulation that we described in section B1.2. We consider an LTI
mechanical system, of which the generic state-space equations are in Eq. (A33).
More specifically, we modelled analytically a cantilever beam with uniform
rectangular cross-section, according to the Euler–Bernoulli beam theory. We
obtained the state-space model following the procedure given in reference [63],
which we adapted to a cantilever beam with displacement transducers. The
system is similar to the one we introduced in section B2.2 to assess rank and
condition number of the CS-MHE. Fig. C1.1 shows the beam as well as:
F4 F3 F2 F1
s1 s2 s3
Figure C1.1: Numerical test case. Legend: 1st mode (—–); 2nd mode (- - -);
3rd mode (- · -); spatial sampling (+); transducers (s1 , s2 , s3 ); input (F1 , F2 ,
F3 , F4 ). Figure reproduced from [104].
Parameter Value
Beam length [m] 0.405
Beam width [m] 0.025
Beam thickness h [m] 0.003
Density [kg/m3 ] 7502
Young’s modulus [GPa] 65.9
x(s1 ) [m] 0.245
x(s2 ) [m] 0.325
x(s3 ) [m] 0.405
ζ1 0.030
ζ2 0.037
ζ3 0.119
εα [N] 1.0
Figure C1.2: Reference input for the numerical test case (—–×, the nonzero
components are marked with a darker circle), and window for the first CS-MHE
iteration (light blue). Figure reproduced from [104].
We note that the force (green crosses in Fig. C1.2) is zero except for the 4
impacts, such that the input signal is sparse in time and space. Consequently,
there is no need to project the signal onto any specific dictionary, and u and α
are equivalent (cf. section B2.1). The first estimation window is marked in light
blue, and consists of nm = 8 spatial points (the grey crosses in Fig. C1.1) and
N = 11 time steps, such that the input estimation takes place on 10 time steps
[105]. This value allows for good accuracy [167] and for a fast computation. The
system is at rest and this status does not change until F1 is applied. Table C1.1
summarises geometry and material properties of the beam.
The CS-MHE can directly estimate the location of an input only if this is
applied to a sampling point. If this is not the case, the input energy is spread
among the neighbouring nodes. However, we can still estimate accurately the
NUMERICAL ESTIMATION OF MULTIPLE FORCE IMPACTS 125
25
Qdrift [N2]
2.5
0
T−N+1 T−1 T
k
Figure C1.3: Qdrift as a linear function of the time step k. Figure reproduced
from [104].
exact input location by linear interpolation, provided that the input consists of
one single impulse (see Fig. B1.1 on page 97, bottom left graph) [63]. In such
context, CS outperforms the random walk model for what the robustness in
relation to the accuracy of an input location is concerned. In fact, an input
applied to an unexpected location may jeopardise the estimation, since the
random walk model does not take into account such uncertainty [104].
Fig. C1.3 shows that the drift term (Qdrift ) follows a linear function of the time
step k within one window. This choice derives from the fact that the estimation
is expected to be more accurate if both past and future data take part in the
estimation (cf. definition A1.18), and this happens next to k = T −N +1 [173].
In order to investigate the influence of the modelling error, we simulated a
model mismatch by varying the beam thickness (h). Table C1.3 shows the first
3 eigenfrequencies of the beam for the reference test case (h = 0.003 m) as well
as 3 other scenarios involving a thinner beam, which results in a frequency
mismatch indicated by δ% . The choice of parameters εR , εQ and λ for each case
will become clear in section C1.1.1. We chose a sampling period of 2.5·10−3 s
(400 Hz), which satisfies the Nyquist-Shannon sampling theorem for the highest
eigenfrequency. Note that the CS-MHE exploits compressive sampling for the
observation of a large amount of input positions, and here we are not considering
its ability to acquire and reconstruct an undersampled signal in time [104].
It is worth noticing that according to the observability matrix the numerical
example is not observable, i.e., a random walk model applied to each sampling
points renders the system unobservable (cf. section B2.2.1). This is true also if
we know a priori where the 4 inputs are applied (they would require 4 random
walks), since more that 3 random walk models are not admitted. On the other
hand, the observability criterion for the CS-MHE requires that at most 3 nonzero
components are active at the same time within an estimation window. For
the case under examination (force impacts with a Dirac delta dictionary), the
number of the active components corresponds to the number of force impulses
within a window (cf. sections A1.8 and B2.1).
126 ESTIMATION OF FORCE IMPACTS
Table C1.3: First 3 eigenfrequencies of the beam and CS-MHE tuning parameters
for the numerical test cases.
In chapter B1 we pointed out the key role of the weight λ and its dependency
on the model and measurement covariance matrices (cf. section B1.2). In this
section we discuss the choice of this crucial CS-MHE parameter. Fig. C1.4
shows the mean square error (MSE) of the input estimation of the reference
case for different values of λ and constant arbitrary covariances Q and R. We
note that the MSE drops within a region of λ, corresponding to an accurate
input estimation. If λ is too small, the optimisation gives more weight to the
minimisation of the model and measurement errors, while the input sparsity
cannot be exploited. On the other hand, a too high value of λ would promote
sparsity within a system that does not minimise any model and measurement
errors, resulting in a higher MSE [104]. In Fig. C1.4 we note a few further
aspects. First, the interval of λ for which the MSE drops is rather wide,
meaning that we can admit some uncertainty in the choice of λ without having
the estimation results compromised. Next, we see that the curve is non-smooth
in the neighbourhood of minimum MSE, and this can be due to the fact that
the graph comes from a numerical investigation that involves discrete values
of λ (the solution is suboptimal, being it governed by the choice of discrete
values, equidistant on a logarithmic scale), and it may also be linked to the
sampling scheme (a higher sampling rate may reduce the non-smoothness [38]).
Furthermore, if we look at the MSE outside the optimal area we note that
choosing a too small λ results in a smaller error than if we choose a too high
value. We know that the model is very good for the reference case, and it makes
then sense to rely on it rather then putting emphasis on the input sparsity.
NUMERICAL ESTIMATION OF MULTIPLE FORCE IMPACTS 127
log10(MSE)
−2
−4
−6
−10 −5 0 5
log10(λ)
Figure C1.4: Choice of λ for the numerical test case δ% = 0%, given arbitrary
constant Q and R. MSE of the input estimation (blue line) and location of its
minimum (green circle). Figure reproduced from [104].
2 2 2 2
log10(εQ)
log10(εQ)
log10(εQ)
log10(εQ)
0 0 0 0
−2 −2 −2 −2
−10 −8 −10 −8 −10 −8 −10 −8
log10(εR) log10(εR) log10(εR) log10(εR)
2 2 2 2
log10(εQ)
log10(εQ)
log10(εQ)
log10(εQ)
0 0 0 0
−2 −2 −2 −2
−10 −8 −10 −8 −10 −8 −10 −8
log10(εR) log10(εR) log10(εR) log10(εR)
In this section we present the results of the numerical investigation. Fig. C1.7
shows the estimation window i = 7 for δ% = 0% (the whole simulation is available
as supplementary material of [104]). The left graphs display the states, divided
into position (top) and velocity (bottom) MPFs. Their confidence intervals
are also present (MPFn ± 3σn , i.e., 99.7% of the normal distribution, where
n = 1, 2, 3 identifies the first 3 eigenmodes of the structure). The right graph
shows the input estimation. The time axis in Fig. C1.7 is relative to the current
window. Fig. C1.8 shows the input estimation of the full simulation. We
obtained the graph by keeping the elements α ≥ εα that correspond to the best
estimation time step of each window, i.e., k = T −N +1 (cf. definition A1.18).
NUMERICAL ESTIMATION OF MULTIPLE FORCE IMPACTS 129
Input estimation (i = 7)
Position MPF
0.01
MPFn ± 3σn
10
0
5
−0.01
F [N]
0 0.01 0.02 0
t [s]
Velocity MPF −5
0.5
MPFn ± 3σn
−10
0
0.2
−0.5 0.3 0.02
0 0.01 0.02 0.01
t [s] x [m] 0.4 0 t [s]
two blue thin lines that represent the confidence level (MPFn ± 3σn ). Legend
(right graph): reference values (—–×); CS-MHE estimation (—–◦). The time
is relative to the current window. Figure reproduced from [104].
We can follow this approach if the last estimate (k = T −1) is not required for
specific real time applications. Moreover, we can calculate the discarded energy
(due to εα ) and adding it to each of the nonzero components proportionally
to their magnitude, obtaining the solid dots in Fig. C1.8, revealing the high
accuracy of the input estimation.
Fig. C1.9 presents three further aspects regarding the input estimation within
each iteration. First, the top graph shows the MSE of every window. We
notice that the MSE grows with the model mismatch. The higher error of the
simulation with δ% = −10% is not only due to the higher model mismatch, but
it is also due to a poor tuning. In fact, by comparing the values of εR , εQ and
λ in Table C1.3 for the case with δ% = −10% to the values for a smaller model
error, we note that the tuning encountered some issues. A finer tuning may
improve the solution, and we will discuss this aspect further in chapter C2. The
different MSE among the input locations is connected to the sensor positioning,
which is linked to the mode shapes of the beam. A simple way to compare
different locations involves the condition number of the observability matrix, or
checking all singular values of the PBH observability matrix (cf. section A1.8).
Furthermore, there exist techniques for optimally placing the transducers for
improved performance [77]. However, this is not a simple task when the location
of the input is not known. For this reason, we adopted a series of equidistant
sensors. Next, the central graph shows nα∗ i|i for each window, which depends
130 ESTIMATION OF FORCE IMPACTS
10
5
F [N]
−5
−10
0.2
0.15
0.3 0.1
0.05
x [m] 0.4 0
t [s]
on the choice of εα . The dashed green line (recognisable by the crosses) is the
reference and corresponds to the sparse signal of Fig. C1.2. We notice a link
between the number of inputs and the MSE, i.e., detecting more components
increases the MSE. Finally, the bottom graph in Fig. C1.9 shows the sum of all
elements αi|i . We observe that the magnitudes of F2 and F4 do not converge to
their expected values, whereas F1 and F3 do converge. The reason for this offset
lays in the fact that part of the model error is seen as input by the CS-MHE.
We can limit this error by a better model or a different sensor positioning which
improves local observability. In general, we can expect better results if the
input enters the system next to a transducer. In Fig. C1.1 we note that the
location of F1 corresponds to a sensor (s3 ), both F2 and F3 are surrounded by
all sensors (with the difference that F2 enters the system not far from a node of
the third mode), and F4 is located further away from the sensors. This results
in a lower estimation accuracy of F4 and, to a minor extent, also of F2 .
As an example, let us have a look at iteration 45 of the test case with δ% = −10%
(marked with a red diamond in Fig. C1.9). Fig. C1.10 shows the results of the
input estimation. We notice that most of the nonzero components are located
around the impulse, while some others are further away and are due to the
model mismatch. Unfortunately, it is not possible to filter out those components
a priori. However, we see that the results are accurate even in case of a high
model error.
NUMERICAL ESTIMATION OF MULTIPLE FORCE IMPACTS 131
0.3
MSE [N2]
0.2
F1 F2 F3 F4
0.1
0
5 10 15 20 25 30 35 40 45 50
i
6 F1 F2 F3 F4
nα* i|i
0
5 10 15 20 25 30 35 40 45 50
i
10
sum(α i|i) [N]
0 F1 F2 F3 F4
−10
0 5 10 15 20 25 30 35 40 45 50
i
Figure C1.9: MSE (top), nα∗ i|i (centre) and sum of all elements in αi|i . Legend:
reference values (- - -×), δ% = 0% (—–◦), δ% = −3.3% (- - -), δ% = −6.7% (- · -),
δ% = −10% (· · · ). Figure reproduced from [104].
Fig. C1.11 (left) shows the input estimation of the whole simulation with
δ% = −10%, to be compared to Fig. C1.8 for δ% = 0%. All peaks are well
estimated, together with some error in form of unwanted peaks. A way to filter
those peaks out is to set a higher threshold εα , as shown in Fig. C1.11 (right).
Both graphs in Fig. C1.11 do not include any energy correction, which can be
implemented as discussed for the reference test case with δ% = 0%. However,
due the modelling error the CS-MHE estimates a wrong impact. Fig. C1.11
highlights the importance of εα on the accuracy of the input estimation (cf.
section B1.2.1). Its choice follows from our best knowledge of the system noise
and the expected input. The development of an automatised routine for choosing
εα is certainly an interesting open point for future research.
To summarise, the reference test case (δ% = 0%) gave a very accurate input
estimation, whereas a small error resulted when we increased the model mismatch
132 ESTIMATION OF FORCE IMPACTS
10
0
n
5
−0.01
F [N]
0 0.01 0.02 0
t [s]
Velocity MPF −5
0.5
MPFn ± 3σn
−10
0
0.2
−0.5 0.3 0.02
0 0.01 0.02 0.01
t [s] x [m] 0.4 0 t [s]
εα = 1 N εα = 2 N
10 10
5 5
0 0
F [N]
F [N]
−5 −5
−10 −10
0.2 0.2
Freq. ID Freq. EMA [Hz] Freq. MoUp [Hz] Error [%] (δ% )
1 8.76 8.76 0.00
2 61.58 54.88 -3.71
3 180.52 153.66 -14.88
s1 s2 s3
Figure C1.13: Experimental test case. Legend: see Fig. C1.1. Figure reproduced
from [104].
Figure C1.14: Reference input for the experimental test case. Legend: see
Fig. C1.2. Figure reproduced from [104].
log10(MSE)
0
−1
−2
−5 0 5
log10(λ)
Figure C1.15: Choice of λ for the experimental test case, given a constant Q
and R. MSE of the input estimation (blue line) and location of its minimum
λ = 2.14 (green circle). Figure reproduced from [104].
log10(λ) log10(MSE)
−1.2
0 −1
−1 −1.4
log10(εQ)
log10(εQ)
−2 −1 −2 −1.6
−3 −2 −3 −1.8
−3 −2
−4 −4
−8 −7 −6 −5 −8 −7 −6 −5
log10(εR) log10(εR)
Figure C1.16: Optimal λ (left) and MSE (right) of the input estimation as
functions of εQ and εR . Legend: see Figs. C1.5–C1.6. The chosen values are
εR = 8.16 · 10−8 and εQ = 5.10 · 10−3 .
of the input. The values of εR , εQ and λ for the experimental test case follow
from Figs. C1.15–C1.16 and are given in the captions. If we compare Fig. C1.15
with Fig. C1.4, we see that the interval of minimum MSE is quite narrow. In
this case the model is not as accurate as it was for that specific numerical
investigation, resulting in the need of an accurate λ. This reflects in the whole
MSE curve: if we rely on the model (small λ), the error is much higher than
what we can obtain by relying solely on the sparsity of the input (big λ). This
is the opposite of what we noted in Fig. C1.4.
Fig. C1.17 shows the estimation window i = 12, while the whole simulation is
available as supplementary material of [104]). For the notation and the legend
we refer to the numerical example in section C1.1.2. A few nonzero components
are located in the neighbourhood of the expected input position, and some of
them will be filtered out since their absolute value does not exceed εα .
136 ESTIMATION OF FORCE IMPACTS
0
0
−2
−5
F [N]
0 0.01 0.02
t [s]
−4
Velocity MPF
MPFn ± 3σn
0.2 −6
0
0.2
−0.2 0.02
0.3
0 0.01 0.02 0.01
t [s] x [m] 0.4 0 t [s]
−2
F [N]
−4
−6
0.2
0.08
0.3 0.06
0.04
0.4 0.02
x [m] 0
t [s]
Fig. C1.18 shows the input estimation of the full simulation. We evaluated
the discarded energy due to εα and distributed it to each of the three nonzero
components, proportionally to their magnitude. Two components are located
where we expect them, while the third one is on a neighbouring node, located in
time at the second nonzero component of the impact and in space at x = 0.365
m, i.e., in the direction of the tip of the beam. A closer look at the video of the
acquisition (the video is available as supplementary material of [104]) shows
CONCLUSIONS 137
MSE [N2]
0.5
0
5 10 15 20
i
4
nα* i|i
0
5 10 15 20
i
Figure C1.19: MSE (top) and nα∗ i|i (bottom). The dashed green line is a
reference, and follows from the signal on Fig. C1.14. Figure reproduced from
[104].
that the hammer hits the beam in the direction of that node, which justifies
the presence of the third component.
Finally, Fig. C1.19 presents MSE and nα∗ i|i for the experimental test case. We
notice that the estimation gets more accurate when the input approaches the
end of the window, justifying the choice of Qdrift (cf. Fig. C1.3) as well as the
choice of recognizing the last time step of each window as the best estimate
[173]. A certain delay characterises the input detection, revealing the intrinsic
capability of a time window to detect an impulse, which is not a trivial task for
single step estimators (cf. section A1.4).
C1.3 Conclusions
Acknowledgements
The first part of this chapter expands reference [103], of which Matteo Kirchner
is first author. Thanks to Jan Croes and Francesco Cosco, who are co-authors
of [103]. Thanks to Eddy Smets for building the test set-up and to Daniele
Brandolisio for his help in the lab. Thanks to Karim Asrih, Ward Rottiers, Luca
Sangiuliano and Simon Vanpaemel for their hints regarding Siemens NX. Thanks
to Frank Naets and Jakob Fiszer for the practicalities regarding state-space
models. Thanks to Noé Geraldo Rocha de Melo Filho for sharing with me
his knowledge of LMS Test.Lab. Thanks to Francesco Cosco for the camera
measurements and to Tom Henskens for the circuit board to synchronise the
pictures with the acquisition system. Thanks to Florian Maurin for Fig. C2.7.
139
140 ESTIMATION OF PERIODIC LOADS DESCRIBED BY FOURIER COMPONENTS
F(k) [N]
0
s1 s2
−4
0 0.1 0.2 0.3 0.4 T−N+1 T−1T
x [m] k
Figure C2.1: Geometry of the numerical test case (left) and simulated sinusoidal
force applied at the beam tip (right). Figure reproduced from [103].
Iteration 1
−3 MPFpos,n ± 3σn ℜ(α) ± 3σ
x 10
10
2
uk
0 0
−2 4
−10
T−N+1 T−1T Fourier components
F(k) [N]
k 0
−1 −10
T−N+1 T−1T Fourier components
k
Figure C2.2: Results at iteration 1. Legend (state estimation): 1st mode (—–);
2nd mode (- - -); 3rd mode (· · ·). The thick green lines are the reference, the
thick blue lines are the CS-MHE estimation, confined into two thin blue lines
that represent the confidence level (MPFn ± 3σn ). Legend (input estimation):
reference values (- - -×); CS-MHE estimation (—–◦). Figure reproduced from
[103].
two graphs on the left hand side refer to the state estimation, and depict position
and velocity MPFs. Moving to the right, the next two graphs show the results
of the input estimation, expressed as <(α) and =(α) and arranged according
to the MATLAB® convention for the DFT, i.e., first the DC component, then
the half space of positive wave numbers and finally the negative half space.C1
The ±3σ confidence bands come from the covariance matrix of the constrained
optimisation problem [19], where we linearised the constraint Eq. (B11b) (cf.
Eq. B15). Finally, we obtained the right graph by applying the inverse Fourier
transform to the nonzero components of α, which are marked by a solid circle.
The input at iteration 1 is purely complex, since the signal is a sine with a
phase shift of π rad. Moreover, the DC component is null, since the force has
zero mean. A single Fourier component consist of one pair of complex conjugate
elements, scaled by the square root of the number of samples. The behaviour of
<(α) and =(α) with regard to the time shift of the optimisation window can be
spot in an animated GIF of the simulation, which is available on YouTube [102].
We can see (predominantly on the velocity MPF of the 3rd eigenmode) that
the sizes of the confidence levels decrease throughout the simulation, thanks to
the propagation of the prior information through the covariance matrix of the
optimisation problem [104].
C1 For further details we refer to function fft on MATLAB® help.
142 ESTIMATION OF PERIODIC LOADS DESCRIBED BY FOURIER COMPONENTS
1
x ℜ(α) ℑ(α) s
0
−1
log10(σ)
−2
−3
−4
−5
Figure C2.3: Standard deviation as square root of the main diagonal of the
covariance matrix.
8
6
|α|
4
2
0
Fourier components
0.6
0.4
σ
0.2
0
Fourier components
Since the slack variable is part of the optimisation variables (and consequently
also part of the covariance matrix associated to it), a weighting number becomes
available (last block in Fig. C2.3). On the other hand, we can derive the same
quantity through the formula for propagating an uncertainty, assuming <(α)
and =(α) to be independent variables [115]. For our case, this yields to Eq. (C2).
Fig. C2.5 displays the comparison between the standard deviations from the
covariance matrix and calculated through Eq. (C2), showing an excellent match.
s
<(α)2 =(α)2
σ|α| = 2 2
· σ<(α) + · σ=(α) (C2)
<(α) + =(α) <(α)2 + =(α)2
144 ESTIMATION OF PERIODIC LOADS DESCRIBED BY FOURIER COMPONENTS
F(x,t)
s1 s2 s3 s4
Figure C2.6: Geometry of the numerical test case for the distributed load.
0
F [N]
-2
-4
0
0.1
0.2
x [m] 0.3 T-1 T
0.4 T-N+1 k
The test set-up consists of a 1 m long aluminium beam clamped on each side to
a vertical mount, which is fixed to the ground (Fig. C2.9). Nine steel masses
are attached to the beam every ∆x = 0.1 m with the aim of lowering the
eigenfrequencies and add complexity to the system. We built a finite element
(FE) model and we updated it based on a series of experimental modal analyses
146 ESTIMATION OF PERIODIC LOADS DESCRIBED BY FOURIER COMPONENTS
−2
−4
0
0.2
0.4 T−1 T
T−N+1
x [m]
k
Figure C2.8: Results at iteration 1. Legend (state estimation): 1st mode (—–);
2nd mode (- - -); 3rd mode (· · ·). The thick green lines are the reference, the
thick blue lines are the CS-MHE estimation, confined into two thin blue lines
that represent the confidence level (MPFn ± 3σn ). Legend (input estimation):
reference values (- - -×); CS-MHE estimation (—–◦); confidence bands on
the Fourier components (—–). The nonzero components are marked by a
solid circle. Γ [x] and Γ [k] refer to the Fourier components in space and time,
respectively.
EXPERIMENTAL ESTIMATION OF A PERIODIC LOAD IN TIME 147
Figure C2.9: Beam set-up (with two uniaxial accelerometers at x = 0.750 m).
(EMAs). For the experiments that we present in this section, three eigenmodes
govern the dynamical behaviour of the system, which we report in Table C2.1
and Fig. C2.10. These dominate the structure response due to the orientation
of the external force. In fact, we attached a shaker to the structure such that
it applies a force along the z axis, and these modes belong to plane xz (the
axis orientation follows the notation in Fig. C2.11). The ID numbers 1, 3, 8 in
Table C2.1 and Fig. C2.10 derive from the whole mode set that we present in
appendix 2.
In order to run the CS-MHE, we need the model of the set-up to be available in
MATLAB® . We tackled this problem by extracting the FE mass and stiffness
matrices and projecting them on modal coordinates [78]. This allows to build a
reduced order state-space model that takes into account only the eigenmodes
under examination, and to operate in the same way as we did throughout this
dissertation. Beside the state-space representation, in this section we employ
camera based measurements to test the CS-MHE. The following list reports the
main hardware of the experimental set-up (see Fig. C2.12), and we refer again
to appendix 2 for further details.
• One shaker, located at x = 0.750 m and acting along the z axis (THE
MODAL SHOP Miniature Inertial Shaker K2002E01 [188]).
148 ESTIMATION OF PERIODIC LOADS DESCRIBED BY FOURIER COMPONENTS
Table C2.1: Model update for the first three beam modes in plane xz.
Figure C2.10: Mode shapes of the first three beam eigenmodes in plane xz.
In this section we indicate the settings of the experiment as well as the values
for the CS-MHE tuning parameters. Next, we show the experimental results.
We will be rather concise since we already discussed how to choose the tuning
parameters εQ , εR and λ in chapter C1, and we presented a numerical example
with a 1D Fourier dictionary in section C2.1. Table C2.2 includes the values
for a first experiment that involves the estimation of a force composed by a
single Fourier component. Later in this section we will consider an increasing
amount of components. We begin by showing two cases that differ by the
amount of measurements (nr ) and consequently of other tuning parameters.
The first run uses all 53 markers available in the central strip, while for the
second run we employ only 3 markers, located at x = 0.370 m, x = 0.500 m
150 ESTIMATION OF PERIODIC LOADS DESCRIBED BY FOURIER COMPONENTS
z [mm] 0.1
-0.1
0
-0.02
-0.04
370 500 630
x [mm]
of markers that had the most light (“BEAM STRIP 2” in Fig. C2.12). We
can see that the input estimation (solid blue) follows well the measurements
acquired through the impedance head (dashed green).C2 The third eigenmode
(mode ID 8 in Table 5) dominates the beam response, not surprisingly since this
eigenmode is the closest to the excitation frequency. Fig. C2.15 shows the MSE
of the input estimation with and without Qdrift . We notice that the presence
of Qdrift does not always lower the MSE, but it smooths out its variability.
Lower values of Qdrift led to a lower estimation accuracy, whereas in general
we expect Qdrift to help the CS-MHE by providing a better guess for the next
iteration. We believe that this behaviour is connected to all covariances that
take part in the optimisation as well as possible numerical issues, and future
research will investigate this aspect. Furthermore, we note that the MSE in
Fig. C2.15 oscillates quite a lot, and this is caused by the fact that the MSE
C2 For the experimental results in this chapter, the green curves that refer to the state
estimation (left hand side graphs) are not reference values. We obtained them by evaluating
the states with the model and the measured force, and thus they indicate the model accuracy
related to that specific load. As for the input estimation (middle and right hand side graphs),
the impedance head does not measure any static force. The graphs show the total force
Ftotal = Fstatic + Fdynamic applied by the shaker, where Fdynamic is given by the impedance
head and Fstatic is given by the total mass of the assembly that includes shaker, impedance
head and connection screws, multiplied by gravity, i.e., Fstatic = 0.296 kg · 9.81 m/s2 = 2.9 N.
152 ESTIMATION OF PERIODIC LOADS DESCRIBED BY FOURIER COMPONENTS
MPFpos,n 3 n MPFvel,n 3 n
10 -3
5 4
0 0
-1
-2
-3
-5 -4
T-N+1 T-1 T T-N+1 T-1 T
k k
( ) 3
20
uk [N]
0 5
0
-20
Fourier components
( ) 3 -5
20
-10
0
T-N+1 T-1 T
k
-20
Fourier components
is very sensitive to small phase errors. This fact could also be related to the
choice of Qdrift .
Next, Figs. C2.16 and C2.18 show the results (first iteration and MSE,
respectively) for the run with three measurement points at x = 0.370 m,
x = 0.500 m, x = 0.630 m. From Fig. C2.16 we note wider confidence intervals
(±3σ bands) in comparison with Fig. C2.14, where we employed 53 measurement
points. The MSE in Fig. C2.18 is higher than in Fig. C2.15, but the force
estimation is still very accurate. This graph does not include any drift term
EXPERIMENTAL ESTIMATION OF A PERIODIC LOAD IN TIME 153
1.3
1.2
MSE [N 2] 1.1
0.9
0.8
Q drift = 150 N 2
0.7
Q drift = 0 N 2
0.6
0 20 40 60 80 100 120 140
iteration ID
Figure C2.15: MSE of the input estimation, obtained by considering 53
displacement sensors.
since it did not introduce any result improvements. As already mentioned, this
aspect requires further investigation.
After showing some results for the cases with 53 and 3 measurement points, let
us now generalise the discussion by investigating the influence of the number of
transducers (nr ) on the results. First, Fig. C2.17 shows how the value of the
balancing weight λ changes with nr . We note that this is particularly relevant if
we want to keep nr low. The curve is not smooth due to the discrete values of λ
during tuning (cf. section C1.1.1). Moreover, we assumed the same covariance
for every measurement point, which is an approximation (it is possible that
every point has a different uncertainty due to the quality of the marker detection,
and taking this into consideration could lead to a smoother curve).
Next, from top to bottom Fig. C2.19 shows the averaged standard deviation (σ)
of the nonzero components of α, obtained by the square root of the diagonal
values of the covariance matrix that refers to the slack variable (cf. Fig. C2.5),
the standard deviation of the fifth element of the Fourier series (cf. Figs. C2.14
and C2.16) and the MSE of the input estimation, respectively. By comparing
the first two graphs we notice that the averaged values are lower than the single
component that refers to the active sinusoid. This happens because the DC
component has a lower uncertainty. Such behaviour is stronger for nr = 1, 2
due to the fact that there are more nonzero components (the system is not
observable, and extra regularisation is needed to compute the covariance matrix,
cf. section B2.1). This explains also why the first two points do not follow
the smooth curve of all other points. Furthermore, we note that σ decreases
while the number of transducers increases, but adding more than a certain
number of transducers (approximately 8) does not result in a strong decrease of
154 ESTIMATION OF PERIODIC LOADS DESCRIBED BY FOURIER COMPONENTS
MPFpos,n 3 n MPFvel,n 3 n
10 -3
5 4
0 0
-1
-2
-3
-5 -4
T-N+1 T-1 T T-N+1 T-1 T
k k
( ) 3
20
uk [N]
0 5
0
-20
Fourier components
( ) 3 -5
20
-10
0
T-N+1 T-1 T
k
-20
Fourier components
1
log 10( )
0.5
0
-0.5
-1
1 10 20 30 40 50 53
nr
2.5
MSE [N 2] 2
1.5
0.5
0 20 40 60 80 100 120 140
iteration ID
Figure C2.18: MSE of the input estimation, obtained by considering 3
displacement sensors.
5
(average)
4 iteration 1
iteration 2
3 iteration 3
2
1
1 10 20 30 40 50 53
nr
5
(1 comp.)
4
3
2
1
1 10 20 30 40 50 53
nr
4
MSE [N2]
3
2
1
0
1 10 20 30 40 50 53
nr
uncertainty. In the bottom graph we see that the smooth increase in accuracy
does not correspond to better results, since the MSE tends to stay constant
(a part from the non observable cases with nr = 1, 2). Lastly, we note that
σ corresponding to the first iteration is higher than for the second and third
iterations, due to the availability of the arrival cost in the latter cases. On
the other hand, we do not notice any similar trend for the MSE of the input
estimation, whose oscillating behaviour is mostly due to small phase errors. The
values of σ reported in Fig. C2.19
√ are scaled according to the chosen Fourier
transform convention (i.e., 1/ N − 1, where N −1 is the size of the Fourier
spectrum, and differs by 1 from the window length N ).
During a further measurement campaign, we investigated the influence of an
increasing number of sinusoids. Table C2.3 shows the different settings for
the tests. Notation “min(nr )” in the third column refers to the observability
requirement. We set the same amplitude for each sinusoidal component of the
input.C3 However, the dynamic response of the system composed by shaker
and structure acted as a filter, with a low-pass tendency. In comparison with
Fig. C2.16 (and previous analogous illustrations), the figures that follow contain
an extra graph (middle-right) with the amplitude of the Fourier components
(abs(α)), whose uncertainty bands are calculated following Eq. (C2). This
allows to visualise the frequency dependent filtering behaviour of the assembly
composed by the structure and the shaker in response to the same amplitude
for each Fourier component of the load (green crosses in the middle-right graphs
of Figs. C2.20–C2.30). From the fourth column of Table C2.3 we see that the
window length varies with the number of sinusoids. We made this choice a
posteriori by choosing the shortest window that generates sufficient sparsity and
guarantees that all frequencies are modelled, in order to avoid spectral leakage
that would degenerate the sparsity level (i.e., S increases). For the case of four
sines we considered a few window lengths. We remind that a single sinusoid
consists of two complex conjugate peaks, and a further nonzero element is due
to the static weight of the shaker (DC component).
Let us now go through the estimation results. First, Fig. C2.20 refers to one
sinusoid, and the results are in line with the previous experiment (cf. Fig. C2.16).
Next, Fig. C2.21 shows two sinusoids. We did not manage to estimate accurately
both components keeping the same window length as the previous case, due to
the deteriorated sparsity resulting from the extra component. Consequently,
we chose N = 33, being this the next smallest number that includes the two
frequencies indicated in Table C2.3. In order to have an idea about the sparsity
levels we can look back at Eq. (A120), which suggests that S = 5 is quite high
in case of N = 17. It is worth to mention that in general every combination
C3 We built custom audio files (wav files, paying attention not to generate any signal clipping),
that LMS Test.Lab [181] passed to the shaker through a LMS SCADAS [180].
EXPERIMENTAL ESTIMATION OF A PERIODIC LOAD IN TIME 157
MPFpos,n 3 n MPFvel,n 3 n
10 -3
4 3
3
2
2
1
1
0 0
-1
-1
-2
-2
-3
-4 -3
T-N+1 T-1 T T-N+1 T-1 T
k k
( ) 3 abs( ) 3
20
20
0 10
-20 0
Fourier components Fourier components
( ) 3 uk [N]
20
5
0
0
-5
-10
-20
Fourier components T-N+1
k
T-1 T
MPFpos,n 3 n MPFvel,n 3 n
0.015 6
0.01 4
0.005 2
0 0
-0.005 -2
-0.01 -4
-0.015 -6
T-N+1 T T-N+1 T
k k
( ) 3 abs( ) 3
30
20
20
0
10
-20
0
Fourier components Fourier components
( ) 3 uk [N]
10
20
0
0
-10
-20
-20
Fourier components T-N+1
k
T
Finally, Figs. C2.24–C2.30 illustrate the results with four Fourier components
for different window lengths (N = 33, 49, 65, 81) and number of measurement
points (nr = 9, 15, 50). To obtain these graphs we lowered λ, not to fall into the
problems that we already mentioned with regard to Fig. C2.22, using as starting
points the values that corresponds to minimum MSE (cf. section C1.1.1). We
are aware that the computational effort increases substantially going from
N = 33 to N = 81, but here we want to explore the possible benefits that
160 ESTIMATION OF PERIODIC LOADS DESCRIBED BY FOURIER COMPONENTS
MPFpos,n 3 n MPFvel,n 3 n
0.01 4
0.005 2
0 0
-1
-0.005 -2
-3
-0.01 -4
T-N+1 T T-N+1 T
k k
( ) 3 abs( ) 3
20 20
0 10
-20 0
Fourier components Fourier components
( ) 3 uk [N]
20 10
0
0
-10
-20
Fourier components T-N+1
k
T
scaling the threshold on the active components (εα ) with the window length.
EXPERIMENTAL ESTIMATION OF A PERIODIC LOAD IN TIME 161
MPFpos,n 3 n MPFvel,n 3 n
0.01 4
0.005 2
0 0
-1
-0.005 -2
-3
-0.01 -4
T-N+1 T T-N+1 T
k k
( ) 3 abs( ) 3
20 20
0 10
-20 0
Fourier components Fourier components
( ) 3 uk [N]
20 10
0
0
-10
-20
Fourier components T-N+1
k
T
tendency remains (to different extents) also in case of a window of 49 time steps
(Figs. C2.26–C2.27) and of 65 time steps (Figs. C2.28–C2.29). A last example
involves 81 time steps and 50 measurement points (Fig. C2.30).
We performed the exercise of an increasing number of Fourier components in
order to investigate the requirements and limits of the CS-MHE in terms of
number of measurements, window length (and consequently number of basis
162 ESTIMATION OF PERIODIC LOADS DESCRIBED BY FOURIER COMPONENTS
MPFpos,n 3 n MPFvel,n 3 n
0.01 4
0.005 2
0 0
-1
-0.005 -2
-3
-0.01 -4
T-N+1 T T-N+1 T
k k
( ) 3 abs( ) 3
30
20
20
0
10
-20
0
Fourier components Fourier components
( ) 3 uk [N]
10
20
0
0
-20 -10
MPFpos,n 3 n MPFvel,n 3 n
0.01 4
0.005 2
0 0
-1
-0.005 -2
-3
-0.01 -4
T-N+1 T T-N+1 T
k k
( ) 3 abs( ) 3
30
20
20
0
10
-20
0
Fourier components Fourier components
( ) 3 uk [N]
10
20
0
0
-20 -10
the peak-to-peak amplitude that results from the sum of all active Fourier
components. It is not a rigorous metric since a small phase discrepancy can
have a big influence, but it helps recognising that once the window is long
enough for a certain sparsity it is not worth to further extend it. Moreover, we
expect the error to decrease after a few iterations of the CS-MHE.
The model error has a significant effect on the experimental results, especially
in relation to the input, i.e., the accuracy of the modal participation factors
164 ESTIMATION OF PERIODIC LOADS DESCRIBED BY FOURIER COMPONENTS
MPFpos,n 3 n MPFvel,n 3 n
0.01 4
0.005 2
0 0
-1
-0.005 -2
-3
-0.01 -4
T-N+1 T T-N+1 T
k k
( ) 3 abs( ) 3
30
20
20
0
10
-20
0
Fourier components Fourier components
( ) 3 uk [N]
10
20
0
0
-20 -10
MPFpos,n 3 n MPFvel,n 3 n
0.01 4
0.005 2
0 0
-1
-0.005 -2
-3
-0.01 -4
T-N+1 T T-N+1 T
k k
( ) 3 abs( ) 3
30
20
20
0
10
-20
0
Fourier components Fourier components
( ) 3 uk [N]
10
20
0
0
-20 -10
MPFpos,n 3 n MPFvel,n 3 n
0.01 4
0.005 2
0 0
-1
-0.005 -2
-3
-0.01 -4
T-N+1 T T-N+1 T
k k
( ) 3 abs( ) 3
30
20
20
0
10
-20
0
Fourier components Fourier components
( ) 3 uk [N]
10
20
0
0
-20 -10
MPFpos,n 3 n MPFvel,n 3 n
0.01 4
0.005 2
0 0
-1
-0.005 -2
-3
-0.01 -4
T-N+1 T T-N+1 T
k k
( ) 3 abs( ) 3
30
20
20
0
10
-20
0
Fourier components Fourier components
( ) 3 uk [N]
10
20
0
0
-20 -10
MPFpos,n 3 n MPFvel,n 3 n
0.01 4
0.005 2
0 0
-1
-0.005 -2
-3
-0.01 -4
T-N+1 T T-N+1 T
k k
( ) 3 abs( ) 3
30
20
20
0
10
-20
0
Fourier components Fourier components
( ) 3 uk [N]
10
20
0
0
-20 -10
this section we did not vary the window length by unity steps, not to introduce
spectral leakage, i.e., we always made sure that the frequencies of all active
Fourier components of the input were modelled within the window. Current
research is investigating a few approaches to cope with possible spectral leakage.
C2.4 Conclusions
171
172 CONCLUSIONS AND OUTLOOK
results if the window does not contain a finite number of sine waves. We tackled
this problem by using the autocorrelation of the estimated input in time domain
[20], that allows to detect the dominant periodic wave and consequently to
adapt the dictionary of the next iteration in order to match the correct period
and minimising thus the spectral leakage. In doing this we did not change
the sampling time nor the MHE window length (N ), but we adapted the the
Fourier shape functions such that they match the detected periodicity. It is
worth noting that this approach works best if N does not match the input
periodicity while the sampling scheme is set in accordance with it, and all sine
waves are higher orders of the dominant frequency. In case of more complex
dynamical behaviours (e.g., some form of combination of rotational speed and
structural eigenmodes), we could think of constructing an ad hoc dictionary
made of Fourier shape functions (multiple of the rotational frequency) as well
as a collection of shape functions that match the eigenmodes of the system.
In this context, it would be interesting to compare the performance of the
CS-MHE with other state of the art estimators such as the EKF or UKF with a
random walk model, under the hypothesis of an observable system. We expect
the CS-MHE to outperform those approaches especially for long windows and
relatively low sampling rates.
Observability
In chapter B2 we investigated rank and condition number of the matrices
that form the CS-MHE optimisation problem, as those metrics are strictly
related to observability. Since we limited the discussion to an LTI system
excited by an external load modelled by a Dirac delta dictionary, future research
could start from this evaluation to further expand the topic aiming at a more
FURTHER METHODOLOGY DEVELOPMENTS 175
formal observability assessment (cf. section B2.4, in particular Eq. B23). This
could include a nonlinear system as well as any rank and condition number
dependencies in relation to the type of dictionary. Further points of interest
may regard the influence of the sampling rate and of any possible correlation
between sparsity in time and space (cf. section B2.5).
Numerical stability
The matrices that result when setting up an estimation problem such as the
CS-MHE may be badly conditioned, and numerical issues may arise. We faced
some concerns related to this topic in part C of this dissertation, and we
tackled them by enforcing symmetry as well as introducing weighting factors
and regularisation factors. This allowed us to get the correct estimation, but
further issues may pop up for other applications. This may derive from different
transducers, models, dictionaries, units, and the resulting ill-posedness may
require specific regularisation approaches (e.g., the ones reported in [70]).
Overcomplete dictionaries
For the examples in dissertation we employed a Dirac delta dictionary and
a Fourier dictionary, which are both complete dictionaries, i.e., all basis
functions are orthogonal. In particular, whenever we dealt with a Fourier
dictionary, we limited the discussion to a DFT with regular sampling and same
number of Fourier components as the length of the time domain data, i.e., the
number of shape function equals the window length. However, overcomplete
(non-orthogonal) dictionaries and different sampling schemes may allow for
higher accuracy, better sparsity and lower sampling rates (cf. chapter A3 and
section B1.4) [107, 110].
In this appendix we give the matrix formulation of the CS-MHE with a limited
amount of constraints, introduced in sections B1.3 and B1.4. We present the
matrices for an LTI system and window length N = 4. Let us start from Eq. (1),
which we presented in section B1.4.2 as Eq. (B9).1 Let us also recall Eq. (B10)
and add a more explicit definition of b in Eq. (2), denoted as Eq. (2d).
x
z = (2a)
α
Hxx Hxα
H = (2b)
Hαx Hαα
b = a>
b Ab ab (2d)
1 We first present it in section B2.1 as Eq. (B17). Note that in this appendix we consider
177
178 APPENDIX
Element Formula
z Eq. (4)
Hxx Eq. (7)
Hαx Eq. (8)
Hxα Hxα = Hαx
>
Table 1: Links between the elements of Eq. (2) and their formulas.
Table 1 links the different parts of Eq. (2) to their explicit formulations for
N = T = 4. Accordingly, the dictionary Ψ in Eq. (B5) results in Eq. (3).
ψ11 ψ12 ψ13
Ψ = ψ21 ψ22 ψ23 (3)
ψ31 ψ32 ψ33
b =
a> x̄> y1> y2> y3> y4> ᾱ1> ᾱ2> ᾱ3> (5)
1
Ab = diag Pa−1 , R1−1 , R2−1 , R3−1 , R4−1 , Pα−1 , Pα−1 , Pα−1 (6)
1 2 3
Pa−1 + A> Q−1
1 A >
> −1
−A Q−1
1
+C R1 C
> −1
Q−1
1 + A Q2 A
−Q−1
1 A −A> Q−1
2
+C > R2−1 C
Hxx = (7)
> −1
Q−1
2 + A Q3 A >
−Q−1
2 A > −1
−A Q−1
3
+C R3 C
> −1
−Q−1
3 A Q−1
3 + C R4 C
MATRIX IMPLEMENTATION OF THE CS-MHE
> > −1 > > −1 > > −1 > > −1 > > −1
ψ11 B Q1 A −ψ11 B Q1 + ψ21 B Q2 A −ψ21 B Q2 + ψ31 B Q3 A > > −1
> > > −1 > > −1
−ψ31 B Q3
+ψ11 D> R1−1 C +ψ21 D R2 C +ψ31 D R3 C
> > −1 > > −1 > > −1 > > −1 > > −1
Hαx
ψ12 B Q1 A −ψ12 B Q1 + ψ22 B Q2 A −ψ22 B Q2 + ψ32 B Q3 A > > −1
(8)
= > > −1 > > −1 > > −1
−ψ32 B Q3
+ψ12 D R1 C +ψ22 D R2 C +ψ32 D R3 C
> > −1 > > −1 > > −1 > > −1 > > −1
ψ13 B Q1 A −ψ13 B Q1 + ψ23 B Q2 A −ψ23 B Q2 + ψ33 B Q3 A > > −1
> > −1 > > −1 > > −1
−ψ33 B Q3
+ψ13 D R1 C +ψ23 D R2 C +ψ33 D R3 C
h i
−1 > −1 > −1 > −1 > −1
qx> = −2x̄>
1 Pa − 2y1 R1 C −2y2 R2 C −2y3 R3 C −2y4 R4 C (9)
179
180
> > −1 > > −1 > > −1
ψ11 B Q1 Bψ11 ψ11 B Q1 Bψ12 ψ11 B Q1 Bψ13
> > −1 > > −1 > > −1
+ψ21 B Q2 Bψ21 +ψ21 B Q2 Bψ22 +ψ21 B Q2 Bψ23
> > −1 > > −1 > > −1
+ψ31 B Q3 Bψ31 +ψ31 B Q3 Bψ32 +ψ31 B Q3 Bψ33
> > −1 > > −1 > > −1
+ψ11 D R1 Dψ11 +ψ11 D R1 Dψ12 +ψ11 D R1 Dψ13
> > −1 > > −1 > > −1
+ψ21 D R2 Dψ21 +ψ21 D R2 Dψ22 +ψ21 D R2 Dψ23
> > −1 > > −1 > > −1
1
+ψ31 D R3 Dψ31 + Pα−1 +ψ31 D R3 Dψ32 +ψ31 D R3 Dψ33
> > −1 > > −1 > > −1
ψ12 B Q1 Bψ11 ψ12 B Q1 Bψ12 ψ12 B Q1 Bψ13
> > −1 > > −1 > > −1
+ψ22 B Q2 Bψ21 +ψ22 B Q2 Bψ22 +ψ22 B Q2 Bψ23
> > −1 > > −1 > > −1
+ψ32 B Q3 Bψ31 +ψ32 B Q3 Bψ32 +ψ32 B Q3 Bψ33
Hαα = > > −1 > > −1 > > −1 (10)
+ψ12 D R1 Dψ11 +ψ12 D R1 Dψ12 +ψ12 D R1 Dψ13
> > −1 > > −1 > > −1
+ψ22 D R2 Dψ21 +ψ22 D R2 Dψ22 +ψ22 D R2 Dψ23
> > −1 > > −1 > > −1
2
+ψ32 D R3 Dψ31 +ψ32 D R3 Dψ32 + Pα−1 +ψ32 D R3 Dψ33
> > −1 > > −1 > > −1
ψ13 B Q1 Bψ11 ψ13 B Q1 Bψ12 ψ13 B Q1 Bψ13
> > −1 > > −1 > > −1
+ψ23 B Q2 Bψ21 +ψ23 B Q2 Bψ22 +ψ23 B Q2 Bψ23
> > −1 > > −1 > > −1
+ψ33 B Q3 Bψ31 +ψ33 B Q3 Bψ32 +ψ33 B Q3 Bψ33
> > −1 > > −1 > > −1
+ψ13 D R1 Dψ11 +ψ13 D R1 Dψ12 +ψ13 D R1 Dψ13
> > −1 > > −1 > > −1
+ψ23 D R2 Dψ21 +ψ23 D R2 Dψ22 +ψ23 D R2 Dψ23
> > −1 > > −1 > > −1
3
+ψ33 D R3 Dψ31 +ψ33 D R3 Dψ32 +ψ33 D R3 Dψ33 + Pα−1
" #
−2y1> R1−1 Dψ11 − 2y2> R2−1 Dψ21 −2y1> R1−1 Dψ12 − 2y2> R2−1 Dψ22 −2y1> R1−1 Dψ13 − 2y2> R2−1 Dψ23
qα> =
1 2 3
−2y3> R3−1 Dψ31 − 2ᾱ1> Pα−1 −2y3> R3−1 Dψ32 − 2ᾱ2> Pα−1 −2y3> R3−1 Dψ33 − 2ᾱ3> Pα−1
(11)
APPENDIX
>
minimise (xT −N +1 − x̄T −N +1 ) Pa−1 (xT −N +1 − x̄T −N +1 )
xk ,αk
>
T
X −1 T
X −1 T
X −1
+ xk+1 − Axk − B ψk,j αk Q−1
k
xk+1 − Axk − B ψk,j αk
k=T −N +1 j=T −N +1 j=T −N +1
>
T
X T
X −1 T
X −1
MATRIX IMPLEMENTATION OF THE CS-MHE
T −1
>
X
+ (α − ᾱ) Pα−1 (α − ᾱ) + λ kαk k1 (12a)
k=T −N +1
subject to x ∈ xLB , xUB , α ∈ αLB , αUB (12b)
181
Appendix 2
Physical set-up
183
184 APPENDIX
Since the preliminary FE model was incomplete and not accurate enough, we
performed a series of model updating procedures based on the following three
experimental modal analyses (EMAs) [78], using three uniaxial accelerometers
at a fixed location and one impact hammer. We extracted the eigenmodes in
LMS Test.Lab [145].
We note that the first two EMAs are rather coarse. However, their purpose
was just a material characterisation, which resulted accurate enough. On the
other hand, the third EMA is finer and allows for a more detailed study of the
eigenmodes, involving the torsional and axial directions of the beam, as well
as the flexural modes of the vertical mounts. Next, we imported the results
of the EMAs (i.e., eigenfrequencies and mode shapes) into the FE software as
target values for a model updating procedure [78], aiming at improving the
accuracy of the FE model. The model updating consisted in three main steps,
corresponding to the previous three EMAs:
Mode ID Plane Freq. EMA [Hz] Freq. MoUp [Hz] Error [%]
1 xz 35.70 34.99 -1.98
2 xy 53.08 52.49 -1.11
3 xz 93.55 96.41 3.05
4 xy 144.92 144.58 -0.24
5 xz 187.41 188.87 0.78
6 xy 286.03 283.12 -1.02
2. Determine the stiffness of the contact between the aluminium and the
steel masses: in general, modelling a mechanical contact is not a simple
task. However, for the beam set-up we chose to model the contact as a
uniform isotropic stiffness of the steel (Esteel ). This approximation does
not introduce relevant distortions since the structure is rather rigid and
each mass presents a single contact area. Another approximation regards
the mass distribution: in fact, the model assumes solid isotropic steel,
while in reality the presence of the screws changes the centre of mass.
However, we neglected this aspect. Table 4 shows the results of this second
model updating procedure.
Table 4: Model update aluminium beam with steel masses, free-free boundary
conditions.
Mode ID Plane Freq. EMA [Hz] Freq. MoUp [Hz] Error [%]
1 xz 26.23 25.74 -1.88
2 xy 39.16 38.69 -1.20
3 xz 65.62 65.98 0.55
4 xy 93.46 94.10 0.68
5 xz 120.38 120.32 -0.05
6 xy 173.01 172.51 -0.29
the top and bottom nodes of the aluminium beam (modelled by 2D elements).
We are aware that there are different ways to model the set-up (including 3D
FE models) and several ways to model the vertical mounts and the clamping,
which could result in a better eigenmode matching, especially for modes ID
5 and 7 (cf. Table 5). However, for our purposes we judged the model to be
accurate enough. Fig. 2 shows the mode shapes of the full set-up, linked to
Table 5. In the following list we give a few remarks about the eigenmodes of
the system:
188 APPENDIX
• The third column of Table 5 indicates if a mode involves mainly the beam
(B) or the full set-up, including the vertical mounts (M). The six beam
modes (B) correspond well with our preliminary FE analysis that did
not take into account the vertical mounts. On the other hand, the three
modes that involve the whole set-up (M) are governed by the dynamical
behaviour of the vertical mounts.
• The second column of Table 5 indicates in which plane the mode is
predominantly located. This allows us to distinguish the following beam
(B) modes:
– Three beam bending modes in the vertical plane xz, i.e., modes 1,
3, 8. Excluding the clamping nodes, they have zero, one and two
nodal points, respectively. For the experimental example we chose
to excite the beam along z, such that these three modes govern the
whole dynamic response.
– Three beam bending modes in the horizontal plane xy, i.e., modes
2, 4, 9. Excluding the clamping nodes, they have zero, one and
two nodal points, respectively. Mode 9 has also a strong torsional
behaviour, and that is the reason why we used notation “plane xy(z)”
in Table 5.
In order to run the CS-MHE, we need the model of the test set-up to be available
in MATLAB® . We tackled this problem by extracting the FE mass and stiffness
TEST CASE DESCRIPTION FOR SECTION C2.3 189
Figure 2: Mode shapes of the first 9 eigenmodes of the full FE model after the
model updating (cf. Table 5).
matrices and projecting them on modal coordinates [78]. This allows to build a
state-space model and to operate in the same way as we did throughout this
dissertation. Since for the experiments in chapter C2 we mounted the shaker
such that it applies a load in the xz plane along the z axis, we expected modes
1, 3 and 8 (cf. Table 5 and Fig. 2) to dominate the response. Accordingly, we
reduced the model limiting it to those three eigenmodes. Literature offers a few
model order reduction (MOR) techniques [15, 174], but for the beam set-up we
selected the mode shapes based on the participation factors of the excitation
configuration. This allows to improve the observability of the system at the
expense of a very small loss of modelling accuracy.
190 APPENDIX
4
log 10 (FRF)
1
MoUp 1 3 8 MoUp
EMA 1 2 3 45 6 8 7 9 10 EMA
0
0 25 50 75 100 125 150 175 200 225 250
Freq. [Hz]
Figure 3: Comparison between an FRF obtained from the updated and reduced
state-space model (—–) and an experimental FRF (- - -). The mode IDs follow
the notation in Table 5 (an additional experimental model is present). The
FRFs refer to a force input at x = 0.750 m and an acceleration output at
x = 0.450 m, and are expressed as acceleration over force [mm/Ns2 ].
In this section we list the steps we went through in order to extract the
displacements measurements from a sequence of images. This procedure involves
standard digital image processing techniques, which can be found in references
such as [64, 149]. Their implementation in MATLAB® is documented in [65],
and many functions are now available in the image processing and computer
vision toolbox (MATLAB® R2017a). For this reasons, we refer the interested
reader to the MATLAB® help and references therein. The experimental set-up
was instrumented with the following (see Fig. 4):
• Two adhesive paper strips on the vertical mounts, identical to the strips
on the beam.
• Two checkerboard patterns on two different planes, to reconstruct the
scene in 3D.
• One shaker (THE MODAL SHOP Miniature Inertial Shaker K2002E01
[188]).
• One impedance head (PCB ICP® 288D01 [144]), to measure the force
entering the system, to be used for validation purposes.
• One high-speed camera (Ximea MQ042CG-CM [201]).
• One photography lamp working with DC (rectified) current. The lamp
produces a spot light which is clearly visible in the centre of Fig. 4.
Consequently, the central strip on the beam (“BEAM STRIP 2” in Fig. 4)
receives more light.
• One data acquisition system (LMS SCADAS [180]).
• One synchronisation circuit.
• One personal computer.
Fig. 5 shows the schematic data flow when an experiment is running. The settings
for the LMS SCADAS [180] and for the camera are defined via two dedicated
TEST CASE DESCRIPTION FOR SECTION C2.3 193
programs. In LMS Test.Lab [181] we set the duration of the experiment, the
sampling rate, the signal for the shaker and the trigger for the camera. In
the camera software we set the camera such that each frame is activated by
the rising edge of the trigger signal and the exposure has a fixed duration.
Furthermore, a signal indicating the exposure time is sent back to the LMS
SCADAS for synchronisation purposes. The camera software takes also care of
saving the images on the PC. The LMS SCADAS is a data acquisition system
with multiple inputs and outputs. From top to bottom, four channels of the
LMS SCADAS are connected as follows:
1. INPUT from the impedance head: this channel acquires the dynamical
component of the force, which is generated by the shaker and captured
by the impedance head. We use this signal for validating the estimation
results.
2. OUTPUT to the shaker: this channel generates a signal that serves as
input for the shaker. This signal is amplified to reach the required force.
3. OUTPUT to the synchronisation box: this channel generates a sine wave
at the frequency corresponding to the desired frame rate. The camera
requires a square wave at a certain voltage level, and the purpose of the
194 APPENDIX
When performing vision based measurements, the first phase involves the
estimation of the parameters for correcting the lens distortions (typically barrel
distortions or pincushion distortions) as well as the parameters to calibrate the
scene, i.e., transforming the pixel information into metric measurements. The
following list summarises these steps:
1. Determine the correction needed to cope with lens distortion. This step
requires several pictures of the measurement area containing a calibration
checkerboard pattern (Fig. 6). We covered the other two checkerboard
TEST CASE DESCRIPTION FOR SECTION C2.3 195
Figure 7: Tracked features between two images. Legend: points of image 1 (×),
points on image 2 (◦), link (· · ·).
patterns (cf. Fig. 4) since the MATLAB® toolbox for correcting lens
distortions can easily detect a single checkerboard pattern.
2
1.5
2
1
1 -0.1
0.5
y
0
0 0.1
0.2
z 0.5 1
-0.5 0
-0.5
x
-0.1
0
y
2 0.1
0.2
1
1
0.5
0
2 1.5 1 -0.5
0.5 0 -0.5 x
z
Once all the parameters of the camera and of the scene are available, we can
apply the following procedure to every image (i.e., to every frame of a video
acquired during an experiment):
1. Make a first guess of the markers positions. This can be done manually
or automatically thanks to specific algorithms for pattern recognition.
Unfortunately, our images were quite dark, illuminated non uniformly,
and the marker resolution was quite low, resulting in the failure of
the automatic pattern recognition routines. Consequently, we defined
manually the first guess for the markers positions.
2. Refine the markers positions up to a sub-pixel resolution. This can be
done automatically by investigating the pixels in the neighbourhood of
the first guess. Fig. 10 shows the whole beam, while Fig. 11 contains
TEST CASE DESCRIPTION FOR SECTION C2.3 197
Figure 11: Marker detection for the central strip (top) and zoom on the central
markers (bottom). Legend: first guess (◦), sub-pixel refinement (+).
two zooms where we can notice the sub-pixel refinement of the markers
positions (green crosses).
3. Transform the pixel coordinates into metric values, by applying the lens
distortion correction (point 1 of the previous list) and the matrices for
roto-translation and scaling (point 4 of the previous list).
4. Repeat the procedure for every picture acquired during an experiment,
using the refined position of the current frame as first guess for the next
frame.
5. Correct the displacement measurements with the model. This step is not
a standard procedure in image processing, and we will discuss it after this
list.
z [mm]
F
-0.1
0 250 500 750 1000
x [mm]
Figure 12: Beam deformed under the static load of the shaker (F), obtained
through an FE analysis.
1
z [mm]
-1
0 250 500 750 1000
x [mm]
0.1
z [mm]
-0.1
-0.2
0 250 500 750 1000
x [mm]
Figure 13: Displacement measurements at the first time step. From image
processing (top) and corrected with the FE deformed shape (bottom).
z [mm]
0
TEST CASE DESCRIPTION FOR SECTION C2.3 199
-1
0 250 500 750 1000
x [mm]
z [mm] 0.1
-0.1
-0.2
0 250 500 750 1000
x [mm]
Figure 15: Overlap of the displacement measurements at all time steps, corrected
with the FE deformed shape.
and its eigenfrequency (142 Hz, cf. Table 5) is the closest to the excitation
frequency (128 Hz). Finally, Fig. 15 shows an overlap of the displacements
evaluated for each frame of a measurement run (approximately 50 s at 512 Hz),
where we can see the space spanned by the vibration as well as an issue related
to the illumination. In fact, we notice that the measurements become blurry
while approaching the vertical mounts of the set-up. The reason for this can be
understood by looking at the light in Fig. 10.
In the coming few lines we want to stress the fact that building the model
and setting up the procedure for extracting displacement measurements from
images required a certain effort and some time. The CS-MHE is a model based
estimator, and its performance is strictly linked to the model accuracy (cf.
the numerical example in section C1.1). Building an accurate model requires
the knowledge of numerical and experimental methodologies, specific software
(including protocols to exchange information among different programs), and a
certain level of personal experience. Moreover, similar considerations apply to
200 APPENDIX
sync [V] 10
0
30 30.005 30.01
t [s]
Figure 16: Synchronisation signal. The chosen time step is marked with a black
triangle (time step ID 31 in Fig. 17).
the camera measurements. For example, the first FE model that we built did
not consider the vibration of the vertical mounts of the set-up. Furthermore,
the model updating procedure consisted in a single step where we optimised
all material and contact parameters. In parallel to this, the measurements did
not include any lens correction and did only consider the 2D plane in which
the beam vibrates. Under these circumstances, in our first trial run of the
CS-MHE we could notice that something was happening, but we did not manage
to find good values for the covariance matrices and consequently it was not
possible to tune the balancing weight λ, and the results were not acceptable.
We cannot state that the sequence of operations that we applied to model and
measurements is the best possible, and there surely is space for improvement,
but we showed in section C2.3.2 that the results are satisfactory.
Before concluding this appendix, let us spend a few words about the
synchronisation signal for identifying the time step at which a picture was taken,
thus allowing us to compare the CS-MHE results with the force measurements
provided by the impedance head. We already mentioned a few details about
hardware connections (cf. Fig. 5), signal shapes and voltage levels, while in
Fig. 16 we show a portion of the synchronisation signal that goes from the
camera to the acquisition system. It is sampled at 16348 Hz (≈ 16 kHz) in
order to have 32 points describing one square wave at a frame rate of 512 frames
per second (fps). The high level of the signal indicates the time in which the
camera is in exposure active mode, and its length was set manually to 1.8 ms,
i.e., at every rising edge of the trigger signal (consisting of a square wave at
512 Hz) the camera takes a picture with an exposure time of 1.8 ms, and at
the same time it sends back the exposure status to the data acquisition system.
Then, the camera waits for the next rising edge of the trigger signal. From this
TEST CASE DESCRIPTION FOR SECTION C2.3 201
/2
phase difference [rad]
/4
- /4
1 10 20 30 40
time step ID
Figure 17: Phase difference between the force estimated by the CS-MHE and
measured by the impedance head, expressed in function of the synchronisation
time step. The choice of time step ID 31 (black triangles in Fig. 16) gives the
best alignment.
signal we had to choose the synchronization strategy, i.e., at which time step
within the exposure active we consider the picture to refer to. We did this by
trying 40 time steps in a portion of signal that includes a full exposure active,
and computing the phase difference between the force estimate given by the
CS-MHE and the force measured by the impedance head (in case the impedance
head was not present, we could set an arbitrary delay according to the values
suggested by the camera manufacturer). Fig. 17 shows this phase difference,
and we can see that the synchronisation denoted by ID 31 leads to the lowest
phase error. This timestep corresponds to the black triangles in Fig. 16, and it
is located towards the end of the exposure active signal.
Bibliography
203
204 BIBLIOGRAPHY
[9] Bao, S., Luo, L., Mao, J., and Tang, D. Improved fault detection
and diagnosis using sparse global-local preserving projections. Journal of
Process Control 47 (2016), 121–135.
[18] Bock, H., Körkel, S., Kostina, E., and Schlöder, J. Robustness
aspects in parameter estimation, optimal design of experiments and
optimal control. In Reactive Flows, Diffusion and Transport. Springer
Berlin Heidelberg, 2007, pp. 117–146.
[19] Bock, H. G., Kostina, E., and Kostyukova, O. Covariance matrices
for parameter estimates of constrained parameter estimation problems.
SIAM Journal on Matrix Analysis and Applications 29, 2 (2007), 626–642.
BIBLIOGRAPHY 205
[27] Candes, E. J., Romberg, J. K., and Tao, T. Stable signal recovery
from incomplete and inaccurate measurements. Communications on Pure
and Applied Mathematics 59, 8 (2006), 1207–1223.
[28] Carin, L. On the relationship between compressive sensing and random
sensor arrays. Antennas and Propagation Magazine, IEEE 51, 5 (Oct
2009), 72–81.
[29] Carmi, A., Gurfil, P., and Kanevsky, D. Methods for sparse signal
recovery using kalman filtering with embedded pseudo-measurement norms
and quasi-norms. Signal Processing, IEEE Transactions on 58, 4 (April
2010), 2405–2409.
[30] Chardon, G., Daudet, L., Peillot, A., Ollivier, F., Bertin,
N., and Gribonval, R. Near-field acoustic holography using sparse
regularization and compressive sampling principles. The Journal of the
Acoustical Society of America 132, 3 (2012), 1521–1534.
[31] Chardon, G., Leblanc, A., and Daudet, L. Plate impulse response
spatial interpolation with sub-nyquist sampling. Journal of Sound and
Vibration 330, 23 (2011), 5678–5689.
206 BIBLIOGRAPHY
[32] Charles, A., Asif, M., Romberg, J., and Rozell, C. Sparsity
penalties in dynamical system estimation. In Information Sciences and
Systems (CISS), 2011 45th Annual Conference on (March 2011), pp. 1–6.
[33] Chen, C.-T. Linear system theory and design. Oxford series in electrical
and computer engineering. Oxford university press, 1999.
[34] Chen, C.-T. Linear system theory and design, 3rd ed. Oxford University
Press, New York, 1999.
[35] Chen, S., and Donoho, D. Basis pursuit. In Signals, Systems and
Computers, 1994. 1994 Conference Record of the Twenty-Eighth Asilomar
Conference on (Oct 1994), vol. 1, pp. 41–44.
[36] Chen, S. S., Donoho, D. L., and Saunders, M. A. Atomic
decomposition by basis pursuit. SIAM Rev. 43, 1 (Jan. 2001), 129–159.
[37] Chu, E., Keshavarz, A., Gorinevsky, D., and Boyd, S. Moving
horizon estimation for staged qp problems. In 2012 IEEE 51st IEEE
Conference on Decision and Control (CDC) (Dec 2012), pp. 3177–3182.
[38] Croes, J. Virtual sensing in mechatronic drivelines – Bridging between
advanced methods and industrial applications. PhD thesis, KU Leuven,
September 2017.
[39] D’Elia, G., Cocconcelli, M., Mucchi, E., Rubini, R., and
Dalpiaz, G. Step-by-step algorithm for the simulation of faulted bearings
in non-stationary conditions. In Proceedings of ISMA2016 including USD
2016, Leuven, Belgium (2016), P. Sas, D. Moens, and A. van de Walle,
Eds., pp. 2393–2408.
[44] Donoho, D., and Tanner, J. Thresholds for the recovery of sparse
solutions via l1 minimization. In Information Sciences and Systems, 2006
40th Annual Conference on (March 2006), pp. 202–206.
[45] Donoho, D. L., and Elad, M. On the stability of the basis pursuit in
the presence of noise. Signal Processing 86, 3 (2006), 511–532.
[46] Doucet, A., de Freitas, N., and Gordon, N., Eds. Sequential Monte
Carlo Methods in Practice. Springer-Verlag New York, 2001.
[47] Duarte-Carvajalino, J., and Sapiro, G. Learning to sense
sparse signals: Simultaneous sensing matrix and sparsifying dictionary
optimization. Image Processing, IEEE Transactions on 18, 7 (July 2009),
1395–1408.
[48] Elad, M. Optimized projections for compressed sensing. IEEE
Transactions on Signal Processing 55, 12 (Dec 2007), 5695–5702.
[49] Feng, D., and Feng, M. Q. Identification of structural stiffness
and excitation forces in time domain using noncontact vision-based
displacement measurement. Journal of Sound and Vibration 406,
Supplement C (2017), 15–28.
[50] Ferreau, H. J., Kirches, C., Potschka, A., Bock, H. G., and
Diehl, M. qpOASES: a parametric active-set algorithm for quadratic
programming. Mathematical Programming Computation 6, 4 (2014),
327–363.
[51] Floudas, C., Akrotirianakis, I., Caratzoulas, S., Meyer, C.,
and Kallrath, J. Global optimization in the 21st century: Advances and
challenges. Computers & Chemical Engineering 29, 6 (2005), 1185–1202.
[52] Forrier, B., Naets, F., and Desmet, W. Virtual sensing on
mechatronic drivetrains using multiphysical models. In ECCOMAS
Thematic Conference on Multibody Dynamics (2015).
[53] Franklin, G. F., Powell, D. J., and Emami-Naeini, A. Feedback
Control of Dynamic Systems, 5th ed. Prentice Hall PTR, Upper Saddle
River, NJ, USA, 2006.
[54] Frasch, J. Parallel Algorithms for Optimization of Dynamic Systems in
Real-Time. PhD thesis, KU Leuven and OVGU Magdeburg, September
2014.
[55] Frasch, J. V., Sager, S., and Diehl, M. A Parallel Quadratic
Programming Method for Dynamic Optimization Problems. Mathematical
Programming Computations 7, 3 (2015), 289–329.
208 BIBLIOGRAPHY
[56] Frison, G., Sorensen, H., Dammann, B., and Jorgensen, J. High-
performance small-scale solvers for linear model predictive control. In
Proc. 2014 European Control Conference (ECC) (June 2014), pp. 128–133.
[63] Ginsberg, D., and Fritzen, C.-P. New approach for impact detection
by finding sparse solution. In Proceedings of ISMA2014 including
USD2014, Leuven, Belgium (15-17 September 2014), P. Sas, D. Moens,
and H. Denayer, Eds., pp. 2043–2056.
[64] Gonzalez, R., and Woods, R. Digital Image Processing (2nd Edition).
International Edition. Prentice Hall, 2002.
[65] Gonzalez, R., Woods, R., and Eddins, S. Digital Image Processing
Using MATLAB. Pearson Prentice Hall, 2004.
[66] Gordon, N. J., Salmond, D. J., and Smith, A. F. M. Novel approach
to nonlinear/non-gaussian bayesian state estimation. IEE Proceedings F -
Radar and Signal Processing 140, 2 (April 1993), 107–113.
BIBLIOGRAPHY 209
[68] Grant, M., and Boyd, S. CVX: Matlab software for disciplined convex
programming, version 2.1, March 2014.
[69] Ha, Q., and Trinh, H. State and input simultaneous estimation for a
class of nonlinear systems. Automatica 40, 10 (2004), 1779 – 1785.
[73] Hayes, B. The best bits. American Scientist 97, 4 (July-August 2009),
276–280.
[74] Haykin, S. Kalman filtering and neural networks, vol. 47. John Wiley &
Sons, 2004.
[75] Haykin, S. Adaptive Filter Theory. Low price edition. Pearson Education,
2008.
[76] Hermann, M., Pentek, T., and Otto, B. Design principles for
industrie 4.0 scenarios. In 2016 49th Hawaii International Conference on
System Sciences (HICSS) (Jan 2016), pp. 3928–3937.
[77] Hernandez, E. M. Efficient sensor placement for state estimation in
structural dynamics. Mechanical Systems and Signal Processing 85 (2017),
789–800.
[78] Heylen, W., Lammens, S., and Sas, P. Modal Analysis Theory
and Testing. KU Leuven, Department of Mechanical Engineering, PMA
Section, 2007.
[80] Hong, S., Lee, C., Borrelli, F., and Hedrick, J. K. A novel
approach for vehicle inertial parameter identification using a dual kalman
filter. IEEE Transactions on Intelligent Transportation Systems 16, 1
(Feb 2015), 151–161.
[81] Hou, M., and Patton, R. J. Optimal filtering for systems with
unknown inputs. IEEE Transactions on Automatic Control 43, 3 (Mar
1998), 445–449.
[82] Hsieh, C.-S. Robust two-stage kalman filters for systems with unknown
inputs. IEEE Transactions on Automatic Control 45, 12 (Dec 2000),
2374–2378.
[83] Hsieh, C.-S. Extension of unbiased minimum-variance input and state
estimation for systems with unknown inputs. Automatica 45, 9 (2009),
2149–2153.
[84] JAI. User Manual SP-12000M-CXP4 – SP-12000C-CXP4 – 12M Digital
Progressive Scan Monochrome and Color Camera. Version 1.0, September
2016.
[85] Javh, J., Slavič, J., and Boltežar, M. The subpixel resolution
of optical-flow-based modal analysis. Mechanical Systems and Signal
Processing 88 (2017), 89–99.
[86] Javh, J., Slavič, J., and Boltežar, M. High frequency modal
identification on noisy high-speed camera data. Mechanical Systems and
Signal Processing 98, Supplement C (2018), 344–351.
[87] Javh, J., Slavič, J., and Boltežar, M. Measuring full-field
displacement spectral components using photographs taken with a dslr
camera via an analogue fourier integral. Mechanical Systems and Signal
Processing 100, Supplement C (2018), 17–27.
[88] Jayawardhana, M., Zhu, X., Liyanapathirana, R., and
Gunawardana, U. Compressive sensing for efficient health monitoring
and effective damage detection of structures. Mechanical Systems and
Signal Processing 84, Part A (2017), 414–430.
[89] Johnson, K. L. Contact Mechanics. Cambridge University Press, 1985.
[90] Julier, S. J. The spherical simplex unscented transformation. In
Proceedings of the 2003 American Control Conference (June 2003), vol. 3,
pp. 2430–2434.
[91] Julier, S. J., and Uhlmann, J. K. Unscented filtering and nonlinear
estimation. Proceedings of the IEEE 92, 3 (Mar 2004), 401–422.
BIBLIOGRAPHY 211
[104] Kirchner, M., Croes, J., Cosco, F., and Desmet, W. Exploiting
input sparsity for joint state/input moving horizon estimation. Mechanical
Systems and Signal Processing 101 (2018), 237–253.
[105] Kirchner, M., Croes, J., Cosco, F., Pluymers, B., and
Desmet, W. Compressive sensing-moving horizon estimator for combined
state/input estimation: an observability study. In Proceedings of
ISMA2016 including USD2016, Leuven, Belgium (19-21 September 2016),
P. Sas, D. Moens, and A. van de Walle, Eds., pp. 2947–2962.
[106] Kirchner, M., Croes, J., Cosco, F., Pluymers, B., and
Desmet, W. Compressive sensing-moving horizon estimator for periodic
loads: experimental validation in structiral dynamics with video-based
measurements. In Proceedings of ISMA2018 and USD2018, Leuven,
Belgium (17-19 September 2018), W. Desmet, B. Pluymers, D. Moens,
and W. Rottiers, Eds.
[108] Kirchner, M., and Nijman, E. Nearfield acoustical holography for the
characterization of cylindrical sources: practical aspects. SAE Technical
Paper 2014-01-2094, SAE International (2014).
[109] Kirchner, M., and Nijman, E. Cylindrical nearfield acoustical
holography. In eLiQuiD – Best Engineering Training in Electric,
Lightweight and Quiet Driving (2016), W. Desmet, B. Pluymers, and
M. Kirchner, Eds., KU Leuven, pp. 23–44.
[110] Kirchner, M., and Nijman, E. Cylindrical nearfield acoustical
holography: Practical aspects and possible improvements. In
SpringerBriefs in Applied Sciences and Technology, Automotive NVH
Technology (2016), A. Fuchs, E. Nijman, and H. Priebsch, Eds., Springer
International Publishing, pp. 47–62.
[111] Kitanidis, P. K. Unbiased minimum-variance linear state estimation.
Automatica 23, 6 (1987), 775–778.
[112] Klinikov, M., and Fritzen, C.-P. An updated comparison of the
force reconstruction methods. In Damage Assessment of Structures VII
(2007), vol. 347 of Key Engineering Materials, Trans Tech Publications,
pp. 461–466.
BIBLIOGRAPHY 213
[114] Korovin, S., and Fomichev, V. State Observers for Linear Systems
with Uncertainty. De Gruyter Expositions in Mathematics, No. 51. De
Gruyter, Berlin, Boston, 2009.
[115] Ku, H. H. Notes on the use of propagation of error formulas. Journal
of Research of the National Bureau of Standards – C. Engineering and
instrumentation 70C, 4 (October-December 1966), 263–273.
[116] Liao, Y., Xiao, Q., Ding, X., and Guo, D. A novel dictionary design
algorithm for sparse representations. In International Joint Conference
on Computational Sciences and Optimization (CSO 2009) (April 2009),
vol. 1, pp. 831–834.
[117] Liu, B., Ling, S., and Gribonval, R. Bearing failure detection using
matching pursuit. NDT & E International 35, 4 (2002), 255–262.
[118] Liu, C., Wu, X., Mao, J., and Liu, X. Acoustic emission signal
processing for rolling bearing running state assessment using compressive
sensing. Mechanical Systems and Signal Processing 91 (2017), 395–406.
[126] Lu, P., van Kampen, E.-J., de Visser, C. C., and Chu, Q.
Framework for state and unknown input estimation of linear time-varying
systems. Automatica 73 (2016), 145–154.
[127] Maes, K., Gillijns, S., and Lombaert, G. A smoothing algorithm for
joint input-state estimation in structural dynamics. Mechanical Systems
and Signal Processing 98 (2018), 292–309.
[128] Maes, K., Smyth, A., Roeck, G. D., and Lombaert, G. Joint
input-state estimation in structural dynamics. Mechanical Systems and
Signal Processing 70-71 (2016), 445–466.
[134] Naets, F., Cuadrado, J., and Desmet, W. Stable force identification
in structural dynamics using kalman filtering and dummy-measurements.
Mechanical Systems and Signal Processing 50-51 (2015), 235–248.
BIBLIOGRAPHY 215
[151] Qiao, B., Mao, Z., and Chen, X. Sparse representation for the inverse
problem of force identification. In Proceedings of ISMA2016 including
USD2016, Leuven, Belgium (19-21 September 2016), P. Sas, D. Moens,
and A. van de Walle, Eds., pp. 1685–1696.
[152] Qiao, B., Zhang, X., Gao, J., and Chen, X. Impact-force sparse
reconstruction from highly incomplete and inaccurate measurements.
Journal of Sound and Vibration 376 (2016), 72–94.
[153] Qiao, B., Zhang, X., Gao, J., Liu, R., and Chen, X. Sparse
deconvolution for the large-scale ill-posed inverse problem of impact force
reconstruction. Mechanical Systems and Signal Processing 83 (2017),
93–115.
[154] Qiao, B., Zhang, X., Wang, C., Zhang, H., and Chen, X. Sparse
regularization for force identification using dictionaries. Journal of Sound
and Vibration 368 (2016), 71–86.
[155] Qu, C. C., and Hahn, J. Computation of arrival cost for moving horizon
estimation via unscented kalman filtering. Journal of Process Control 19,
2 (2009), 358–363.
[156] Quirynen, R. Numerical Simulation Methods for Embedded Optimization.
PhD thesis, KU Leuven and University of Freiburg, January 2017.
[157] Quirynen, R., Gros, S., and Diehl, M. Efficient NMPC for
nonlinear models with linear subsystems. In Proceedings of the 52nd
IEEE Conference on Decision and Control (2013), pp. 5101–5106.
[158] Quirynen, R., Gros, S., and Diehl, M. Fast auto generated ACADO
integrators and application to MHE with multi-rate measurements. In
Proceedings of the European Control Conference (2013), pp. 3077–3082.
[159] Quirynen, R., Gros, S., and Diehl, M. Inexact Newton based
Lifted Implicit Integrators for fast Nonlinear MPC. In Proceedings of
BIBLIOGRAPHY 217
[170] Ray, L. R. Nonlinear tire force estimation and road friction identification:
Simulation and experiments1,2. Automatica 33, 10 (1997), 1819–1833.
218 BIBLIOGRAPHY
[171] Rezayat, A., Nassiri, V., Pauw, B. D., Ertveldt, J., Vanlanduit,
S., and Guillaume, P. Identification of dynamic forces using group-
sparsity in frequency domain. Mechanical Systems and Signal Processing
70-71 (2016), 756–768.
[172] Ristic, B., Arulampalam, S., and Gordon, N. Beyond the Kalman
filter : particle filters for tracking applications. Artech House, Boston,
London, 2004.
[173] Sain, M., and Massey, J. Invertibility of linear time-invariant dynamical
systems. Automatic Control, IEEE Transactions on 14, 2 (Apr 1969),
141–149.
[174] Sanchez, R. R., Buchschmid, m., and Müller, G. Model order
reduction techniques in structural dynamics. In ECCOMAS Congress
2016 – VII European Congress on Computational Methods in Applied
Sciences and Engineering (June 2016), M. Papadrakakis, V. Papadopoulos,
G. Stefanou, and V. Plevris, Eds.
[175] Santosa, F., and Symes, W. W. Linear inversion of band-limited
reflection seismograms. SIAM Journal on Scientific and Statistical
Computing 7, 4 (1986), 1307–1330.
[176] Särkkä, S. Bayesian Filtering and Smoothing. Cambridge University
Press, 2013.
[177] Sawalhi, N., and Randall, R. Simulation of vibrations produced
by localized faults in rolling elements of bearings in gearboxes. In 5th
Australasian Congress on Applied Mechanics (ACAM), Brisbane, Australia
(2007).
[178] Schmidt, P. Improvements in localization of planar acoustic holography.
Master’s thesis, Institute of Electronic Music and Acoustics, University of
Music and Performing Arts, Graz, Austria, May 2012.
[179] Sen, D., Aghazadeh, A., Mousavi, A., Nagarajaiah, S., and
Baraniuk, R. Sparsity-based approaches for damage detection in plates.
Mechanical Systems and Signal Processing 117 (2019), 333 – 346.
[180] Siemens PLM Software. https://www.plm.automation.siemens.com/en/
products/lms/testing/scadas/.
[181] Siemens PLM Software. https://www.plm.automation.siemens.com/en/
products/lms/testing/test-lab/.
[182] Siemens PLM Software. https://www.plm.automation.siemens.com/en/
products/nx/.
BIBLIOGRAPHY 219
Matteo KIRCHNER
221
List of publications
Journal articles
223
224 LIST OF PUBLICATIONS
Book chapters
Books
Misc.