Thesis CarlosPereira2016

Departamento de Electrónica,
Universidade de Aveiro Telecomunicações e Informática

2016 Programa de Doutoramento em Informática
das Universidades de Aveiro, Minho e Porto
Carlos Eduardo Avaliação Dinâmica para Cenários Reactivos

da Silva Pereira
Dynamic Evaluation for Reactive Scenarios
Departamento de Electrónica,
Universidade de Aveiro Telecomunicações e Informática
2016 Programa de Doutoramento em Informática
das Universidades de Aveiro, Minho e Porto
Carlos Eduardo Avaliação Dinâmica para Cenários Reactivos

da Silva Pereira
Dynamic Evaluation for Reactive Scenarios
Tese apresentada à Universidade de Aveiro, Minho e Porto para cumpri-

mento dos requisitos necessários à obtenção do grau de Doutor em In-
formática no âmbito do Programa Doutoral MAP-i, realizado sob a ori-
entação cientı́fica do Doutor António Joaquim da Silva Teixeira, Professor
Associado do Departamento Electrónica Telecomunicações e Informática da
Universidade de Aveiro, e pelo Professor Doutor Miguel Augusto Mendes
Oliveira e Silva, Professor Auxiliar do Departamento Electrónica Telecomu-
nicações e Informática da Universidade de Aveiro.
o júri / the juri
presidente / president Vitor Brás de Sequeira Amaral

Professor Catedrático, Universidade de Aveiro
vogais / examiners committee João Manuel Pereira Barroso

Professor Associado com Agregação, Universidade de Trás-os-Montes e Alto
Douro
Luis Manuel Dias Coelho Soares Barbosa

Professor Associado, Escola de Engenharia, Universidade do Minho
José Miguel de Oliveira Monteiro Sales Dias

Professor Associado Convidado, Instituto Superior de Ciências do Trabalho
e da Empresa, Instituto Universitário de Lisboa. Diretor do Centro de De-
senvolvimento da Linguagem da Microsoft Portugal
António Joaquim da Silva Teixeira (Orientador)

Professor Associado, Universidade de Aveiro
Joaquim Manuel Henriques de Sousa Pinto

Professor Auxiliar, Universidade de Aveiro
Carlos Jorge da Conceição Teixeira

Professor Auxiliar, Universidade de Trás-os-Montes e Alto Douro
Hugo Alexandre Paredes Guedes da Silva

Professor Auxiliar, Escola de Ciências e Tecnologia, Universidade de Trás-
os-Montes e Alto Douro
agradecimentos Na realização deste trabalho, foram muitos aqueles que me incenti-
varam e ajudaram, e sem os quais a constituição deste seria muito
provavelmente apenas uma miragem.
Em primeiro lugar, tenho que agradecer aos meus pais, fonte ines-
gotável de apoio, por terem feito tudo ao seu alcance para me providen-
ciarem oportunidades às quais não tiveram acesso e por me suportarem
nos bons e maus momentos que todo este meu perı́odo académico
compreendeu.
Agradeço ao Professor António Teixeira e ao Professor Miguel Oliveira
e Silva pela oportunidade, paciência e confiança ao longo destes anos.
Sem eles, não seria possı́vel este trabalho. A forma descontraı́da com
que foi possı́vel discutir ideias e conceitos é algo que prezo e que recor-
darei com apreço. Melhores orientadores seria na minha opinião, im-
possı́vel.
Agradeço ao Artur, pelo apoio e por ter aberto o caminho académico
para a minha geração. Deste perı́odo, ficarão sempre as acaloradas
discussões deste a universidade até à estação e as sessões de ténis de
praia ao final da tarde.
Ao Nuno Luz, um agradecimento não só pela colaboração efectuada no
contexto deste trabalho mas pela amizade que estou certo permanecerá
por muitos e longos anos.
Um agradecimento ao IEETA e aos seus colaboradores, em especial a
Ana Isabel, o Nuno Almeida e o Samuel Silva pelo apoio nas actividades
decorrentes deste trabalho e pela boa disposição permanente.
Por fim, um agradecimento para a longa lista de pessoas que me aju-
daram, incentivaram e apoiaram durante este longo perı́odo e que seria
impossı́vel indicar individualmente. Muito obrigado a todos.
Palavras-Chave Avaliação, Cenários Reactivos, Dinamismo, Adaptação ao contexto,
AAL, Arquitectura Orientada a Serviços, Eventos
Resumo A natureza dinâmica de cenários como Ambient Assisting Living e

ambientes pervasivos e ubı́quos criam contextos de avaliação exigentes
que não são completamente considerados pelos métodos existentes.
Esta tese defende que são possı́veis avaliações que tenham em consid-
eração a natureza dinâmica e heterogénea de ambientes reactivos, in-
tegrando aspectos como percepção e dependência de contexto, adapt-
abilidade ao utilizador, gestão de eventos complexos e diversidade de
ambientes.
O principal objectivo deste trabalho foi desenvolver uma solução que
forneça aos avaliadores a possibilidade de definir e aplicar avaliações a
utilizadores suportadas por um modelo de avaliação flexı́vel, permitindo
a criação e reutilização de instrumentos e especificações de avaliação
sem modificar a infraestrutura geral.
Para atingir este objectivo foi seguida uma abordagem de engenharia
envolvendo: a) definição de requisitos; b) conceptualização de uma
solução geral contendo um paradigma, uma metodologia, um modelo
e uma arquitectura; c) implementação dos componentes nucleares; d)
desenvolvimento e teste de provas de conceito.
Como resultado principal obteve-se uma solução de avaliação dinâmica
para ambientes reactivos integrando três partes essenciais: um
paradigma, uma metodologia e uma arquitectura de suporte. No seu
conjunto, esta solução permite a criação de sistemas de avaliação es-
caláveis, flexı́veis e modulares para concepção de avaliações e aplicação
em ambientes reactivos.
Keywords Evaluation, Reactive Scenarios, Dynamic, Context-awareness, AAL,
Service Oriented Architecture, Event-awareness
Abstract The dynamic nature of scenarios such as Ambient Assisting Living and
Ubiquitous and Pervasive environments turns them into challenging
evaluation contexts not properly addressed by existing methods. We
argue that it is possible to have evaluations that take into considera-
tion the dynamic and heterogeneous nature of reactive environments
by integrating aspects such as context-awareness, user adaptability,
complex event handling, and environment diversity.
In this context, the main objective of this work was to develop a solution
providing evaluators with the ability to define and apply evaluation tests
to end-users supported by a flexible evaluation model allowing them
to create or reuse evaluation instruments and specifications without
changing the infrastructure or the need for other logistical necessities.
To pursue this goal, we adopted an engineering approach encompass-
ing: a) requirements definition; b) conceptualization of a general so-
lution comprising paradigm, methodology, model, and architecture; c)
implementation of its core components; and d) development and de-
ployment of a proof of concept.
The result was a dynamic evaluation solution for reactive environments
based on three major parts: a paradigm, a methodology and its model,
and a support architecture. Altogether, they enable the creation of
scalable, flexible and modular evaluation systems for evaluation design
and application in reactive environments.
Overall, we consider that the proposed approach, due to its flexibility
and scope, widely surpasses the goals considered on the onset of this
work. With a broad range of features it establishes itself as a general
purpose evaluation solution, potentially applicable to a wider range
of scenarios, and fostering the creation of ubiquitous and continuous
evaluation systems.
Contents
Contents i
List of Figures v
1 Introduction 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Thesis Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4 Main Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.5 Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.6 Published Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2 Background/Related Work 7
2.1 Common Evaluation Methodologies . . . . . . . . . . . . . . . . . . . . . . 7
2.1.1 Test Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1.2 Enquiry Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2 Experience Sampling Methodology . . . . . . . . . . . . . . . . . . . . . . 9
2.3 Support Technologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.3.1 Service Oriented Architecture . . . . . . . . . . . . . . . . . . . . . 14
2.3.2 Context Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.3.3 User Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.3.4 Ontologies and the Semantic Web . . . . . . . . . . . . . . . . . . . 23
2.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3 A New Evaluation Paradigm 25

3.1 Requirements for a dynamic evaluation paradigm . . . . . . . . . . . . . . 26
3.1.1 Introducing context into an evaluation . . . . . . . . . . . . . . . . 27
3.1.2 Supporting environment heterogeneity . . . . . . . . . . . . . . . . 27
3.1.3 Providing adaptability and redundancy in interaction . . . . . . . . 28
3.1.4 Allowing reusability by introducing semantic value . . . . . . . . . 29
3.1.5 Simplifying the distribution and execution of evaluations . . . . . . 29
3.2 A proposal for Dynamic Evaluation . . . . . . . . . . . . . . . . . . . . . . 30
3.2.1 Focusing on the user . . . . . . . . . . . . . . . . . . . . . . . . . . 30
i
3.2.2 Considering the environment . . . . . . . . . . . . . . . . . . . . . . 31
3.2.3 Enhancing evaluation definitions . . . . . . . . . . . . . . . . . . . . 32
3.2.4 Automating the execution . . . . . . . . . . . . . . . . . . . . . . . 34
3.3 The Conceptual Architecture . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.3.1 Evaluation Domains . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.3.2 Evaluation Nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.3.3 Support Infrastructure . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4 A Methodology and a Model for Evaluation Definition 43

4.1 Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.2 A Methodology Proposal for Dynamic Evaluation . . . . . . . . . . . . . . 44
4.3 Generic Domain Language . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4.3.1 Enquiries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.3.2 Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.3.3 Event Processing Rules . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.3.4 Evaluation Flow Control . . . . . . . . . . . . . . . . . . . . . . . . 53
4.4 Domain Language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.4.1 Creating a domain - Schema extension . . . . . . . . . . . . . . . . 56
4.4.2 Creating a domain - Association Process . . . . . . . . . . . . . . . 61
4.5 Evaluation Specification . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.5.1 Creating an evaluation - Evaluation Ontologies . . . . . . . . . . . 67
4.5.2 Creating an evaluation - Creating instances of the Enquiry and EPR
Specifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.5.3 Creating an evaluation - Defining the Evaluation Assessments . . . 72
4.6 Execution Specification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
4.6.1 Applying an evaluation - Instantiating an evaluation . . . . . . . . 75
4.6.2 Applying an evaluation - Creating execution specifications . . . . . 76
4.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
5 Dynamic Evaluation Architecture 81

5.1 Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
5.2 Architectural Proposal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
5.3 Virtual Domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
5.3.1 Domain Manager . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
5.3.2 Evaluation Module . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
5.3.3 Data Persistence Unit . . . . . . . . . . . . . . . . . . . . . . . . . 90
5.3.4 Domain Interfaces and Producers . . . . . . . . . . . . . . . . . . . 90
5.3.5 Extending a Virtual Domain . . . . . . . . . . . . . . . . . . . . . . 93
5.4 Node . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
5.4.1 Node Manager . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
5.4.2 Evaluation Engine . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
5.4.3 Interface Manager . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
ii
5.4.4 EPR Engine and Event Logger+Dispatcher . . . . . . . . . . . . . 101
5.4.5 User and Context Models . . . . . . . . . . . . . . . . . . . . . . . 104
5.5 Support Unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
5.5.1 Node Registry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
5.5.2 Domain Registry . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
5.5.3 Attribute Service . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
5.5.4 Interface and Event Producer Repository . . . . . . . . . . . . . . . 110
5.5.5 Association Service . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
5.5.6 Evaluation Mediation Service . . . . . . . . . . . . . . . . . . . . . 112
5.6 Evaluation Hub . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
5.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
6 Proof of Concept 117

6.1 A First Instantiation of the Architecture:
Dynamic Evaluation as a Service Platform . . . . . . . . . . . . . . . . . . 117
6.1.1 Node Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
6.1.2 Virtual Domain Framework . . . . . . . . . . . . . . . . . . . . . . 121
6.1.3 Support Unit and the Evaluation Hub . . . . . . . . . . . . . . . . 126
6.2 Creating an Evaluation for a concrete scenario: The TeleRehabilitation Eval-
uation Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
6.2.1 The evaluation scenario . . . . . . . . . . . . . . . . . . . . . . . . 127
6.2.2 Defining the domain language . . . . . . . . . . . . . . . . . . . . . 129
6.2.3 Implementing the Virtual Domain . . . . . . . . . . . . . . . . . . . 133
6.2.4 Creating the TeleRehabilitation evaluation test using the Virtual Do-
main . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
6.3 Applying the TeleRehabilitation Evaluation Test . . . . . . . . . . . . . . . 138
6.3.1 Starting and applying the test . . . . . . . . . . . . . . . . . . . . . 140
6.3.2 Performing a second iteration of the evaluation . . . . . . . . . . . 141
6.3.3 Evaluation results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
6.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
7 Conclusions 147
7.1 Developed Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
7.2 Main Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
7.3 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
7.4 Epilogue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
Acronyms 153
Bibliography 155
Annexes 165
iii
A General Evaluation Language 165
A.1 Enquiry Ontology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
A.2 Event Ontology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
A.3 Event Processing Rules Ontology . . . . . . . . . . . . . . . . . . . . . . . 167
A.4 Evaluation Ontology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
A.5 Evaluation Assessment Ontology . . . . . . . . . . . . . . . . . . . . . . . 170
A.6 Evaluation Control Flow Ontology . . . . . . . . . . . . . . . . . . . . . . 172
B ESM Tool Comparison Table 176
iv
List of Figures
2.1 Screenshots of a multiple choice question screen and a tutorial screen for an
audio note sample using CAES . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2 CAESSA Visual Editor Screenshot . . . . . . . . . . . . . . . . . . . . . . 11
2.3 MyExperience ESM Tool Screenshot . . . . . . . . . . . . . . . . . . . . . 12
2.4 MovisensXS Tool Sampling Screenshot . . . . . . . . . . . . . . . . . . . . 13
2.5 Maestro Tool Sampling Screenshot . . . . . . . . . . . . . . . . . . . . . . 14
2.6 CSCP Profile Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.7 COBRA-Ont List of Classes and properties [Chen et al., 2003] . . . . . . . 19
3.1 User’s Circle of Information . . . . . . . . . . . . . . . . . . . . . . . . . . 32

3.2 Evaluation Assessment Example . . . . . . . . . . . . . . . . . . . . . . . . 33
3.3 Evaluation Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.4 User Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.5 User Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.6 Domain User Selection Phase . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.7 Node Conceptual Listing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.8 Conceptual Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.1 Dynamic Evaluation Lifecycle . . . . . . . . . . . . . . . . . . . . . . . . . 44

4.2 Instantiating an Evaluation within a Domain . . . . . . . . . . . . . . . . . 46
4.3 Applying the methodology to the conceptual architecture . . . . . . . . . . 47
4.4 Enquiry Conceptual Specification . . . . . . . . . . . . . . . . . . . . . . . 49
4.5 Event Specification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.6 EPR Conceptual Specification . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.7 Task Specification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.8 Evaluation Flow Control Ontology . . . . . . . . . . . . . . . . . . . . . . 55
4.9 Domain Language Creation Process . . . . . . . . . . . . . . . . . . . . . . 56
4.10 Domain Language Creation - Phase One . . . . . . . . . . . . . . . . . . . 57
4.11 Enquiry Specification: Question extension example . . . . . . . . . . . . . 58
4.12 Enquiry Specification: Answer extension example . . . . . . . . . . . . . . 59
4.13 Event Specification: Extension example . . . . . . . . . . . . . . . . . . . . 60
4.14 Domain Language Creation - Phase Two . . . . . . . . . . . . . . . . . . . 62
4.15 Domain Language Creation: Task Association . . . . . . . . . . . . . . . . 63
v
4.16 Domain Language Creation: Interface Association . . . . . . . . . . . . . . 64
4.17 Domain Language Creation: EPR Ontology . . . . . . . . . . . . . . . . . 65
4.18 Evaluation Creation Process Flow . . . . . . . . . . . . . . . . . . . . . . . 67
4.19 Evaluation Specification . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
4.20 Evaluation Assessment Specification . . . . . . . . . . . . . . . . . . . . . . 69
4.21 Enquiry Instantiation Example . . . . . . . . . . . . . . . . . . . . . . . . 70
4.22 EPR Instantiation Example . . . . . . . . . . . . . . . . . . . . . . . . . . 71
4.23 Evaluation Assessment Example . . . . . . . . . . . . . . . . . . . . . . . . 73
4.24 Execution Specification Transformation Process . . . . . . . . . . . . . . . 74
4.25 Evaluation Instance Example . . . . . . . . . . . . . . . . . . . . . . . . . 75
4.26 Execution Specification - Step One . . . . . . . . . . . . . . . . . . . . . . 77
4.27 Execution Specification - Step Two . . . . . . . . . . . . . . . . . . . . . . 78
4.28 Execution Specification - Step Three . . . . . . . . . . . . . . . . . . . . . 79
5.1 Dynamic Evaluation Architecture Overview . . . . . . . . . . . . . . . . . 82

5.2 Virtual Domain Component Overview . . . . . . . . . . . . . . . . . . . . 86
5.3 Virtual Domain Internal Overview . . . . . . . . . . . . . . . . . . . . . . . 88
5.4 Task Interface Processing Example . . . . . . . . . . . . . . . . . . . . . . 92
5.5 Event Interface Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
5.6 Event Interface Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
5.7 Evaluation Module Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
5.8 Evaluation Module Output Example . . . . . . . . . . . . . . . . . . . . . 99
5.9 Interface Manager Overview . . . . . . . . . . . . . . . . . . . . . . . . . . 100
5.10 EPR Engine Integration Overview . . . . . . . . . . . . . . . . . . . . . . . 102
5.11 EPR Engine Execution Demonstration . . . . . . . . . . . . . . . . . . . . 103
5.12 EPR Engine Result Example . . . . . . . . . . . . . . . . . . . . . . . . . . 105
5.13 Support Unit Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
5.14 Criteria Handling in the Node . . . . . . . . . . . . . . . . . . . . . . . . . 112
5.15 Sequence Diagram regarding the initialization of an evaluation . . . . . . . 114
5.16 Evaluation Hub UI Overview . . . . . . . . . . . . . . . . . . . . . . . . . 115
6.1 DynEaaS Platform: EPR Creation User Interface . . . . . . . . . . . . . . 123

6.2 DynEaaS Platform: Evaluation Assessment Results in a Timeline . . . . . 124
6.3 DynEaaS Platform: SPARQL UI . . . . . . . . . . . . . . . . . . . . . . . 125
6.4 DynEaaS Platform: Node Registration UI . . . . . . . . . . . . . . . . . . 126
6.5 The TeleRehabiliation Application User Interfaces . . . . . . . . . . . . . . 127
6.6 TeleRehabilitation Domain - Creating the Enquiry Extended Specification 130
6.7 TeleRehabilitation Domain - Associating the Enquiry Extended Specifica-
tion with control flow elements . . . . . . . . . . . . . . . . . . . . . . . . . 132
6.8 TeleRehabilitation Domain - Enquiry final domain language . . . . . . . . 133
6.9 TeleRehabilitation Domain - Event final domain language . . . . . . . . . . 133
6.10 TeleRehabilitation Domain - Enquiry Creation UI . . . . . . . . . . . . . . 134
6.11 Application Screenshot with the Question UI . . . . . . . . . . . . . . . . . 135
vi
6.12 Defining an Evaluation Assessment using DynEaaS . . . . . . . . . . . . . 138
6.13 Distribution of the software components for the execution of the TeleReha-
bilitation evaluation test . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
6.14 Conducting the evaluation test during a TeleRehabilitation session between
a user and a therapist . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
6.15 Example of the DynEaaS UI regarding a question in one of the evaluation
test’s assessments. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
6.16 Example of a SPARQL query applied to the evaluation results in DynEaaS 144
vii
viii
Chapter 1
Introduction
1.1 Motivation
Technology can be found on every day objects like phones, cars, cloth or in our own
homes. A big challenge however remains on how to create technology that can integrate or
even cooperate with already existing technology. Technology should be as transparent as
possible to the user. Systems and services should not be required to compete for the user
and if they are, the user should not be aware of it. In order to guarantee their success, it is
necessary to heavily test these systems in real-life situations with the users and guarantee
that they operate according to expectations. It is necessary to involve users in development
particularly within tests and evaluations. Otherwise, due to competitiveness of the market,
these systems and applications risk not being accepted, not used and soon forgotten by
the users.
Evaluation processes are fundamental to assess the merits of a given research or a prod-
uct. They are applied in many forms, from written enquiries [Bevan and Bruval, 2003] to
observation practice [Hanington and Martin, 2012], and provide evaluators with data from
which it is possible to extract results and conclusions and thus determine the success of the
research or its utter failure. To apply an evaluation is a hard process. Independently of the
research area, evaluations require planning on both content and application. From creating
enquiries, preparing locations, instructing the users or analyzing the results, evaluations
cost time. It is often harder for evaluators to handle logistical aspects such as the location
than to design the actual evaluation test. These necessities range from area to area, and
while certain areas have hard restrictions which evaluators must comply with, others have
methodologies which do not fit their necessities.
Due to our work in projects such as the Living Usability Lab [LUL Consortium, 2012]
and the AAL4ALL [AAL4ALL Consortium, 2015], we came upon examples of both sce-
narios. In one test, we had the objective of applying an evaluation to a set of users which
had a common disability. The objective was to conduct a test for two months consisting
on a set of pre-validated enquiries to the user using a browser [Martins et al., 2014]. The
browser would apply the test exactly as its paper version and collect data that can only
1
be analyzed when the full test ends. In another test, we intended to analyze the usability
of an Ambient Assisted Living (AAL) application. In it, the objective was to test the
application in real life conditions and analyze if the user’s reactions and feedback in regard
to the application. We intended to check which options did the user preferred and why.
We intended to analyze results as the test was performed and conduct new tests on the fly
if necessary. We intended to verify if the environment itself compromised the application
and if it did, then how or why.
On these cases, we found very different realities. In the first, a typical methodology
based on enquiries proved sufficient. It applied the enquiries on a browser to the user who
would select answers from a predefined set, thus submitting them to a database. At the end
of the trial the database became closed and the results became accessible to the evaluators.
On the second we found a gap as common methodologies were not capable of realizing the
entire set of constrains and requirements that environments such as AAL or pervasive
environments comprehend. To apply a set of enquiries would disregard the reactive nature
of the environment itself and leave questions open. Conducting an observational test,
can limit the user in the sense of him performing its normal routines. To create a focus
group posteriously to the test within a group could result in cases where the users are
either too nice or do not remember what occurred. In addition, neither methodology
provided an answer as to how to assess the effect that the environment had on the user
performance, on to how to detect extraordinary events and data related to them, on how
to apply consecutive tests without disrupting the user’s routines, or on how to suit the
test to the user’s characteristics, preferences or interests. In all, it was clear that common
methodologies were not suited for reactive environments such as AAL.
In an AAL environment, users are constantly interacting. Whether directed to an
application or to a sensor, users interact actively or passively. In some situations, these
interactions may interfere with one another as they may clash and cause drastic changes on
usability. A sudden change caused by an application can directly influence the conditions of
the environment, thus making it highly reactive. Applications that operate simultaneously
or using the same medium (such as sound) may see their conditions deteriorated which
can cause high drops in user experience. To really test an application in such a scenario it
is necessary to envision these situations and, if possible, extract data from them in order
to verify their importance. In such environments, another important factor is context.
For instance, an obtained result can be directly dependent on previous actions. Having
knowledge of these actions can lead to a better comprehension of the user’s answers. Dis-
regarding them can lead evaluators to compare situations where the users where not in
similar conditions thus leading to incorrect conclusions.
In all, the differences between reactive environments, such as AAL, and other areas,
hinted that these environments have different necessities, requirements and objectives. In
some cases, the evaluator may want to directly pose a question to the user while in others
he might find it better to analyze the user’s behavior. In some cases, the test must be
performed exclusively at the user’s home but in others, it may be necessary for the test
to available at all times. A methodology for reactive environments should contemplate
alternative solutions from which evaluators may choose from. They must be allowed to
2
simply build and apply their evaluations according to their realities and without constrains
from the methodology, something that common solutions are not suited for.
In the literature, a methodology called Experience Sampling Methodology (ESM) [Lar-
son and Csikszentmihalyi, 1983] with context-awareness has been used by some researchers
as a way of evaluating the effect that other sources of information may have on the user
during an evaluation. This methodology mainly consists on allowing evaluators to link
events with specific enquiries which are triggered only when the event occurs and obtain
information on site. They are able to detect events that a device produces and apply
the enquiries via the device. While these solutions point in the right direction, most are
directed for mobile application testing [Consolvo and Walker, 2003] and only contemplate
a single device (like a specific phone device) and not the whole environment [Froehlich,
2009, Intille et al., 2003]. Others are standalone solutions and are not prepared for multiple
user evaluation or mass scale instantiations [Barrett and Barrett, 2001]. They are restric-
tive in the type of answer that can be posed to the user and do not offer the flexibility to
change the procedure by maintaining a fixed methodological process on how to apply to
test. Finally, they disregard aspects such as multimodality and are often too complex for
evaluators as they require programming knowledge in order to set up an evaluation.
Driven by the necessities of AAL and the limitations of ESM solutions, it was clear that
an evaluation solution that fits the dynamic nature of reactive environments is necessary.
The solution should focus on allowing evaluators to create, apply and analyze evaluations
in scenarios where the environment can directly influence the outcomes of a test. It should
allow evaluators to gather data that can assist them in interpreting unexpected situations,
the effect they have on the user’s objectives and establish more accurate conclusions. It
also requires a strong focus on dynamicity. Since environments change from one to another,
evaluations may not perform identically. The reactive nature of an environment can lead
one evaluation to go in a completely different way to another due to a sudden change
in context. Rather than considering that user as an outlier, extracting data from that
particular user may lead to results that may result in valuable information. Following
these thoughts, a dynamic environment requires a dynamic evaluation methodology in
which it is possible to evaluate the user under certain conditions and see the user as an
individual rather than a simple element of a set.
1.2 Thesis Statement

Evaluation in reactive and dynamic environments requires an alternative approach that
goes beyond contextual data. Aspects such as user adaptability, complex event handling or
environment diversity must be considered and integrated into an evaluation solution that
answers the challenges and difficulties of performing evaluations in reactive environments.
We consider that is possible to develop a solution that provides evaluators with the ability
to create and apply evaluation scenarios supported by a flexible evaluation model that
allows them to create or reuse evaluation instruments and specifications without changing
the infrastructure or the need for other logistical necessities.
3
1.3 Objectives
The main objective of this thesis is to provide evaluators with the necessary tools to
create and apply evaluations within reactive environments. This objective focuses on gath-
ering additional data that allows evaluators to analyze the implications of the environment
on the evaluation subject and extract better conclusions for the evaluation.
To realize this objective, this work focuses on the development of a solution that pro-
vides evaluators with the ability to define new evaluation practices or methodologies with-
out requiring a strong logistical effort in preparing the subjects for these changes. Simulta-
neously, it intends to prove that it is possible to create a scalable evaluation solution that
is not based on a simple methodology and that provides evaluators with the freedom to
change specifications, introduce new evaluation elements, use contextual data and apply
remote evaluations without implementing new systems for each new iteration.
Altogether, our approach is divided in three parts: an evaluation paradigm, an evalua-
tion methodology and a supporting architecture. These three elements are combined into
an evaluation solution that offers a set of tools to evaluators with the purpose of facilitat-
ing the creation and application of evaluations. In addition to its focus on context, the
solution tackles aspects that often plague evaluators like user recruitment, user adaptation
to context, deployment and result gathering and at the same time promotes concepts like
evaluation reusability, flexibility, and multi-environment support. Together, they provide
evaluators with the ability to design evaluations in whatever manner they require and
without the limitations of a common methodology.
1.4 Main Contributions

The main contributions of this thesis comprehend an evaluation solution that features:
• A new paradigm based on the concept of dynamic evaluation for applying evalua-
tions in reactive environments. The paradigm introduces the notion of domains and
nodes to represent evaluation networks comprised of evaluation instruments such as
enquiries or events while integrating concepts such as context awareness, user adapt-
ability, remote initialization and control for evaluation scenarios into evaluations.
• An evaluation methodology and a model featuring an incremental approach for eval-
uation specification. The methodology provides evaluators with the ability to design
their evaluations from the bottom up, and be able to specify both the content of the
evaluation as well as the procedure that delivers the test to the user. The model fea-
tures an ontological approach to promote reusability and allow different evaluations
(and their methodologies) to be applied at a distributed level.
• A support architecture inspired by Service Oriented Architecture (SOA) principles
that facilitates recruitment, enables the creation of evaluation networks focused on
user characteristics and preferences, and allows the remote creation, deployment and
application of evaluation tests based on the evaluation model.
4
• A proof of concept that implements the evaluation solution and demonstrates its
applicability.
1.5 Structure
This thesis is structured as follows. In Chapter 2, we present related work regarding
common evaluation methodologies with a special focus on ESM solutions, as well as sup-
porting technologies that influenced our solution. In Chapter 3 we present the main issues
and ideas associated with the necessity of a dynamic evaluation paradigm as well as a
conceptual architecture for this purpose. In Chapter 4 we present an evaluation method-
ology and a model to allow the specification of dynamic evaluation tests. In Chapter 5, we
propose a service architecture that implements our evaluation methodology and allows the
creation, design and application of evaluations on a large scale. Chapter 6 demonstrates
the feasibility of the solution by featuring a proof of concept that implements it. Chapter 7
enumerates the main conclusions of the thesis and possible future work topics. Additional
aspects considered important were added in the Annex.
1.6 Published Results

Throughout the development of this thesis, and directly associated with the subject of
the thesis, we identify the already published contributions:
• The dynamic evaluation methodology and its model ([Pereira et al., 2014] and [Luz
et al., 2014]);
• The architecture and its proof of concept scenario ([Pereira et al., 2015]).
Several contributions were also made in areas that although not directly associated with
the thesis, were fundamental to its successful development as they contributed to decisions,
requirements and conclusions taken during this research. From this work we highlight the
following publications: in multimodal interfaces ([Teixeira et al., 2011a] [Silva et al., 2015]);
in AAL architecture proposals ([Teixeira et al., 2015] [Teixeira et al., 2011b] [Pereira et al.,
2013b]); in application development ([Teixeira et al., 2013] [Teixeira et al., 2012]); and in
generic evaluations ([Pereira et al., 2013a] [Martins et al., 2014]).
Finally, it is important to note that the development of this thesis was created within the
context of European projects Living Usability Lab [LUL Consortium, 2012] and AAL4ALL
[AAL4ALL Consortium, 2015], both of which were completed with high levels of success.
5
6
Chapter 2
Background/Related Work
The evaluation of users within reactive environments in normally made through general
purpose evaluation methods such as enquiries, observation or interviews. Due to this, we
start this chapter by listing the most used evaluation methodologies and briefly describing
their characteristics and advantages. Following, we introduce the ESM as an alternative
and describe existing frameworks and tools that support it. Finally, to better contextualize
this thesis, we present a brief description of key technologies used on the scope of this work,
namely SOA, user and context models and ontologies.
2.1 Common Evaluation Methodologies

The roots of technology evaluation lie in the USA at the end of the 1960s when large-
scale applications of technology began to affect dramatically the life of citizens [Bakouros,
2000]. A thorough evaluation assesses the technology and its device’s value from technical,
market and consumer perspectives and reconciles the results within a valid methodol-
ogy [Bakouros, 2000]. A wide number of methodologies have been proposed to conduct
evaluation on applications and systems. However, when concerning the evaluation of the
user’s experience it is possible to divide the existing methodologies into two groups: test
methods and enquiry methods.
2.1.1 Test Methods

Test methods involve observing users while they perform predefined tasks [Nielsen,
1993]. They are able to measure the user interaction and consist of collecting mostly
quantitative data from users [Afonso et al., 2013]. Testing usually involves systematic
observations to determine how well participants can use the system or service [Mitchell,
2007]. They focus on people and their tasks, and seek empirical evidence about how
to improve the user interaction [Hanington and Martin, 2012]. Test methods techniques
include:
7
• Rapid prototyping is a technique that uses a low fidelity prototype (not implemented)
called mock up, used to collect preliminary data about user interaction [Bernsen and
Dybkjr, 2009]. A mock up can be quickly created and changed. Despite being
gathered in a preliminary stage of development, the collected information is valid
and reliable [Bernsen and Dybkjr, 2009].
• Performance evaluation is centered in the users and the tasks they perform, and
it involves the collection of quantitative data. The participant’s performances are
evaluated by recording elements related to the execution of particular tasks (e.g.
execution time, success/failure, or number of errors) [Nielsen, 1993]. Log file analysis
is used to collect information about users’ performance. The logs recorded by the
system are important to supplement data collected by observers, as they enable the
realization of triangulations.
• Observation is a research method that consists on attentive visualization and system-

atic recording of a particular phenomenon, including people, artifacts, environments,
behaviors and interactions [Hanington and Martin, 2012]. Observation can be direct,
when the researcher is present during the task execution, or indirect, when the task
is observed through other means, such as video recording [Bevan and Bruval, 2003].
• Remote testing is a method oriented to usability evaluation where evaluators are

separated in space and/or time from users. In a traditional usability evaluation,
users are directly observed by evaluators, however, in a remote usability test, the
communication networks act as a bridge between evaluators and users, leading to
review the user’s interaction in their natural conditions and environments [Castillo,
1997]. This approach also facilitates the quick collection of feedback from users who
are in remote areas with reduced overall costs.
2.1.2 Enquiry Methods

Enquiry methods involve collecting qualitative data from users. It can provide valuable
information on what the users feel and desire [Bevan and Bruval, 2003]. Qualitative data,
although subjective, may help to know what users actually want, and for that reason,
survey methods are often used for evaluating user experience and usability, particularly,
interviews, questionnaires or focus groups [Shneiderman, 1997]:
• The Focus group methodology consists in involving a small number of people in

an informal discussion group, focused on a specific subject [Wilkinson, 2003]. A
moderator introduces the topic and guides the discussion. The goal is to extract the
participant’s perceptions, feelings, attitudes and ideas about a given subject [Bevan
and Bruval, 2003].
• Interviewing is a method used in direct contact with the participants, to gather

opinions, attitudes, perceptions and experiences [Hanington and Martin, 2012]. The
8
interviews are usually conducted by an interviewer who conducts a dialog with the
participant. Because interviews have a one-to-one nature, errors and misunderstand-
ings can be quickly identified and clarified [Bevan and Bruval, 2003].
• The questionnaire is a tool to collect self-registration information as characteristics,
thoughts, feelings, perceptions, behaviors or attitudes, usually in a written form [Han-
ington and Martin, 2012]. A questionnaire has the advantage of being cheap, do not
require test equipment, and the results reflect the user’s opinions, namely about the
strengths and weaknesses of the user interaction.
• Diary study is a non-intrusive field method in which the users are in a different
location of the evaluators and can manage their own time and means of gathering
information [Brandt et al., 2007]. The data are recorded in the moment that occurs,
which reduces the risk of false information [Tomitsch et al., 2010]. Participants record
specific events throughout the day. The data resulting from this collection can then
be used to guide the implementation of clarification interviews.
2.2 Experience Sampling Methodology

Experience Sampling Methodology [Larson and Csikszentmihalyi, 1983] or Event Sam-
pling Methodology (ESM) is a successful method applied from social psychology [Csik-
szentmihalyi and Larson, 2014] that has been adapted for evaluation in diverse fields such
as quality of life, the experience of work, the examination of cross-cultural differences
and clinical research questions [Hektner et al., 2007]. ESM allow evaluators to study “in
situ” situations of everyday experiences. It can involve detailed descriptions of a person’s
life such as asking participants to relate feelings, thoughts or activities or it can be used
to describe specific events whenever they occur [Reis and Gable, 2000]. Generally, the
methodology can be translated as a self-observation method [Reis and Gable, 2000].
Several tools have been created to support the application of ESM. [Consolvo and
Walker, 2003] characterizes ESM tools in three major parts: alerting, delivering and cap-
turing. Alerting consists on how to alert the participant to capture his attention, and the
authors divide it into type of alert (random, scheduled, event-based), scheduling require-
ments (daily time period, number of alerts per day, number of alerts overall) and delivery
mechanism (audible or tactile). Delivering relates to how to pose the question to the user,
which the authors divide into delivery type and question design. Finally, capturing, regards
how the answer is provided by the user, which the authors divide into record type (written
or spoken) and timing of responses (timestamp and timeout).
From the early approaches with programmed stopwatches and handwritten notes, sev-
eral tools have been developed to allow for electronic data collection:
ESP The Experience-Sampling Program version 2.0 (ESP) is a software package reported
from 1999 [Barrett and Barrett, 2001, Barrett, 1999] containing a native application to trig-
ger and run ESM enquiries on a PDA Palm Pilot, and a desktop application for structuring
9
the questionnaires. As an advanced version to ESP, [Consolvo and Walker, 2003] introduces
iESP as a tool focused on ubicomp application evaluation.
CAES and CAESSA Intille et al. [Intille et al., 2003] introduced the concept of Context-
Aware Experience Sampling (CAES) by designing a tool capable of sampling the user di-
rectly via questioning as well as sampling from sensors that were on the user or nearby.
Their tool, developed for PocketPC, features the ability to pose multiple choice questions
only through the phone’s interface (showcased in Figure 2.1). Additionally, the tool permits
some level of question answer chaining based upon particular question responses.
Figure 2.1: Screenshots of a multiple choice question screen and a tutorial screen for an
audio note sample using CAES [Intille et al., 2003]
In [Fetter et al., 2011], the authors proposed an extension of CAES by building CAESSA
- a toolkit enabling researchers to setup CAES studies through a visual editor. The toolkit
was developed for PDAs, and supports a fixed set of question types as well as answers via
text input, microphone and camera. CAESSA is composed of three main parts: a daemon
for collection sensor data, an editor for handling event streams, and a question actuator
daemon for presenting the questions to the user. CAESSA features a plug-in mechanism for
sensor inclusion and its visual editor allows the creation of flows between sensors, engines
and actuators. Questions are restricted to free text, numerical text, ratting scale and yes/no
questions and are defined in a XML file. Figure 2.2 illustrates the CAESSA Visual Editor.
The same authors proposed another version of their work with PRIMIExperience [Fetter
and Gross, 2011], by use of Instant Messaging (IM) as a cost-effective mean for carrying
out ESM studies.
MyExperience MyExperience is a open-source software that runs on Windows Mobile

devices (including PDAs and mobile phones) [Consolvo et al., 2007]. It is based on a three
tier architecture of sensors, triggers and actions in which triggers use sensor event data
to start certain actions. Its interfaces are specified via XML and a lightweight scripting
language similar to the HTML/JavaScript paradigm on the web [Froehlich et al., 2007].
Their latest release include a set of built-in sensors including support for GPS, GSM-based
motion sensors (based on cellular signals), and device usage information (e.g., button
presses, battery life information, etc.). The events can be used to trigger actions such
10
Figure 2.2: Screenshots of the CAESSA Visual Editor [Fetter et al., 2011]
as to initiate wireless database synchronization, send SMS messages to the research team
and/or start “in situ” self-report surveys. Additional sensors can be added via its plug-in
architecture.
The tool provides evaluators with the ability to pose questions in a set of fixed formats,
including closed-form and open-form data. In total, the latest version of MyExperience
provides fourteen separate survey response widgets (a selection of which are shown below
in Figure 2.3) from radio button lists and text fields to widgets that allow the subject to
take pictures, video, or even to record their responses audibly. Regarding the usage of
the tool, MyExperience allows evaluators to define a test using a XML structure. Data is
stored at the phone using SQL Compact Edition and thus, can only be consulted after the
experiment ends.
Momento Momento is a ESM tool that provides integrated support for situated evalua-
tion of ubicomp applications [Carter et al., 2007]. Momento can gather log data, experience
sampling and user diaries using a client-server architecture. It features a desktop appli-
cation designed for experiment management and uses the SMS and MMS capabilities of
mobile devices to share information between the end users and the evaluators. Through
it, the tool permits evaluators to answer end user requests, ask participants to capture or
record data, or automatically gather data from the mobile device. To support multiple
evaluations, the tool includes a fixed server to store gathered data.
11
Figure 2.3: Screenshots of the MyExperience tool regarding its possible response meth-
ods [Froehlich, 2009]
Communication is handled via text or multimedia messages through HTTP or SM-

S/MMS. It is also possible to integrate the client with other applications using the Context
Toolkit [Dey et al., 2001] through the CTK event system and the CTK services system.
Test specification is made through configuration files that are read by the mobile client. A
specification includes the participants, the locations (via bluetooth IDs) and a set of rules.
The rules can be used to automate certain actions, and follow a if[conditional and/or
conditional] then send[content] to [recipients] structure.
movisensXS MovisensXS is a commercial ESM software supporting self-reports, behav-

ior records, or physiological measurements [movisens GmbH, 2015]. The software operates
on Android devices and is coordinated through the web. The software claims to support
multiple questionnaire items such as likert scale, open input items, geomaping and multi-
media (through video, pictures and audio). The software includes a workflow methodology
for setting the ESM test (see Figure 2.4) as well as a form editor to design the enquiries.
Answer types are restricted to dates, decimals, geopoints, numbers, text, radio selection
and visual analog. At the time of writing, the software only operates through time trig-
gering and does not include other sensors or event selection options.
MetricWire MetricWire [MetricWire, 2015] is a commercial ESM software similar to

movisensXS. It allows evaluators to design studies using their website and then deploy
them to a set of users on a mobile device. Study creation includes several types of ques-
tions formats, allows the specification of dates where questions are to be triggered by the
smartphone and the ability to record GPS coordinates.
12
Figure 2.4: MovisensXS Tool Sampling Screenshot [movisens GmbH, 2015] for specifying
sampling schemas
Maestro Maestro is an ESM tool that proposes itself to enhance ESM solutions by ex-
ploiting long term user behaviour and usage patters for ESM triggering [Meschtscherjakov
et al., 2010]. According to the authors, user behavior and context triggered ESM enables
the evaluator to trigger ESM questions based on past user behaviour and dinamically adapt
the ESM questions whenever the user changes his/her behaviour. At the same time, it also
allows evaluators to log and monitor real-time comprehensively and flexible user behavior
with meaningful context information [Meschtscherjakov et al., 2010].
The maestro tool follows a client-server architecture. Events are stored at the central
server thus requiring a constant connection between the client and the server. A rule based
mechanism allows evaluators to specify event triggers for ESM questions. Communication
between the server and the client is made through XML (regarding configuration files) and
HTTP/XML for handling enquiry data. Configuration can be performed remotely through
the server’s web application. Events are captured at the client, transferred to the server
through HTTP via GPRS/EDGE and there evaluated. If the evaluation of the rule is
positive, the server contacts the client with the corresponding question(s). The client was
13
developed for Blackberry and uses its internal web browser. Figure 2.5 showcases some
ESM questions displayed on the Blackberry.
Figure 2.5: Example questions on the Blackberry device created by the Maestro ESM
Tool [Meschtscherjakov et al., 2010]
In [Fischer, 2009], Fischer makes a critical review of some ESM tools and indicates
several principles that should be considered when designing an ESM tool. The author
claims that (1) ESM tools should follow a client-server logic to make use of the growing
connectivity of today mobile device’s and the internet; (2) evaluation creation interfaces
should be accessible to evaluators and not require in-depth programming knowledge; (3)
users have their own devices and these should be prefered for ESM evaluations; (4) ESM
tools should include different configuration options for the ESM study and not be based on
a certain area; (5) Logging and enquiries should be separated regarding setup on devices;
(6) be aware of the limitations of client-side devices and their interfaces when implementing;
(7) ESM evaluations should be accessible from a server-side to allow evaluators to monitor
the progress of the study.
2.3 Support Technologies

2.3.1 Service Oriented Architecture
Due to heterogeneity of hardware, communication interfaces, operating systems and
mainly vendors, the main challenge for today’s architectures rests on interoperability. This
concern is particularly addressed by SOA [Huhns and Singh, 2005, Papazoglou and van den
Heuvel, 2007].
Given web services popularity, a logical step came in the concept of creating a dis-
tributed service only architecture which was later called “Service Oriented Architecture”.
In the last few years, SOA evolved from being a single concept to a widely accepted
14
architectural style. Summarily, SOA can be described as an architectural model where
applications are “encapsulated” as services, and where communication is provided via a
self-contained communication system inherent to the architecture.
SOA as an architecture proposes itself to several objectives. One of them is related
to its inherent interoperability which allows for any component within the architecture to
be remotely invoked by any potential client. Given that every component involved in the
architecture must provide a standard interface by which it can be invoked (using a protocol
known by all clients), this is a capability that rests assured. With it, new components can
be easily discovered and included by the architecture by themselves or for the construction
of new software systems. These can later also be published and made available as new
services entering an “infinite” cycle limited only by hardware capabilities.
Another main objective is the aggregation and abstraction of complex business logics
under standardized service interfaces to allow simpler integration of complex services in
novel applications. This way, business processes are hidden and become irrelevant to the
development of newer applications.
In order to be discussed at a conceptual level, a typical SOA architecture is composed
by four basic layers [Thies and Vossen, 2008]:
• Applicational Layer - which may include legacy systems, Customer Relationship Man-
agement (CRM) software, Enterprise Resource Planning (ERP) systems or additional
databases.
• Service Layer - where services are provided on top of the applicational layer. An
important note is that services are normally described using Web Service Definition
Language (WSDL).
• Processing Layer - where services are orchestrated into processes using, for instances,
Business Process Execution Language (BPEL).
• Presentation Layer - where functionalities are made available to users via desktop or
web applications.
The implementation of SOA generally implicates a good practices set of principles

[Bieberstein et al., 2005]:
• Reusability, granularity, modularity, composition - reusable services due to the ex-

istence of others scenarios; individual and autonomous modules regarding the pro-
cessing of a certain instruction; creation of complex services either by composition
or orchestration of other services.
• Standards compliance - to assure the interoperability of services.
• Identification, categorization, provisioning and monitoring of services - through which

searching for services and detecting their anomalies become easier.
15
According to the same author, other more concrete principles should still be considered
such as the separation of business logics from base technology, reusing business logics
whenever necessary, lifecycle management or the efficient usage of system resources.
2.3.2 Context Modeling

Context-awareness has become an important concept within computer science since
Weiser first presented the term “pervasive” in 1991 as “the seamless integration of devices
in our everyday life” [Weiser, 1991]. Similarly to user modeling systems, context-aware
systems focus on adaptability. They aim their operations on current surroundings and
can change their modus-operandi without explicit user intervention. Their objective is
to increase user’s usability and system’s effectiveness rates [Strang and Linnhoff-Popien,
2004]. A number of definitions for context exist in literature. Some point out specific
terms such as location, nearby people, objects, identity or temperature. Others point date
and time, emotional state or focus of attention. Schilit et al. [Schilit et al., 1994] claimed
that the most important aspects of context are: where you are, who you are with and
what resources are nearby. A perhaps more general and accurate definition is provided by
Abowd [Abowd et al., 1999]:
“Context is any information that can be used to characterize the situation

of entities (i.e., whether a person, place or object) that are considered relevant
to the interaction between a user and an application, including the user and
the application themselves.”
Later on, in the same article, Abowd would consider a context-aware system, a sys-
tem that uses context to provide relevant information and/or services to the user, where
relevancy depends on the user’s task’ [Abowd et al., 1999]. In this case, the definition of
Fickas et al. [Fickas et al., 1997] is perhaps more practical by defining context-aware sys-
tems as “applications that monitor changes in the environment and adapt their operations
according to predefined or user-defined guidelines”.
The first contextual application remotes to 1992 when [Want et al., 1992] created the
Active Badge Location system. The system was based on infrared technology to determine
the user’s location and had the objective of forwarding telephone calls to a phone that
was nearby the user. Since then, a wide number of contextual applications have been
researched and developed. However most can be cataloged and pointed to a strict number
of architecture definitions.
2.3.2.1 Architectural Principles

While contextual-aware systems can be implemented in a high number of ways, they
differ according to the location of sensors, the amount of users or even the future plan-
ning for the system. The method of data acquisition however normally relates to three
types [Chen, 2004]:
16
• Direct Sensor Access - client software gathers the information directly from sensors.
This means that sensors become “embedded” in the application and can only be
accessed by it. Concurrency is very hard to achieve due to this fact, and as such
proves to be a improper choice for distributed systems.
• Middleware Infrastructure - a middleware is used on top of sensors which separates

business logic from data-aquisition mechanisms. This facilitates modifications on
sensor acquisition properties without changing clients and vice-versa plus allowing
for concurrent access.
• Context Server - the use of a server implicates a client-server paradigm. Contextual

data is sent to the server and consulted on demand by the clients (by polling or
subscription). Additionally, this method allows for high levels of concurrency and
allows for modifications in a easier manner.
A common trace within distributed contextual architectures comes by the implemen-

tation of layered systems, separating components by stages [Baldauf et al., 2007]. An
abstract view of these layered systems is given by Ailisto [Ailisto et al., 2002], who classi-
fied contextual systems as normally been depicted by five layers.
Layer one represents the Sensing Layer, that is, the obtainers of data. Note that sensors
can be more than hardware devices, possibly being other applications or services that pro-
vide information on demand. Sensor examples can be devices such cameras, microphones,
accelerometers, motion detectors, thermometers, biosensors among others.
The second layer, Data Retrieval Layer, represents the first software component of the
framework. Its objective centers on the obtaining raw data from the sensors. If the sensors
are applications or services, then this layer establishes itself as a client for those.
The third layer, Pre-Processment depends mainly on the granularity of the information.
Data from sensors may sometimes be presented with irrelevant information which may need
some “parsing” before being made available. Within the responsibility of this layer, are
cases when data is be minimal and need to be compiled or joined for applicational purposes.
Finally, the pre-processment layer may also be used to join extra information (f.i. tags) to
data, that may be necessary to avoid future conflicts with other sources.
The fourth layer, Storage and Management prepares the data for consumption by mak-
ing it available via an interface to the client. Two main methods are commonly imple-
mented to allow access to information, by polling - clients regularly check for updates,
or by subscription - clients subscribe to a certain resource and are notified when updates
occur.
Finally, the fifth layer represents the Applicational layer, where the context data is
actually consumed by applications or services.
2.3.2.2 Representation Models

A representation model for context is used to describe and store context data in a pro-
cessable form. The development of flexible and useable context ontologies is a challenging
17
process [Baldauf et al., 2007]. However, it is possible to summarize the most relevant con-
text modeling approaches based on data structures and exchange of information [Strang
and Popien, 2004]:
• Key-Value Models. These models represent the simplest data structure for modeling
contextual information. They are often used to described the capabilities of services
within distributed frameworks. Matching algorithms are used by service discovery
methods to find key-value pairs (ex. CAPEUS [Samulowitz et al., 2001]). Their ad-
vantages are mainly associated with simplistic representation and easy maintenance.
• Markup scheme models. All markup scheme modeling approaches use a hierarchi-
cal data structure comprised by markup tags with attributes and content. Profiles
represent typical markup-scheme models. Some are defined as extensions to the Com-
posite Capabilities / Preferences Profile (CC/PP) [W3C, 2007] standard possessing
Resource Description Framework (RDF) encoding and eXtensible Markup Language
(XML) serialization. One such example is the Comprehensive Structured Context
Profile (CSCP) [Indulska et al., 2003] shown on Figure 2.6. Other examples can be
found in [Strang and Popien, 2004].
Figure 2.6: CSCP Profile Example [Indulska et al., 2003]
• Graphical Models. The Unified Modeling Language (UML) is a type of graphical

representation suitable to represent context due to its generic structure. Several
approaches exist where UML is used to model contextual aspects such as [Sheng and
Benatallah, 2005]. Another graphical modeling example is based on the extension of
the Object-Role Modeling (ORM) format.
• Object oriented models. The use of object-oriented techniques to model context in-
stantly provides a number of powerful features such as encapsulation, reusability or
inheritance. Objects are used to represent context types (such as location or noise,
18
etc.), and the details of context processing become encapsulated on an object level
and hence hidden to other components. One such example is Hydrogen [Hofer et al.,
2003], specialized for mobile devices and inspired by a decentralized approach where
context is divided into local (context values relative to the client itself) and remote
(context values from other devices) types.
• Logic based models. Logic-based models based themselves upon facts, expressions
and rules to defined the context model. Normally, a logic system is used to manage
those terms and adding, removing or changing the existing facts. Existing systems
are usually constructed upon Prolog (to facilitate reasoning on the facts) but more
“theoretical” examples also exist being based on first-order logic for instances.
• Ontology base models. Ontologies are a common method for specifying concepts and
relationships between them. Due to their formal inclination, they become a good
method to modeling contextual information. Several context-aware frameworks and
systems use ontologies as their representation model such as COBRA [Chen et al.,
2003] which uses a OWL-based ontology approach, named COBRA-Ont, exemplified
on Figure 2.7.
Figure 2.7: COBRA-Ont List of Classes and properties [Chen et al., 2003]
2.3.3 User Modeling

User modeling techniques are mainly used to provide user adaptation in applications
[Kay, 2000]. Despite all research, there is no uniform definition for what a user model
19
really is. Nonetheless, Kobsa’s [Kobsa, 2007] definition of a user model is probably the
most widely accepted:
“A user model is a knowledge source in a natural-language dialog system which

contains explicit assumptions on all aspect of user that may be relevant to the
dialog behavior of the system. These assumptions must be separable by the
system from the rest of the system’s knowledge.”
There is today a wide number of systems which use some level of adaptation in or-
der to increase usability, normally provided by some kind of user analysis and reasoning.
Kay [Kay, 2000] enumerates a broad number of systems ranging from advisors or consul-
tants to fully fledged systems that adapt interaction to the user’s preferences, goals, task,
needs and knowledge.
In order to compose a user model, several techniques exist and are applied at different
stages of development. In her work, Kay [Kay, 2000] enumerates four main techniques for
this purpose.
Elicitation of User Modeling Information This is the most direct way of obtaining
user information. Whenever the system aims to obtain information about the user, it
simply asks him. For this purpose, several methods can be used such as direct question
and answer, through forms or a simple wizard tool. In a sense, this method is better for
applications which contain an exclusive user model, where the questionnaires are aimed
for that specific application.
Modeling Based Upon Observing Another method is to compose the model through
user observation. The main advantage of this approach is the ability to collect large
quantities of data without disturbing the user. One can argue that capturing data while
the user is interacting with the application leads to more a genuine evaluation, in the
sense that the user will not stop the interaction to answer the questionnaire. On the other
hand, the evaluation may also be biased by the person who is analyzing the interaction,
later leading to errors. However this can be countered by using many observers and later
merging their impressions into a more solid and trustworthy report.
Stereotypes It is expected for increasingly individualized adaptation of interaction to

require detailed and sophisticated user modeling [Kay, 2000]. Creating models from scratch
is an expensive investment. Stereotyping is a valuable solution applied by known systems
like GUMS [Finin and Drager, 1986], BGP-MS [Kobsa and Pohl, 1994] or the um toolkit
as examples of user modeling based on stereotyping. Due to their relevance, some of these
will be later described. In truth, the concept of stereotyping remotes to Rich’s work [Rich,
1979] who used people descriptions of themselves to deduce the characteristics of books
that they would probably like. Its execution is based on the principle, depending on the
applied stereotypes, the application changes.
20
A simple example may be explained by conceptualizing the stereotype “bad sight”. If
a user has difficulties in seeing, then he’s “labeled” with this stereotype. Now, imagine
that an application possesses a routine that increments font size if the user belongs to that
stereotype. When interacting with the application, the user will now experience a larger
font compared to other users who don’t share this stereotype leading to “user-adaptation”.
Stereotyping is perhaps the most used technique when developing a user model for ap-
plications. This technique involves establishing specific groups where users will be inserted
depending of certain characteristics. These groups are based on categories (or events)
established by the developer to whom applications will react differently.
On her work, [Kay, 2000] divides the essential elements of stereotyping in three parts:
• triggers - activate the stereotype, if the user has “bad sight” then it triggers that
stereotype;
• inferences - the consequences of a stereotype, in case of “bad sight” this leads to a

bigger font or a smaller resolution;
• retraction - in case a stereotype in no longer valid, then this mechanism is capable

of disabling the applied inferences for that user. Resorting to the case of the “bad
sight” stereotype, simply imagine a scenario where the user puts some glasses. In
this case, the application should return to normal font size and/or resolution.
Normally, when stereotyping is applied to provide user-adaptation to an application,

starting information is minimal based on default assumptions. As the user interacts with
the application, starting assumptions may become invalid or simply be complemented by
new assumptions, which implies a dynamic process. In addition, the process of identifying
stereotypes is also dynamic, using collected data together with machine learning techniques
or statistical tools to identify new one [Kay, 2000].
Knowledge-Based Reasoning This technique is many times applied in conjunction

with stereotyping. Based on collected data about the user, reasoning algorithms can be
applied in order to extrapolate new information. For instances, if a user indicates that he
does not have speakers, then it is possible to infer that producing sound is ineffective since
having speakers is a prerequisite for producing any sound.
Some of these techniques are often used together when establishing a new user model
system. Initially, the user may be asked some questions to establish default “ground” for
the system. Based on this questionnaire, reasoning algorithms may be applied in order to
extract new inferences about the users. Finally stereotyping is used to “group” users to
specific characteristics or application changes.
2.3.3.1 Related Systems

According to the work of [Kobsa, 2007], systems that include user data modeling are
generally divided into two types, shell systems and server systems:
21
Shell User Model Examples For Kobsa [Kobsa, 2007], “user model shell systems”
are systems that are integrated into applications, dependent from applications and in most
cases, embedded into the own logic of the corresponding applications which leads to a clear
distinction between both being non existent.
Examples of “shell” modeling systems can be found in the works of [Huang et al.,
1991], [Kono et al., 1994], [Brajnik and Tasso, 1994] or [Kay, 1994], these last two being
examples of user modeling based on stereotyping - matching users to previous defined
profiles [Kay, 2000]. Braijnik’s work [Brajnik and Tasso, 1994], UMT allows the user
model developer the possibility of defining hierarchically ordered stereotypes as well as
rules for user model inference and contradiction detection. When new information about
the user is received, it is classified by UMT based on a set of premises or assumptions.
Based on these, stereotypes may be triggered and their contents added to the user model.
Thanks to these stereotypes, the application will be able to adapt itself considering this
particular type of user. PROTUM [Vergara, 1994] is a similar approach to UMT including
more sophisticated stereotypes.
While not being a “shell” user model system in a strict sense, um [Kay, 1994] is a toolkit
for user modeling that represents assumptions about the user using attribute-value pairs
that may describe the user’s preferences, beliefs or other information considered relevant.
In order to evaluate assumptions, um tags each pair with a list of evidence for its truth
and and falsehood. At runtime, interpreters are used to evaluate evidences and compose
conclusions.
Server User Model Examples In truth, the work by [Finin and Drager, 1986] is
considered to be the starting point for application-independent user modeling systems.
GUMS is a Prolog designed system aiming to compose long term models of individual users.
Its objective is to provide a well defined set of services for an application system interacting
with various users. As the application system interacts with the user, it simultaneously
acquires information on the user and maps it to an user model maintenance system for
incorporation. The application provides new facts about the system which is then verified
with some assumptions thus generating new assumptions about the user which later may
be queried by the application.
Server models are nowadays much more common than their alternative. In fact, given
their success, numerous commercial user model examples are currently available to develop-
ers. This comes as no surprise given the numerous advantages this model imposes [Kobsa,
2007]:
• User information is at the disposal of more than one application at a time;
• Applications may use information acquired by other applications, leading to a sub-

sequently better user perception;
• Information about the users is stored in a non-redundant manner;
22
• Other information dispersed across the enterprise (past purchases, demographic da-
ta), can be more easily inserted into the user model repository;
On the other hand, a major disadvantage comes from the fact that this model is based
on a central server, accessible only through the network which may lead to availability or
performance issues. In order to diminish these problems, redundancy is often used.
The server system “Doppelganger” [Orwant, 1994] gets all its information by means
of sensors (either software or hardware). In order to analyze this information, developers
are able to use several techniques such as beta distributions, linear prediction or Markov
models. Individual user models are collected into so-called “communities” leading to a
similar stereotyping concept. It differs from traditional stereotyping due to being in a
sense dynamical due to being based on probabilistics [Kobsa, 2007].
Personis [Kay et al., 2002] is an “extended server” version of um using the same base
components residing in an object data layer over a database. User models are hierarchically
ordered contexts which are structured on the object database structure. The authors
distinguish two basic operations upon this representation: accretion - collecting evidence
about the users - and resolution - interpreting the collected evidence.
A more concrete example is contained on KnowledgeTree [Brusilovsky, 2004], a student
adaptive education system which includes a user model functionality by collecting evidence
from students obtained by their interaction with multiple servers. The activities performed
by students are stored and inferred by agents which process the flow of events and updates
the model. Each agent is responsible for a specific property such as motivation level or
level of knowledge for a specific course [Kobsa, 2007].
BGP-MS [Kobsa and Pohl, 1994] is another user model server system using assumptions
about the user by techniques of stereotyping based on a first-order predicate logic. Different
assumption types such as beliefs and goals are represented in different partitions which are
hierarchically ordered to exploit inherence [Kobsa, 2007].
2.3.4 Ontologies and the Semantic Web

According to [Studer et al., 1998], an ontology can be defined as “a formal, explicit
specification of a shared conceptualization”, composed by a set of entities (e.g., objects,
concepts, relations) that are assumed to exist in a particular domain. Its formality is due to
its support on unambiguous formal logics, explicity because it makes domain assumptions
explicit for reasoning and understanding, and shared for its ability to reach consensus.
An ontology can be characterized according to several dimensions, two of which are
particularly relevant: formality and granularity. Regarding formality, ontologies can be
informal, structurally informal, semi-formal or formal [Silva, 2004]. Informal knowledge
representation mechanisms like using natural language are normally associated with human
readability. They might however become highly ambiguous. On the other end, with a more
formal approach, knowledge representation mechanisms become less ambiguous and more
machine readable, but less human readable. The more common languages used in semi-
formal and formal ontologies are based in description logics and first-order logics.
23
Regarding granularity, ontologies can be fine-grained (or offline) or coarse- grained (or
on-line). Fine-grained ontologies include a more detailed description of the knowledge do-
main, while coarse-grained ontologies tend to be more abstract [Silva, 2004]. To choose
one, it is necessary to counterbalance accuracy and computational complexity. When accu-
racy is required, fine-grained ontologies are better suited but require more computational
resources. When usability is more important, coarse-grained ontologies are often choosen,
the Semantic Web being one of those cases.
2.3.4.1 The Semantic Web

From its birth, the Web evolved into a human-oriented technology [Tijerino and Al-
muhammed, 2004]. Its original view however looked at the Web as a machine oriented
technology as well. In the last decade, for practical, commercial and technological rea-
sons, the need for machine oriented data in the Web has increased significantly, leading
to the appearance of the Semantic Web. Presently, the Semantic Web is the home to
a massive amount of unstructured and semi-structured data in the forms of documents
from the human-oriented Web represented through a expressive and extensible structured
vocabulary. The most popular ontologies are supported by description logics languages
and vocabularies. Vocabulary examples include Resource Description Framework Schema
(RDFS), Ontology Inference Layer (OIL), DARPA Agent Markup Language (DAML), Web
Ontology Language (OWL), OWL 2 and Semantic Web Rule Language (SWRL).
2.4 Summary
In this chapter, we presented the related work on evaluation methodologies with a
special focus on ESM tools and their characteristics. Additionally, and for a better un-
derstandment of the thesis, a brief description of key technologies and concepts within the
scope of this work was also included.
24
Chapter 3
A New Evaluation Paradigm
What is an evaluation? The Merriam Webster on-line dictionary [Merriam-Webster,

2015] describes the act of evaluating as “to judge the value or condition of (someone
or something) in a careful and thoughtful way”. But the act of judging is sometimes
subjective. To judge someone or something objectively requires evidence, proof or facts.
The less facts exist about something the more subjective the judging becomes. More
subjectivity and less objectivity implies a bigger possibility of the judgment being wrong.
An evaluation procedure can be seen in the same light. The more information you possess
regarding a situation, someone or something, the more accurate can the evaluation be.
In certain evaluation scenarios, the amount of gathered data is simply not sufficient.
Evaluators are often left with nothing but a set of predefined answers without any contex-
tual knowledge. They are left with a simple answer for a predefined question and with no
answer as to why did the user choose that answer: where did he choose it, in which con-
ditions, after doing what, alone or accompanied, with his full attention or multitasking...
Answers to such aspects can often provide a much bigger insight and leave evaluators with
a much wider amount of data when reaching conclusions. Each of these situations can in
some cases compromise a simple answer to a simple question. If the user is not focused,
he may choose randomly or without much thought. If the same happens to several users,
not only does the question become compromised but the whole evaluation results can be
questioned.
The fact that context interferes with an evaluation is undeniable. A human is prone
to being distracted and one distraction, whether consciently or not, may alter his state of
mind. The longer the evaluation period, the higher becomes the probability of a distraction
occurring. It is impossible to prevent the occurrence of distractions. But knowing which
distraction it was and when did it occur may provide some answers. If the evaluation
is performed to a set of users, and similar distractions occur, evaluators can apply some
parallelism to these occurrences and reach additional information regarding the evaluation
scenario.
Long evaluations not only increase the possibility of distractions. Studies show that
compliance becomes an issue when conducting evaluations for longer periods [Kelly et al.,
2008]. The repetitiveness of a procedure such as an enquiry leads to a lack of enthusiasm
25
by the user which can lead to his desistance. However, cases may also happen where the
user wants to comply and can not because he does not have any access the evaluation
itself, he is away from the evaluation data gathering point or because of a malfunction in
which data could not be gathered. The existence of alternatives could in both cases be a
possible solution. Note that many evaluations are performed in the user’s home. Pervasive
environments such as the user’s habitation can be used to perform “custom evaluations”
which do not derive from the original objectives but provides more comfort by means of
adaptability to the user.
Issues do not only exist from the user’s standpoint but also from the evaluator’s stand-
point. Evaluators are normally limited to simple enquiries for most evaluations despite
wanting more information, which is often insufficient to perform a thorough analysis. In
these cases, the existence of more data, by means of contextual information could help
evaluators when assessing more specific and less common results. Real time data can help
provide a “visual picture” which can lead to new and interesting results or at least portray
the necessity of further research.
Another issue concerns recruitment. The selection process in regard to the users re-
quires the evaluator to analyze each one personally and provide an assessment. When
contemplating large user groups, this task can take weeks or even months to execute.
It is necessary to provide a unified answer to these issues and allow evaluators to
conduct evaluations in a more dynamic and flexible manner. The next sections detail a set
of requirements that we feel are important for a concept of dynamic evaluation for reactive
environments.
3.1 Requirements for a dynamic evaluation paradigm

What is a dynamic evaluation? What comprehends a dynamic evaluation? Contrarily
to the term “evaluation”, there is no concrete definition for dynamic evaluation within
a dictionary. But the word “dynamic” implies something that adapts itself, that evolves,
that better suits its objective. In this sense, a dynamic evaluation consists on an evaluation
procedure which adapts itself regarding its current status. It may adapt itself based on the
user, its environment, time, evaluator decisions or other factors. In sum, it is flexible and
has the main objective of gathering more and better structured data and, consequently,
information.
The comprehension of dynamic evaluation in the scope of this thesis involves a number
of subjects: it spreads from the creation of the evaluation procedure itself to its conduction;
it involves the adaptability and the flexibility of interaction as well as the evaluation’s
semantic domains; it manages the selection of evaluation targets and its distribution. The
main objective is to offer an alternative approach where evaluators are able to customize
the evaluation to what suits them better and create evaluation tests without constrains
from the methodology. We believe that this can be achieved by providing evaluators with
the ability to perform choices in regard to their method and to allow them apply evaluations
that suit the conditions and necessities of the associated research. As a result, we present
26
a dynamic paradigm that offers customization as its main statement. In the next sections,
we detail the aspects that allow the necessary customization in the scope of a dynamic
evaluation paradigm.
3.1.1 Introducing context into an evaluation

Contextual data is a reality these days. Producers of information are ubiquitous and
can be found almost anywhere. The success of terms like Internet of Things (IoT) [Atzori
et al., 2010] and the increasing interest from software companies show the importance and
possibility of contextual data. Disregarding this fact is to leave valuable and collectible
information out and thus diminish the possibilities of an evaluation procedure.
Contextual data can be materialized by a wide number of things. It can represent
aspects such as temperature, noise or brightness. It can represent location such as GPS
coordinates or the division of a house. But it can also represent the number of people
within a division or the information of when did the user left the house. Conceptually,
its scope is limitless. In this sense, making use of this information when performing an
evaluation seems obvious. Obviously, a number of technical difficulties may arise but we
will tackle them later on in Chapter 5. For now, let’s focus on what can contextual data
provide to an evaluation test.
First of all, contextual data by itself is valuable. The mere analysis of certain data
can help evaluators when performing judgment over evaluation results. Knowing what
the user did before answering a giving question can help portrait his decision. In the
case that multiple users possess similar circumstances, a pattern can be observed that can
lead to new research questions. Second, contextual data can be used to assert uncommon
situations. If the evaluator was able to possess some sort of control over contextual data,
this data could be used to interrogate the user on the spot. This would consist on linking
contextual data with interrogation practices leading to real time data collection. Third,
contextual data can be used for interaction purposes. Adapting the interaction to the user
can be performed if taking context into account. By doing so, the user could be more
comfortable thus diminishing its resistance specially during longer evaluation scenarios.
Focusing only on the user’s answers and ignoring the entire decision making process
that lead to it can prove fruitless since, in certain research areas, the decision making
process can be as important as the answer itself.
3.1.2 Supporting environment heterogeneity

With the appearance of concepts like Ambient Intelligence or Pervasive Computing,
common evaluation locations like habitations or work environments are filled with informa-
tion producers. These environments are heterogeneous, thus differing from one to another
and often appearing to be incompatible. While they might appear so, environments may
simply possess different methods to fulfill an evaluation.
Enabling an evaluation scenario to be applied to two or more different environments is
feasible. For instance, a question may be fulfilled differently by two environments in regard
27
to the interaction modality. The diversification that this causes provides evaluators with
new, comparable data, in which different users in different settings, answered to a question
which was posed differently. In some situations, the environment and its user might even
have a different behavior than another which leads to a certain test element being posed
to one but not to the other. We believe that this is acceptable and desirable as options of
a dynamic evaluation paradigm.
The conjunction of context and environment heterogeneity can help the evaluator to
reach new and interesting results since it allows him to test a given subject in different
conditions and observe changes and reactions of the subject. Conducting the same test on
different environments can provide valuable information in regard to the adaptability and
overall feel of the different conditions of the test.
Note that we do not intend to dispute the necessity of generic evaluation testing as it
is still a major part of most evaluations performed. We intend to simply provide choice
for those who can go beyond those boundaries and conduct evaluations on a more diverse
manner.
3.1.3 Providing adaptability and redundancy in interaction

Evaluations are mostly performed through a medium and via a single modality. But
solely focusing on one interaction point for gathering information can be a serious mistake.
It can provoke delays, skipped assessments, lack of enthusiasm and consequently lead to
quitting or worse, a disinterested user. If the user becomes less focused on the evaluation, he
may resort to quick answering, which can lead to reliability issues on gathered information
and overall evaluation conclusions.
A dynamic evaluation should be able to make use of whatever exists and interact with
the user in the best possible manner. This does not necessary imply something like a brute
force method but rather a more intelligent approach to interaction. With the popularity
of speech and the appearance of diverse interaction methods which range from gestures to
facial recognition, evaluators should be able to conduct diversified evaluations. Providing
different interaction methods can improve enthusiasm, thus keeping the user more engaged
in longer evaluations.
In addition, redundancy and diversity can be a strong support for usability, often an
issue in some research areas. Interaction could and should make use of the environment
itself and with the usage of contextual data, adaptability can be inserted into the modalities
themselves. Such a feature would allow for a better assessment of the user according to the
current conditions and, in a sense, improve the user’s comprehension and overall experience.
Note that adaptability can be applied not only in regard to context but in regard to the
user itself. By focusing on the user’s preferences, disabilities or limitations, interaction can
become much easier for specific evaluation sessions. In areas like Ambient Assisted Living,
where the focus is in third-age users, this adaptability can help the user in crossing the
technology boundary which often is present.
Note that it is the responsibility of the evaluator to judge if results from different
interaction modalities should be used. The importance here is once again choice. If the
28
evaluator intends to conduct an evaluation through a certain modality, he should be able to
do so. But if he intends to gain information in whatever manner, if he intends to provide
maximum adaptability or usability, or if he simply intends for diversity, the evaluation
paradigm should allow him to do so as well.
3.1.4 Allowing reusability by introducing semantic value

Every evaluation is applied to a user following a specification/procedure. In the case of
an enquiry, the specification consists on a series of questions, which by themselves possess
no extra meaning. In other words, they are merely questions with a certain content like for
instance a Likert scale question or an open-answer question. Note however that in these
examples their designation (ex.: open question) is related to the type of answer that the
question requires and not so much to what the question represents in terms of research.
Embedding semantic information opens a different perspective into the ability to reuse
specifications. By naming evaluation elements in accordance to their research areas, we
become able to create relationships between different elements and create specifications
that possess meaning by themselves, rather than simply being a mere representation of the
procedure. On a large scale, they can enable the creation of sets of evaluation elements
that are cataloged according to their research area. With them, evaluators become able to
search and analyze existing specifications and either integrate or expand them into their
own evaluations. This level of reusability can lead to the creation of evaluation domains
targeted to specific areas of research where it might even be possible to extrapolate results
between multiple evaluations.
Simply introducing semantics is obviously not sufficient if a strong technological sup-
port does not exist. However, by looking at an evaluation in the way we see research, a lot
can be learnt. Research is a continuous process. It starts with a few axioms and evolves
into complex assessments and specifications. An evaluation specification should be similar.
Reusing evaluation specifications could help evaluators in conducting correct, peer valid
evaluations. Evaluators still lose high amounts of time in designing evaluation specifica-
tions, but if they become able to reuse or even extend an existing one, this problem could
be decreased.
3.1.5 Simplifying the distribution and execution of evaluations

Conducting location spread evaluations can also be fairly difficult. Due to the hetero-
geneity of environments, to set and prepare each environment to receive an evaluation is a
time expensive task, one that increases depending on the number of participants. Select-
ing the participants is another time expensive task, one that increases if the evaluation is
highly restrictive on certain characteristics regarding the participants. And conducting the
evaluation itself often requires logistical effort in setting schedules that the user is comfort-
able with. To facilitate the evaluator job, and allow him more time to analyze and design
the evaluation itself, it is necessary to remove some of this logistical efforts from him.
29
The evaluation process should be based on a systematic approach. One that establishes
a distinction between evaluator and targets of evaluation. One that allows the evaluator
to specify the test, the method of execution and simply distribute the test to a set of
users. In a way, the evaluation should function much like a computer program, something
that is delivered and executed by the user remotely. By borrowing ideas from concepts
like Crowdsourcing [Brabham, 2008], it could be possible to look at users as resources,
to whom evaluations could be delivered automatically. Evaluators would become able to
define certain selection criteria as quickly define user groups to whom they could apply the
evaluation. The main challenge here is to allow a dynamic approach since every environ-
ment can be distinct.
So far, we identified key issues associated with general evaluation pratices. In the next
section we will further explore these issues by presenting the main cornerstones on which
our “dynamic evaluation” proposal is based on.
3.2 A proposal for Dynamic Evaluation

An evaluation can be seen as a procedure applied to a user under certain conditions. In
this process, the user to whom the evaluation is being applied is subjected to a series of steps
in which he provides information regarding certain topics. A dynamic evaluation follows the
same rules, but adds a new dimension to it by introducing concepts like adaptability, user
choice and heterogeneity. The steps that compose the evaluation are no longer absolute,
they become flexible, they can be performed differently or they can even not be performed
at all. Evaluations cease to follow a generic pattern and adopt a dynamical nature in which
the evaluation test is reactive to the user and its associated environment.
Note that while a more dynamic nature is our main proposal, the generic approach is
not put aside. Generic testing is still available in the proposal, but we offer more options
to the evaluator which can materialize in more, better, contextualized data. For this, we
anchor our approach on four key areas: focusing on the user, considering the environment,
enhancing the definition of evaluations and automatizing the process.
3.2.1 Focusing on the user

The user is the subject of an evaluation and commonly the main source of information.
Seeing each user as an indistinguishable element of a set is the same as applying the old
evaluation methods where the process, rather than the user, is the focus of the evaluation.
To conduct a more dynamic evaluation, it is necessary to change this focus by considering
each user as an individual. And, in order to do this, it is important to be able to characterize
the user based on his features, interests and preferences.
In our design, every user is singular and represents a person. Associated to it can be
a set of characteristics that possesses no limitations. It can include information as diverse
as user demography, job history, travel log or patrimony. These aspects are associated
with the user and not the evaluation. The user characterization can be even increased as
30
the user participates in evaluations where certain patterns or behaviors can lead to a user
modeling.
By characterizing the user, it becomes possible to conduct the same evaluation process
and obtain different executions. Evaluations can include specific evaluation items which
only are triggered if the user possesses some sort of characteristics. It becomes possible to
conduct user-friendly interactions by adapting the interaction to the user and even facilitate
the task of recruitment. By using information such as interests or even characteristics,
it becomes possible to create and establish groups which share common features thus
facilitating the evaluator when searching for participants.
It is important to note that we are disregarding the ethical constraint here and assuming
the user cooperation and permission in the process.
3.2.2 Considering the environment

The environment personifies the surroundings of the user. While the user is the main
source of information, he is not the only one. Around the user, other information producers
exist. By gathering that information we are enabling a contextualization of the user’s own
information while participating in an evaluation. Especially in technologically advanced
environments, information can be gained by a wide set of producers, from applications and
external systems to other people. Pervasive or ubiquitous environments are rich in sensor
or adapters which can easily provide information if necessary.
Information surrounds the user. It may not surround him at all times but, when it
does so, it can produce changes in his behaviour. When applying an evaluation test,
these changes can produce distinct and untraceable results. Associating this contextual
information with the trial can provide new valuable information. Methodologies often
label certain answers as outliers and remove them from the evaluation conclusions. With
contextual data, perhaps these outliers can become something more and either lead to new
conclusions or new research questions.
Based on these premises, we established the environment as a fundamental design
element. We do not look at the environment as a house, a workplace or any specific
location. We consider all of these as part of the environment. An environment represents
an abstract circle of information which surrounds a user (illustrated by Figure 3.1). It can
imply a house, a workplace, a cellphone, a car, the temperature, a GUI’s brightness, a
person, anything that is associated in some manner with the user. It must however have
at least one well-defined producer.
Limitations can and should be placed. While information surrounds the user at all
times, an evaluation is created by an evaluator and it is logical to enforce him with the
responsibility of selecting what type of information should and should not be considered.
This circle of information is associated with the user, but a subset can be selected by the
evaluator for one evaluation. Two users represent two circles of information which may
have similar elements. We will see later on how to deal with this issue.
By now, we have two elements associated with one another, a user and its environment.
None of these have yet any association with an evaluation. Together, they represent a
31
Vehicle
Infotainment
System
House
Screen
Running
Calory Brightness
(Activity)
Counter Graphical User
Interface
Phone
GPS External
Coordinates User Temperature
Meter
Accelerometer
Working
Location Laptop
Figure 3.1: User’s Circle of Information
possible recipient of an evaluation.
3.2.3 Enhancing evaluation definitions

To obtain data from a user, two choices are possible: gathering it directly from him
or obtain it from its surroundings. By questioning the user, the evaluator is obtaining
concrete data regarding a specific subject. By obtaining data of the user’s surroundings,
the evaluator is obtaining data that is related to the user’s context. In a sense, while one
is primarily focused in directly querying the user, the other is focused on a perceptory
analysis of what surrounds him.
Feedback represents the most direct type of assessment. It can be applied anywhere,
as long as there is a mechanism that can receive the data. On its most common format, it
is based on enquiries that apply a series of questions to the user to which the user should
respond. Depending on the type of test, the question can be open, thus allowing the user
to answer whatever he wants, or it can be closed, with multiple choices or scales. Feedback
can also assume another format, like allowing the user to submit information on its own
without any questioning, much like a diary.
Contrarily to feedback, ambient data is based on a passive type of interaction. In most
situations, the user produces data which is collected without any direct interaction with a
user interface, and often unconsciously. An example of this method are for instance sensors
within pervasive environments.
In our opinion, an evaluation must contemplate both feedback and ambient techniques.
It must allow enquiries to be triggered at specific timings and it must also allow the
collection of data from the user’s environment. Above all, it must allow both to be mixed
together. By combining both, it becomes possible to directly question a user following
the detection of a certain event. This produces proper contextualized data supported by
32
actual feedback which can be very valuable in regards to the conducted research. Note
that the same evaluation can be applied to different users, possible resulting in different
applied evaluation items and different collected data from user to user.
The objective of an evaluation is information. Not random but specific information
which provides answers to research questions. The necessity of when and how to obtain
information can only be set by the evaluator since the content of the test as well as its
moments of execution and overall calendarization are his responsibility. All of these aspects
constitute the evaluation’s specification. In our approach, the specification of an evaluation
plan is composed of evaluation assessments. An evaluation assessment is an execution
flow which consists on a series of evaluation elements linked to one another in a specific
order. The flow is oriented, and its elements are executed one by one, based on their order
of completion. Each element on the assessment can be either an atomic operation, like an
event generated by a sensor, or a non-atomic operation like a question within an enquiry.
Every assessment executes and terminates according to a given schedule.
There are many advantages to this approach but first and most importantly is the fact
that the evaluator is offered with choice when designing the evaluation. He can either
design an evaluation with elements that are common to all involved users, or he can create
a more global set of assessments in which some will be executed and others will not.
Assessments allow him to design an evaluation which can range from fixed, timed enquiries
to complete ambient-dependent execution flows. Figure 3.2 illustrates an example of an
evaluation which includes two evaluation assessments. The first shows a flow in which
an enquiry (Enquiry 1) only occurs when an event (Event X) is detected, featuring, for
instance, an ambient-dependent example. The second links another enquiry (Enquiry 2)
to a time specific period. These examples show the ability to associate different evaluation
instruments to create fine-tuned data gathering situations.
Evaluation Starts
Event X Enquiry 1
Assessment A With
Evaluation ABC
Time Interval/
Specific Time
Evaluation Followed
Evaluator Enquiry 2
Assessment B By
10/04/2014
:14:00
Figure 3.2: Exemplification of an evaluation consisting of two evaluation assessments
Another aspect of the specification is the modularity involved. Each assessment consists
on a series of elements - questions, enquiries or events - elements which can be shared
between evaluations. By using semantics in their definition, each of these elements can
represent research concepts by themselves instead of being just common descriptions. By
making this change, we enable evaluation elements to be extended much like programming
objects. For instance, let us observe a simple example: an enquiry can be declared as an
Ambient Assisted Living enquiry and another enquiry as an Usability enquiry. If we state
33
that the former extends the latter, then this implies that the Usability enquiry is also an
AAL enquiry. By providing this type of relationships between elements, evaluators become
able to establish domain areas that include evaluation elements associated with specific
research areas. It also becomes possible for evaluators to use and integrate evaluation
elements that were produced by others thus creating a level of reusability that is important
to facilitate the creation of evaluation tests.
In addition to the forms of obtaining data and modularity, an evaluation specification
must contemplate answers to issues such as ease of creation, privacy and other detailed
aspects regarding assessments like priority, cooldowns, interaction with the user among
others. We will tackle them at a later time.
3.2.4 Automating the execution

The execution phase of an evaluation represents the data gathering step. It consists
on applying the previously created specification to the user for a specific amount of time
in which data is collected for posterior analysis. Prior to it, some preparations must be
performed to check if all requirements are met. In order to apply the specification, all
infrastructural details must be set and the evaluation prepared to be applied to the user.
Technologically, certain assessments may require specific hardware which must be installed
and finally the user must also himself agree to the constraints of the test. In sum, all
of these aspects can be as time expensive as the design of the specification itself or the
posterior analysis of the results. It is necessary to simplify this process by making it simpler
and more independent.
Figure 3.3 represents an evaluation’s lifeline divided into four phases, each depicting
essential steps to the successful creation and application of an evaluation test. The flow
integrates the main aspects and requirements presented previously as a systematic way to
achieve a dynamic evaluation.
Create Evaluation
User Selection
- Create Evaluation
- Set Inclusion Apply Evaluation Gather Results
Elements
Specifications - Deploy Evaluation - Obtain Data
- Create Evaluation
- Verify Compatibility - Execute Evaluation - Analyze Information
Assessments
- Check Acceptance
- Set Schedule
Figure 3.3: Evaluation conceptual design and application flow
The first phase is linked to the creation and design of an evaluation by the evaluator.
It implies the creation of the evaluation assessments composed by events or enquiries to
gather data from the user. The second phase is associated with the user selection, that
is, the creation of the target group, and the verification that they are ready to receive
the evaluation. The third phase is linked to the execution of the evaluation by the target
group, and the fourth and final phase is associated with the resulting data analysis.
34
3.3 The Conceptual Architecture
To support the proposed evaluation methodology a conceptual architecture was con-
ceived. It was envisioned with the major objective of facilitating the creation, management
and analysis of dynamic evaluation scenarios on a large scale.
Due to all of the requirements, there are many variables involved in designing the
architecture. We want to allow dynamic evaluations that respond to the user’s environment.
We want to simplify the creation, distribution, and collection of evaluation data. We want
to allow semantic data and the creation of evaluation areas. We want both general and
detailed, customizable evaluations. To achieve all of this, we needed to go step by step,
analyzing each requirement and each possible solution.
We started by analyzing the involved actors and their responsibilities. The evaluator
has the responsibility of designing and applying the evaluation tests while the user has the
responsibility of complying with the test. Based on this premise, our first decision was to
separate the architecture into two: a part dedicated to the evaluator and a part dedicated
to the user. This way, it becomes possible to address both the user and the evaluator’s
necessities.
In regards to the user, its objective is self-contained, that is, it only concerns himself.
The user is asked to participate in a test to which he may concur or not. He has however
no responsibilities in designing the evaluation. On the other hand, the evaluator has need
of users in order to apply the evaluation, as recipients for the test. In a sense, the user
is something like a resource to the evaluator, one to whom he asks a specific task and
expects the results. From this perspective, we choose to look at the user as a resource.
An independent, autonomous resource that possesses properties like user characteristics,
interests and preferences as well as its own context. Figure 3.4 shows an example of an
user in this light.
ENVIRONMENT INCLUDES:
- SENSOR A
- SENSOR B
- LAPTOP
- MICROPHONE
INTERESTS INCLUDES:
- AAL
- SMARTPHONE APPS
- WOMEN FASHION
User
CHACTERISTICS INCLUDES:
- SHORTSIGHTEDNESS
- WEIGHT
- HEIGHT
- BIRTHDATE
Figure 3.4: User Characterization as a Resource
In this example, we are characterizing the user with a few aspects which are divided in
35
three areas: environment, interests and characteristics. The environment encompasses all
specifications of the user’s surroundings. This particular user includes only a few hardware
items: two sensors, A and B, a laptop and a microphone. The interests area features
topics in which the user may be interested in. Topics can range from specific elements
like AAL to more general area like women fashion. The last area includes a more personal
characterization of the user himself such as height, weight, birth date or the fact that he
is shortsighted.
The characterization is not associated with any evaluation. The user exists beyond any
concrete evaluation and is a possible target for evaluations.
Simplifying user selection

By looking at users as resources, we open a set of possibilities. The first advantage
of characterization is the ability to simplify user selection when setting a new evaluation.
Each user now possesses a set of properties that can become accessible to the evaluator. By
analyzing them, the evaluator is able to analyze each of them and verify if the conditions
of the user satisfy its own. If so, then the user becomes a possible candidate of evaluation.
This verification is based on two aspects: inclusion criteria and exclusion criteria. Inclu-
sion criteria defines aspects that the user must fulfill in order to be considered. Exclusion
criteria defines aspects which the user must not have. Figure 3.5 illustrates a selection step
applied to two users by considering only inclusion criteria. One of the users completely
fulfills the specifications while the other fails to do so. Note that some aspects are depen-
dent on the environment rather than the user himself. The necessity of having “Sensor A”
is one such example.
In the example, the first user would be a valid choice for receiving the test while the
second user would not. The process relies on the usage of users as resources to simplify
the selection. We can however go a step further by using user’s similarities to establish
specific user groups. For instance, both users one and two are interested in AAL. So, we
can design a group of users which have an interest in common which is AAL. By grouping
users, we are facilitating the evaluator’s selection process by providing him with a set of
users which have a feature in common.
On the other hand user groups can go beyond common interests. Note that both
user one and two have a “Sensor B” element. This indicates that both users would be
prepared for an evaluation where “Sensor B” was a requirement. So, if a specific evaluation
assessment required “Sensor B”, both users could be the recipients of that evaluation. In
this sense, user groups can also group users according to their environment specifications.
3.3.1 Evaluation Domains

User groups define sets of users which share common properties. They facilitate the
evaluator’s job when searching for evaluation recipients. But much like users have specific
characteristics, evaluations themselves do too. Conceptually, an evaluation has a scope of
application like an area, or a topic. An evaluation can be about a certain scientific area
36
- SENSOR A
- SENSOR B
- LAPTOP
- MICROPHONE
INTERESTS INCLUDES:
- AAL
- SMARTPHONE APPS
- WOMEN FASHION
- SENSOR A
User CHACTERISTICS INCLUDES:
- SHORTSIGHTEDNESS
Fullfils the - WEIGHT = 75kg
specification - HEIGHT = 1,80m
- GENDER = FEMININ
INTERESTS INCLUDES:
- SMARTPHONE APPS
Evaluator CHARACTERISTICS - SENSOR B
INCLUDES: - TABLET
- WEIGHT > 65kg
- GENDER = FEMININ
INTERESTS INCLUDES:
- AAL
- SMARTPHONE APPS
Does not fullfill
- WOMEN FASHION
the specification
User CHACTERISTICS INCLUDES:
- WEIGHT = 85kg
- HEIGHT = 1,90m
- BIRTHDATE = 05/06/1993
- GENDER = MASCULIN
Figure 3.5: Using criteria for user selection
like AAL or it can be about a human pathology like hypertension, both of which are fairly
different. Difference which can also translate to their terminology and content.
In order to help the characterization of evaluation scenarios, we have introduced the
notion of evaluation domains. An evaluation domain is an architectural building block
that sets both a language to produce evaluations and the users these evaluations can be
applied to. Consequently, the evaluation domain defines an abstract evaluation network
where all associated users comply with a set of guidelines that the evaluation domain
enforces. Concretely, an evaluation domain is characterized by three main components:
• Evaluation Scope - States one or more topics which represent the domain. For in-
stance, AAL or Ambient Intelligence. These scopes are comparable to the user’s
interests and are used to distinguish different domains.
• Applicability Criteria - Represents the set of characteristics which all associated users
must fulfill. If a user fulfills the criteria, it can become part of the domain’s evaluation
network.
• Domain Language - Represents the set of evaluation elements - the Vocabulary - and
how they can be used to create evaluations.
A domain can be compared to a factory where evaluators are able to create new eval-
uations. Using applicability criteria and evaluation scopes, a domain defines a complete
network where the evaluator is given a set of users that are compatible with the type of
37
evaluation that he intends to perform. With these users, the evaluator is able to easily
apply an evaluation scenario without doubts of the user’s willingness or compatibility.
The domain language guarantees that evaluation domains are in accordance with the
support architecture, and more importantly, that all evaluations are executable by the
users that belong to the domain’s network. In addition, the domain language defines a
vocabulary composed by a set of evaluation elements that evaluators will be able to use
to create new evaluations. These elements are defined in the language using a semantic
approach. For instance, rather than specifying a question and its content, the specification
defines something like an AALQuestion. The element can then be instantiated within
future evaluations with actual questions (f.i. “Did you find this application accessible?”).
These definitions allow domain languages to integrate elements from other specifications
thus promoting reusability. We will fully tackle these concepts on Chapter 4.
Domains are used by evaluators to create and apply evaluation tests. The definition
of an evaluation domain however is not associated with an evaluator but with a designer.
The designer represents the owner of the evaluation domain and is responsible for creating
the domain specification, including the scope, the applicability criteria and the domain
language. While the evaluator simply accesses and uses the evaluation domain, it is designer
who is responsible for guaranteeing its proper operation.
To establish an evaluation domain, it is necessary to undergo a set of phases that
validates its content and specifications. The diversity of user environments implicates that
one evaluation element such as an event can be supported by an user and not supported
by another. When establishing a domain, this verification must be performed in order to
include the user within the domain’s network. The verifications are based on the inclusion
criteria that defines the domain and matched with the user characteristics/environment
listings associated to each user. Figure 3.6 illustrates an example of associating two users
into a domain’s network.
In the figure, it is noticeable that the first domain is rather simple and only composed
of a single criteria. In this case, both users are compatible and thus become part of the
domain. In the second domain however, only user two fulfills the criteria thus leaving user
one out of the domain’s network. General or more abstract domains have lesser criteria
and include higher number of users than their opposite. Nonetheless, the same user can
be part of multiple domains at the same time as long as it fulfills their criteria.
Every domain has an owner/designer which is responsible for it. Evaluators are able to
use domains as they use a service, but do not own it. This simplifies the evaluator’s task
by providing him with a set of domains from which he may choose from. If a domain fulfills
the evaluator’s objectives, then the evaluator may simply use the domain to construct and
apply an evaluation scenario to a set of users which fulfill the domain’s criteria. While
a domain belongs to an owner, it is the owner decision if a domain is private or public.
The domain may impose access privileges which only provide authorization to a set of
evaluators or define an open-policy which offers free access to all who might be interested.
38
- SENSOR A
- SENSOR B
- LAPTOP
- MICROPHONE
DOMAIN A
INTERESTS INCLUDES:
INTERESTS: - AAL
AAL - SMARTPHONE APPS
- WOMEN FASHION
CHARACTERISTICS:
WEIGHT > 60kg
User One CHACTERISTICS INCLUDES:
- SHORTSIGHTEDNESS
- WEIGHT = 75kg
- HEIGHT = 1,80m
- GENDER = FEMININE
- SENSOR B
- TABLET
DOMAIN B
ENVIRONMENT:
SENSOR B INTERESTS INCLUDES:
TABLET - AAL
- SMARTPHONE APPS
INTERESTS - WOMEN FASHION
SMARTPHONE APPS
User Two CHACTERISTICS INCLUDES:
- WEIGHT = 85kg
- HEIGHT = 1,90m
- BIRTHDATE = 05/06/1993
- GENDER = MASCULINE
Figure 3.6: Applying criteria when establishing a new domain
3.3.2 Evaluation Nodes

Previously, we have introduced the term “user’s circle of information” to refer to infor-
mation that constantly surrounds a user. We stated that this information can be helpful
for evaluators and should be considered in evaluation practices whenever possible. How-
ever, for it to be part of a feasible evaluation solution, it is necessary to create a generic
method in which this information can be collected while taking into account the environ-
ment heterogeneity that occurs from user to user. In this sense, and to support the circle
of information notion, we introduced the concept of an evaluation node.
A evaluation node is an abstract location/area where evaluations are applied to a
single user. It is an entry point between a user and the architecture, responsible for
receiving new evaluations, applying those evaluations and gathering the extracted data.
Each node is singularly associated with a user and includes all possible information sources
for that specific user which can range from devices to applications or systems. The node
also classifies the user regarding its characteristics, interests and his environment, and uses
this to check for compatible domains in the architecture. Figure 3.7 illustrates an example
of a node.
The image includes different sources which can be used by evaluators to extract in-
formation. The “environment” branch represents what we named as ambient information
producers. By linking this type of information with feedback operations it becomes possi-
ble to comply with one of initial objectives, that is, to gather data at specific timings. For
instance, by knowing that the user possesses “Sensor A” within the environment then the
39
Associated
To
NODE ONE
Includes
User One
Characteristics Interests Environment
- Weight - AAL
- Height - Smartphone Apps
- Age
Includes
Locations Devices Interaction Applications and Activities

Modalities Systems
- Home - Sensor A - Jogging
- Workplace - Laptop - GUI - Weather App - Working
- Phone - Speech - News App
Figure 3.7: Example of a node on a conceptual level
evaluator can create an assessment which starts an enquiry whenever “Sensor A” produces
information. With examples such as this, it becomes possible to extract information in
very concrete situations and provide contextualized, specific data for analysis.
Every node is independent. It applies the evaluation but it does so according to its own
conditions, especially when interacting with the user. Since a node may not be limited to
a simple interface, modalities may - depending of the evaluator’s wishes - vary and adapt
themselves to the user and the environment (f.i. by changing their volume according to
noise). By using the node’s own data, interaction modalities can determine if they are
the best suited to gather information at a specific timing thus improving the chances of
gathering the data at that specific moment. This of course depends of the evaluator’s
wishes since he determines the evaluation itself.
3.3.3 Support Infrastructure

The description of our architecture has so far presented both the concepts which are
closer to both the users and the evaluators: the evlauation nodes and the evaluation
domains respectively. To provide a consistent support for both, the architecture must be
able to add new users by accepting and including new nodes in a rapid manner and with few
obstacles and, in the same way, allow the inclusion of new domains. Accepting a new user
requires certain verifications which must be supported by the architecture like checking
which domains is the node compatible with. In addition, it is necessary to provide support
for other administrative tasks like checking the status of the nodes or inspect changes in
their specification which can cause possible changes in their compatibility with domains.
To provide an answer to this, the architecture includes a middleware which includes
services that provide aid to some of these tasks and help link the nodes to the domains:
the Support Unit. The support unit acts as a glue between the nodes and the domains
and allows new nodes to be rapidly included by updating a registry which it contains. The
support unit also includes other support services which will be detailed on Chapter 5.
40
In order to manage evaluation domains, the architecture includes a final component,
called Evaluation Hub, composed by a single service to facilitate the registration of new
domains and offer a general view of existing ones. The objective here is to provide the
evaluators with a search unit where they can analyze existing domains and decide whether
the domains suits their objectives or not. Similarly, the evaluation hub also allows de-
signers to inspect existing domains and their vocabulary, and extract elements which they
find relevant to new domains. By allowing this, the architecture enforces the concept of
reusability, thus promoting a sort of involuntary cooperation between designers.
Figure 3.8 shows a conceptual view of the entire architecture. In the figure it is possible
to observe that evaluators are able to find domains via the evaluation hub. From its part,
the hub is capable of accessing all domains and services of the architecture. Each domain is
linked to the support unit in order to search for compatible nodes, for deploying evaluations
and retrieving evaluation data. Finally, the nodes apply the received evaluations while
being oblivious of any other architectural elements.
Evaluators Designers
EVALUATION HUB
SUPPORT UNIT
DOMAIN A DOMAIN B
INTERESTS: INTERESTS:
AAL SMARTPHONE APPS
CHARACTERISTICS: ENVIRONMENT:
WEIGHT > 60kg SENSOR A | SENSOR B
NODE ONE NODE TWO NODE THREE NODE FOUR
User One User Two User Three User Four
Figure 3.8: Conceptual view of the proposed evaluation architecture
The architecture relies on services not only as a support option but also as a method-
41
ology. In others words, by allowing users and evaluators to look at evaluation as services,
we are abstracting many layers of complexity which they do not require. To the evaluator,
a domain constitutes a way for him to create and define evaluations to apply to a set of
users. From his part, the user simply knows that he can be subjected to an evaluation
at any time. It is important to note that both domains and nodes are linked only if they
accept to.
3.4 Summary
In this chapter, we started by analyzing classical evaluation environments, pointing to
the relevance of dynamic evaluation. Then, we covered the requirements that the dynamic
evaluation paradigm should fulfill. On observance of these requirements, the guidelines to
define a dynamic evaluation solution were presented. Finally, a conceptual architecture
for dynamic evaluations, based on evaluation domains, evaluation nodes and a support
infrastructure were proposed and generally explained. This architecture separates the
evaluator from the user, allowing for the construction of flexible and expansible dynamic
evaluation systems.
In the next chapter, we will explain the evaluation methodology and model that sup-
ports the proposal, including their definition and specification processes.
42
Chapter 4
A Methodology and a Model for

Evaluation Definition
To support the dynamic evaluation conceptual architecture introduced in the previous

chapter, a model and a methodology, supporting the creation and execution of evaluations,
are required. The model should be general and flexible enough in order to accommodate
the diversity of domains it is intended to be applied to. The followed approach provides
a solid basis, from which evaluations can be created, and means for its extensibility. The
proposed methodology tackles, thus, topics covering the definition of domains, the creation
of evaluations, the application of evaluations to sets of users, the storing of the data and
the retrieval of the results.
4.1 Requirements
In the previous chapter, domain was introduced as an evaluation instrument, defined
by a scope, some applicability criteria and a domain language. We claimed that to assure
the successful growth of a global evaluation solution, it is necessary to allow new domains
to come forth as time goes by and guarantee that they are compatible with the existing
architecture. We also claimed to be necessary to guarantee the possibility of new users -
that is, nodes - to be added and become compatible not only with the existing domains
but also with future ones that are yet to be formed. With the diversity of domains and the
consequence of different evaluation methods and practices, it becomes necessary to develop
a way to specify evaluations that provides the necessary flexibility for designers and com-
patibility for users/nodes. It is necessary to allow designers to create their domains at will,
without a limitless number of restrictions and at the same time make sure that nodes will
be able to interpret and execute evaluations without changing their whole infrastructure.
At the same time, understanding how data is collected and stored from evaluation is also
a common issue. While differences from domain to domain exist, establishing a software
solution that sustains this diversity can be hard. Results change from domain to domain
and are dependent of the domain’s definition. To support these variations, it becomes
43
necessary to balance between the stability and rigidity of a closed data specification and
the lack of structure and flexibility of a more open one. The architecture must be taken
into account in this decision as it configures the backbone of the entire solution. Above
all the approach must support the creation and application of evaluations that defend the
main objectives of the paradigm and establish a common ground between all evaluation
participants.
4.2 A Methodology Proposal for Dynamic Evaluation

Our evaluation solution takes in consideration the requirements and establishes an
evaluation model that makes it possible to implement the notions of dynamic evaluation.
In our proposal, a concrete evaluation test (i.e. an evaluation) is defined in a single domain
and can only be applied to the set of users that belong to that domain’s network. Since
users can be part of multiple evaluation networks, they might be subjected to evaluations
from multiple domains at the same time. Both users/nodes and domains are independent,
both have their own characteristics and cooperate to reach a common objective which is the
execution of evaluation scenarios. This independence is important to allow the appearance
of new users and new domains through time and allows the constant evolution of the
architecture by allowing existing users to become part of new domains and new users to
become part of existing domains. It guarantees that the architecture does not become
stalled.
While this independence is fundamental to the architecture’s value, it also poses an
challenge on the overall compatibility between nodes and domains. Nodes can be subjected
to evaluations from different domains, domains which do not share any similarity in their
evaluation methodologies (since their definitions change) and might not be prepared to
execute them. Given the diversity of domains, expecting a node to be able to interpret
and support every type of evaluation is not feasible. Thus, means to identify compatibility
between nodes and domains must be part of the solution. It is accomplished matching the
characterization of the nodes against the applicability criteria of the domains.
The proposed architecture unfolds into five stages, that are aligned with the involving
actors - designers, evaluators and users -, as can be seen in Figure 4.1.
Generic Domain 1 Extends * Domain 1 Defines * Evaluation 1 Defines * Evaluation 1 Generates * Execution
Language Language Domain Specification Specification
Definition of the Creates Uses the Designs the Participates in

domain language the domain domain evaluation the evaluation
Designer Designer Evaluator Evaluator User
Figure 4.1: Processing phases from the creation of an evaluation to its execution
44
As an innate characteristic, the model features a language that designers must follow
in the creation of domains. This language, referred to as generic domain language, assures
that domains are in accordance with the architectural rules.
In the second stage, domain languages are specified. They must be in accordance
with the generic domain language rules. As stated before (see Section 3.3.1), this mainly
corresponds to specifying the domain vocabulary (evaluation elements) and the rules to be
observed in its usage in the construction of evaluations. These domain languages will be
used afterwards, in the third stage, for the creation of evaluation domains in the dynamic
evaluation architecture. Recall that a domain is composed of a domain language plus a
scope and applicability criteria and, thus, the same domain language can be used in the
creation of different evaluation domains.
The specification of evaluations is the forth stage of the evaluation model. An eval-
uation specification, defined in the scope of an evaluation domain, is composed of a set
of evaluation assessments, created with the purpose of applying a test to a set of users.
It constitutes a planification for the evaluation in regard to its methodology/procedure.
The specification however is abstract since it does not include the actual set of users and
schedule. It is necessary to instantiate the specification in order to deploy and start the
actual evaluation.
This is the role of the final stage of the evaluation model, referred to as the execution
specification. Taking an evaluation assessment from a given evaluation specification, the
evaluator selects a set of target users and defines an application schedule, giving rise to an
execution specification. This execution specification is then deployed to the users’ nodes,
where the actual executions take place. Finally, the results of the executions are received
and attached to the evaluations assessments for posterior analysis. The separation between
the evaluation and execution specifications is justified by the diversity of domains. Every
evaluation is defined in a domain and consequently dependent of its specification. This
approach allows nodes to not be dependent of any domain and be accessible to all domains
if they match their criteria. As such, and also to simplify the distribution and execution
of evaluations, this is supported by a common language that all nodes understand, thus
making possible to look at the evaluation as a program and distribute it remotely.
The process of converting the information of the fourth part of the model to the fifth
part of the model is similar to a transformation method. This transformation takes an eval-
uation specification and parses it into a set of execution specifications using the domain’s
specification from part two. Figure 4.2 exemplifies the process by illustrating the creation
of an evaluation in a domain by an evaluator and its subsequent deployment into two
nodes. The evaluation is stored within the domain as an evaluation specification. When
triggered to start, the specification is transformed into a set of execution specifications
(one per evaluation assessment) to be deployed to the nodes. The node is then responsible
for processing and applying the assessment to the user.
The patent diversity of evaluation domains implies a different way of performing an

evaluation in each of them, mostly due to the different domain languages. However, defining
a data model that allows this diversity to occur within a fixed architecture is difficult. On
45
EVALUATION DOMAIN A
INTERESTS:
Uses AAL
Execution Deployed
CHARACTERISTICS: NODE ONE
Specifications into
WEIGHT > 60kg
Evaluator
Creates Evaluation
User One
Evaluation Specification Generates
Execution NODE TWO

Specifications
User Two
Figure 4.2: Instantiating an Evaluation within a Domain
one hand, a rigid model can pose an easier solution but would place limitations to designers
and evaluators. On the other hand, a flexible model is more desirable but can be harder
to support in actual implementations of the architecture. The selection and adoption of
an adequate data model is critical to our evaluation model.
The decision to use ontologies for model definition comes naturally from the previous
requirements. Ontologies are a flexible data representation method that supports the
inclusion of new classes and properties in existing specifications. In our scenario, this allows
us to include new domains without breaking the architecture’s constrains and without
limiting the designers objectives. A major advantage of using ontologies is the ability
to allow designers to bring existing ontologies and their knowledge into their domains
and apply them in their evaluations. By allowing this, we are enabling evaluators to use
specifications which are standards in certain areas or widely used by their peers in the
architecture without changing the existing infrastructure.
The inclusion of ontologies into our evaluation model leads to another advantage in
regard to the feasibility of the model and the architecture itself. Each specification is
defined as an ontology and built upon the other specifications. In other words, the domain
language is built on top of the general domain language, the evaluation specification on
top of the domain language, and the execution specifications on top of the evaluation
specification. This method configures an incremental approach which is assisted by the
architecture and guarantees that the rules for each layer are fulfilled. At the nodes, the
evaluation is performed by the users. Figure 4.3 illustrates this incremental approach from
top to bottom.
To explain other advantages of this approach and the details of each specification,
namely how are they built and associated with each other, the following sections feature
a full explanation of the model from a top to bottom perspective, starting from the gen-
eral domain language. The evaluation domain is not further explained, since it is not a
specification and the already given description is enough to understand its role.
46
Generic Domain Language
EVALUATION DOMAIN
Domain Language
Designer
Evaluation Specification Evaluation Specification
Evaluator
Execution Specifications Execution Specifications
deployed deployed
NODE NODE
Performs the Performs the

Evaluation Evaluation
User User
Figure 4.3: Applying the methodology to the conceptual architecture
4.3 Generic Domain Language

The generic domain language includes a set of fundamental elements for the construc-
tion of domains. This language is the support of every specification of the architecture
and represents the basis of the evaluation model. The language itself is divided in several
ontologies, each with its purpose, and tackles the main necessities of a dynamic evaluation
approach. To support the creation of direct assessment situations, the language provides
an enquiry ontology. To support the detection of event-related situations, the language
includes a complex event ontology. To support the definition of feedback-ambient data
collection scenarios, the language includes an assessment specification ontology. To repre-
sent the logistical support and execution guidelines, the language includes a control flow
ontology. These ontologies are independent and represent what our evaluation model is
based on.
Every ontology is simple and extensible by definition. While the base ontologies can
not be changed in regard to the generic domain language, they can be extended into new
specifications and used in concrete domains. This is in fact an objective because it allows
the model to include new evaluation methodologies and concepts which may arise with
time and thus creating specialized domains.
To define the ontologies we resorted to known elements of ontology representation from
standards like OWL [OWL Working Group, 2009], RDF [Group, 2014b] and RDFS [Group,
47
2014a]. Presented classes and properties follow common nomenclatures by being defined
by a namespace and a unique ID1 . This definition is important given the variety of domains
and nodes and allows us to guarantee that every element is not repeated. It also opens
a door for data mining options by allowing algorithms to inspect multiple specifications
from both domains, evaluations and executions and extract new knowledge from them.
In our description of the ontologies, we will resort to a simple description rather than
full description logics. As such, ontology models are illustrated using a conceptual represen-
tation composed by their classes and properties for easier comprehension. To complement
this information, Annex A presents a full listing of every ontology’s classes and properties.
4.3.1 Enquiries
The most normal evaluation procedure involves questioning the user. Specifying a set
of questions and a set of possible answers is a typical procedure that offers low complexity
and short preparation periods. Despite the commonness of this procedure, in itself, the
procedure suffers from already stated issues like repetitiveness or rigidity. Nonetheless, the
importance of enquiries in an evaluation is consensual. The necessity of building a series
of questions and establish an order to those questions is fundamental when gathering
information from the user especially in feedback scenarios.
In our proposal, we opted to create an abstract structure which allows evaluators to
design enquiries easily but without limitations. Initially, we intended to create a language
capable of producing the most common question and answer types such as multiple choice,
open answer or Likert scale questions but decided otherwise since it would impose a fixed
set of options to evaluators thus restricting them. As such, we choose to design a simplistic
but extensible structure for direct assessment based on two elements: Question and Answer.
Figure 4.4 showcases the Enquiry conceptual specification.
The Question and Answer elements establish an action which is made to an actor where
the Question acts as the input and the Answer acts as the output. The actor can be a
human but such is not mandatory since it may be possible to use Question and Answer
elements when communicating with an external system for instance. In order to allow
evaluators to establish multiple interrogations in some order, we introduced the element
Enquiry and an EnquiryGroup element to group enquiries. To establish the sequence
when applying the enquiry, the specification includes a enq:followedBy property between
question elements.
These elements allow evaluators to create assessment elements which will be applied to
an actor. The Question, Answer and Enquiry elements are only abstract elements. Their
values come when extending them or when forming new relationships between them. It
is possible to enrich the type of interrogations to the user by adding semantics to them,
or simply adapting the element to the evaluation’s needs. With this flexibility, we are
guaranteeing that no limitations are placed when applying enquiries in different scenarios.
1
Normally the ID represents an URI. As an example, the uri “http://enquiryOntology#Question”
would refer to the “Question” element of a “enquiryOntology” specification.
48
rdfs:Class
rdf:type rdf:type rdf:type
enq:hasFirstElement
enq:hasEnquiry Enquiry Question Answer
rdfs:subClassOf enq:hasTransitionTo
Usability
Question
Figure 4.4: Enquiry Conceptual Specification
Concretizations of this specification extend the base elements. For instance, in an area
like AAL, evaluators can define concepts like AALQuestion or UsabilityRelatedQuestion
which would provide semantic information regarding the question itself. On the other hand,
elements like OpenQuestion or MultiChoiceQuestion could be created to specify a more
concrete type of question. Note that these elements can possess other relationships as well.
The MultiChoiceQuestion element requires a property entitled enq:hasPossibleAnswer
which defines the set of possible choices from which the actor must select.
The specification itself possesses no limits and a single rule. Relationships with the
Enquiry, Question or Answer are allowed only via rdfs:subClassOf property. From then
on, every added concept possess no restrictions on its specification. In Figure 4.4, the
element UsabilityQuestion shows an example of an added concept which extends the
base specification.
4.3.2 Events
The utilization of events is crucial for a dynamical approach to an evaluation. An
environment is typically reactive and its influence on the user’s behavior can be high,
since devices, applications or appliances normally share the same medium. As such, it
becomes of higher importance to allow a measurement of this influence and the possibility of
analysis regarding its consequences. To do that, the inclusion of events became mandatory.
An event can be seen as an action or an occurrence of something that follows a certain
49
typification. By detecting it, evaluators gain a much ampler array of evaluation tools. On
its simplest form, it becomes possible to perceive if a certain action does occur and how
many times does it occur in a certain period of time. But it also becomes possible to
trigger sequences consisting on inquiring the user regarding that event or analyze scenarios
in which multiple events occur simultaneously or within a span of time.
In our proposal, an event is seen as a conceptual element that represents something
that occurs at a precise timing. In that sense, the specification’s top element is an Event
class with a Timestamp property. New events can be created as subclasses of the Event
class. Figure 4.5 shows the Event specification.
evt:hasTimestamp
Timestamp Event
evt:hasTimestamp
Specification
evt:hasTimestamp Example
rdfs:SubClassOf
Increase Decrease
rdf:type Volume Volume
rdf:type
evt:hasTimestamp
15 Jan 2020 Increase
18pm Volume 1a.1
Figure 4.5: Event Conceptual Specification and an example
As an example, Figure 4.5 shows a scheme consisting on two events: IncreaseVolume

and DecreaseVolume. Both of them are subclasses of the Event class and inherit the
evt:hasTimestamp property. The figure includes an instance of IncreaseVolume and
how it becomes associated with the representation. Note that the instance possesses an
rdf:type property connecting it to its class and the necessary evt:hasTimestamp property
which belonged to its originator.
4.3.3 Event Processing Rules

Event Processing is a common feature within event driven architectures or event-aware
systems [Michelson, 2006]. It consists on analyzing event flows and extracting information
from them by applying Event Processing Rules (EPR) [Paschke and Kozlenkov, 2009].
Its usage on evaluation however is very limited, with existing proposals being oriented to
mobile rather than complete dynamic environments.
50
To allow evaluators to detect more specific situations, our proposal includes Event
Processing Rules (EPRs) as an extension to event support. By using the Event as the
base element, an EPR represents a tree-like structure associating multiple events. Within
literature, EPR specifications can range from complete languages based on first order logics
[Anicic et al., 2010] to query languages which extend query standards like Sparql [Anicic
et al., 2011]. ESM tools make use of EPRs to trigger data gathering actions, but their
specifications offer low flexibility and don’t allow the creation of new operations past their
original definition. Because of this, our approach includes a new EPR specification.
Our EPR specification is recursive and allows the creation of complex EPRs com-
posed by unlimited events. Their base objective is to allow the evaluator to be able to
predict, analyze and gather data regarding any situation within the environment that
is associated with events. At its basis, every EPR instance has a root element which
is specified via the epr:hasRootElement property. The root can be either a simple
Event or an EventOperation, both of which are abstracted in the specification by an
EventRuleElement class. If the root of an EPR is a simple event, then it amounts to a
single occurrence of that event. On the other hand, if it is an event operation, the rule will
be fulfilled only when all included conditions are satisfied.
The EventOperation element represents all event operations within the specification:
the And, Or and Not operations which derive from first-order logic prepositions; and the
Delay and Repetition operations which provide a more functional approach. All of them
are able to receive other operations as arguments thus allowing recursivity. The speci-
fication is also extensible to new operations, but dictates that every non-abstract oper-
ation needs to implement the epr:hasRuleElement property which specifies the cardi-
nality of every operation. For instances, both the And and Or operations require two
EventRuleElement elements as arguments while the Not operation requires only one, both
of which are manifested by the epr:hasRuleElement property. The first possesses cardi-
nality two while the second possesses cardinality one. A visual representation of the EPR
base specification is presented in Figure 4.6.
Less common operations were also added in the specification. One such example is the
Delay operator which allows evaluators to include waiting periods within EPRs. The delay
is imposed from the time when the condition is fulfilled to when the actual operation is
triggered. This operator allows the evaluator to, for instance, include an interval between
the execution of an EPR and the start of a subsequent enquiry. It is applicable in cases
where the evaluator intends to impose a waiting period. To specify the interval, the
operation includes an epr:hasInterval property.
Another less common function is the ActiveTimeInterval. This function mandates
that the operation is only active for a given period of time. During that time, the associated
element (either an operation or an event) is active and, if it is fulfilled, the operator will
trigger. After the interval is surpassed however, the operation is considered as terminated.
In order to allow the trigger to be performed only at the end of the interval, we included an
additional property - epr:evaluatesAtEnd - to allow it. In this scenario, the associated
element may constantly change its status during the active interval and is only verified at
its end. Altogether, the ActiveTimeInterval operation allows the evaluator to analyze a
51
Event
Processing
Rule
epr:hasRootElement
Event
Processing
Rule Element
rdfs:subClassOf rdfs:subClassOf
epr:hasRuleElement
(only 1)
epr:evaluatesAtEnd
epr:hasRuleElement Event Event
Event Boolean
(only 1) Operation Operation
epr:hasRuleElement
ActiveInterval
(exactly 2) epr:hasInterval
Event
rdfs:subClassOf rdfs:subClassOf rdfs:subClassOf Operation Integer
epr:hasInterval
Delay
Event Event Event rdfs:subClassOf epr:hasRepetition

Event
Operation Operation Operation Times
Operation
NOT AND/OR Function
Repetition
...
Figure 4.6: EPR Conceptual Specification
situation within a specific period of time. When linking EPRs with other elements, this
operation allows the evaluator to check if something happens during a span of time and,
if so, to trigger a subsequent action. Like the Delay operation, the ActiveTimeInterval
is also associated to the an epr:hasInterval property to define the interval.
Another less common operation which was added is the Repetition Operation. The
Repetition Operation allows evaluations to specify how many occurrences are needed
for a certain event or operation to be considered as fulfilled. To specify this, the operation
possesses an epr:hasRepetitionTimes property as well as a epr:hasInterval to define
the period of time in which these occurrences must occur. Together with the delay opera-
tion, these operations provide a helpful method to allow the detection of situations where
a number of events are detected within a specific time frame.
Both the delay and repetition operations are predicates. A predicate allows the inclusion
of variables which can be other than booleans but still produces a boolean result. In
our specification, we represent these types of functions as EventOperationFunctions.
Function operations make the EPR specification extensible and capable of receiving new
operations if so is needed. For example, in cases where it becomes necessary to inspect
Event parameters, new functions such as BiggerThan, SmallerThan or EquivalentTo
could be added to the EPR specification and used by the evaluator. Due to their specificity,
we choose to exclude these operations from the base EPR representation, thus including
only the most relevant and necessary ones.
When executing, an active EPR is the target of events which will constantly change
its state by fulfilling or not its operation’s conditions. Within an EPR, the occurrence
of an Event is treated like a True statement, while its non-occurrence represents a False
statement. Event operations represent prepositions, that is, boolean expressions which
associate multiple events. When an event occurs, the operation is verified and assumes
52
a True or False value according to its condition. It is also important to note that all
operations are verified initially. For instance, the Not operation is True at the start. More
aspects regarding EPR execution practices will be fully described in Chapter 5.
4.3.4 Evaluation Flow Control

The ability to create new domains, link them to a network of users and consequently
deploy contextual evaluations is one of the major traits of our proposal. In order to achieve
this level of functionality, it is necessary that evaluations created in the domains are able
to be interpreted by nodes whether nodes are created first or vice-versa. Domains are
independent of the nodes, making it necessary to establish a bridge between them in order
to assure their compatibilization.
Evaluations are created in an evaluation domain but need to be delivered to the node
and then executed. Evaluations change from domain to domain due to semantic content
but the procedure does not since it contemplates applying a question, advancing a question,
waiting for an event and so on, like a state machine. Using this concept, we decided to
create a control flow specification that is able to represent an evaluation instantiation by
focusing on the procedure, abstracting the specification and thus allowing all nodes to be
able to understand and execute every evaluation.
The necessity for a control flow ontology is explained by a necessity which this sepa-
ration imposes. While representing the procedure is a step forward in assuring the com-
patibility between nodes and domains, a domain language also contains non-procedural
aspects which must be taken into account when applying the evaluation to the user. When
a designer creates a domain language, he may insert elements that extend a Question or
an Event that defines a set of aspects which only the evaluator knows of. The issue here is
that it is impossible to know beforehand how to deal with new elements since nodes are not
aware of how to interpret them. In this scenario, either the node must be constantly altered
to interpret a new domain, or the domain must be restricted to a series of elements that
the nodes know. Both options are not optimal as one forces the node to know beforehand
every domain or be changed to include new specifications.
As a solution to this, we designed the Compflow ontology2 . Compflow [Luz et al., 2014]
is a workflow ontology designed for evaluation and crowdsourcing purposes. It is similar
to languages such as BPMN [Group, 2011], but takes a different approach to workflow
specifications by introducing a concept called Interface. In Compflow, an Interface
represents a element to whom it is possible to deliver something which we might not know
the specifics of but which the Interface knows how to handle. For this, Compflow allows
the specification of execution elements, called Executables, and the linkage of them to
who is responsible for executing them - the Interfaces.
Regarding the evaluation model, we use Compflow instances to represent execution
specifications. Despite the diversity that domains pose, by using a common representation
to all nodes, we can guarantee to evaluators that their evaluations are executable in every
2
Created in the scope of this thesis as well as the work of Nuno Luz from ISEP [Luz, 2015].
53
node independently of the domain that produced it. By parsing evaluation elements into
Executables (Tasks or Events), we are able to define a series of evaluation elements, and
through interfaces, to outsource their execution to a specific component (represented by
the interface) that applies it to the user.
A Task element represents the most basic execution element of a Compflow instanti-
ation. Each instance possesses an Input declared via cfw:hasInput, an Output declared
via cfw:hasOutput and one or more Interfaces declared via cfw:executedBy. Figure 4.7
illustrates this description.
cfw:hasInput cfw:executedBy
Input Task Interface
cfw:hasOutput
Output
Figure 4.7: Task Conceptual Specification
In our evaluation scenario, it is possible to use the Task element as a way of defining
how should a Question element be executed. By extending the Input and Output elements
as Question and Answer elements respectively, we are able to create a Task that receives
a specific type of question and produces a specific type of answer without changing the
node itself. To assure that the elements are interpreted as required, the Task is associated
to a subclass of the Interface element. By establishing the interfaces as outsourcing
components, we are able to create a single execution structure supported by a modular
mechanism based on interfaces. We will dwell on this process in more detail in Section 4.6.
Figure 4.8 contains a more ample view of the Compflow specification. Every instance
of Compflow ontology is formed by a Job at its root. Besides the Executable elements,
Compflow is characterized by a Workflow element which groups Activity elements. A
Workflow represents a series of activities linked to one another in a deterministic manner.
An Activity represents an abstract class that can either be an Executable element or
a Gateway element. Gateway elements constitute decision points which can be applied to
both input - by waiting for two or more activities to be completed before executing - or to
output - by launching multiple activities. An optional element such as Priority can be
used to introduce a more direct control over the execution of the workflow, and the State
element is used to denote the status of an activity when in execution. The full Compflow
specification includes additional elements which we will not require and thus ignore.
Another important element of the Compflow specification is the Event class. It is
important to note that the Compflow Event is different from the Event class from before.
Unfortunately, we can not change the naming of the class as it is mentioned in the literature.
For this reason, we will refer to the control flow ontology event class as Compflow Event
54
Job Priority State
cfw:hasWorkflow cfw:hasState
cfw:hasPriority
cfw:hasFirstActivity
cfw:hasCurrentActivity
cfw:hasActivity
Workflow Activity cfw:transitionTo
rdfs:subClassOf
rdfs:subClassOf
rdfs:subClassOf
rdfs:subClassOf
Event Executable Gateway
cfw:executedBy rdfs:subClassOf
cfw:executedBy
Interface Task
rdfs:subClassOf cfw:executedBy
rdfs:subClassOf
EventInterface TaskInterface
Figure 4.8: Evaluation Control Flow Ontology Specification
as opposed to the event specification Event class presented before. The Compflow Event
class represents an operation with a specific input used within Workflows while the base
Event class represents an event occurrence used for EPR specifications.
4.4 Domain Language

An evaluation domain represents a defined evaluation area with a scope, a set of users
and a domain language. The last corresponds to a domain language which must be created
using the generic domain language as basis and following a set of necessary rules. The
language will represent the “evaluation schema”, being the basis for all processes within
the domain, from the specification of evaluations to their execution.
The domain language is the second part of our evaluation model. It represents the
semantic component of the domain and its creation is split into two phases: schema ex-
tension and the association process. The first phase has the objective of embedding the
domain with the semantic content that is associated with the research area of the future
domain. In other words, it consists on extending the general language specification with
new content that reflects what is an evaluation in the domain’s perspective. The result of
55
this process creates the vocabulary which can be later used by evaluators to define new
evaluation scenarios. The second phase has the objective of making the added content
usable in the architecture and subsequently, in evaluation scenarios. The result of this pro-
cess is a set of association rules that will allow evaluation specifications to be automatically
transformed into execution specifications. Figure 4.9 illustrates the processes associated
with creating a domain language.
Schema Extension Defines Used By

Vocabulary
Domain Evaluator
Specification
Association Process Defines Used By
Association
Rules
Designer
Evaluation Model Generation

Process – Phase 3 to 4
Figure 4.9: Domain Language Creation Process
By splitting the domain language into two: schema extension and association process,
we are separating the domain into two parts as well. The first is directed at evaluators
and consists on enabling them to create new evaluations using a set of elements defined
in a vocabulary. The second is associated with the execution of the evaluations, and
consists on deploying evaluations by transforming the evaluation into a language that
nodes universally understand. The main advantage of this approach is that evaluators are
spared from dealing with the deployment phase themselves, and the domain assumes that
step. This removes a strong logistical necessity from evaluators and allows them to focus
on the evaluation itself rather than its distribution and execution. In addition, the process
itself is automatic and is made using the domain language itself which also simplifies the
designer’s responsibilities.
On a different note, to easily comply with the following rules and create a correct domain
language, designers may resort to ontology creation programs like Protégé [Stanford Center,
2015] which verify ontology correctness, linkage between multiple ontologies and facilitate
multiple exporting formats.
4.4.1 Creating a domain - Schema extension

Creating a domain requires careful planning on the part of the designer regarding how
he foresees evaluations to take place, what type of users does he expect and what kind
of environments will be supported. It requires an understanding of the area of research
in order to embed the necessary concepts into the domain itself so that the domain truly
represents an evaluation ground for that area.
The first phase of the domain language creation process is the schema extension. This
phase consists on embedding the base specifications from the general domain language
56
with new evaluation elements that represent the domain’s scope and objectives. These
evaluation elements are concepts which can be individually introduced or be part of already
existing ontologies. By fusing the concepts with the base specifications (enquiry, event), the
designer is creating evaluation elements which will compose the vocabulary of the domain.
The result of this process, establishes the elements which evaluators are able to use in order
to create evaluation scenarios in the domain. Figure 4.10 illustrates this process.
Enquiry Extended
Enquiry Specification
Specification
Analyzes Extends
Event Extended
Event Specification
Specification
Domain
Concepts EPR Extended
Designer EPR Specification
Specification
Figure 4.10: Domain Language Creation Process - Schema Extension
The extension phase consists on expanding specific elements from the base specifications
with concepts related with the domain’s research areas. As an example, let us consider a
domain whose focus is Literature Habits. In this domain, the designer/evaluator’s objective
is to obtain information regarding the users’ habits about literature such as what type of
books they read, where they prefer to read them, in what kind of conditions do they read,
etc.. For this, the designer introduces the concepts Author, Book and Genre. The elements
however do not yet represent evaluation elements. They must be inserted into the existing
evaluation elements specification: either the enquiry or the event specifications.
4.4.1.1 Extending the enquiry specification

Without lost of generality lets use the aforementioned example, literature habits, to
illustrate the enquiry specification extending process. To start, the designer must extend
the two main classes of the language - Question and Answer - in order to be able to create
direct assessment situations, that is, interacting with the user directly. Looking at the
thoughtout domain, the designer creates two new elements: a RecommendQuestion and
a FeedbackQuestion, the first aimed at obtaining recommendations and the second for
obtaining feedback for a specific book, author or genre. Figure 4.11 shows the result of the
extension in regard to the Question element.
The introduced elements can also be extended as seen in Figure 4.11. The Feedback-
Question is linked to three elements: an AuthorFeedbackQuestion, a BookFeedbackQues-
tion and a GenreFeedbackQuestion. These three elements are also FeedbackQuestion
and consequently Questions via subclassing. Each of them is linked to different elements
which will be important for their future execution. For instance, the AuthorFeedbackQues-
tion is linked to the Author element via the hasAuthor property which indicates that
every instance of an AuthorFeedbackQuestion will possess an Author as an argument.
Similarly connections are also represented in the BookFeedbackQuestion and GenreFeed-
57
Author domain:hasAuthor
rdfs:subClassOf
Recommend Feedback Author
Question
Question Question
rdfs:subClassOf
rdfs:subClassOf
rdfs:subClassOf Book domain:hasBook

Feedback Feedback Book
Question Question
rdfs:subClassOf
Genre domain:hasGenre
Feedback Genre
Question
Figure 4.11: Enquiry Specification: Question extension example
backQuestion elements. Note that only elements who extend the main Question elements
via subclassing become qualified as evaluation elements of the domain’s vocabulary.
The Question extended specification describes the structure of our future questions.
While we choose to describe a simple specification as an example, other elements could
easily be included. The Book element could be linked to the Author via a hasAuthor
property or include elements like date of publishing, editor, number of pages, etc. Similarly,
the example follows a more semantical approach but the designer could also add a set of
properties focusing on other aspects like interaction. In that particular case, the designer
could include properties like hasContent, representing a text that should be present to
the user, or hasReproductionSpeed, indicating in case of a speech modality, the speed at
which the question should be presented.
While the Question extension describes the specification on which the evaluator will
create actual instances of enquiries for evaluations, the Answer extension describes the
structure which will store the content produced by the user, that is, how will the received
data be organized and structured. Following similar principles to the question extension
example, we present in Figure 4.12 an example of the Answer extension for the Literature
Habits domain.
For this example, we choose to create elements that counterpose the created question
elements. This is optional since the designer can go as far as creating a single class who
represents the entire answer structure. Nonetheless, we choose this method due to its
higher organization and overall understandment. As such, the AuthorFeedbackAnswer,
GenreFeedbackAnswer and BookFeedbackAnswer directly counterpose the AuthorFeed-
backQuestion, GenreFeedbackQuestion and BookFeedbackQuestion. Both the Genre-
FeedbackAnswer and BookFeedbackAnswer elements are linked to a Description element
via a hasDescription property. The BookFeedbackAnswer however is linked to a Rating
element. Both the Description and Rating elements are focused on the actual content of
the user’s feedback answer and less on other factors, like interaction. Once again, this is
58
rdfs:subClassOf Author
Recommend
Answer Feedback
Answer
Answer
rdfs:subClassOf
domain:hasDescription
rdfs:subClassOf
rdfs:subClassOf Genre domain:hasDescription
Feedback Feedback Description
Answer Answer
rdfs:subClassOf
Book domain:hasRating
Feedback Rating
Answer
Figure 4.12: Enquiry Specification: Answer extension example
merely optional.
Creating the enquiry specification does not place a limit on the created elements as
long as they extend the main classes - Question and Answer - as subclasses or by being
linked to elements who do so (either directly or transitively). The specification must also
be a correct ontology. In exception to these rules, the designer is free to include whichever
elements or concepts as he desires.
4.4.1.2 Extending the event specification

While the enquiry specification focuses on direct assessments, the event specification
focuses on passive assessments or, in other words, situations where the user produces
information without being asked to. In our approach, events are atomic elements that
occur within the user’s environment and which do not directly require the user’s interven-
tion. Much like the enquiry specification in regard to questions and answers, the event
specification consists on a description of every event that may occur within the domain’s
evaluations.
For its elaboration, the designer may start by analyzing the user’s nodes and probe
the context for possible events which could be helpful for its future evaluations. Such a
step can aid the designer in predicting the contextual situations to which the users will
be subjected and cross them with his own objectives. This however is only possible in
situations where the designer has access to the nodes. In other situations, the designer
should envision its future evaluation scenarios and create events that are compatible with
that vision.
The specification is created by extending the Event element of the event language and
to aid in the explanation we will resort to the Literature Habits domain used previously. In
this scenario, the designer had the objective of gathering information regarding the user’s
reading habits. In addition to making questions regarding books and authors, the designer
now adds the objective of knowing every instant in which the user is reading.
59
For this purpose, the designer introduces two new events: StartedReadingEvent and
EndedReadingEvent. The first indicates a precise time in which the user has started
reading while the second indicates when the user has finished reading. Remember that in
our approach, events are atomic by definition and as such it is more correct to produce two
events than a single one with an interval property (although also possible). Figure 4.13
illustrates the full resulting specification.
rdfs:SubClassOf Started rdfs:SubClassOf Started

Event Reading Reading
Event Ebook Event
evt:hasTimestamp domain1:hasBook
rdfs:SubClassOf
rdfs:SubClassOf
Ended Started
Timestamp Reading Reading Book
Event Paper Event
rdfs:SubClassOf
domain1:hasBook
Ended
Reading
rdfs:SubClassOf Ebook Event
Ended
Reading
Paper Event
Figure 4.13: Event Specification: Extension example
Both the StartedReadingEvent and EndedReadingEvent are placed as subclasses of

the Event class and inherit the Timestamp element by definition. Much like the Feedback-
Question elements from the Enquiry specification, these two events are also extended into
more specific events, namely the StartedReadingEbookEvent, StartedReadingPaper-
Event, EndedReadingEbookEvent and EndedReadingPaperEvent elements. By doing so,
we are creating two types of reading events, one which is associated with ebooks and the
other to paper books. While the paper books is not linked with any additional properties,
the ebook events are linked to the Book element, clearly stating which book did that event
refer to.
In the process of creating the event specification, the designer already has to have in
mind which software or hardware elements will produce the events since every event requires
one or more producers. While that specification is not made here but on the next step of
the domain language creation process, it is already embedded into this definition with the
paper and ebook differentiation. In regard to ebooks, it is easy to envision a process in
which the ebook reader detects that the user has opened an ebook and communicates the
event linked with a reference to the actual book. On the paper books however, it is hard
60
to vision a process which is capable of assessing which book is the user reading without
requiring the user’s intervention.
The subclassing mechanism also has a different role here in comparison to the Enquiry
specification due to the existence of the EPRs. If the designer later creates an EPR linked
to an event like the StartedReadingEvent, both the StartedReadingPaperEvent and
the StartedReadingEbookEvent will fit the EPR and might trigger it according to its
dependencies.
Like the Enquiry specification, the Event specification does not pose any limitation in
regard to which elements are linked to the added events, as long as they extend the main
Event class or are linked to elements who do so (directly or transitively). The specification
must also be a correct ontology.
4.4.1.3 Extending the EPR specification

An EPR allows evaluators to inspect complex situations by combining different events
with a set of predefined operations. By extending the EventOperation, the designer is
able to add new operations to the already existing set, as explained on Section 4.3.3.
In order to provide an example, we will use the Literature Habits domain and add a
EventOperationSchedule consisting on detecting if an event or an EPR is triggered within
a time interval. This operation can be helpful to the evaluator in situations where he wants
to trigger an enquiry only if a certain event is detected within a certain period of time.
The operation itself needs two arguments: a start timestamp and a finished timestamp.
Since it requires additional variables for its instantiation, the operation becomes a predi-
cate. As such, and according to the base EPR specification, the operation must be added
as a subclass of the EventOperatorFunction.
Contrarily to the event and enquiry specification, EPR extensions may also require
some modifications within the nodes. This requirement is described in Chapter 5. The
specification’s requirements however are similar: the designer must produce a correct on-
tology and respect the base specification in regard to properties and existing classes; added
operations must be added according to their signature.
4.4.2 Creating a domain - Association Process

The first phase of domain creation resulted in an extension of a set of specifications -
the enquiry, event and EPR specifications - with elements from the domain’s research area.
In doing so, the domain was embedded with a series of new assessment instruments which
configure its vocabulary and will become available for future evaluations in the form of new
events, new questions and new EPRs. To fully create a domain however, it is necessary
to guarantee that these newly created assessment instruments can be executed within the
nodes, that is, that they are compatible with the node’s architecture.
To achieve compatibility between all nodes and all domains, we have previously ex-
plained that it is necessary to create a bridge between them and identified the control
flow ontology as that bridge. As such, while nodes are not able to interpret the extended
61
specifications due to the diversity and independence of the domains, they are able to un-
derstand control flow instantiations natively. Phase two of the domain’s creation process
focuses on this connection by featuring an association between the extended specifications
and the control flow ontology.
The result of this association is the final domain language that defines evaluations in
the concept of the domain, that is, what assessment elements does it include in the form
of questions or events and also how will they be interpreted and executed in the domain’s
associated nodes. Figure 4.14 illustrates this process conceptually.
Domain Specification
Merges Produces
Enquiry Extended Control Flow Enquiry Domain
Specification Ontology Ontology
Merges Produces
Event Extended Control Flow Event Domain
Designer Merges Produces

EPR Extended Control Flow EPR Domain
Figure 4.14: Domain Language Creation Process - Association Process
In the figure, the Enquiry, Event and EPR specifications are associated with the control
flow ontology resulting in an enquiry domain ontology, an event domain ontology and an
EPR domain ontology. Globally, they constitute the domain language of the future domain
but while all three require an association with the control flow ontology, all of them follow
different processes.
4.4.2.1 Introducing control flow elements into the Enquiry extension specifi-
cation
In order to better explain the association process, we will once again resort to the
Literature Habits domain and use the resulting specifications from phase one. The extended
Enquiry specification featured two new types of Questions: a FeedbackQuestion and
a RecommendationQuestion. The former was also extended by three additional types:
an AuthorFeedbackQuestion, a BookFeedbackQuestion and a GenreFeedbackQuestion.
The association process requires the designer to associate each question type with a Task
element using the cfw:hasInput property. Figure 4.15 illustrates the result of this process
for our example.
The figure focuses on the FeedbackQuestion element and one of its descendants. In it, a
newly formed FeedbackQuestionTask receives the FeedbackQuestion as an input. Similar
processes should be made for the AuthorFeedbackQuestion, BookFeedbackQuestion and
62
Question Task Answer
rdfs:subClassOf rdfs:subClassOf rdfs:subClassOf
Feedback cfw:hasInput Feedback cfw:hasOutput Feedback

Question Question
Answer
Task
Author cfw:hasInput Author cfw:hasOutput Author

Feedback Feedback Feedback
Question Question Task Answer
Figure 4.15: Linking task elements with the Enquiry specification of the extension phase
to compose the enquiry’s domain language
the GenreFeedbackQuestion. Each of these processes represents an execution guideline

for future evaluations. They will indicate to a node what type of tasks can evaluations
from this domain include and how to deal with them. In the figure, not only have we
linked the question elements to Task subclasses but also the answer elements. By doing
this, we are now clearly stating that, for instance, the FeedbackQuestionTask receives a
FeedbackQuestion as an input and a FeedbackAnswer as an output. In the figure, we
have chosen to include only a few elements and omit the full structure of both the question
and answer elements which have resulted from the extension specification phase. Note that
the full structure includes all those elements, but only the subclasses of both Question
and Answer elements can be linked to Task elements.
Identifying interface elements On Section 4.3.4, we indicated that nodes include a

main component which interprets Compflow instantiations, leaving the task of interpret-
ing and executing content (inputs and outputs) up to specific software components - the
interfaces. With this separation, we are defining a common aspect between all nodes - the
compflow ontology - which all are capable of interpreting. On the other hand, we are also
allowing designers to include whichever aspects they want within their domains by defining
no boundaries with the event, enquiry and EPR specifications.
Since in our approach domains and nodes can be increasingly added, by designing
evaluations based on Compflow’s specification, we are guaranteeing that all evaluations,
past, present or future can be run inside every node. Since inputs and outputs are not
controllable as they are limitless and depend of the designer, it is impossible to expect the
node to interpret all content natively and, as a solution, we outsource their execution via
interfaces. For this purpose, in addition to the association with compflow elements, the
designer must indicate in the domain language which interface is responsible for tackling
each task or event. Figure 4.16 adds this concept into the previous illustration.
63
cfw:isExecutedBy
Question Task Answer TaskInterface
Feedback cfw:hasInput Feedback cfw:hasOutput rdfs:subClassOf

Feedback
Question Question
Answer
Task
rdfs:subClassOf rdfs:subClassOf rdfs:subClassOf cfw:isExecutedBy
Author cfw:hasInput Author cfw:hasOutput Author Feedback

Feedback Feedback Feedback Question Task
Question Question Task Answer Interface
cfw:isExecutedBy
Figure 4.16: Integrating interface elements within the enquiry’s domain language
To each added Task element, the designer must explicitly indicate which interface
or interfaces are responsible for interpreting that task’s input and output elements. In
the provided example, this is done by associating the FeedbackQuestionTask to the
FeedbackQuestionTaskInterface with the cfw:isExecutedBy property. This association
states that within this domain, the FeedbackQuestionTaskInterface is the responsible
for dealing with every task which is a FeedbackQuestionTask. Since the AuthorFeedback-
QuestionTask is a subclass of the FeedbackQuestionTask, then it is forcibly compatible
with the FeedbackQuestionTaskInterface. Optionally, the designer may also include
a timeout indication stating how long can a Task be active using the cfw:hasTimeout
property.
When dealing with an evaluation, a node will receive the corresponding domain’s spec-
ification and use it to process the received tasks. For each Task, the node will check which
interface (or interfaces) is (are) linked to it and make a request to the interface with the
corresponding input information. The interface then processes the request and replies with
an output structure which must correspond to the Answer specification that the Task is
linked to as its output.
4.4.2.2 Introducing control flow elements into the EPR extension specification
EPRs follow a different approach to the Enquiry specification. Chapter 5 explains the
node’s architecture in detail but for comprehension purposes it is important to anticipate
that, within a node, EPRs are processed by an individual component called the EPR
Engine. As previously explained, the node’s main component (who controls the node) is
only able to interpret Compflow representations for compatibility purposes, relying on the
interface mechanism for input and output processing. Because of this, the EPR Engine
is seen by the main component as an interface and requires an explicit indication within
64
the domain’s specification. Due to this, the designer must associate the extended EPR
ontology like he did for the Enquiry with the indication of the interface that interprets it.
Contrarily to the Enquiry specification, the EPR resulting ontology is only composed
of a single association and does not change from domain to domain unless changes are
made in the node’s architecture. Because EPRs represent passive assessment instruments
within evaluation assessments (contrarily to Enquiries who represent direct assessment
instruments), they are linked with Compflow’s Event element within the EPR domain
ontology. Figure 4.17 shows this ontology.
Compflow cfw:isExecutedBy
Event Interface
Event
cfw:hasInput EPR cfw:isExecutedBy EPR

EPR
Event Interface
Figure 4.17: EPR’s domain ontology
In the figure, the CompflowEvent class is extended into an EPREvent class and linked to
an EPRInterface. This association indicates to the node’s main component that all EPRs
are processed by the EPRInterface which is the EPR Engine. Despite being composed
by multiple events, the ocurrence of the EPR is seen by the node’s main component as
a single event, the EPREvent. As such, the EPR itself will be given to the node as the
input of the EPREvent class and eventually passed to the interface - the EPR Engine. It
is the responsibility of the EPR Engine to process all events within the node and create
an instance of the EPREvent when the EPR is considered completed. This instance is
then used by the node’s main component to advance within workflows and process ongoing
evaluations.
4.4.2.3 Introducing control flow elements into the Event extension specifica-
tion
In our model, enquiries represent direct assessment instruments while EPRs represent
indirect/passive assessment instruments. EPRs are defined as a combination of events
and event operations which can range from a single event to a complex combination of
elements. Because events are part of EPRs, we decided that events should be indicated
within evaluation assessments always as a part of an EPR and never on their own. While
this has no consequence to the evaluator, by using this definition, we are able to link events
with the EPR Engine on the nodes, thus concentrating the detection of events on a single
component.
65
In regard to the specification, this definition is a major difference. While EPRs and
enquiries have inputs and outputs, events do not. By being specified as part of EPRs,
events can be only considered when EPRs are active and since they do not require an
input, interfaces are not appropriate. Nonetheless, it is necessary to specify who creates
the event, and to do so, we introduced the notion of an atomic event producer which
represents a software component that creates events.
In our approach, interfaces and producers have very different objectives. An interface
is something that receives requests which it interprets, processes and returns a result. It
possesses an API which others may use to contact it. EPRs and enquiries require this
in order to execute accordingly. A producer however is a mere content creator. It does
not need to receive requests and as such, applies a unidirectional communication method.
This decision will be important when discussing the architectural aspects of our solution
as producers and interfaces have different methods of operation inside nodes.
In regard to the association process, the designer still has to associate each event of the
event extended specification to an event producer via the cfw:isExecutedBy property. In
this association, every event producer must be a descendant of the AtomicEventProducer
class (which belongs to the Event ontology). The resulting event domain ontology will
later be used to explicitly indicate to the node which producers does it require in order to
be compatible with the domain.
4.5 Evaluation Specification

The previous part, the domain language, had the objective of allowing designers to build
evaluation domains. In the third part, the domain is already built. With it, evaluators can
now create evaluations using the definitions of the domain language. Before we explain
the necessary steps to create an evaluation and what is an evaluation specification, it is
important to note that the following steps will be automated by an architectural component
later 5.3.2. As such, we do not expect evaluators to have to create their evaluations by
manipulating ontologies. To explain the underlying model however, we will assume so, and
ignore the automation process.
The third part of the evaluation model is based on the construction of evaluation
specifications. An evaluation specification represents an actual evaluation test definition
within a single domain. The definition constitutes the elements (like EPRs or enquiries)
that will be applied to a set of users in order to obtain information according to a set of
objectives.
To assist the evaluator in the creation of this specification and to establish a common
representation of evaluations in our proposal, we defined two additional ontologies: the
evaluation and the evaluation assessment ontologies. These ontologies allow the evaluator
to define what elements compose the evaluation scenario and how should they be applied
to the user. In our proposal, evaluations are defined using evaluation assessments. As-
sessments link and establish a sequential order between evaluation elements and can range
from complex examples that simulate cause-effect situations to simpler examples, like a
66
simple enquiry.
The process of creating an evaluation specification was started at the previous section
when the domain’s vocabulary was defined. The elements on the vocabulary however are
abstract, or in other words, they are similar to classes within programming. To use them in
actual evaluations, evaluators must create instances of these elements and establish actual
questions and EPRs. Note that, much like a class, instances may differ in content but will
follow the same procedure of the mother class when executing. The result of this process
is a set of instances which evaluators can utilize to build a set of evaluation assessments
following the assessment ontology specification. Figure 4.18 illustrates the involved steps
for creating an evaluation specification.
Evaluation +
Evaluation
Assessment
Ontologies
Creates Selected In
Enquiry Extended Enquiry
Specification Instances
Evaluation
Specification
Creates Selected In
EPR Extended EPR
Specification Instances
Evaluator
Figure 4.18: Evaluation Creation Process Flow
The figure represents the steps involved with creating an evaluation specification. To
better explain this process, we now present each step for creating and applying the eval-
uation specification in detail, starting with the evaluation and evaluation assessment on-
tologies.
4.5.1 Creating an evaluation - Evaluation Ontologies

In our approach, evaluations are constructed and defined according to a domain lan-
guage. An evaluation specification includes a set of assessment instruments like enquiries
or events but it is not associated with its target audience, the users. This is because the
application of an evaluation can be made multiple times and each to a different target. By
separating the definition from its instances, we are able to apply the same evaluation defi-
nition repetitively. For instance, the evaluator may create an evaluation for a set of users
today and reuse that same evaluation specification tomorrow with a totally different set of
users. These aspects will be a part of an instantiation. Figure 4.19 shows the specification
which personifies these concepts.
In the figure, the main Evaluation class is associated with another class called Evalua-
tionInstantiation. An evaluation’s specification is made through the Evaluation ele-
ment but its executions are made through the EvaluationInstantiation element. This
67
eval:hasCreator eval:hasEvaluation eval:hasUser
Evaluation
Evaluator Evaluation User
Instantiation
eval:hasStartDate
eval:hasEndDate
Timestamp
Figure 4.19: Evaluations and Evaluation Instances Specification
specification also include two additional properties to characterize each instantiation:

eval:hasUser and eval:hasEvaluation. The latter identifies to which evaluation does
an evaluation instantiation belong while the former identifies the users which participate
in an evaluation instantiation.
Evaluation Assessment Specification In order to apply evaluations to the users, it is

necessary to elaborate some sort of planning that applies a series of evaluation instruments
to the user. In other words, it is necessary to specify when will an enquiry be triggered
and presented to the user or when is a given EPR active.
To specify these aspects we already introduced the notion of evaluation assessments.
Rather than being a specification for the domain, the evaluation assessment specification
defines a structure defining evaluation execution flows. Previously, we have spoken of nodes
as abstract locations where evaluations will take place. For that to be possible, that is, for
an evaluation to be created and deployed to a node, it is necessary to allow the evaluator
to specify the evaluation according to some structure. Assessments do this by being a
standard representation of evaluations for all nodes and for domains.
An evaluation assessment is a set of evaluation items - which can be enquiries, events, or
others which can be later added - with a fixed order of execution. By definition, evaluation
assessments are conditional. If an assessment contains an event, in the case that event never
occurs, the assessment will not be completed. These situations are acceptable within the
concept of dynamic evaluation. In the case of the event occurring, it allows the evaluator
to extract contextualized data about a certain scenario which he would not be able to
otherwise.
The base element of the evaluation assessment ontology is the EvaluationAssessment
element. Every EvaluationAssessment element represents a single assessment and is com-
posed of a set of EvaluationAssessmentElement elements. For its part, each Evaluation-
AssessmentElement represents a single EvaluationItem - which is an abstract element -
that can be an Enquiry or an EPR. Events can also be inserted into assessments but always
under an EPR. The property ast:followedBy is used between EvaluationAssessment-
Element elements in order to establish an order of execution between them. Figure 4.20
illustrates the conceptual representation of the specification.
The element EvaluationItem is an abstract element with the purpose of allowing
the language to be extensible. If in the future, new evaluation assessment methods are
68
eval:hasEvaluation
Evaluation
Evaluation
Instantiation
ast:hasEvaluation
ast:hasEvaluation
Instantiation
ast:hasEvaluation
Assessment Evaluation
Evaluation
Assessment
Assessment
Instantiation
ast:hasFirstElement
ast:represents
Evaluation
ast:followedBy Assessment Evaluation Item
Element
Event
Enquiry Processing
Rule
Figure 4.20: Evaluation Assessment Specification
included, evaluators will be able to add them without changing the entire specification.
It is also important to note that assessments can be applied at a single moment of the
entire evaluation, periodically while following a schedule, or dependent of a certain action
or event.
4.5.2 Creating an evaluation - Creating instances of the Enquiry

and EPR Specifications
To create an evaluation, the evaluator needs to start by creating instances of the do-
main’s Enquiry and EPR extended specifications. The Event specification is exempt from
this step because events are only used within EPR instances and never by themselves.
To explain this process, we resort to the Literature Habits domain from previous iter-
ations. In that example, the Enquiry extended specification included several Question
subclasses like the RecommendationQuestion, FeedbackQuestion and its descendants,
AuthorFeedbackQuestion, BookFeedbackQuestion and GenreFeedbackQuestion. To in-
clude them in actual evaluations, we now need to create individuals/instances of each of
these classes.
We provide an example of an instantiation of the BookFeedbackQuestion element in
69
Figure 4.21. In it, the BookFeedbackQuestion is associated to a BookFeedbackQuestionA
and BookFeedbackQuestionB by an rdf:type property, specifying them as instances. To
enhance the example, we also added a Title element to the Enquiry extended specification
and much like the BookFeedbackQuestion instances, Book and Title instances were also
included.
rdfs:subClassOf rdfs:subClassOf Book domain:hasBook domain:hasTitle

Feedback Title
Question Feedback Book
Question
Question
enq:hasFirstElement rdf:type rdf:type rdf:type
enq:hasFirstElement domain:hasBook domain:hasTitle

Enquiry Book Feedback "Pride and
Enquiry A Book A
Question A Prejudice"
rdf:type
rdf:type enq:hasTransitionTo
rdf:type rdf:type
Book Feedback domain:hasBook domain:hasTitle "Brave New

Book B
Question B World"
Figure 4.21: Creating an instance of the Enquiry extended specification
In the example, we are creating two actual questions. One of the questions is associated
with the “Pride and Prejudice” novel while the second concerns the “Brave New World”
book. Both of them answer to the base BookFeedbackQuestion class correctly and can be
used by evaluators within any evaluations belonging to this domain. Our base evaluation
assessment specification was not designed to receive individual questions but only EPRs or
enquiries and as such, the questions must be contained within an Enquiry element. This
decision was made to maintain a direct correspondence between the evaluation assessment
elements and the Compflow elements but can be changed if necessary.
Since the Enquiry extended specification from before did not include any subclasses to
the Enquiry class, we have created an instance of the base Enquiry class, which we called
EnquiryA. To adhere with the base Enquiry specification, the enq:hasFirstElement and
enq:hasTransitionTo properties were also included and linked to the question instances.
The enq:hasFirstElement indicates which question is the first to be executed within the
Enquiry while the enq:hasTransitionTo links each Question element to the next, giving
them an executing order. In the example, the BookFeedbackQuestionA is constituted as
the first element of EnquiryA and is followed by the BookFeedbackQuestionB.
This example provides a complete enquiry - EnquiryA - which becomes available for
inclusion in every evaluation that belongs to the domain.
4.5.2.1 Instantiating the EPR Specification
The creation of an EPR instance focuses on the association between operations from the
EPR’s extended specification and Event classes from the Event specification. To exemplify,
70
we will create an EPR instance with the objective of determining if a StartedReading-
EbookEvent or a StartedReadingPaperEvent are produced between 19:45m and 20:45m
on the 12th of September of 2015. We choose this example in order to include operations
from the base specification, operations from the extended EPR specification and events
from the Event extended specification.
For this example, we instantiated the EPR class with an EPR A element to represent a
specific EPR instance. At the top of the EPR, we introduced an EventOperationSchedule
instance and specified the start and end timestamps as required by the operation’s spec-
ification. Then, we used the epr:hasRuleElement property to link the operation to its
argument. Since we want the EPR to be triggered if either the StartReadingEbookEvent
or the StartReadingPaperEvent occur, the option fell on the OR operation. Then and
once again by using the epr:hasRuleElement, we linked the OR operation to the base
events. The result of this process is shown in Figure 4.22. Despite not being represented
in the figure, all operation instances are associated with their respective classes via the
rdf:type class.
Event
Processing
Rule
epr:hasStartTimestamp
rdf:type 12/09/2015
19h:45m
Event epr:hasRootElement Event

Processing Operation
Rule A Schedule A
12/09/2015
20h:45m
epr:hasEndTimestamp
epr:hasRuleElement
Event
Operation OR
A
epr:hasRuleElement epr:hasRuleElement
Started Started
Reading Reading
Ebook Event Paper Event
Figure 4.22: Example of an instance of the EPR extended specification
The construction of EPR instances is simple and mainly based on linking operations
between one another with the epr:hasRuleElement property. The only rule for EPR
creation is that all operations must be instantiated in accordance with their specification.
Events do not require instantiations and should be included directly into the instantiation.
Also note that certain EPR instantiations can later result in impossible situations. This
71
must be verified by the evaluator, and is considered a valid EPR instantiation if it abides
by the EPR specification.
Like in the Enquiry instantiation, the result of this operation produces an EPR instance
- EPR A - which can be used within every evaluation of this domain.
4.5.3 Creating an evaluation - Defining the Evaluation Assess-

ments
In the previous process, instances were created for both the Enquiry and EPR specifi-
cation but in order to apply them in the domain’s evaluations, they must first be enclosed
within evaluation assessments. In our approach, an evaluation is defined by a set of evalu-
ation assessments. An evaluation assessment allows evaluators to link different assessment
instruments in a specific execution order. In Section 4.5.1 we explained the base specifi-
cation of an evaluation assessment and indicated that an assessment can be composed by
EPRs and Enquiry elements.
To create an assessment, the evaluator should start by creating an instance of the
EvaluationAssessment class. Then, the evaluator should pick the elements which will
constitute the assessment, that is, the assessment instruments. For this step, the evaluator
can use every instance of the Enquiry and EPR specifications which belong to the corre-
sponding domain. To each of the selected elements, the evaluator must create instances
of the EvaluationAssessmentElement element and link them with the ast:represents
property. These elements will be used to create an order between the assessment in-
struments in terms of execution. To do this, the EvaluationAssessmentElement in-
stances should be associated between them using the ast:isFollowedBy property. To
complete the assessment, the evaluator needs to associate the EvaluationAssessment
instance with the EvaluationItem that represents the first element of the flow using the
ast:hasFirstElement property. In Figure 4.23, we illustrate an assessment example using
the EPR A and EnquiryA instances from the previous section. In the example, an assess-
ment instance entitled EvaluationAssessmentA includes two assessment instruments in a
strict order.
In regard to this assessment’s execution, the assessment would start by enabling EPR
A. When EPR A is considered as completed, then the assessment would advance to the
EnquiryA instance. Concretely, and following the definition of these instances, the node
would wait for the interval between 19:45 and 20:45 and check if either a StartReading-
EbookEvent or a StartReadingPaperEvent occur. If so, then it would advance and inter-
rogate the user regarding ”Pride and Prejudice” and then regarding ”Brave New World”.
After doing so, the assessment would terminate. In case the EPR did not occur successfully,
the assessment would be declared as incomplete by the evaluation.
Since evaluation assessments are the basic evaluation elements of evaluation scenarios,
every EPR or Enquiry - even if single - must be encapsulated by them in order to be used
within an evaluation. The full specification of an evaluation is enclosed by an instance of
Evaluation element, and connected to all evaluation assessments that will take part in
72
rdf:type
Evaluation Evaluation A
ast:hasEvaluation ast:hasEvaluation
rdf:type Evaluation
Evaluation
Assessment
Assessment
A
ast:hasFirstElement ast:hasFirstElement
Evaluation rdf:type Evaluation ast:represents

ast:followedBy Assessment Assessment EPR A
Element Element A1
ast:followedBy
Evaluation ast:represents
Assessment Enquiry A
Element A2
Figure 4.23: Evaluation Assessment example specification
the evaluation.
4.6 Execution Specification

The previous section described how to create an evaluation based on the specification
for a domain. When reaching this phase, the evaluator has already defined a full evaluation
but has not yet applied it. Note that in our approach, an evaluation is a reusable concept
which not contemplate its application. The same evaluation can be applied several times
to different groups of users with different scheduling options.
In Section 4.4, we explained that the process of creating a domain is divided into two
parts, the extension and association phases. When creating the evaluation specification, we
resorted to the extended specifications and created instances of EPRs and Enquiries to be
used within a domain’s evaluation. To compose an evaluation, we established assessment
elements which provided an execution order to the instances, thus creating a full evaluation
specification. Now, it becomes possible to apply the evaluation by selecting a set of users
and a schedule and sending it to the users.
A node however is not able to interpret evaluation specifications due to their depen-
dence of the domain language (and its flexibility). As such, it is necessary to create a
way to represent the evaluation in a standard and understandable format for all nodes.
The execution specification is this format and represents the fourth and last part of our
evaluation model. Contrarily to previous phases, this part of the model does not follow
73
the same incremental methodology but a generation process based on the domain language
where each assessment on the evaluation specification will result in a generated execution
specification. The full set of execution specifications represents the evaluation sent to the
node. Figure 4.24 illustrates the method.
Scheduling
Properties &
Users
Creates Evaluation Composed

Instantiation
Specification Assessments
(for a single User)
Evaluator
Parsed
Through
Generates Control Flow And

Execution Domain
Specifications Ontology Specification
User
Figure 4.24: Transforming an evaluation specification into a set of execution specifications
The figure shows the process of instantiating an evaluation to a set of users. The figure
illustrates the transformation process from the evaluation specification into a set of exe-
cution specifications using the domain language and the control flow ontology. Remember
from the previous section that besides the vocabulary, the domain language also encom-
passes a set of association rules that link enquiry and EPR elements to control flow classes.
Those rules establish a direct correspondence between elements like Question, Answer or
EPR elements to control flow elements like Task, Event or Workflow which form the basis
of what the node is able to interpret. The transformation process uses these associations -
as well as other control flow ontology elements - to generate an execution specification from
an evaluation assessment (in the evaluation specification). The result is a specification that
is independent from the domain and understandable by all nodes in the infrastructure.
While an evaluation specification is a generic representation of an evaluation, each
execution specification targets a single user. The reason for this is that each execution
specification does not only specify the content of an evaluation assessment in an under-
standable format for the node but will also become the repository for gathered data. In
other words, nodes will not only read the specification to know how to apply the evaluation
but also write in it, thus placing the gathered data in the same structure. Once again, this
method guarantees a generic treatment to gathered data in the nodes without breaking
the architectural support. In execution, results can be added as each evaluation element is
filled and evaluators become able to access the specification and analyze results from the
assessment without disrupting the evaluation. In addition to the actual results gathered
74
from interacting with the user, the usage of the control flow ontology as a basis for the
specification becomes also a source of new data as information related with the application
of the evaluation itself is stored as well.
In this section, we will describe the process of creating the execution specification for
an evaluation. It is important to note that within the architecture the process will be
automatic, thus not requiring any human intervention.
4.6.1 Applying an evaluation - Instantiating an evaluation

To apply an evaluation, first of all, the evaluator needs to select the evaluation’s target.
For this part, the evaluator must utilize the evaluation specification from Section 4.5.1 and
start by creating an instantiation of a defined evaluation. In this step, we are creating an
instance of the evaluation which will correspond to a specific schedule and a specific group
of users. As such, the evaluator must then associate the participants via the eval:hasUser
property with the created instance. Finally, the evaluator must define the schedule of the
instantiation, that is, when does it commence and when does it end. Resorting to Literature
Habits domain used until now, we present in Figure 4.25 an instantiation example of
EvaluationA produced in the last section.
eval:hasStartDate
eval:hasEndDate
eval:hasEvaluation eval:hasUser
Evaluation
Evaluation User Timestamp
Instantiation
rdf:type rdf:type rdf:type rdf:type

eval:hasStartDate
eval:hasEvaluation Evaluation eval:hasUser 19:00 15th Ago

Evaluation A Instantiation User U1 2015
A1
rdf:type rdf:type
eval:hasUser
22:00 15th Ago

User U2
2015
eval:hasEndDate
Figure 4.25: Evaluation Instance example
The example includes two users who will participate in the evaluation, UserU1 and
UserU2. To signalize this specific evaluation, the element EvaluationInstantiationA1
is associated with EvaluationA as an instance of it. Associated to the instance are also
two timestamp elements via the eval:hasStartTimestamp and eval:hasEndTimestamp
properties.
75
It is important to denote that evaluation assessments from EvaluationA are also in-
stantiated and associated with EvaluationInstantiationA since they are relative to that
evaluation’s occurrence in particular and not to the evaluation’s specification. This pro-
cess however is no longer relative to the evaluation’s specification but to its execution
specification.
4.6.2 Applying an evaluation - Creating execution specifications

An execution specification is a set of control flow elements based on an evaluation
assessment. For each assessment belonging to an evaluation, an execution specification
is created and included in the request that is sent to the node. The node will use the
contained information and apply the assessment to the user, storing all gathered data
within the same structure.
Because the architecture uses the domain language as a basis for evaluation creation
and instantiation, it is necessary to associate evaluation specifications from domains with
the results from the nodes. Since the results from the nodes are stored within execution
specifications, we created a “linking structure” which allows a correspondence between
each execution specification and the correspondent evaluation assessment. This way, it is
possible to analyze the execution specification by starting from the assessment that gener-
ated it. As a whole, this method creates a direct correspondence between the evaluation
specification and the set of data that resulted from its application to a user.
The second step of the transformation process consists on transforming each assessment
of the evaluation specification into a Job instance of the control flow ontology. Each Job
element corresponds to an execution specification that can be applied to a user. In other no
describe these two steps, we will use the resulting specifications - EvaluationAssessmentA,
EvaluationInstantiationA1 and EvaluationA - from the Literature Habits domain ex-
ample of the previous sections.
4.6.2.1 Instantiate the Evaluation for each User

The first step of the process consists on creating the “linking structure” between the
evaluation specification and the future execution specifications. For each assessment within
the evaluation specification, an instance of the EvaluationAssessmentInstantiation el-
ement should be created. This element will become the bridge between an assessment and
its execution within a specific node. In Figure 4.26 we extended the resulting specification
from Section 4.6.1 by including the element A A1 U1.
In the example, A A1 U1 represents the execution of the EvaluationAssessmentA within
the EvaluationInstantiationA1 for UserU1. This process must be repeated for each user
which will take part of the evaluation. To identify the user to whom it belongs to, the
element should be associated with the user via the eval:hasUser property. Similarly,
to identify both the assessment and the evaluation instantiation, the element must also
be linked by the ast:hasEvaluationAssessment and ast:hasEvaluationInstantiation
properties respectively.
76
eval:hasStartDate
ast:hasEvaluation
eval:hasEndDate
Instantiation
Evaluation eval:hasEvaluation eval:hasUser

Evaluation
Assessment Evaluation User Timestamp
Instantiation
Instantiation
rdf:type rdf:type rdf:type rdf:type

eval:hasStartDate
eval:hasEvaluation Evaluation eval:hasUser 19:00 15th Ago

Evaluation A Instantiation User U1 2015
A1
rdf:type
eval:hasEndDate
ast:hasEvaluation
ast:hasEvaluation Instantiation
eval:hasUser 22:00 15th Ago
(exactly 1) 2015
ast:hasEvaluation
Evaluation Assessment
rdf:type Assessment A_A1_U1
A
Figure 4.26: Creating assessment instances for a single user within a single evaluation
instantiation - EvaluationInstantiationA1
4.6.2.2 Create control flow representations for each Evaluation Assessment

The second step of the process consists on creating a control flow representation charac-
terizing the execution process of an assessment. The result of the process is a self contained
ontology - the execution specification - that will be deployed onto a node and perform the
evaluation assessment to the user.
To start, it is necessary to create a Job element for each evaluation assessment in the
selected evaluation. The Job element represents the root element of the execution specifi-
cation. To preserve the association between each EvaluationAssessmentInstantiation
and its Job instance, the cfw:hasJob property can be used within the evaluation’s speci-
fication. According to the control flow ontology specification, every Job is composed by a
base Workflow. As such, it is necessary to create a Workflow instance which will contain all
elements of the execution process. The base Workflow operates as a wrapper to the assess-
ment and will later indicate its success or failure through properties like cfw:hasStatus,
cfw:hasCreationDate, cfw:hasExecutionDate among others which are created by the
node’s main component. The result of this initial step can be seen in Figure 4.27.
After the creation of the Job and base Workflow elements, the process enters an iterative
cycle based on the creation of control flow elements for each corresponding element of the
assessment. This cycle is based on the domain’s definitions set on the domain language
(section 4.4). To explain it, we will resort to the EvaluationAssessmentA which was
instantiated for UserU1 in the first step. EvaluationAssessmentA is composed by two
elements: EPR A and EnquiryA.
The first element of EvaluationAssessmentA is the EPR A. Within the domain lan-
guage we stated that an EPR is the input of the EPREvent class which is executed by the
EPRInterface (see Figure 4.17). For the execution specification, we use this information
and create instances which replicate the information for EPR A. As a result, a new element
77
Assessment Instantiation User U1
Instantiation A1
rdf:type eval:hasEvaluation eval:hasUser

Instantiation (exactly 1)
ast:hasEvaluation
Evaluation Assessment cfw:hasJob cfw:hasWorkflow
Workflow
Assessment A_A1_U1 Job A_A1_U1
A_A1_U1
A
Figure 4.27: Creating a bridge between the evaluation assessment and the execution spec-
ifications
is introduced, - EPREvent A A1 E1 - which receives the EPR A as its input and is executed
by the EPRInterface.
The second element of the assessment is the EnquiryA. The enquiry is seen as a con-
tainer element. An Enquiry element (or any of its subclasses) becomes the input of an
EnquiryWorkflow instance by definition. Despite not being in the domain language, this
association is fixed and will later permit evaluators to obtain data regarding the enquiry
itself (and not its questions) through the workflow’s instance.
Question elements follow a similar process to EPRs. According to the class of the
Question, the corresponding Task element from the domain language is instantiated and
associated with the Question via the cfw:hasInput element. In the example, the enquiry’s
first question is the BookFeedbackQuestionA. As a result, a BookFeedbackQuestionTask
instance is produced and linked to the question and its associated interface, Feedback-
QuestionTaskInterface. The same is done to the final element of the enquiry, the
BookFeedbackQuestionB.
To adhere with the control flow ontology specifications, two additional properties have
to be considered: 1) the first element of every Workflow must be associated with the work-
flow with the cfw:hasFirstActivity element. 2) every element within a Workflow must
be separated by the cfw:isFollowedBy property is accordance to the association between
the corresponding elements of the assessment (represented by the ast:followedBy prop-
erty). The resulting execution representation for EvaluationAssessmentA is illustrated in
Figure 4.28.
At the end of this process, an execution specification is formed containing execution rep-
resentations for all assessments which belong to the selected evaluation. This specification
will then be sent to a node where it will be applied to the user.
4.7 Summary
In this chapter we explained the underlying specifications which are associated with
our proposal. We explained our evaluation model which ranges from a set of predefined
ontologies to a fully compatible and generic execution specification. Throughout the chap-
78
Assessment Instantiation User U1
Instantiation A1
rdf:type
eval:hasEvaluation eval:hasUser
Instantiation (exactly 1)
ast:hasEvaluation
Evaluation Assessment cfw:hasJob cfw:hasWorkflow
Workflow
Assessment A_A1_U1 Job A_A1_U1
A_A1_U1
A
ast:hasFirstElement cfw:hasFirstActivity cfw:isExecutedBy
Evaluation ast:represents rdf:type

cfw:hasInput EPR Event EPR
Assessment EPR A EPR Event
A_A1_E1 Interface
Element A1
ast:followedBy cfw:isFollowedBy
ast:represents cfw:hasInput rdf:type

Evaluation Enquiry Enquiry
Assessment Enquiry A Workflow Workflow
Element A2 A_A1_U1_W1
enq:hasFirstElement cfw:hasFirstActivity cfw:isExecutedBy
cfw:hasInput rdf:type
Book Feedback Feedback
Book Feedback BFQTask
Question Question Task
Question A A_A1_U1_T1
Task Interface
enq:hasTransitionTo cfw:isFollowedBy
rdf:type
cfw:hasInput
Question B A_A1_U1_T2 cfw:isExecutedBy
Figure 4.28: Creating control flow instances for the Evaluation Assessment A within a
single evaluation instantiation - EvaluationInstantiationA1
ter, we explained the decisions which lead to the creation of such a model as answers to
the constrains of dynamic and flexible evaluation support. The resulting model allows
evaluators to include new evaluation elements or simply reuse existing ones and it allows
designers to create domains according to their ideas and objectives.
To provide the necessary flexibility and still guarantee its applicability, the model uses
ontologies for its specifications. The usage of ontologies not only safeguards aspects like
flexibility and diversity but also provides a unified data structure which is transversal to
the entire architectural proposal. By using ontologies, the model also gains an unlimited
life span as new evaluation methods can be incorporated into the model without changing
its core definitions.
In the next chapter, we will explain the architectural definitions that implement this
model and provide designers, evaluators and users with a ready environment for evaluation
development.
79
80
Chapter 5
Dynamic Evaluation Architecture
To support the concept of a general but flexible evaluation proposal it is necessary

to provide evaluators with tools which facilitate the creation and execution of evaluation
scenarios. In this sense, we designed an evaluation support architecture which comple-
ments the evaluation model and provides a stable but scalable infrastructure where new
evaluations can be safely planned, deployed and executed by users.
5.1 Requirements
In Chapter 3, we have introduced the concepts of domains and nodes as the base
elements of a conceptual architecture for evaluation support. We characterized a node as a
resource centered on a single user and described him/her according to interests, preferences
and environment definitions. We characterized a domain as a dedicated evaluation network
based on specific criteria, a domain language and an application scope. By joining them, we
described an evaluation approach where domains are linked with nodes that are compatible
with them and become able to receive context-aware evaluations. To achieve this approach,
it becomes necessary to materialize it into a software architecture.
To create a global evaluation platform, the architecture must allow domains and nodes
to be included rapidly and without changing the existing infrastructure and its integrity.
The creation of evaluations and evaluation networks should be facilitated by allowing
evaluators to focus on the design and content of the evaluation and delegate operations
regarding distribution and execution to the architecture. Results from ongoing evaluations
should be accessible on-demand.
The architecture must allow evaluations to be performed on low processing devices,
such as smartphones, by having computational necessities fulfilled in the cloud or in private
servers if necessary. It must allow a user to be associated to several locations and several
devices, which altogether form a single abstract environment where evaluations can be
performed. Domains should be able to reuse elements such as specifications or software
components from other domains. Evaluators must be given tools to find existing domains
which are ready to be used.
81
5.2 Architectural Proposal
Our architectural proposal is strongly coupled with the evaluation model specifications
explained in Chapter 4. Following the incremental approach of the model, domains and
nodes become software components in a global network supported by a set of specific
services which guarantee the functioning and safety of the entire model.
From an organizational standpoint, we propose a distributed architecture inspired in
Service Oriented Architecture (SOA) practices. Nodes and domains become software el-
ements which are seen by the architecture as services accessible through communication
APIs. To support them, the architecture is surrounded by a Support Unit which includes
auxiliary services to which nodes and domains can resort to for several operations. Resum-
ingly, from a bottom-up perspective the architecture is composed at the bottom by nodes
representing users that can become subjects for evaluations. In the middle, the support
unit layer assures the connection of the nodes with the architecture and guarantees their
anonymity by storing their actual locations and other properties. Above this layer are
virtual domains in which evaluators may create and apply evaluation scenarios following
the domain’s criteria and specifications. Finally, and also linked to the support unit, an
Evaluation Hub provides designers and evaluators access to certain functionalities related
with the infrastructure such as registering new domains or searching for existing ones using
certain criteria. Figure 5.1 illustrates the design of the architecture.
EVALUATION HUB
SUPPORT UNIT
VIRTUAL DOMAIN ( ) VIRTUAL DOMAIN
NODE NODE NODE NODE
Figure 5.1: Dynamic Evaluation Architecture Structural Overview
The choice of a SOA model is mainly based on its ability to easily include or exclude
elements without endangering the entire architecture. By guaranteeing the stability of the
support unit, nodes and virtual domains can be defined in any location, and be added to
82
the architecture without much effort. In the case of virtual domains, they can even be
placed in the cloud since they do not require the physical presence of the node. Using
a middleware approach (with the Support Unit) guarantees that the extensibility of the
architecture is assured by the infrastructure itself and not its components, thus removing
complexity from both users, evaluators and designers.
In our definition, nodes and virtual domains are not dependent of the infrastructure.
They are independent components that can be dynamically added or removed from the
architecture. In the case of a node, the correspondent user is the responsible for its setup
while in the case of a virtual domain, that responsibility falls over the domain’s design-
er/owner. To be added to the infrastructure, the nodes and virtual domains must fulfill
a set of specifications and be registered within the Support Unit registries. When regis-
tered, they are considered as elements of the architecture and become eligible for evaluation
practices.
The stability of the infrastructure is assured by the support unit. The support unit is
composed by a set of services with the objective of assuring the correct operation of the
entire infrastructure. For this, the unit includes services such as the virtual domain registry
and the node registry services, with the objective of masking nodes and domains from each
other and assuring their anonymity in the infrastructure; services such as the evaluation
mediation and association services establish communications between nodes and virtual
domains, and services like the attribute service guarantees a unified approach to criteria.
Altogether, the support unit interconnects the entire infrastructure.
Before, we stated that nodes can be part of a domain as an element of the domain’s
evaluation network. Following this premise, each virtual domain is linked to a set of nodes
which are compatible with its definitions. The compatibility process is assured by the
support unit - based on its registries - and consists on matching nodes and domains in
regard to the domain’s ontology, the criteria and the user’s interests. If a node supports
(or is able to support) a given domain, an installation process is started which consists on
preparing the node for the domain. At the end of the process, the node becomes part of
the domain’s evaluation network and can be the target of evaluations from that domain.
Overall, the architecture is based on a separation between nodes, domains and the
support unit which configures a decoupled approach with advantages for all stakeholders.
Evaluators can focus on evaluation definitions and scenarios. Users can focus on their
context, their interests and participating on evaluations. The support unit focuses on
facilitating both node and virtual domain inclusions, providing support services for them
and dealing with infrastructural aspects like availability and expansibility.
In order to operate the architecture and assure the well functioning of the entire pro-
posal, we identify four main stakeholders: user, evaluator, designer and the administrator.
• User - The user participates in the evaluations and provides data that is used by
evaluators to establish conclusions. The user represents an evaluation resource per-
sonified by a node.
• Evaluator - The evaluator is responsible for creating and launching evaluations and
later analyzing evaluation results. It is the target of the virtual domain concept.
83
• Designer/Owner - The designer is the responsible for creating new virtual domains.
The main difference to an evaluator is that evaluators use virtual domains, while a
designer creates and maintains virtual domains by specifying the domain language,
its criteria and application scope.
• Administrator - To supervise the entire architectural implementation, the admin-

istrator has the responsibilities of operating the infrastructure, assuring its proper
operation and guaranteeing its proper behavior.
In the following sections we will describe the elements which compose the architectural
proposal in more depth and how they apply the evaluation methodology and its model.
In order to facilitate the exposure of the concepts, we will start by explaining the Virtual
Domain element due to its lesser dependence in regard to the other elements.
5.3 Virtual Domain

A virtual domain (VD) is a software service that represents the evaluation domain
concept from Chapter 3. Each service can be viewed as an autonomous component admin-
istrated by its designer and linked to the architecture via the support unit. Each virtual
domain operates as an evaluation service that allows multiple evaluators (depending on
authentication) to create and apply evaluation scenarios according to the domain’s defini-
tions for a specific research area and/or in certain evaluation conditions. Enquiries, events,
EPRs and assessments all constitute a set of evaluation instruments which can be used by
evaluators to create evaluation scenarios. Each evaluation is specific to a VD and can be
deployed onto a set of nodes within the VD’s evaluation network.
A VD materializes the domain’s conceptual definition by including a domain language
as its basis. Each VD uses the domain language - which results from the association process
(see Section 4.4) - as its data model and every created evaluation on the VD is forcibly an
instance of that specification. This usage results in a flexible data persistence that provides
many advantages. For one, it allows the incremental approach methodology described in
Chapter 4 for representing and applying evaluations. On the other hand, it also allows VDs
to share concepts from their specification with other VDs thus enabling the reutilization
of concepts from domain to domain.
In addition to the domain language, every VD possesses a list of criteria which is used
to match domains with nodes. The criteria is divided into structural and non-structural
criteria. Structural criteria refers to the necessary software components that the node
must possess in order to receive evaluations from the domain. If an evaluation instrument
requires a certain interface, the node must possess that interface (or an alternative one)
or it will not be compatible with the domain. This type of criteria is associated with the
domain language as it is extracted from the association rules. Non-structural criteria on
the other hand is linked to the user and/or its surroundings. It may correspond to aspects
such as the user’s characteristics or preferences, or correspond to the contextual conditions
of its environment (such as location). The establishment of criteria is very useful for finding
84
nodes which may become subjects of evaluation for a given domain thus facilitating the
evaluator’s job in finding them. Altogether, the criteria establishes an evaluation domain’s
evaluation network.
In the infrastructure, VDs can assume public or private definitions according to its
designer objectives. In public profiles, VDs are accessible to all evaluators and their spec-
ifications can be reused freely by other designers. Users who may find the VD interesting
can solicitate their association with the VD and become part of the domain’s network if
they fulfill all requirements. In private profiles, the VD does not allow user associations
to occur by user initiative, the specifications are not publicly available and evaluators can
only access the VD with proper authentication. In order to establish a middle ground
between these settings, VD designers may also declare the VD as an invite-only domain,
thus being able to receive requests for either user association or evaluator accesses. In
order to know these aspects and be able to apply these constrains, every VD must register
its existence before the infrastructure using a specific registry service. The registration not
only consists on the indication of its profile status but also the domain’s criteria and the
domain language. Based on the declared information, the support structure of the archi-
tecture will react and apply the necessary procedures to ensure the proper functioning of
the VD.
Virtual Domain’s Module Overview While being based on a domain language, in

order to be accepted into the architecture, the VD must be implemented by the designer
in accordance with the architecture’s APIs. From a structural standpoint, a virtual do-
main is composed by three main components: a domain manager, an evaluation module
and a data persistence unit. The domain manager links the virtual domain to the archi-
tecture’s infrastructure using a well defined API, which later allows evaluators to access
architecture operations such as searching for matching nodes, deploy evaluation scenarios
or obtain gathered data from nodes. The evaluation module has the objective of facilitat-
ing evaluation definition, instantiation and analysis by evaluators, by being based on the
definitions of the evaluation model, namely the domain language. The data persistence
unit represents the storage unit of the virtual domain, where data from both evaluation
definitions and its executions are saved. Figure 5.2 shows the VD’s internal components
in regard to the architecture.
Due to the usage of ontologies for data specification purposes, most data within a
VD is represented with Uniform Resource Identifier (URI). For every VD, uris must be
unique and follow a structure which splits the element from the domain’s unique ID.
For example, an element BookFeedbackQuestion from domainA must be represented by
an “http://domainA#BookFeedbackQuestion” uri. In this example, the domainA prefix
identifies the uri’s domain which is important within nodes. Since each node is able to
receive evaluations from multiple domains simultaneously, this uri structure guarantees
that concepts are not shared between evaluations within their execution phases and that
evaluations are indeed independent.
85
SUPPORT UNIT
VIRTUAL DOMAIN
VIRTUAL DOMAIN
DOMAIN MANAGER
DATA PERSISTENCE
UNIT
EVALUATION
MODULE
Figure 5.2: Virtual domain component overview
5.3.1 Domain Manager

The domain manager is the control unit of a domain. It represents the domain within
the architecture and provides a set of operations (via an API) which can be used to interact
with the architecture. The module acts as a bridge between the more business associated
module in the domain - the evaluation module - and the support unit and guarantees that
calls to the infrastructure are performed in accordance to their requirements. It possesses
two communications sides: one for the evaluation module and another for the support unit
services. Technically, they represent two APIs: an internal API which is used only by VD
components and an external API which is used by support unit services.
Within the standard operations of the domain manager APIs, some provide the ability
to start evaluations, list existing evaluations, authenticate the domain on the architecture,
add/remove nodes by IDs, search nodes by affinity or obtain/synchronize results from
ongoing evaluations. Overall, API’s operations are split between three areas: evaluation
control, domain authentication and domain network. The evaluation control operations
in the API focus on allowing the VD to deploy and start evaluations at selected nodes as
well as retrieve collected data for ongoing evaluations. The domain authentication area
provides operations to give evaluators access to use the VD. Despite having an owner
(the designer), a VD can be accessed by multiple evaluators which can use this API to
register/login themselves within the VD - if the VD is public. The domain network area
includes operations to add/remove nodes to the VD’s evaluation network or to instruct the
infrastructure to search for new nodes using the VD’s criteria.
To guarantee the compatibility with all domains, the module itself is generic. In other
words, this means that the module does not change from domain to domain and as such
does not use domain language elements on its operations but elements from the generic
domain language only. Note that due to the incremental nature of the evaluation model,
this does not pose an issue as all domain elements extend generic domain language elements.
By being generic, the module can be seen by designers as a black box ready to use which
86
facilitates the creation of the VD and, in a way, guarantees the interoperability between it
and the infrastructure.
5.3.2 Evaluation Module

Ontology aspects can be complex. They increase the difficulty of creating an evalu-
ation due to the necessities of understanding the existing classes, subclasses, properties
and subproperties of ontologies which are not trivial. While the comprehension of these
elements is necessary for the development of a domain, it is not necessary for evaluators
who simply want to use one. In this sense, the abstraction of ontology aspects can become
an important aspect for more casual evaluators. Evaluators are able to create evaluations
much more quickly without having to worry with the connections between ontology classes
and the domain languages and especially by having specific User Interfaces (UIs) where
evaluations can be designed using visual elements.
The evaluation module has the main objective of simplifying the design, instantiation
and analysis of evaluations of a VD. The module is based on the domain language and
features UIs that facilitate the processes represented in Chapter 4 when handling eval-
uation specifications and instantiations. The module abstracts ontology complexity by
focusing only on the necessary elements in regard to content and automatically filling the
required ontological aspects. Figure 5.3 shows the virtual domain internal structure and
its connection to the architecture.
The module features three units which represent the main processes of evaluation design
in a domain: the evaluation creation unit, the evaluation instantiation unit and the evalu-
ation results unit. The module is optional because the creation of evaluation specifications
can be performed manually using programs like Protégé [Stanford Center, 2015].
While the simplification of the main evaluator tasks in regard to the creation and ini-
tialization of evaluation scenarios is the primary objective of evaluation module, it can also
include other operations which the designer may interpret as necessary for the material-
ization of its VD. For instance, the module can include units which provide authentication
support, the creation of evaluation groups, data mining operations over retrieved informa-
tion among others. In a way, while the domain manager represents the core operations of
the VD, the evaluation module can be perceived as the business module of the VD.
5.3.2.1 Evaluation Creation Unit
The creation unit has the objective of facilitating the creation and design of new eval-
uations for the domain. The unit is based on two main areas of operation: the creation
of instances for the EPR and Enquiry specifications, and the creation of evaluation as-
sessments. Both of these areas are based on the processes from Section 4.5.2 and 4.5.3
respectively of the evaluation model. In itself, the unit abstracts the ontological aspects of
the specifications by focusing only on retrieving the necessary content from the evaluator
using an UI.
87
SUPPORT UNIT
DOMAIN MANAGER API (EXTERNAL)
DOMAIN MANAGER
DOMAIN MANAGER API (INTERNAL)
DATA PERSISTENCE
UNIT
EVALUATION
CREATION
UNIT
EVALUATION
INSTANTIATION
UNIT
EVALUATION
RESULTS
UNIT
EVALUATION MODULE
VIRTUAL DOMAIN
Figure 5.3: Virtual domain internal overview
In order to exemplify the procedure regarding the Enquiry specification, we will resort
to the example of Section 4.5.2 based on the literature habits domain. In the enquiry
specification example, to create a question regarding the user’s feedback of a book, we
needed three elements: BookFeedbackQuestionA, BookA and a title, “Pride and Prejudice”.
In addition to the elements, we linked the elements using properties from the domain like
domain:hasBook, domain:hasTitle and enq:hasFirstElement. In all, we created an
instance for the BookFeedbackQuestion.
In order to simplify this process, we can divide it in what can be filled automatically
and what must be filled by the evaluator. For this example, only the title of the book
does in fact require the evaluator’s input. The other elements can be easily generated
using the base specification as a set of rules for associating the necessary properties and
a random ID generator for the classes’ instances. Regarding the enquiry itself, the eval-
uator could specify the order of the questions through a drag and drop approach. The
creation unit then automatically creates the associations between the question elements
using the enq:hasTransitionTo and links the first element to the enquiry instance using
the enq:hasFirstElement property.
Note that this example is merely a suggestion of how can the domain designer/owner
88
approach the evaluation model mandatory processes and create an UI which simplifies the
evaluators task. This is possible because every VD is based on a domain language that
is immutable. A domain can create new evaluations, but the evaluations always follow
the incremental nature of the evaluation model and thus, automatically comply with the
domain language. Similar steps can be taken in regard to the EPR specification and the
assessment creation UIs. Later on, in Chapter 6, we will provide an example of a creation
unit built for a specific domain using these recommendations (see Section 6.1.2.2).
5.3.2.2 Evaluation Instantiation Unit

The second unit of the module is associated with the instantiation of evaluations. Using
the evaluation specification that results from the creation unit, this unit allows evaluators
to select a group of users and a schedule and instantiate the evaluation automatically. In
regard to Chapter 4, this unit translates to the creation of the execution specifications as
described in Section 4.6.
Similarly to the creation unit, the instantiation unit separates aspects which require
the evaluator’s direct intervention from those that can be automated. In this case, the unit
asks the user to select the group of nodes who will be participants and the evaluation’s
start and end dates. For this, the unit features an UI which facilitates the process. After
the evaluator’s submission, the unit then deals with the ontology requirements of linking
the users to the EvaluationInstantiation instance (see Section 4.6.1) and initiates the
deployment phase.
In Section 4.6, we explained that nodes execute control flow instances and leave to
interfaces the interpretation and execution of actual content. For this, we explained that
it is necessary to transform evaluations into execution specifications who represent the
same information but possess a structure which can be interpreted by all nodes. The
instantiation unit can automates this entire process by generating the entire execution
specifications using the domain language rules and the respective evaluations specifications.
The process itself follows the steps from Section 4.6. Initially, the unit instantiates the
evaluation for each selected user. Then, for each evaluation instantiation, the unit executes
a generation algorithm which parses each assessment in the evaluation specification into
an execution specification. Finally, the unit groups the execution specifications into a set
and sends the evaluation to the selected nodes by using the domain manager’s API.
Once again, this step can be made manually and initiated via the domain manager’s
API. Doing so however is time costly and especially complicated.
5.3.2.3 Evaluation Results Unit

The third and final unit of the evaluation module is the results unit. Our evaluation
model is based on an incremental methodology which starts from general specifications
and ends at execution specifications. We explained this evolution in Chapter 4, but we
stopped at the execution building phase since the execution phase itself is not made within
the domain, but at the nodes. The results unit focuses on the produced data from this last
89
phase and has the objective of facilitating the consultation of results from the execution
phase of the evaluation.
In Section 5.4.2, we will explain the process associated with the execution of evaluations
at the nodes in detail. In the context of the results unit, we will focus on the results of the
process. After an evaluation is executed at the node, the data is returned to the domain.
The data is mainly composed by instances of the EPR and Enquiry extended specifications
and execution data regarding the process itself declared in control flow ontology elements.
As said before, data is returned to domains inside the original execution specifications.
During the execution process, gathered data is linked with the original structures that
define the procedure to facilitate both the persistence of the data and its analysis as well.
In regard to EPR data, the node uses the EPR specification and associates triggered
events with the corresponding Event class elements. In regard to the Enquiry specifica-
tions, gathered data is instantiated in accordance to the Answer specifications. On both
situations, the results comply with the specifications of the domain. As such, the designer
knows before hand what type of results it may obtain and is able to create UIs that allow
evaluators to visualize data either graphically or in a more treated way in comparison to
the ontological format. In addition, the designer can provide analysis tools that compare
multiple execution specifications or apply analysis algorithms over them.
Once again, this unit is not mandatory as results are embedded in the execution spec-
ifications and can be analyzed simply through them.
5.3.3 Data Persistence Unit

The data persistence unit of a virtual domain does not have any limitation regarding
its type or location. It should however be accessible by the different components on the
virtual domain.
5.3.4 Domain Interfaces and Producers

In the last chapter we explained how is a domain language built and what are its
characteristics. One of its main features is the idea of separating execution flows from
content using software interfaces. In the domain language, execution elements like Tasks
are associated with specific interfaces responsible for interpreting the associated content
(as input) and producing new data (as output). In the case of Events, producers are
responsible for generating events that can trigger EPRs. The idea was to give the evaluation
model the flexibility to grow according to the designer’s objectives and allow new domain
languages to be created (and implemented) freely.
Interfaces can represent modalities like GUIs, speech, gestures, among others. Event
producers are more indirect instruments and can comprehend devices like sensors, actu-
ators, image processing software, among others. In regard to the specifications, a task
interface is able to interpret Question elements from the Enquiry extended specification
and produce an output structure that complies with the corresponding Answer specifica-
tion. An event producer is able to create event instances based on the Event extended
90
specification and in accordance with its own processing routines.
When constructing the VD, each declared interface or producer of the domain’s ontology
must exist as a software component. These components are fundamental to the execution
of evaluations as they interpret the actual content of evaluations and retrieve data from
the user. The components actually also represent the structural criteria of the VD and
must be part of every node who belongs to the domain’s evaluation network. By assuring
that the node possesses all interfaces/producers of the domain language, it is guaranteed
to evaluators that the node will be able to handle every evaluation from that domain.
Interface and event producer components are deployed into nodes during the association
process between a node and the VD (see Section 5.5.5). When installed, every interface has
to register itself in the node’s interface manager in order for the node to know the location
of the interface and communicate with it during evaluations. By its part, every producer
must register itself in the event logger. When doing so, the node uses the interface’s ID
from the domain language to identify and verify the component(s).
The implementation of an interface or a producer can be done using any language and
can only be created by the domain’s designer since it is based on the domain language.
The internal processing of the interface/producer is unknown by the architecture, as it
operates like a black box. Security issues can be raised by this, but it is important to note
that the component is only installed on the node by the user’s choice.
Domain interfaces and producers are not a part of a VD implementation but are manda-
tory requirements for its successful operation. They are registered in the architecture’s do-
main registry along with the domain’s ontology and criteria. The full registration process
of a virtual domain will be explained later on. To explain the implementation process of
an interface/producer, following are some examples using the literature habits domain.
5.3.4.1 Task Interfaces

As an example of a task interface, consider the FeedbackQuestionTaskInterface from
the literature habits domain of Chapter 4. The domain language states that every instance
of a FeedbackQuestionTask is interpreted by a FeedbackQuestionTaskInterface. This
implies that every subclass of FeedbackQuestionTask is also resolved by the Feedback-
QuestionTaskInterface (see Figure 4.16). Since the task specifies a FeedbackQuestion
as its input and a FeedbackAnswer as its output, the interface must adhere by this signa-
ture. Figure 5.4 illustrates an example of a task interface which receives a BookFeedback-
Question and produces a BookFeedbackAnswer.
Task interface components include the interpretation of the input, the interface’s exe-
cution and the consequent output. In order to be compatible with the architecture, a task
interface component must provide an endpoint which possesses the following operations:
• start(Question) - starts the execution of a task’s input in the interface. The

interface returns an ID to identify the process.
• cancel(ID) - stops an ongoing execution.
91
Input Feedback Question Output
Task Interface
Figure 5.4: Task interface processing example
This endpoint allows the node to communicate with the interface and start or cancel
ongoing tasks. The communication itself should be performed using serializable formats.
5.3.4.2 Event Producers

Unlike task interfaces, event producers do not require an endpoint. They are constantly
active and signalize their location by registering themselves in the node’s interface manager.
Because of their constant activity, event interfaces do not communicate directly with the
other components but use an intermediate module called Event Logger+Dispatcher.
The Event Logger+Dispatcher is a node component based on a message queue. It acts
as a postman within the node in regard to events, allowing every component in the node
to subscribe to them and be alerted upon their occurrence. By using this method, not
only can events be used by EPRs, but they can be used by other interfaces themselves, for
purposes like adaptability to the user or context analysis.
In regard to the design of an event producer component, the component must be able
to produce an instance of the Event class included in the Event Extended Specification.
After creating the instance, the producer must communicate the resulting structure to the
logger which will then propagate the instance to subscribers. As an example, Figure 5.5
illustrates an instance of the StartReadingEbookEvent built by the producer and sent to
the Event Logger module which then routes the instance to Subscriber A and B.
Subscriber A
Ebook Reader Event Logger

Event Producer Event Instance + Dispatcher
Subscriber B
Figure 5.5: Example of an event producer communicating with the event logger
92
EPRs subscribe to the logger to receive associated events when active. When they
become inactive, they simply unsubscribe their respective events. The registration itself is
made using the Event full URI (ex.: http://domain1#StartReadingEbookEvent). It is also
important to note that an event component can be associated with more than an event,
and as such can produce more than one type of event if so necessary depending on the
domain language. In our definition however, an Event can not be produced by multiple
event producers as it might bring inconsistency into the evaluation. In the scenario of the
same action being detected by two producers, it could result in two different events that
mean the same occurrence. A compatible and simple solution would be to synchronize all
event producers and manifest the event only through one. In regard to communications,
event producers connect with the logger using standardized messages.
5.3.4.3 Event Interfaces
In Chapter 4, we explained the differences between producers and interfaces. We indi-

cated that a producer represents a unidirectional component which produces data period-
ically but is not able to receive requests from the node. On the other hand, we stated that
interfaces are able to receive requests, interpret them and return new content. In practice,
interfaces differ from event producers as they include an API which other components can
use to send requests to them.
While events are created by event producers, EPREvents are not. By definition, we
associated EPREvents with EPRs (see Section 4.4.2.2), and have stated that evaluation
assessments can incorporate either EPRs or enquiries into their instances. An EPR instance
is designed by the evaluator at the evaluation specification level and not at a domain level
which means it is virtually impossible to predict every possible EPR, thus creating a need
for EPR interpretation. Because of this, nodes need a specific interface which is capable
of interpreting EPRs. In Chapter 4, we identified that interface as the EPRInterface and
in our architecture, this interface is represented by a specific node component called EPR
Engine. It is important to note that although we limited the domain language to the EPR
Engine, other interfaces can be added for EPR interpretation if necessary.
To comply with our proposal, every event interface (like the EPRInterface) must
provide an endpoint with the same operations as the task Interface: start(EPR) and
cancel(ID). They will be used by the node to redirect EPRs to the EPR Engine, and/or
cancel ongoing EPR interpretations. Much like the task interface, communications must
also be performed in a serializable format.
5.3.5 Extending a Virtual Domain

By definition, domain languages are built from the generic domain language. Every
virtual domain is built on top of a domain language and the domain language can be used
to create multiple virtual domains (f.i. with different criteria or owner(s)). A domain
language however can also be built on top of another domain language. In these cases, if
93
the original domain language is extended to a new domain language, the same can be done
regarding the virtual domains.
This mechanism can be helpful for developers as every interface/producer that was im-
plemented for the original virtual domain can be reused in the new. In such a scenario, the
developer of the new virtual domain only has to implement the new interfaces/producers
and the UIs that extend the original virtual domain. The mechanism can be important for
reducing the implementation times of VDs and facilitating their proliferation.
5.4 Node
A node is a software component that represents a user to whom evaluations can be
applied. The component is seen by the architecture as an autonomous service linked to
the support unit and administrated by a single user, its correspondent user. Conceptually,
a node personifies a single user and its environment as an abstract location where evalua-
tions can be executed while accounting for contextual changes. To virtual domains, nodes
represent evaluation targets by being part of evaluation networks who establish them as
suitable for receiving evaluations.
Each node can be part of multiple evaluation networks at the same time. To assure that
a node is compatible with a domain, the node features an expansible structure which is able
to include new components (interfaces and producers) as extensions of its core features.
Regarding evaluations, nodes apply execution specifications which result from instantiating
an evaluation specification at a VD, and interact with the user and the environment to
gather the necessary data. Resulting data is added to the specifications and returned to
the VDs for posterior analysis.
A node includes several modules who cooperate to apply evaluations to the user. At
its core, the node features an evaluation engine responsible for processing execution speci-
fications. To aid it, the node includes three assisting modules: the EPR Engine, the Event
Logger+Dispatcher and the Interface Manager. The EPR Engine and the Event Logger
are responsible for event-handling in the node while the Interface Manager is responsible
for managing third-party components - the interfaces. To store evaluation data, the node
features a data persistence unit and to store data regarding user preferences, characteristics
or other elements associated with criteria, the node includes two components: the user and
context models. Finally, to connect the node with the architecture, the node includes a
Node Manager which handles all communication between the node and the support unit.
Figure 5.6 illustrates the node design structure.
Evaluations are created in domains and sent through the support unit to the associated
nodes. They are received by the node manager who verifies if the evaluation corresponds to
an associated domain. If so, the node manager delivers the execution specification to the
evaluation engine (according to its associated schedule). From this point on, the evaluation
is considered as being in execution. The evaluation engine applies the received specification
and delegates to the Interface Manager the routing of either Tasks (to Task Interfaces)
or Complow Events (to the EPR Engine). When all specifications are completed (or the
94
SUPPORT UNIT
NODE MANAGER
DATA PERSISTENCE
UNIT
EVALUATION ENGINE
EPR ENGINE
USER MODEL
INTERFACE MANAGER
EVENT LOGGER
CONTEXT MODEL
+DISPATCHER
INFRASTRUCTURE
NODE
Figure 5.6: Node Architectural Design
scheduled period ends), the evaluation is considered as terminated.

Through all of this process, data is saved in the data persistence unit in a ontological
format. Due to the nature of the evaluation model (based on Chapter 4 specifications), data
is stored in an incremental format. This format allows data to be constantly uploaded and
accessed by evaluators if they require, using a pulling approach accessible through the node
manager’s API. When solicited, the node delivers the received execution specification with
the annexed results (in accordance to its specifications) to be access in the correspondent
VD.
Overall, the node is an unaltered component that does not change in regard to its core
modules. Nonetheless, to allow the execution of multiple domains, the node is able to
“receive” new software modules (as interfaces/event producers) that are introduced in the
node. Recall that interfaces and event producers represent the structural criteria of a VD
and must be part of a node in order for the node to adhere to the VD’s evaluation network.
As such, the modules must be downloaded and installed (manually or automatically) in
the node. The modules are represented in the figure as part of the infrastructure block
and operate as black boxes to the node.
Regarding its installation, a node does not forcibly mandate all components to be
95
accessible locally. Modules can be split by several locations and still operate accordingly
as long as communications are assured. For example, the main node manager can be a
software module on the cloud, the evaluation module a component at the user’s home, the
persistence unit on a distributed server, and so on. The goal is for the architecture to be
sufficiently modular in order to be adaptable to the user’s circumstances and still be able
to guarantee the correct execution of evaluation scenarios. To achieve this, all modules
possess specific communication APIs.
5.4.1 Node Manager

The node manager is a software module responsible for linking the node to the evalua-
tion architecture. Its goal is to encapsulate the node by providing a set of public operations
which can be used by other services in order to contact the node. For this purpose, the
manager includes an external API which handles all communications between the nodes
and other elements of the architecture like support services or virtual domains. From an
operational standpoint, this API features operations for node authentication and mon-
itoring, operations for VD association requests, and operations for receiving evaluation
requests or uploading gathered data.
In our approach, nodes can be part of VD networks, thus marking them as possible
targets of evaluation scenarios. Since it is the user who decides with whom should the
node be associated, or what is the node’s policy in regard to it, the manager includes an
internal API which provides operations for it. Via the API, the user is able to analyze VD
association requests or set how should the node behave in regard to these requests: either
public - accept all -, private - do not accept any - or invite-only - ask the user. The user
may also indicate a VD to whom it wishes to become connected to. The full association
process is mediated by the association service on the Support Unit and will be described
in Section 5.5.5.
Since the node manager is the entry point of the node, evaluation requests are also
received by it. An evaluation request is formed by a set of execution specifications (one for
each assessment), the evaluation specification and the schedule. Every received evaluation
is verified by the manager in regard to its origin. If an evaluation is received and not
created by an associated VD, the evaluation is discarded. If it is, the manager verifies the
evaluation’s schedule and starts the execution specification when the initial date is fulfilled.
The module does this by sending the specification to the evaluation engine so it can be
initialized.
During the execution, the manager is able to cancel the evaluation if a request is made
by the evaluation’s associated VD. When it occurs, the node groups all gathered data and
sends it to the VD. At the end of the evaluation, the node stores all gathered data, only
sending it to the VD when requested. For this purpose, the module maintains a connection
to the data persistence unit and the evaluation module. In addition to these tasks, the
manager has also the task of answering ping requests to the support unit to keep the
registry up-to-date.
96
5.4.2 Evaluation Engine
The evaluation engine is the main execution module of the node, responsible for manag-
ing and applying evaluations to the user. The module is based on the control flow ontology
and uses the domain’s ontology to parse incoming evaluations. As seen in Chapter 4, each
evaluation is traduced to an execution specification and it is the evaluation engine’s re-
sponsibility to apply the specification step by step in order to obtain data from the user.
To do so, the module is linked to three other components, the persistence unit, the node
manager and the interface manager.
The module is associated with the node manager in order to receive evaluations to
execute. Since the manager is the entry point of the node, it is the manager who makes
execution requests to the engine. In similar manner, the engine can also receive requests
to cancel ongoing evaluations.
In the persistence unit, the engine stores all data that is gathered during the evaluation’s
execution. The data is saved in ontological format, and follows an incremental approach
to the corresponding execution specification which lead to its gathering. In other words,
the engine stores obtained data as the output of the control flow element that produced
it, like a Task or an EPREvent.
The association between the interface manager and the engine is fundamental to the
execution of evaluations. When parsing a Task or a Compflow/EPREvent, the engine del-
egates the interpretation of its input to the corresponding interfaces. In this regard, the
engine contacts the interface manager who is able to contact any interface in the node, and
requests the delivery of the input to the interface. In the same way, when the interface
completes its task, the interface manager alerts the engine so that it may proceed in the
workflow.
When a task or an event ends, the result is sent to the engine as output, which the engine
associates to the execution specification using the cfw:hasOutput property. It is important
to note, that every action taken by the engine is stored using timestamps and other de-
scriptive attributes. Timestamps are added to control flow instances - Task, Compflow/EPR
Event, Workflow - and indicate when was the element executed or started. As the en-
gine advances through each element, the process marks its State as ‘In Progress’,
‘Finished’, ‘Canceled’, ‘With Error’ and ‘Timed Out’ depending of its execution.
All of these can be accessed by evaluators as they become part of the results.
5.4.2.1 Processing an execution specification

Figure 5.7 illustrates an evaluation assessment within an execution specification. This
assessment is based on an example from the Chapter 4 and is sent by the node manager
to the evaluation engine. In this particular example, the engine receives a Job element -
Job A A1 U1 - representing an instance of EvaluationAssessmentA. Upon receiving it, the
engine starts the execution by analyzing the cfw:hasWorkflow property of the Job element.
In this case, the engine finds the Workflow A A1 U1 element. Then, the engine inspects
97
the workflow and finds its first element using the cfw:hasFirstActivity property, an
EPREvent, the EPREvent A A1 E1.
Job A_A1_U1
cfw:hasWorkflow
Workflow
A_A1_U1
cfw:hasInput cfw:isExecutedBy
EPR Event EPR
EPR A
A_A1_E1 Interface
cfw:isFollowedBy
cfw:hasInput Enquiry
Enquiry A Workflow
A_A1_U1_W1
cfw:hasInput cfw:isExecutedBy Feedback

Question Task
Question A A_A1_U1_T1
Interface
cfw:isFollowedBy
cfw:isExecutedBy
cfw:hasInput
Question B A_A1_U1_T2
Figure 5.7: Execution specification received by the Evaluation Module
The entire process is iterative. When it finds an EPREvent, the engine knows that
according to the domain language, the element possesses an EPR as input. The engine
checks which interface resolves it and contacts the Interface Manager component in order
to redirect the EPR to the corresponding interface. Since the EPR Event is associated with
the EPRInterface, the node annexes EPR A - its input - as an argument to the request and
sends it to the Interface Manager.
When the engine delegates either a Task or an EPREvent to the interface manager, the
element is considered ‘In Progress’ and the process stays on hold. The engine waits for
the interface manager to answer the request and when it does so, the answer is associated
to the corresponding element as an output. In some situations, it may not exist an output
as it may not occur in time (‘Timed Out’), it might be canceled (‘Canceled’) or a failure
may happen (‘With Error’). If either of these happen, the execution ends. If on the
contrary, all goes well, the resulting output is received by the engine, it associates it to
the element using the cfw:hasOutput element, denotes that element ended with success
(‘Finished’) and continues the workflow.
98
In the example, after executing the EPREvent A A1 E1, the engine checks the cfw:is-
FollowedBy property and finds another Workflow. Once again, the engine checks its first
element, and finds a BFQTask1 associated to a FeedbackQuestionTaskInterface. The
engine repeats the previous process and sends a request to the interface manager with the
indication of the interface (or interfaces) and its input and waits for an answer. Upon
receiving it, it associates the answer to the task and continues the workflow. Figure 5.8
shows an example of the result from both BFQTask elements.
domain:hasRating Book
12/09/2015
Rating Feedback
20h:05m:30ss
Answer
cfw:hasStartDate
rdf:type rdf:type
domain:hasRating cfw:hasOutput cfw:hasCompletedDate

Book Feedback BFQTask 12/09/2015
"Five"
Answer A_I1 A_A1_U1_T1 20h:06m:10ss
rdf:type rdf:type cfw:hasState cfw:isFollowedBy
domain:hasRating cfw:hasOutput cfw:hasStartDate

Book Feedback BFQTask 12/09/2015
"Four"
Question B_I1 A_A1_U1_T2 20h:06m:12ss
cfw:hasState
cfw:hasCompletedDate
12/09/2015
'Completed'
20h:06m:42ss
Figure 5.8: Output from Task elements in the Evaluation Engine
In the figure it is possible to observe that associated to the Tasks are now two new an-
swer elements, BookFeedbackAnswerA I1 and BookFeedbackAnswerB I1 via the cfw:has-
Output property. In each of them is data which resulted from applying the task to the user.
In this example, the data corresponds to two Rating elements linked to the base answer
elements. Note that the structure follows the Answer specification from Section 4.4.1 on
which this example was based.
The evaluation engine supports concurrent evaluations as it handles each execution
specification individually. The coordination between multiple and active evaluations how-
ever is not handled by the engine itself, and is left to the user. Overall, the evaluation
engine is an iterative component which parses control flow executions. It is its job to pro-
cess execution specifications and persist data results. The component is a fixed module
which functions in an abstract manner which reduces the received elements to their base
classes. In addition to the node manager, the engine is the only module with access to the
data persistence.
1
Represents the BookFeedbackQuestionTask element in the figure for simplification purposes
99
5.4.3 Interface Manager
In Chapter 4 we introduced interfaces as an answer to domain-node compatibility. With
interfaces, it is possible to create domains with new evaluation instruments and apply them
to already existing nodes by outsourcing their execution. Interfaces can represent devices,
software systems or user interfaces and be local or remote. With them, we give answer
to an important requirement, the ability to apply evaluations anywhere. For example,
we become able to make an evaluation on a lower power device such as smartphones, by
making all interaction necessities on the interface, and placing the entire node on the cloud
or a dedicated server. To the node, the interface is a black box which is able to interpret
a specific type of content.
When linking a node to a domain, interfaces that are part of the domain’s ontology
are “placed” on the node. From its part, the node maintains its original structure and
interfaces are seen as external content providers. To allow this concept to occur, it is
necessary to embed the node with a module that controls which interfaces are linked to it.
The interface manager fulfills this role by maintaining a registry of linked interfaces and
by being a communication bridge between the interfaces and the node. The manager itself
is a software module divided in two components: a registry and a controller. The registry
allows interfaces to inscribe themselves and insert their access points (API endpoints).
The controller manages requests from the evaluation engine and redirects them to the
appropriate interfaces. Figure 5.9 illustrates the module’s structure.
EVALUATION ENGINE
(QUESTION/
EPR INPUT, OUTPUT
INTERFACES)
INTERFACE TASK INTERFACE A

EPR INPUT QUESTION INPUT
COORDINATOR
OUTPUT ANSWER OUTPUT
EVENT INTERFACE
(EPR ENGINE)
INTERFACE REGISTRY TASK INTERFACE B
INTERFACE MANAGER INFRASTRUCTURE
Figure 5.9: Interface manager component organization and connection to the infrastructure
5.4.3.1 Registering and using an interface

In order to be contactable and receive interpretation requests, interfaces need to regis-
ter themselves in the interface registry. To do this, the interface must contact the registry
100
and state the endpoint of its mandatory operations. In the case of a Task Interface, the
interface must state two operations, the start and cancel operations. For this, the manager
has a public method - registerTaskInterface(start endpoint, cancel endpoint,
interface uri). Events created by event producers are not handled by the manager but
by the Event Logger+Dispatcher. Event Interfaces however are, and need to declare their
API similarly to Task Interfaces using the registerEventInterface(start endpoint,
cancel endpoint) operation. It is important to remember that the standard event inter-
face of the node is the EPR Engine, but that the node is able to insert other interfaces if
necessary using this mechanism.
Start and cancel operations are used by the interface controller to make requests to
interface components. When the evaluation engine asks the manager to route either a
Task or an EPREvent to an interface for interpretation, the controller searches the start
endpoint on the registry and forwards the request to the interface. When the interface
answers back, the controller forwards the answer to the engine. Note that if a request is
made to multiple interfaces, the manager makes the request to all interfaces and waits for
an answer from any of them. When one does so, the manager forwards the answer to the
evaluation engine and contacts the other interfaces using the cancel endpoint for them to
stop their operation. This decision is predefined at the node but other policies involving
cooperation or redundancy algorithms with multiple interfaces can also be included within
the interface manager. By using endpoints and centralizing all communication between
the node and the interfaces in the interface manager, it becomes possible to have remote
interfaces and thus conduct evaluations which go beyond the house of the user.
5.4.4 EPR Engine and Event Logger+Dispatcher

Complex situations which require the combination of multiple events are tackled in
our approach by EPRs. These rules allow the evaluator to combine several events and
operations and view them as a single event that can trigger further evaluation actions.
In Chapter 4, we have provided a description of an EPR, which from an architectural
standpoint requires a specific unit capable of processing it. In the context of an node,
that unit is called the EPR Engine. An EPR is sent to the node as part of an execution
specification (namely as an EPREvent) received by the evaluation engine, and consequently
forwarded to the EPR Engine. This delivery however is not made directly but via Interface
Manager and it occurs because to the Evaluation Engine, the EPR Engine is an interface.
Going back to the specification on Section 4.4.2.1, we identified the engine as EPRInterface
which is a subclass of the EventInterface class. This association declares the EPR Engine
as a mere interface, like any other and is important to assure that in the future, other
engines can be added to the node, thus granting event processing the ability for alternative
implementations.
When the node is set up, the EPR Engine must announce itself in the Interface Manager.
To do that, the manager provides an API operation similar to the task interface registration
process - registerEventInterface. In its registration, the module inserts its endpoints
(start and cancel) and its interface uri. The engine itself can be a remote module and
101
be installed in a simple device, a server or in the cloud. Posterior to its registration, the
engine is ready to execute incoming EPRs.
In order to receive new events, the engine is connected to the Event Logger+Dispatcher
module. This module acts as a postman unit where every event producer in the node is able
to publish their events. Using a publish-subscribe mechanism, the module is fundamental
for assuring a decoupled communication method where all producers can place new events
regardless of their location or implementation. On the other end, the module allows every
component on the node to subscribe to certain producers and receive notifications when new
events occur. This mechanism not only allows the EPR Engine to subscribe to events which
are part of active rules but also allows interface components to receive event notifications
and apply those notifications to adapt themselves to current conditions.
5.4.4.1 Processing an EPR

An EPRs is an evaluation item on an evaluation assessment. When traducing the
evaluation specification to the execution specifications, the EPR becomes linked with the
EPREvent as its input, and is sent to the Evaluation Engine when an evaluation scenario
is initialized. Every EPR is subjected by the node to a pipeline which parses, processes
and annexes its results. Figure 5.10 illustrates the interaction between node components
for the realization of this pipeline.
Evaluation Interface Event Logger Event

EPR Engine
Engine Manager +Dispatcher Producer
startEPR(EPR, registerEventProducer
interfaceUri) (eventProducerUri)
checkStartEndpoint(
interfaceUri)
start(EPR)
parseEPR(EPR)
loop
[for each event in EPR]

subscribe(event)
publish(event)
notify(event)
evaluateEPR(EPR,event)
alt
[if evaluateEPR = true]

return EPR result
return EPR result
loop
[for each event in EPR]

unsubscribe(event)
Figure 5.10: Sequence Diagram for processing an Event Processing Rule in the Node
According to the evaluation model, when an EPR is found in a workflow, it includes

two properties indicating its input (cfw:hasInput) and its associated interfaces (cfw:isExe-
cutedBy). When detected by the evaluation engine, the EPR originates a request, where
the input and interfaces are annexed and sent to the Interface Manager. When the EPR is
102
received by the manager, the interface(s) uri(s) are translated into actual endpoints. Then
and using the start endpoint, the manager forwards the request to the actual interface
component(s) for its handling. By predefinition, that interface is the EPR Engine, who
will receive the request and begin its processing.
Prior to the execution of the EPR by the EPR Engine, it is noteworthy to explain
that all event producers on the node have (forcibly) made their registration in the Event
Logger+Dispatcher. The publish/subscribe methodology uses the same endpoint mecha-
nism of the interfaces. Interested parties must make a request to the logger stating their
interest, that is the event, and how to notify them for new iterations. To do that, the
logger provides two operations, a subscribe and an unsubscribe method. The full logger
API is completed by the registration operation and a publish operation which is used by
producers to indicate new event iterations.
When an EPR is received by the EPR engine, the engine starts by parsing the received
EPR and checking which Events are associated to it. For each found Event, the engine
subscribes to it in the Event Logger+Dispatcher in order to receive new instances of it.
Whenever a new event is published in the logger, the logger forwards it to every subscriber.
The EPR Engine receives the notification and propagates the event to every active EPR
that is interested in the event.
If an EPR has the event in its specification, the engine starts an execution process.
The process checks its operations and conducts an internal evaluation. Given that opera-
tions are predicates, at the root level a true or false value will be formed. If the value
corresponds to true, the engine returns the gathered data to the interface manager who
will forward it to the evaluation engine. There, the data is associated with the EPR Event
instance that started the EPR and stored in the data persistence unit.
As an example of the EPR Engine execution process, observe the following EPR:
AN D(N OT (SpeakerOf f Event), OR(IncreaseV olumeEvent, DecreaseV olumeEvent))
The above EPR represents a situation where the evaluator intends to check whenever
the user increases or decreases the volume of a speaker that must not be offline. Figure 5.11
demonstrates the evolution of the EPR when it receives an occurrence of the Increase
Volume event.
EPR Idle 'Increase Volume' EPR Satisfied
Event
And And And

False True
Not Or Not Or Not Or

True False True True
Speaker Increase Decrease Speaker Increase Decrease Speaker Increase Decrease

Off Volume Volume Off Volume Volume Off Volume Volume
False False False False Change State False False True False
Figure 5.11: EPR Engine execution example for the IncreaseVolume Event
103
In the figure, the engine receives an IncreaseVolume event and propagates it to its
leafs. The corresponding leaf returns a true value which provokes a change in the Or
operation which instead provokes a change in the And operation. Since the And operation
represents the root element of the EPR, the EPR is considered as fulfilled. Then, the EPR
Engine alerts the Evaluation Engine to the termination of the EPR who consequently links
the result with the correspondent EPREvent.
Certain operations like Delay or Not operations can alter their state without incoming
events. The engine is aware of this and allows operations to self-trigger it necessary and
ask for an EPR evaluation which checks if the rule became fulfilled.
Contrarily to most complex event processing (CEP) engines, the EPR Engine evaluates
each EPR instantiation separately. Since evaluations can be started at any point in time,
multiple instantiations of the same EPR can be received at different moments of time. In
this scenario, an event that occurs when two instances of the same EPR are active can
provoke very different results (depending on the EPR’s operations). Because of this, events
are considered individually for each EPR instance.
5.4.4.2 Storing EPR Data

When the EPR Engine receives an EPR, it parses and processes the EPR. During
its processing, the engine produces new data which corresponds to new events and other
information which is being created by event producers. For every EPR request, the EPR
engine produces an EPR instance for each request and stores all produced data on it.
At the end of the execution (or due to a cancellation), the instance is returned to the
evaluation manager as the result of the EPR execution.
To construct the result, the engine uses the EPR specification as a reference. For
each received event from the event producers, the engine links it with the corresponding
Event class using the rdf:type property. Figure 5.12 illustrates the EPR processing result
for the previous example. In its execution, a single instance of either IncreaseVolume
event or the DecreaseVolume event is sufficient to fulfill the EPR’s condition as long as a
SpeakerOff event does not occur. In the example, instances are created for every operation
and associated with the original EPR classes.
When the result is returned to the Evaluation engine, it is associated to the EPREvent
instance that triggered the request using the cfw:hasOutput property.
5.4.5 User and Context Models

A node represents an evaluation resource in which the user is the only participant and
has characteristics that are related to him and to his environment. These characteristics
classify the node in the architecture as a subject that can become part of a domain network.
If a VD wants to include the node in its network, then the node has to fulfill the criteria
established by that VD. Criteria is specified at the domain level and verified at the node
level. As an example, consider a property called ‘hearing ability’. A virtual domain may
want to include in its network only nodes in which the user’s hearing ability is bigger than
104
Event rdf:type Event
Processing Processing
Rule A_I1 Rule A
epr:hasRootElement epr:hasRootElement
Event rdf:type Event

Operation Operation
AND A_I1 AND A
epr:hasRuleElement
epr:hasRuleElement epr:hasRuleElement epr:hasRuleElement
Event Event Event Event

Operation OR Operation Operation OR Operation
rdf:type rdf:type
A_I1 NOT A_I1 A NOT A
epr:hasRuleElement epr:hasRuleElement epr:hasRuleElement

epr:hasRuleElement
Increase rdf:type
Increase Decrease SpeakerOff
Volume Event
Volume Event Volume Event Event
I1
evt:hasTimestamp
12/09/2015
20h:05m:53ss
Figure 5.12: EPR Engine - Result from processing an EPR
eighty percent. This requirement is what we call a non-structural criteria. It implies that
if a node is part of a domain in which this criteria was specified, then it forcibly fulfills this
condition.
As said before, criteria is split into two types: structural criteria and non-structural
criteria. Structural criteria is associated with the interface and event producer components
that nodes must possess in their infrastructure. Non-structural criteria is related to the
user and its environment properties. Taking the previous example, a non-structural criteria
element is always composed by an attribute - ‘hearing disability’, an operation - ‘bigger
than’, and a value - ‘eighty’.
In regard to the node’s architecture, it is necessary to store these attributes and allow a
way for users to add or update them. In this sense, the node includes two dedicated com-
ponents to store and categorize non-structural criteria: the user and context models. Both
models cooperate with the node in verifying node-domain compatibility. The existence
of both models allows a separation to be made between what is related to the user and
what is related to the environment. As such, each attribute is associated with either the
context or user model. When a criteria is received to which the node has no information to
answer, the corresponding attribute is requested to the user using a dedicated user/context
UI by the node manager. The result is stored in the user and context model and used as
a reference for future requests.
Both models are accessible (by other node components) via a well-defined API com-
105
posed of the following operations:
• evaluateCriteria(attribute, operation, value)
• requestAttribute(attribute)
• addAttribute(attribute, value)
• changeAttribute(attribute, value)
• removeAttribute(attribute)
The first operation is used by the node to compare criteria in the event of an association
between the node and a VD. When it receives the association request, the node manager
uses this operation to query the user/context model and verify the criteria. Upon receiving
the query, the user/context module checks if it has the attribute in store and attempts to
match it with the received operation and value. As a result, there are three possible
scenarios: (1) the module states the criteria as valid; (2) the module states the criteria as
non-valid; (3) the module states that the attribute was not yet defined. Independently of
the outcome, the result is returned to the node manager.
The second operation is used in an internal manner. It has the objective of allowing
other modules in the node to use user/context data within their operation. For instance,
one of our requirements was linked with the need to adapt evaluations to the user. If the
interface which interacts with the user has knowledge regarding the user itself, for example,
using the hearing disability attribute, the interface can adapt its functionality to it and
increase the volume, thus better conducting its objectives [Teixeira et al., 2011a].
Finally, to enable the management of attributes and their data, the models include three
additional operations. With them, users are able to add, remove or change attributes and
their values in the models.
5.4.5.1 Attribute Nomenclature

For the previous aspects to be possible it is necessary that designers have access to
standard attributes which do not change from node to node. For this, we propose an
attribute nomenclature for an architectural level. Attributes are “created” by virtual do-
mains - designers - using a specific service (Section 5.5.3) and follow a standard structure
which abstracts a tree representation supported by a key-value format. For instance, the
attribute “hearing disability” can be represented in the following format:
user.characteristics.physical.hearing.disabilityLevel
The selected nomenclature is global to the entire architecture. The format was chosen
due to its clearness and ease of implementation but different formats can be used if nec-
essary. Each entry is unique and its tree representation intends to resolve problems which
could arise with their naming, re-utilization of attributes, ownership, among others. The
106
representation is structured by levels/prefixes in which every level has an owner, and new
levels can only be added with his consent. The management of new attributes is performed
at a support unit level and its functionalities accessible via the designers account on the
Evaluation Hub.
5.5 Support Unit

The support unit is an abstract middleware that employs a set of services to assure the
proper functioning of the architectural implementation. Its existence provides a level of
encapsulation between nodes and domains by guaranteeing that they do not communicate
directly, and facilitates the inclusion of new nodes and virtual domains. This encapsulation
is important to guarantee the security of the nodes and virtual domains.
In itself, the unit includes a set of services which are fundamental to a SOA approach.
Figure 5.13 illustrates the support unit overview. The figure includes services like the
node and domain registries, the association service, the interface repository, the attribute
service, and the evaluation mediation service.
SUPPORT UNIT
INTERFACE
EVALUATION MEDIATION ASSOCIATION ATTRIBUTE
NODE REGISTRY DOMAIN REGISTRY REPOSITORY
SERVICE SERVICE SERVICE
SERVICE
Figure 5.13: Support Unit Overview
It is important to note that the support unit represents a middleware decoupled from
both nodes and domains. As such, the scalability of its services is the responsibility of the
architecture itself and not a concern for either end users, evaluators or designers.
5.5.1 Node Registry

The node registry service has the objective of maintaining a directory of all nodes which
are active in the infrastructure. It guarantees the anonymity of nodes by being the only
service who knows a node’s IP address and acts as a traducer to support services which
may want to know it. For that, the registry provides an API accessible via the support
unit, which generally allows the registration of new nodes, fetches node information and
allows the elimination existing nodes.
The registration of a node requires the following properties:
• ID - corresponds to the node’s reference. It is used to uniquely identify the node.
• IP Address - indicates the IP of the node’s node manager module which will be used
to communicate with the node.
107
• Invitation Profile - determines if the node has a public/private/invitation. On a
public profile, all VDs who want to include the node in their network are able to do
so (depending on further criteria verification). On a private profile, only the node is
able to pinpoint to which VD it wants to associate with. On an invitation profile,
the node receives invitations wherever a VD wants to link it to a network to which
the user can either accept or refuse.
• Interests - indication of a few keywords that declare the user’s interests.
Only after registration can a node be associated with a VD and subsequently, receive
evaluations.
5.5.2 Domain Registry

The domain registry service is a directory of all available virtual domains. It keeps a
registry of every VD associated to its actual location. Similarly to the node registry, the
domain service offers an API which other services can use to register new domains, obtain
domain information or removing existing domains. In addition to it, the registry stores
the accounts of designers/owners and their credentials which are used to register domains.
In order to register a VD, the necessary information is required:
• ID - identifies the virtual domain in the infrastructure. It is not changeable and must
be unique.
• Scope - identifies the research purpose of the VD using keywords. These keywords
will be later used to match the VDs with nodes (using with the user’s interests).
• Domain Language - describes the domain’s language. It is inspected by the domain in
order to find the interface and producer elements that will be required in the registry
using the cfw:isExecutedBy property.
• IP Address - states the location of the VD (most specifically, the VD’s domain man-
ager module). Used for internal purposes.
• Evaluation Address - states the location of the VD main UI. It is used to allow
evaluators to access the VD. It is facultative as it depends on the VD having an
evaluation module that is accessible via web.
• Visibility - describes if the domain is public or private. If it is public, the domain’s
resources (domain language and interface/producer modules) are visible in the eval-
uation hub.
• Invitation Profile - determines if the domain possesses a public/private/invitation
profile. On a public profile, the VD accepts all nodes who wish to be part of its
network. On private, the VD only accepts node that were invited by the VD. On a
invitation profile, the domain accepts invitations from nodes, to whom it may either
accept or refuse.
108
• Non-Structural Criteria - defines a set of criteria that will be used by the association
service to verify if a node is within the VD’s research purposes. The definition is
made using a list of public attributes (via the attribute service).
• Interfaces and Event Producer components - for every interface/event producer in the
domain language, the designer must indicate a component, as well as a description of
its (software) requirements. The same component can be linked to several interfaces
if so specified.
• Owner - identifies the designer to whom to VD is associated to.
Some of these elements are dependent of others. For instance, the interface and event
producer components are dependant of the domain language. As such, an implementa-
tion of the service should divide the registration into several stages (f.i. using a wizard
approach).
5.5.3 Attribute Service
The attribute service is an auxiliary service with the objective of maintaining a central-
ized registry of usable attributes. Attributes represent characteristics or properties that
designers may use to create non-structural criteria to filter domain-node associations.
As stated before, our solution proposes attributes as a key-value structure follow-
ing a tree like representation. Besides its clearness, this specification is also a good
choice for guaranteeing that attributes are not build at random or without meaning for
stakeholders. Their creation logic follows a level based composition, where each level
as an owner, either a designer or an administrator and every new level can only be
created if the owner of the attribute accepts it. For example, creating the attribute
user.characteristics.hearing.disabilityLevel can only be done if the owner of the attribute
user.characteristics.hearing accepts it. This way, the responsibility for each attribute is
associated to an account (designer/admin) that allows or not the new attribute’s creation.
At the initialization of the infrastructure, a set of prefixes can be provided by the archi-
tecture itself under the responsibility of the administrator.
In this specification, each attribute also possesses several properties which must be
registered:
• Usable - An attribute is marked as usable or not. If an attribute is usable, it means

it can be used in criteria. If not, it means that the attribute is only representative of
other attributes which contain it as a prefix.
• Model Correspondence - Every usable attribute either has a correspondence with the
user or the context model of a node. The node uses this information at runtime to
know to whom to request the attribute value and verify criteria.
109
• Private/Public - Every attribute is either private or public. If it is private, it can
only be used by its creator. If its public, then it accessible to every designer.
• Durability - Every attribute has a validity. Since the attribute is part of a criteria,
its value is solicited to the user. Depending on the durability, the value is used as
reference for future criteria solicitations. If the attribute has no durability, then its
value is solicited to the user on every criteria verification.
• Owner - Every attribute has an owner which can be an administrator or a designer.

The owner is the sole responsible for the attribute and is able to accept incrementa-
tions to the attribute.
• Type - Represents the attribute’s value type. This value influences the operations
which can be used when composing new criteria. Possible options are integer, long,
string, boolean and datetime.
Upon registration, attributes becomes accessible (if public) for usage within criteria
when registering new virtual domains.
5.5.4 Interface and Event Producer Repository

By definition, a node can be part of a domain’s evaluation network if and only if,
the node satisfies both structural and non-structural criteria. While attributes represent
the non-structural part of criteria, interfaces and producers represent the structural part.
Regarding the structural requirements, in order for a node to be linked to a domain,
that node must possess at least one executor (interface or producer) for every cfw:Task
element, evt:Event element and cfw:EPREvent represented in the domain’s ontology. This
rule guarantees that any evaluation from the associated domain will be executable in the
node.
Despite being essential to the association of nodes and domains, interfaces and produc-
ers are independent components that can be used in multiple nodes and domains. To store
them, the architecture features an interface/producer repository only accessible to designers
via the Evaluation Hub. The repository keeps a directory where all interfaces/producers
are persisted as well as a description of them. The interface/producer itself can be an
executable or an installable software component.
Every interface/producer is described in the registry using a set of properties. These
properties include common aspects like name, ID, description, owner or public/private.
But they also include their input and output (interface only) structures which must be
respected by designers when building their domain languages. These representations are
instances of the Question and Answer specifications which we described in Chapter 4.
After the registration, both interfaces producers become accessible to other designers (if
public). To use them in their domains, designers must reference the interface/producer in
the domain language using its unique ID.
110
5.5.5 Association Service
The association service is a support unit service with the objective of mediating the
association between nodes and domains, and consequently, the creation of evaluation net-
works. It provides a set of operations to both nodes and domains, allowing them to identify
targets to whom they wish to become connected to, or to request the service to find targets
that match both their criteria and their interests. On an architectural level, the service is
part of the Support Unit and is accessible by both nodes and domains.
An indication by either a node or a domain regarding the association with another
domain or node respectively marks the start of the association process. When the target is
identified, the service is responsible for creating an association request to the node’s user
or the domain’s designer (depending on who started the process). The request is an invite
made to the owner of the domain/node questioning him on whether or not does he wish
to accept/belong to the node/domain. It is important to note that in cases where either
the node or the domain are public, the invitation is not presented, and in cases where the
node or the domain are private, the invitation is automatically declined.
If an invite is answered positively then the association service starts the next phase
of the association process by applying the domain’s criteria. It is debatable whether or
not should the criteria be applied prior to the invitation. Applying the criteria before the
invitation may lead to cases where the criteria requires user intervention because attributes
have no correspondence in the node’s user or context models. This seems illogical since
the user is being asked to provide an answer to attributes which only have relevance if he
accepts a future invitation. On the other hand, applying the criteria after the invitation
results in weaker matching between nodes and domains. In this case, the matching can
only be performed by using the user’s and domain’s declared interests. Overall, there are
trade-offs on both options and perhaps a mixed solution could prove best. Due to this,
we decided to label this decision as an implementation option to whom implements the
Support Unit services.
In spite of whatever option is selected, the criteria verification is a fundamental step
of the association process. After receiving a positive invitation answer, the association
service fetches the domain’s criteria from the domain registry and sends it to the node
for verification. The verification process is started and composed of two individual parts,
one which is applied to the structural criteria and the other applied to the non-structural
criteria. Figure 5.14 illustrates criteria handling in the node in respect to its modules.
As said before, the non-structural criteria is composed of attributes which depict user or
context model information. The node receives the verification and for each attribute in the
request, asks the user or context model for its value. If a value is non-existent in the models,
the node either rejects or asks it to the user (depending on the user/context implementation
policies). In either case, the attribute’s value is matched with the criteria’s operation and
its own value. If globally, all criteria is fulfilled, the verification of non-structural criteria
is considered successful.
If the result of verifying non-structural criteria is successful, the node then starts the
verification of the structural criteria. Structural criteria is composed by the domain’s inter-
111
Association
Service
Criteria
Node Manager
Verifies structural criteria Verifies non-structural criteria

Interfaces Event producers User criteria Context criteria
Interface Event
User Model Context Model
Manager Logger+Dispatcher
Figure 5.14: Criteria Handling in the node - Propagating structural and non-structural
criteria between the node’s modules
faces and event producers. The condition for fulfilling it states that the list of all interfaces
and producers sent by the association service must exist in the node. The list itself is based
on the domain’s registration and its domain language. The node verifies the list by con-
sulting with its own interface manager and Event Logger+Dispatcher modules. For each
interface missing, the node asks the user to install it (via the interface/producer reposi-
tory). At this point, the user can cancel the installation or perform it. If the user does
install all interfaces as requested, the verification of structural criteria is performed once
more. If successful, the node informs the association service that the node is now compat-
ible with the domain. When the service receives confirmation, it alerts the corresponding
domain and persists this information.
To support all of these steps the association service maintains an API providing multiple
operations for this purpose. The process is not instantaneous and may take a large period
of time since it may require user interaction. For this purpose, the APIs may apply a
ticket methodology for identifying ongoing associations. The service itself is also able to
communicate with other services like the domain or node registry, and includes auxiliary
operations which allow other services to make requests to it. Finally, and in addition to the
support for association between nodes and domains, the service also provides operations
responsible for their dissociation.
5.5.6 Evaluation Mediation Service

One of our initial requirements was based on the necessity of making nodes autonomous,
capable of executing evaluations which do not require a constant connection to their creator.
By definition, our architecture features a strong decoupling between nodes and virtual
112
domains that applies this principle. Nodes require a connection to the infrastructure to
receive evaluations, but afterwards, they are able to apply the evaluation autonomously
and independently. They can be offline if necessary, and still execute evaluations normally
as long as the node is functioning normally. The ability to be offline can be important
on some scenarios. While communications are almost always accessible, they may impose
issues on data plans or battery life that can be important to a user.
A different but as important requirement was the ability to provide evaluators with
updated results on-demand. In this case, the ability to go offline acts counter-productively
since if a node is disconnected, it is not able to sent new data to the domain. Our archi-
tecture flexibility enables implementations to address both requirements with more or less
importance. For instance, it is possible to address permanent connectivity on nodes by
dividing it in several components, some of which need to be always online. These options
depend on the architecture’s goals and as such we do not impose “always on” communica-
tions at the architecture level and leave it as an implementation option. On the other hand,
direct communication between nodes and domains is something that should be avoided for
security reasons. For this purpose, we decided to create an intermediary service that facili-
tates evaluation management by acting as a mediator between domains and nodes through
specific operations aimed at dealing with evaluation deployment, cancellation and data
gathering. This way, not only is direct communication avoided but it becomes possible to
apply fault-tolerance policies when a node or a domain becomes offline.
5.5.6.1 Managing an evaluation

The architecture allows nodes and domains to reference each other using IDs but with-
out knowing what their actual location is. Domains start an evaluation by contacting the
mediator service who forwards it to the selected nodes. Figure 5.15 illustrates the process
of starting an evaluation by sending it to a node. Upon receiving the request, the mediator
starts by verifying if the node and the VD are in fact associated. If the association is
confirmed, the mediator makes a request to the node registry in order to obtain the node’s
IP location and then sends the evaluation to the node. The evaluation is then received by
the node and the result of the operation is communicated to the VD. Operations regarding
cancellation and the retrieval of evaluation results follow a similar methodology.
5.6 Evaluation Hub

The Evaluation Hub is a support service directed at all stakeholders. It represents an
entry point to the architecture that provides management operations such as registration
of new accounts, access to the interface repository, creation of new domains and nodes,
search for existing VDs, registration of attributes. Figure 5.16 illustrates the Evaluation
Hub UIs and their correspondence with the Support Unit services. The Evaluation Hub
itself is a service like any VD or node permanently connected to the Support Unit and is
managed by the infrastructure’s administrator.
113
SU:Evaluation SU:
Virtual SU: Node
Mediation Node Association
Domain Service Registry
Service
startEvaluation(nodeID, evalSpec,
execSpec)
ticketID
verifyAssociation(nodeID, domainID)
association
alt
[association == true]
getNodeSpecs(nodeID)
nodeSpecs
startEvaluation(domainID,
evalSpec, execSpec)
notifyRequest(ticketID, returns ack
'true') executesEvaluation()
[else]
notifyRequest(ticketID,
'false')
Figure 5.15: Sequence Diagram regarding the initialization of an evaluation and its medi-
ation through the Evaluation Mediation service
The existence of the Evaluation Hub is necessary to assure the correct operation of the
infrastructure by guaranteeing that all accesses to the support services on the Support Unit
are done with proper authorization. In this regard, the hub constitutes the first line of
interaction with the infrastructure by providing UIs which designers and users have access
to via accounts which they are also able to create in the hub. As such, the hub acts as a
presentation layer to the architecture.
The Evaluation Hub’s primary function for designers is the ability to register virtual
domains. It provides a UI in which the designer must indicate the VD’s location, interests,
interfaces, criteria and other properties which are fundamental to registering a VD. The
hub then assumes the responsibility of verifying the inserted information and either validate
the new virtual domain or decline its registration.
In addition to the registration of virtual domains, the designer is able to view (public)
existing virtual domains and their domain languages. The objective is to allow the designer
to verify if elements from existing domain languages can be useful in its own domains. With
a similar purpose, the designer is given access to attribute and interface repositories via
specific UIs. In regard to the attribute service, the designer is able to create new attributes
by requesting them and verify extension requests for his own attributes. In regard to the
interface/producer repository, the designer is able to analyze public interfaces/producers by
verifying their input/output structures and other properties. The access to both interests
and interfaces/producers is provided to designers with the purpose of enabling their reuse
if the designers intend to so.
Concerning the user, the hub allows him to register his/her respective node. Like the
virtual domains, the hub validates the registration and inscribes it into the node registry
if successful. To end users, the hub also provides searching functions through existing
114
Evaluator User Designer
EVALUATION HUB
ACCOUNT
DOMAIN/NODE ATTRIBUTE
AUTHENTICATION/ DOMAIN SEARCH INTERFACE REPOSITORY
REGISTRATION CREATION/LOCATION
REGISTRATION UI UI
UI UI
UI
SUPPORT UNIT
INTERFACE
EVALUATION MEDIATION ASSOCIATION ATTRIBUTE
NODE REGISTRY DOMAIN REGISTRY REPOSITORY
SERVICE SERVICE SERVICE
SERVICE
Figure 5.16: Evaluation Hub UI Overview
domains which they may use to find virtual domains that may interest them.
Evaluators have access to the infrastructure as well. Since virtual domains are inde-
pendent, evaluators must abide by each VD’s own rules in regard to authorizations and
authentications. To search for VDs however, the evaluators may resort to the Hub and
inspect existing VDs conditions and endpoints.
5.7 Summary
In this chapter we explained our architectural proposal. The proposal is transversal
to our evaluation model and provides users, evaluators and designers with a scalable so-
lution based on SOA principles that embodies the fundamental principles of our dynamic
evaluation paradigm.
Resumedly, the architecture features a core unit represented by the Support Unit and
the Evaluation Hub to ensure the proper operation of the infrastructure. The flexible
nature of virtual domains allow designers to create evaluation methodologies in a stable
backbone, and gives evaluators the tools to build evaluations taking context into account,
conduct them with the target audience and to obtain the necessary data to aid them in their
objectives. Nodes allow users to become part of evaluation areas that interest them and
participate in evaluation tests without constant logistical intervention and give evaluators
the ability to apply evaluations on multiple locations, using diverse hardware devices and
with adaptable interaction. In all, the proposal is supported by an evaluation model which
guarantees the correct operation of the entire solution.
115
In the next chapter we will join the model, the methodology and the architecture and
describe the implementation of our solution and its usage in a proof of concept scenario.
116
Chapter 6
Proof of Concept
In order to demonstrate the validity and feasibility of our evaluation solution, we decided
to create a proof of concept to verify our main contributions and analyze its usage in real
scenarios [Pereira et al., 2015]. Based on this objective, our proof of concept is divided into
three major parts: (1) developing an instance of our architectural approach; (2) creating
a domain language and an evaluation test using the developed solution; and (3) applying
the evaluation test to a set of users using the developed solution.
The development was centered on the creation of two base frameworks, representative
of the node and the virtual domain. The domain and the evaluation was centered on a
TeleRehabilitation application. The application of the evaluation was performed to a set
of users while using the application in an AAL environment.
In this chapter, we will start by presenting the implemented solution, starting with
both frameworks. Then, we will explain the creation of the Rehabilitation virtual domain
and the details of the TeleRehabilitation evaluation test. Finally we will describe the
application of the test and its results.
6.1 A First Instantiation of the Architecture:

Dynamic Evaluation as a Service Platform
As stressed before, our proposal has a strong architectural influence on SOA principles,
namely in the way that both nodes and virtual domains were designed (see Sections 5.3 and
5.4). Both cases are seen by the infrastructure as decoupled components that offer their
functionalities to others via well defined APIs. More than components, they are services
that have their own operation logics, abstracted by a generic API that allows them to be
part of the infrastructure. On its part, the infrastructure sees these services as content
providers to whom it may address requests.
Following this distributed logic and the concept of both nodes and VDs as services,
we have named our developed solution as Dynamic Evaluation as a Service, or simply the
DynEaaS platform. The naming represents the vision that we have on both the nodes and
the domains. Users and their nodes are resources that can be called to perform evaluation
117
scenarios in predefined conditions. Virtual domains are abstract locations that represent
specific objectives, conditions and evaluation methods that can be used by evaluators
to rapidly create and apply evaluation tests. On both cases, a logic of Software as a
Service [Ma, 2007] is underlined, or more concretly, a logic of Evaluation as a Service.
Evaluators resort to VDs to create evaluations. VDs resort to nodes to apply evaluations.
Behind them, a backbone structure also composed by services concludes the assembly of a
decoupled, modular and distributed evaluation solution.
Given the specifications of our scenario, rather than fully implement the architectural
proposal, the DynEaaS platform is formed by two frameworks that can be reused by
developers for the creation of new nodes and new virtual domains: the node and VD
frameworks. Regarding the support unit, to reduce implementation times, we choose to
embed part of their functionalities within the frameworks. Despite this, we still created
embrionary versions of the services but did not used them for this proof of concept.
In the case of the node, we have developed an ‘almost ready to use’ solution that can
be installed and rapidly initialized. This solution is a standard representation of a node
and follows the specifications of the evaluation definition model. It is composed by a series
of modules that were implemented according to the architectural specification of node. Its
objective is for any user to be able to setup a simple node, be ready to become linked with
a virtual domain and consequently receive evaluation tests.
The design of the virtual domain is different. Since the VD depends on the domain
ontology and its specifications, it is not possible to create a unified component that repre-
sents all possible virtual domains. Architecturally, we accounted for this aspect by splitting
the VD into two main modules, one which was generic - the domain manager - and the
second which was auxiliary and dependent of the specification - the evaluation module.
To facilitate the design and incorporation of new VDs, we have developed a version of
the domain manager module and implemented a sample version of an evaluation module.
This sample version is associated with an example evaluation domain and can be used by
designers as a starting point to create or extended new VDs.
6.1.1 Node Framework

The node framework encompasses a set of modules designed according with the prin-
ciples of the node architecture presented in Section 5.4. Our implementation of the node
does not include all modules within the node architecture as they were not fundamental
to our proof of concept, namely the node manager (embedded into the evaluation engine)
and the user/context modules (developed only as a key-value storage). All others were
implemented as part of the node framework.
The modules were mostly developed in Java and communicate with each other using web
services. The choice for web services provides the ability to constitute both a centralized
approach, where all modules are near each other and communicate easily, but also allows
the setup of nodes which are spread between several locations. This possibility can be
important in the sense that many evaluations may require the usage of mobile phones or
wearable devices that the user constantly has with him, but in which it is not possible
118
to install a full node given the low processing capabilities of those devices as well as the
short battery life. By using web services, it also becomes possible to place the modules
in a remote server (or the cloud) and place the interfaces and producers modules near the
user, thus creating a distributed node with the user at the center.
To install the node, the user can choose to install all modules in the same device. In
this case, the module will operate on a local logic. If the user however wants to install each
module on a different location, it will have to change the standard (localhost) addresses of
the modules. As already stated, the modules communicate via web services, specifically,
using REST [Fielding, 2000]. The choice of REST instead of SOAP [W3C, 2000] was due
due to its simpler nature, popularity and its base HTTP usage.
6.1.1.1 Evaluation Engine

The evaluation engine is one of the main modules of the node framework and has the
objective of parsing and managing incoming evaluation executions. The engine was devel-
oped using Jena [Foundation, 2015]. Jena is a open source Java framework to construct
Semantic Web applications. It allows developers to programmatically handle languages
like RDF, RDFS and OWL, as well as SPARQL [W3C, 2008] for query-based specifica-
tions. Using it, we have created a generic engine that is capable of parsing Compflow
instantiations1 .
As seen before, evaluations are sent to the node as execution specifications. More
concretely, each specification is formed by a set of assessments mapped as Job instances.
The engine parses each Job element and applies the underlying assessment to the user using
the definitions from the evaluation specification ontology. The execution specification
is wrapped in a JSON element formed by the set of Job instances and the evaluation
specification. As it parses the Job instance, the engine delegates the execution of both
Tasks or Compflow Events to the interface manager. To do so, the engine creates a
request consisting on the element in question as well as a callback address and waits for a
response. When it receives the answer via the callback address, it continues its execution
according to the specification.
The engine operates on a concurrent manner as it creates threads to deal with each spec-
ification. To save the results of its operation, the node must be linked with the persistence
unit.
6.1.1.2 Persistence Unit

The persistence unit is a fundamental piece to the node’s successful operation as it rep-
resents the storage unit for evaluation results. In our framework, we choose to not include
a persistence unit and leave the decision to the user itself. The module’s standard settings
however were based on a MySql database. The schema of the database is automatically
created by the modules and is based on a triple store format following a “subject, predicate,
object” schema, commonly used by ontology definitions.
1
This module was developed in cooperation with Nuno Luz from ISEP [Luz, 2015]
119
6.1.1.3 Interface Manager
The interface manager is a Java module based on a publish-subscribe methodology.

The module contains a registry where it stores all endpoints of all of the node’s interfaces.
To allow the registration of interfaces, the module has a REST API composed by two regis-
tration operations (one for the EventInterfaces and the other for the TaskInterfaces).
Each registration must contain the start and cancel endpoints which the manager uses to
communicate with the interfaces as well as their URIs2 .
Regarding its development, the module has a standard approach. When a request is
received by the module for the routing of a Task or an EPREvent, the module broadcasts
the request to all associated interfaces. Then, it expects an answer from one of them, or
a cancel request from the evaluation engine. When an answer occurs, the module notifies
all other interfaces to cancel the execution of the Task/Compflow Event. This policy
however can be changed if necessary by implementing multimodal practices like fusion of
information [Atrey et al., 2010] within the module.
To communicate with the engine, the interface controller has an internal API consisting
of two operations for each interface type: startTask, cancelTask, startEvent, endEvent.
The engine uses the start methods to send a Question/EPR element to the manager in order
for it to be executed and adds an indication of which interfaces to do so via their URI.
Similarly, the cancel operations are used to terminate an ongoing Task/Compflow Event,
in which the controller alerts all interfaces to stop their execution.
In the node framework, the interface manager was created as an internal module of the
Evaluation Engine.
6.1.1.4 EPR Engine and Event Logger+Dispatcher
The EPR Engine was implemented according to the base specification for EPRs. In this
sense, new operations which may be added (via the EPR Extended Specification) require
a revision in this module for it to operate accordingly. The implemented version that is
part of the node framework however, the engine includes additional operations such as
EventOperation BiggerThan or EventOperation SmallerThan, both as subclasses of the
EventOperationFunction, to allow evaluators to compare parameters that may be linked
to the events.
In the node, the module operates as the sole EventInterface. As such, it receives
every EPR instance that every active evaluation may contain. The standard module was
developed in Java. As the module is an Event Interface, it possesses two main commu-
nication endpoints, the start and cancel operations. In the implemented version, these
operations were developed using a REST API and allow the module to communicate with
the interface manager as the event logger+dispatcher.
2
In our implementation, all elements are identified by an URI which is unique and must be in accordance
with the domain language.
120
Event Logger+Dispatcher The Event Logger+Dispatcher acts as an intermediary be-
tween all event producers that belong to the node and any party that is interested in them
(such as the EPR Engine). Because of this, the module is similar to the interface manager
as it is also based on a publish-subscribe methodology. Due to the possible issues on han-
dling a high number of publishers and their events, it was necessary to use a developed
solution that guaranteed a scalable implementation. In this sense, the module incorporates
the Rabbit Message Queue [Pivotal Software, 2015] software. This software guarantees a
safe and scalable solution that consists on placing received events in an message queue, and
rapidly handle and route them to interested parties upon a quick analysis of the messages’
header. The usage of a message queue like Rabbit MQ guarantees that a high number of
producers or subscribers will not constitute a bottleneck in the node.
To communicate with the module, event producers encapsulate their events in JSON
(in accordance with the Event Extended Specification). To identify components and route
the messages, all modules are identified via their URIs, which are unique according to the
evaluation definition model.
6.1.2 Virtual Domain Framework

According to the architectural definition, a virtual domain is composed by three mod-
ules: a domain manager, an evaluation module and a data persistence unit (see Section 5.3).
Given the dependency that a VD has on its domain language, it is not possible to develop a
‘out-of-box’ solution for VD creation similar to what we did for nodes. With this in mind,
and in order to facilitate the creation of new VDs, we created a VD framework composed
by a generic domain manager that can be utilized for every VD implementation, and a
sample evaluation module based on the domain language for the evaluation scenario that
we will present later on. This sample facilitates the designer’s task by allowing them to
analyze an existing module and use a similar approach in their implementations.
The VD framework was developed in Java, most specifically using Enterprise JavaBeans
(EJB) [Oracle Corporation, 2014] technology and can be deployed in any Java application
server (such as Glassfish [Oracle Corporation, 2015] or WildFly [RedHat, 2013]). The
visual UIs were created using Java Server Faces (JSF) technology.
6.1.2.1 Domain Manager

The domain manager module provides a set of necessary operations for managing the
VD. It includes operations focused on handling evaluation deployment as well as some op-
erations to handle the VD’s network. Available to developers, the manager’s API includes
operations such as:
• startEvaluation(nodeUri, evaluationSpecification, executionSpecifica-

tions) - sends an evaluation to a node for execution;
• cancelEvaluation(nodeUri, evaluationUri) - cancels an ongoing evaluation on
a node;
121
• getEvaluationResults(nodeUri, evaluationUri) - gets the results of an evalua-
tion that either is in execution or has already ended in a node;
• addNodeToDomain(nodeUri) - requests the addition of a node into the domain’s
network. Subsequently creates an invitation that is received by the node;
• searchCompatibleNodes() - requests the association service to find compatible
nodes (in regard to criteria);
• removeNodeFromDomain(nodeUri) - requests the dissociation of a node from the
domain’s network;
• applySparqlQuery(sparqlQuery) - applies a SPARQL query to the persistence unit.
These operations allow developers to link their own evaluation modules, and other UIs,
to a implementation that provides them with a set of predefined operations.
6.1.2.2 Evaluation Module

As stated in Chapter 5, the evaluation module is a facultative module of the VD. Its
operations can be achieved using the domain manager’s API and through programs that
are able to handle ontologies such as Protégé. Nonetheless, its existence is highly recom-
mended due to the complexities of ontology handling and the constrains of the domain
itself. With this module, evaluators are able to apply evaluations without handling any
details regarding the underlying specifications. However, due to the variations of the do-
main language, the VD framework does not contain a definitive implementation of it, but
a sample implementation for one domain. Despite this, a substantial part of the implemen-
tation such as UIs, operations and algorithms can be reused by designers as it is generic.
Regarding the UIs, the module contains generic UIs to:
• instantiate an evaluation;
• verify node status (f.i. to verify if they are active before starting an evaluation);
• creating evaluation assessments;
• creating EPRs;
• retrieve evaluation results from a node;
• analyze assessments, enquiries and EPR results individually;
• analyze evaluation results from the procedure standpoint (using compflow elements);
• SPARQL query creation and application to extract new data.
Regarding the algorithms, it is important to highlight the evaluation specification to

execution specification generation algorithm.
122
EPR Creation UI Due to the complexity of the EPR specification, we created a visual
UI where evaluators are able to specify EPR instances. The UI is based on the generic
EPR specification and as such, includes the base operations (as well as a couple more that
were important for our test example). The UI abstract the EPR instantiation step from
Section 4.5.2.1. Figure 6.1 showcases an example of the UI.
Figure 6.1: Defining an EPR using the DynEaaS EPR Creation UI
The UI presents a tree like representation that can be built from the root to its leafs.
It allows the evaluator to select the operation and automatically creates the associated
structure for the operation. For instance, if the evaluator selects an And operation, then
the UI automatically places two elements that must be filled. The UI also verifies the
correctness of the EPR instance and only allows the evaluator to save the EPR in that
condition.
Note that according to the EPR specification, all leaf elements must be evt:Event
subclasses only. Because the Event specification varies depending from domain to domain,
the DynEaaS sample evaluation module also includes an UI that allows evaluators to define
their events without handling the ontology. After doing so, it is then possible to include
those events within EPR instances. That UI however only allows the insertion of Events
at their raw format, that is, without any associated properties.
Due to its genericity, the UI can be reused by designers in their own evaluation module
123
implementations.
Evaluation Creation UIs The evaluation creation UIs allows evaluators to create eval-
uation specifications using a visual interface. By using the instances of the enquiry spec-
ification and the event/EPR specifications, this unit is split into two main parts: a UI
dedicated to the creation of evaluation assessments, and a second UI directed to the in-
stantiation of the specification.
Regarding the creation of assessments, the module includes a UI similar to the EPR
creation UI. The UI features a flow like representation where evaluators can link either
Enquiry or EPR instances that have been previously created. When submitted, the soft-
ware automatically links the evaluation assessment with the corresponding evaluation thus
removing the need for evaluators to dwell with ontology details.
Regarding the second part of this feature, the UI allows evaluators to select a set
of target nodes and according to a schedule, directly instantiate the evaluation. This
functionality uses the generation algorithm to generate the execution specifications from
the set of assessments on the evaluation specification and the domain manager’s API to
hide programming details from the evaluator when deploying evaluations to the nodes.
Evaluation Result Access UIs The evaluation result unit provides a series of UIs that
evaluators can use to access and analyze the results of an evaluation instantiation. Beside
allowing evaluators to view enquiry or EPR results, the unit includes UIs to access assess-
ment execution detail, checking previous evaluation instantiations, verify event occurrences
or analyzing individual performances. Figure 6.2 illustrates a simple assessment displayed
using in timeline format.
Figure 6.2: DynEaaS Timeline feature for representing an evaluation assessment’s results
through time
124
The UI allows evaluators to access all the instantiations of an evaluation, verifying each
assessment on detail, each node to whom it was applied and even analyze details regarding
the procedure via control flow data. In addition to the viewing features of the unit, the
sample version of the evaluation module also includes an UI to retrieve the most recent
results for ongoing evaluations from the nodes.
SPARQL Query Viewer To provide the ability for more technical evaluators to ques-
tion the data results and extract more precise information, we included a SPARQL UI
where evaluators can view the results in a raw format using XML. Figure 6.3 illustrates
the application of a sample query using the UI.
Figure 6.3: Using SPARQL queries to access data within the DynEaaS platform
6.1.2.3 Interfaces and Producers
As a virtual domain requires a set of interfaces and producers to interpret its domain
language, the VD framework includes sample versions for an interface and a producer
components.
125
6.1.3 Support Unit and the Evaluation Hub
For our proof of concept, there was no need to implement the full Support Unit. To
diminish development times, we opted to embed fundamental aspects of its services within
the VD Framework and allow evaluators/designers to control those aspects via UIs that
the framework includes. Given this choice, services like the attribute service, the domain
registry or the interface repository were not fully developed, and thus not used in our proof
of concept.
Node Association The node registry was implemented as part of the VD framework and
indicates the list of all nodes linked to the VD. Rather than being a centralized service, we
choose to register the nodes locally within the VD. The business logic is however, similar.
Regarding the implementation, the VD framework includes an UI that evaluators can use
to register nodes, showcased by Figure 6.4.
Figure 6.4: DynEaaS platform - Node registration UI
For our proof of concept, we considered that all nodes are registered in the VD and form
the VD’s network. Given this fact and that the fact that node registration is part of the
VD framework, the association service was not implemented as most of its operations are
fulfilled by the node registry. As a consequence, the evaluation mediation service was also
implemented as part of the VD framework. The module abstracts three main operations:
starting evaluations, cancel ongoing evaluations and retrieve evaluation results. These
operations are split from the main implementation in order to allow their reusability if
necessary.
126
Finally, since our proof of concept involved a single domain, we also chose to embed
the evaluation hub as part of the VD framework. The implementation included account
authentication, registration and some search UIs regarding the domain language.
6.2 Creating an Evaluation for a concrete scenario:

The TeleRehabilitation Evaluation Test
To test the DynEaaS platform and more importantly, to prove the feasibility of the pro-
posed evaluation solution, we conducted an evaluation test following a dynamic evaluation
perspective. This test was conducted in an AAL environment, and focused on the utiliza-
tion of a TeleRehabilitation [Teixeira et al., 2012] application by a set of users. Our goal
was to obtain data regarding its usage and measure the user experience of the application.
6.2.1 The evaluation scenario

Rather than applying a common evaluation test like a survey, we choose to evaluate
an application in an AAL environment. An AAL environment configures a reactive envi-
ronment where context has a fundamental importance in the user’s actions and thus, it is
more suited to our proposal. The evaluation had the purpose of obtaining data regard-
ing a TeleRehabilitation application during its usage. TeleRehabilitation is an application
that allows users to perform a rehabilitation session in their homes under supervision by
a remote physiotherapist. It was designed to an AAL environment in which users have
limited mobility and incorporates different communication methods to improve usability.
Due to its distributed nature, the application is divided in two parts, a patient side and a
therapist side. Figure 6.5 illustrates the two parts of the TeleRehabilitation system.
(a) Patient Interface (b) Physiotherapist Interface
Figure 6.5: The TeleRehabiliation Application User Interfaces
For our test, we targeted the patient side as the focus of the evaluation. The therapist
side was contemplated only as an event producer. The objective was to obtain “in situ”
data that could provide evaluators with information regarding the patient’s experience with
127
the application. In addition, results should also provide developers with indications that
can assist them in future iterations of the application. With these premises, the evaluation
focused in aspects such as:
• the user’s opinion of the application when using it and after using it;
• if all components were working as the user expected them to work (namely the
exercise demonstration, the communication windows, the voice commands and the
initialization stages);
• if certain components are not used by the user, then why;
• if the interface was intuitive or not (during usage);
• the cause of unexpected interactions (pressing several buttons at once, or repeating

the same voice commands);
Rather than a typical evaluation test in which users are prompted with dozens of ques-
tions, evaluators intended to use DynEaaS to obtain data which otherwise they were not
able to. As such, questions regarding the color of the interface, the overall feel of a compo-
nent, classifying the application using selective words among others, were not inserted in
the evaluation. Evaluators also did not want to overload the user by interrupting its usage
of the application. Because of this, VD interfaces were embedded into the application itself
as a way of removing the need for the user to focus elsewhere during the rehabilitation
session.
The test was programmed for half an hour, and began when the user logged into the
application. As the cooperation of evaluators and developers was necessary to the creation
of the scenario, it was also possible for us to analyze their usage of DynEaaS through the
entire preparation and execution process regarding aspects such as:
• the node framework versatility and ease of installation;
• the node’s ability to integrate well with other applications;
• the developers difficulties to use the node’s functionalities to design and integrate
new interfaces/producers;
• the designer’s ability when creating a domain language;
• the evaluator’s ability to successfully use a virtual domain;
• the virtual domain framework as an extendable software component;
• the evaluator’s ability to create an evaluation without assistance;
• the evaluator’s comprehension of the solution;
128
• the overall value of the results to the evaluators.
It is important to note that, to use DynEaaS, designers and evaluators were subjected
to a short presentation in which the dynamic evaluation paradigm was presented as well
as the main features of our DynEaaS solution.
6.2.2 Defining the domain language

To conduct the evaluation test, it was necessary to design a domain focused on the
purpose of evaluating a TeleRehabilitation application for an AAL environment. In this
sense, we started by creating a domain language that included the necessary evaluation
elements using the methodology proposed by Chapter 4.
6.2.2.1 Defining the Enquiry and Event Extended Specifications

The first step to create the domain language was to extend the Enquiry, Event and
EPR specifications with new evaluation elements that will be required in the domain. In
this sense, designers and evaluators were asked to define the domain language (with our
assistance) by envisioning evaluation elements that they might need when applying the
evaluation test. Having these elements in mind, they were asked to start with the Enquiry
specification.
Rather than creating Question subclasses with new semantic meaning, designers choose
to create “question types” such as the OpenQuestion, the MultipleChoiceQuestion and
the BooleanQuestion. While the created classes did not correspond to our expectations,
they are still perfectly valid and in conformity with our solution. Figure 6.6 illustrates the
visual representation of the Enquiry extended specification.
In this specification, designers included: open answer questions - in which the user
can provide his answer freely; multiple choice questions - which have a set of predefined
answers that the user must choose; and boolean questions - which are a specific type
of multiple choice question but one where only two possible answers (‘true’ and ‘false’)
are possible. At the same time, designers defined the Answer specification by including
the MultipleChoiceAnswer, the OpenAnswer and the BooleanAnswer, in correspondence
with the question subclass elements. All question types also included a domain:hasText
property to state the actual content of the question (not represented in the figure due to
its genericity).
Event Extended Specification Following this specification, the designers were asked
to create the Event Extended specification in regard to the concept of a Rehabilitation
domain. For this step, they choose to incorporate a set of events linked to the applica-
tion’s execution from both the patient and the therapist sides. The resulting specification
included events to provide feedback regarding several phases of a rehabilitation session as
well as events that point the usage of a certain component. In addition, the specifica-
tion also included events to represents certain requests that can be made by the therapist
129
Multiple rdfs:subClassOf
rdfs:subClassOf
Boolean
Question Choice
Question
Question
domain: domain:
rdfs:subClassOf hasPossibleAnswer hasPossibleAnswer
Open Possible
Question Answer
MultipleChoice Boolean
Answer
Answer Answer
rdfs:subClassOf
Open
Answer
Figure 6.6: TeleRehabilitation Domain - Creating the Enquiry Extended Specification
and that could influence the user’s behaviour. Altogether, the full list of events ranged
from the login stages to the exercises and to the interaction components. In Table 6.1,
we present the full list of events contemplated by the Rehabilitation domain associated to
their respective producer.
It is important to note that some of these events incorporate parameters not presented
at the table. For instance, the TR ExerciseStatus event showed the percentage of exercises
already completed by the user.
6.2.2.2 Performing the Association Process

After defining the base elements that formed the vocabulary of the future Rehabili-
tation domain (by defining the new elements of the Enquiry, Event and EPR extended
specifications), designers were asked to associate the extended specifications with the con-
trol flow ontology. This process, which we named association process (see Section 4.4.2),
links the newly defined elements (Questions, Events and EPRs) with control flow ontol-
ogy elements (Tasks and EPR Events) and defines a set of rules that will later allow the
automatic transformation of evaluation specifications into execution specifications that all
nodes are capable of interpreting.
As the EPR specification was unchanged, the base EPR extended specification indicated
at Section 4.4.2.1 was sufficient. This means that regarding the VD framework, designers
can simply use the EPR Creation UIs without any modification. Regarding the Enquiry
extended specification however, as changes were made to it, designers had to perform the
association process. Using an example as a reference, the designers were able to create the
130
Event Designation Description Producer
TR NextExercise Advancement to a new exercise. Patient
TR newExerciseList Activation of a new exercise list. Therapist
TR SensorSelect Selection of a sensor in the interface. Therapist
TR SendExercise Sending a message to the other user. Patient
TR RemoveExercise Elimination of an exercise from the exercise Therapist
list.
TR Login Successful authentication. Patient
TR PreviousExercise Selection of a previous exercise. Therapist
TR TimeChangeExercise Changing the time period of an exercise. Therapist
TR SensorZoomReset Resetting the sensor viewing component. Therapist
TR ExerciseStatus Indication of the status of the currently ac- Patient
tive exercise.
TR SensorZoom Performing zoom on the sensor component. Therapist
TR ReceiveMsgChat Receiving a message on the chat component. Patient
TR SendMsgChat Sending a message via chat component. Patient
TR SelectExercise Selecting an exercise. Patient
TR endExerciseList Finishing an exercise list. Patient
TR SessionStart Initiating an exercise session. Patient
TR SessionEnd End an exercise session. Patient
Table 6.1: List of Events that compose the Event Extended Specification
required specification. Figure 6.7 illustrates the result of this process. Due to the difficulty
of the process however, this step had a high degree of assistance by us.
The result of this process defines three types of tasks that will form the future execution
specifications: the OpenQuestionTask, the MultipleChoiceTask and the BooleanQues-
tionTask. Each of these tasks classifies the correspondence between the question elements
and the answer elements and more importantly, how should the nodes handle them.
Concerning the Event extended specification, it is important to remember that since
events are always part of EPRs, they do not require an association with control flow
elements at this phase.
Associating the Interfaces Upon completing the association between the new ele-
ments and the control flow ontology, it was necessary to discuss how should the user be
presented with the enquiries and by whom were the events produced in order to define the
Interfaces and AtomicEventProducers of the specification. As a result, it was concluded
that questions to the user should be presented inside the application itself. In a similar
fashion, most events would also be produced by the application.
Since the newly created question types differed in their nature, it was necessary to
create specific interfaces for each one of them. All question classes have a content repre-
sented by the domain:hasText property but differed in regard to what they involve as the
131
Open cfw:hasInput Open cfw:hasOutput Open

Question QuestionTask Answer
MultipleChoice cfw:hasInput MultipleChoice cfw:hasOutput MultipleChoice

cfw:hasInput cfw:hasOutput
Boolean Boolean Boolean
Figure 6.7: TeleRehabilitation Domain - Associating the Enquiry Extended Specification

with control flow elements
OpenQuestion allows the user to provide any kind of answer while the MultipleChoice-
Question places a restrictions on which answers are possible. Because of this, the interfaces
that manifest each of these questions to the user could not be the same, even if embedded
into the same application.
As a result, we defined two new interfaces. A TROpenQuestionInterface that would
deal with OpenQuestionTask elements and a TRMultipleChoiceInterface that would
deal with the MultipleChoiceTask (and BooleanQuestion task) elements. Figure 6.8
illustrates the end result of the domain language in regard to the Enquiry specification.
Associating the Event Producers Regarding the Events, it was necessary to define its
producers. Since some of the Events are produced by the patient side of the application and
others by the therapist side, the definition included two AtomicEventProducer subclasses:
the TeleRehabilitation PatientAEP and the TeleRehabilitation TherapistAEP. As
such, the result domain language (regarding the Events) explicitly associated the events
with the corresponding event producer. To exemplify this, part of the specification is
illustrated by Figure 6.9.
After performing all these steps, the complete domain language was then formed by
joining all specifications into a single ontology that was used as the model of the VD.
132
cfw:isExecutedBy
Question Task Answer TaskInterface
rdfs:subClassOf rdfs:subClassOf rdfs:subClassOf rdfs:subClassOf

rdfs:subClassOf
Open cfw:hasInput cfw:hasOutput TR

Open Open
Question OpenQuestion
QuestionTask Answer
Interface
cfw:isExecutedBy
TR
MultipleChoice cfw:hasInput MultipleChoice cfw:hasOutput MultipleChoice MultipleChoice
Question Task Answer Question
Interface
cfw:isExecutedBy
cfw:hasInput cfw:hasOutput
Boolean Boolean Boolean cfw:isExecutedBy
Figure 6.8: TeleRehabilitation Domain - Resulting domain language regarding the Enquiry
specification
rdfs:SubClassOf domain:hasPercentage
TR_Exercise AtomicEvent
Event Double Producer
Status
evt:hasTimestamp cfw:isExecutedBy rdfs:SubClassOf

rdfs:SubClassOf
cfw:isExecutedBy Tele
Timestamp TR_Login Rehabilitation
_PatientAEP
rdfs:SubClassOf
rdfs:SubClassOf
cfw:isExecutedBy Tele
TR_new
Rehabilitation
ExerciseList
_TherapistAEP
Figure 6.9: TeleRehabilitation Domain - Resulting domain language regarding the Event
specification
6.2.3 Implementing the Virtual Domain

6.2.3.1 Creating the evaluation module for the VD
As the domain language was completed, the next step was to build the VD according to
its content. As previously stated, a VD is composed by two main modules: a domain man-
133
ager and an evaluation module. Since the domain manager does not change, the designers
task was centered in developing a version of the evaluation module that incorporated the
changes of their domain language.
To facilitate the creation of new domains, the VD framework included a sample evalua-
tion module that incorporated UIs and algorithms that could be reused by developers/de-
signers if necessary. Since no modifications were made to the EPR specification, designers
were able to use the EPR creation UIs that were part of the base VD framework. In
the same way, UIs and back-end operations to create assessments or evaluations were also
reused.
Since the Enquiry specification changed, it was necessary to develop a new UI where
evaluators were able to create new Enquiries without handling ontology details. Figure 6.10
shows the aspect of the Question creation UI based on the domain language from the
previous section.
Figure 6.10: TeleRehabilitation Domain - Enquiry Creation UI
To develop this UI, designers had to develop code regarding the new elements of the
Enquiry specification. It is important to note that this step was the biggest implementation
requirement for the creation of the VD and was unavoidable. The front-end was developed
by using the sample VD framework Enquiry Creation UI as a base.
The resulting UI allows evaluators to create three types of questions, thus matching
the specification. Since no other changes were made to the other specifications, the VD
required no further changes. As such, after creating the Enquiry UI, the VD was completed
and ready to be used.
134
6.2.3.2 Implementing the interface/event producer components
In order to apply an evaluation test based on the created VD, the respective interfaces
and event producers within the domain language must be implemented into real compo-
nents. As stated before, for the Rehabilitation scenario, designers (and evaluators) chose
to embed the interaction into the actual TeleRehabilitation application. This way, the user
would answer the questions while doing the rehabilitation session and thus not have to
direct his attention elsewhere. At the same time, the domain language also noted that all
events were either produced by the patient side of the application or by the therapist side
of the application. These choices mean that the patient application is both a interface and
an event producer, and the therapist application is a event producer.
Starting with the interface, the specification pointed to three types of Tasks that the
interface should be able to handle: OpenQuestionTask, MultipleChoiceTask and the
BooleanTask. For each of them, the application developers created a specific GUI where
the user could view the question and answer it within the application. To connect them
with the node framework, the developer used the method specified in Section 5.4.3 thus
creating a REST API composed by operations through which it would receive the ques-
tions. Figure 6.11 illustrates the question GUI created in the patient TeleRehabilitation
application.
Figure 6.11: Screenshot of the TeleRehabilitation application GUI for answering an eval-
uation question
Regarding the event producers, developers had to start by inserting into the applications
the ability to detect the events through the user’s interaction. Then, in both the patient and
the therapist’s applications, the developers implemented a module for creating the events
according to their specification and send them to the node framework using the API of
135
the Event Producer+Dispatcher module in a JSON format. Following is an example of an
event in this format:
{
" eventType " : " http : // domain # TR_ExerciseStatus " ,
" timestamp " : " 1 4 3 1 3 5 6 6 6 5 0 0 0 " ,
" data " : {
" percentage " : " 8 0 "
}
}
After completing these steps, the interfaces and event producer components were com-
pleted, and the application ready to be used within evaluations.
6.2.4 Creating the TeleRehabilitation evaluation test using the

Virtual Domain
After creating the domain language, implementing the VD and changing the appli-
cations to include the interface and event producers, it was now possible to create the
evaluation test using the developed solution. As said before, rather than an exhaustive
test where the user was constantly questioned, evaluators preferred a more simple scenario
with well defined situations in which certain events lead to an “in situ” interrogation.
Based on this premise, evaluators were then asked to define their evaluation vision using
the Rehabilitation VD.
To create an evaluation using the VD, evaluators were asked to define it by creating a
set of evaluation assessments that represented their objectives. As such, evaluators started
by defining this set externally to the VD from which we highlight the following assessments:
• an assessment intended to assert the overall opinion of the application after a certain
usage. The assessment is composed of two elements, the first being an EPR which
triggers five minutes after login, and the second, a question composed of a number
of possible answers.
• an assessment to assert a possible malfunctioning with the chat component. In case
the user presses the ‘Send chat message’ five times in a row under ten seconds, a
question is triggered asking the user the cause of that event. Note that while the
first assessment would occur due to being associated with time, the probability of
this second assessment happening was very slim.
• an assessment to trigger a question when the user surpasses thirty percent of the
exercise list. The question itself interrogates the user regarding the exercise demon-
strations and its utility.
• an assessment to trigger if the user has not used the chat functionality at all after
ten minutes. In this case, the user is asked why did he not use that functionality,
either by not noticing, not needing it or feeling it is not important.
136
The assessments ranged from applying general questions to conjunctural assessments
based on situations where the utilization of the application determines if they are triggered
or not. Since the assessments used both EPRs and Enquiries, to specify them in the VD, it
was necessary to first define their parts in their respective UIs. As such, for each assessment,
evaluators were asked to use the EPR and Enquiry creation UIs and insert the necessary
elements.
6.2.4.1 Creating the EPRs
The highlighted assessments were all directed linked to an EPR as the trigger of the
entire assessment. Using the Events of the domain language, evaluators specified these
EPRs using the EPR Creation UI. Based on the conceptual assessments, the resulting
EPRs were:
• EPR should trigger five minutes after the login was performed - Defined by:
EventOperationDelay(TR Login, interval=‘300’).
• EPR should trigger if a ’send chat message’ is performed five times under five seconds
- Defined by:
EventOperationRepetition(TR SendMsgChat, repetitionTimes=‘5’, interval=‘5’).
• EPR should trigger if the user surpasses thirty percent of the exercise list - Defined
by:
EventOperationBiggerThan(TR ExerciseStatus, value=‘30’, parameter=‘percen-
tage’)
• EPR should trigger if the user did not use the chat component ten minutes after the
login - Defined by the two EPRs:
TR Login followed by EventOperationActiveInterval(EventOperationNot(TR Send-
MsgChat), interval=‘600’, evaluatesAtEnd=‘true’).
6.2.4.2 Creating the Enquiries
After defining the EPRs within the VD, the next step was to define the enquiries that
would be integrated into the evaluation assessments. Therefor, for each assessment, it was
necessary to create the corresponding questions within the VD. In our implementation
solution, it is not possible to include questions into an assessment without wrapping them
into an enquiry. As such, evaluators had to create a set of enquiries, each containing one
or more questions. Note that since the created enquiries were directly associated with the
objective of the assessment, most enquiries were composed of a single question.
137
6.2.4.3 Creating the Evaluation Assessments
After defining the enquiries and the EPR, it was now possible to define the evaluation.
To do so, evaluators defined their assessments using the Evaluation Assessment Creation UI
by selecting EPRs or Enquiries from the list of evaluation elements. Figure 6.12 showcases
one of those assessments, the example being the non usage of the chat component ten
minutes by the user after his login.
Figure 6.12: Defining an evaluation assessment using DynEaaS
The assessment possesses there elements: two EPRs and a subsequent enquiry. Sequen-
tially, the example assessment is defined by: (1) an EPR trigger that triggers if the login is
performed - EPR(TR Login) ; (2) an EPR that verifies if within ten minutes no chat mes-
sages are received - EPR(EventOperationActiveInterval(EventOperationNot(TR Send-
MsgChat), interval=‘600’, evaluatesAtEnd=‘true’)); (3) an enquiry containing a
single question - “As of now, you still have not used the chat component. Why is that?”.
After inserting all assessments in the VD, the evaluation was then ready to be instan-
tiated and applied to the users.
6.3 Applying the TeleRehabilitation Evaluation Test

The application of the test involved before all, the necessary preparation of the loca-
tion where the test would be applied. Since the TeleRehabilitation is composed by two
applications, one for the patient and another for the therapist, it was necessary to find two
locations in which the test could be made. After consideration, we decided to place the
138
patient in a controlled environment, our Living Lab environment3 . We choose the Living
Lab because we required a test in which we could not only verify the different aspects
that formed our dynamic evaluation solution, but in which we were also able to test the
DynEaaS platform itself. Regarding the therapist application, we selected an empty room
where the therapist could calmly administer the rehabilitation session.
After selecting the locations, the first step was to prepare the node in the AAL Living
Lab and link the applications to it. Since the node contemplated the AAL Living Lab,
all modules were installed in the same machine as the laboratory includes a private net-
work that can be accessed by all devices in the building. Since the therapist application
could also create events, the application was configured to be able to access the Event
Logger+Dispatcher module of node. Regarding the virtual domain, we installed it in a
server and added the node using the DynEaaS UI. Figure 6.13 illustrates the result of this
process.
Node TeleRehabilitation Patient Application
TeleRehabilitation_MultipleChoice
Questions
QuestionInterface
Interface
Manager
Node TeleRehabilitation_OpenQuestion
Modules Questions
Interface
Event
Logger+Dispatcher Events TeleRehabilitation_PatientAEP Patient
AAL Living Lab
Evaluation Business Logics TeleRehabilitation Business

TeleRehabilitation BusinessLogicsLogics
TeleRehabilitation Therapist Application
Virtual Domain TeleRehabilitation_TherapistAEP
Evaluator Therapist
Server Exterior Room
Figure 6.13: Distribution of the software components and their communication routines
for the execution of the TeleRehabilitation evaluation test
After setting up all components and configuring the IP addresses, the scenario was set
to start the evaluation. Note that some components (such as the therapist application)
are set at a remote location thus configuring a distributed environment that the solution
supports.
3
The Living Usability Lab is an AAL laboratory that simulates a user’s house and in which we are able
to test products or applications aimed at the AAL paradigm.
139
6.3.1 Starting and applying the test
Rather than choosing a high number of users, evaluators decided to conduct a small
evaluation as an experiment of the dynamic methodology. As such, the test was applied
to two users individually, one after the other within the Living Lab. The session was
coordinated by the therapist.
Before starting the test, the users were given a brief explanation of the application
and alerted to the fact that they might receive some questions during the execution of
the rehabilitation session. Also before the start of the session, the evaluators deployed the
evaluation into the node using the DynEaaS platform. As all was then ready, the users
were placed before the application and started the test.
Throughout the test, the users performed the session as requested and answered the
posed questions from the evaluation. The rehabilitation session was composed by a 12 step
exercise list for the patient to complete. Figure 6.14 shows the user and the therapist using
the TeleRehabilitation application as part of the evaluation test.
(a) User View (b) Physiotherapist View
Figure 6.14: Conducting the evaluation test during a TeleRehabilitation session between a
user and a therapist
During this period, the evaluators periodically updated the evaluation results on the
virtual domain by using DynEaaS to fetch the results from the node.
First impressions An early result of the evaluation test was that the application’s
interface was not suited for the evaluation. Asking the user to answer the questions using
the keyboard and mouse was uncomfortable. Since the rehabilitation session was performed
a couple of meters away from the interface, answering evaluation questions forced the user
to stop his current task, approach the monitor and then enter the information.
After this finding, evaluators decided that the interaction with the user should be
allowed via speech as well rather than using only the keyboard and mouse. As such, it
would be necessary to rewrite the evaluation test and the subsequent domain.
140
6.3.2 Performing a second iteration of the evaluation
The major modification the evaluators wanted was to allow the user to also answer
questions using speech as a way of improving user experience. To permit this, first of all,
it was necessary to change the applications in order to include speech technology. As a
consequence, the underlying interfaces (TeleRehabilitation MultipleChoiceInterface
and TeleRehabilitation OpenQuestionInterface) of the application would also have to
be altered. As a result, developers implemented a new version of the application and its
interfaces supporting speech.
Regarding the evaluation, since the interfaces were altered but no changes would be
made to the actual domain language, the same VD could be used. However, after perform-
ing the modifications to the application, the evaluators thought it important to include
a new assessment on the test regarding the usage of the new included speech support
and voice commands. This newly created assessment required an event that was not in
the original specification thus creating the need for alterations to be made to the domain
language.
6.3.2.1 Extending the domain

Since the modifications to the original evaluation were only based on a new event, the
designers (with our feedback) decided to implement a second virtual domain by extending
the already built domain. Using the original domain language as basis, changes were made
to the Event extended specification by including a new Event: the TR SpeechCommand
event. This event would trigger whenever the user issued a speech command and as the
event would be produced by the patient side of the application, the class was associated
with the TeleRehabilitation TherapistAEP event producer (via the cfw:isExecutedBy
property).
In order to maintain the correctness of the evaluation specification, we also advised
the designers to change the names of the interfaces for proper identification. This way, it
would be possible to easily compare results between the first and the second iteration of
the test and analyze the differences between their interfaces.
Changing the virtual domain implementation The only modification made to the
domain language was the inclusion of the TR SpeechCommand event. Due to it, the VD
required no modifications except the addition of the event. Since the domain language
was changed due to the interfaces, the VD was also updated with the new version in
order to correctly perform the generation process (from the evaluation to the execution
specification). After doing these steps, the VD was updated and ready to be used.
6.3.2.2 Extending the evaluation

As the previous domain language was extended, all evaluations specifications from the
first domain were necessarily compatible with the second domain. In this sense, as it was
141
only necessary to add a new assessment, evaluators were able to use the previous evaluation
specification and extend it.
To create the new assessment, once again, the evaluators had to introduce the EPRs and
the Enquiries prior to introducing the assessment. This assessment in particular had the ob-
jective of questioning the user if he did not use any voice commands five minutes after the lo-
gin was made. To trigger the question, the assessment included two EPRs to verify its con-
dition: EPR(TR SpeechCommand) followed by EventOperationActiveInterval(EventOpe-
rationNot(TR SpeechCommand), interval=‘300’, evaluatesAtEnd=‘true’).
After inserting both the enquiry and the EPRs, it was then possible to add the assess-
ment to the evaluation. Having completed it, the evaluation was once again ready to be
applied.
6.3.2.3 Applying the second iteration of the evaluation test

As the infrastructure was already set, the same users were informed of the possibility
of using speech to answer their questions. Similarly to the first test, the session took half
an hour and consisted on a telerehabilitation session between the user and the therapist.
The rehabilitation session was once again composed by a 12 step exercise list. As in the
first iteration, two users were asked to participate in the test in addition to the therapist.
The evaluation was made to each user independently.
6.3.3 Evaluation results

We will observe the results from the proof of concept using two perspectives: the results
that DynEaaS provides concerning the TeleRehabilitation evaluation test, and the results
that concerned the creation and administration of the evaluation test using our dynamic
solution. It should be noted that it was not in the scope of the test to use the results to
validate the TeleRehabilitation application.
6.3.3.1 Results from the evaluation test

Using the DynEaaS platform, evaluators were able to analyze certain data that with
general methods would not be accessible. Starting with the evaluation assessments, Dyn-
EaaS showed that the number of triggered assessments were similar between both users.
Regarding the overall feel of the application, both users described it as pleasant. Regarding
the experience with the exercise demonstrations after surpassing 30% of the exercise list,
both users also described it as useful. Other assessments like the possible malfunctioning
of the chat component or not using the chat functionality within ten minutes after login
did not trigger as their conditions were not fulfilled. This aspect however is also a result
as it showed that the users had no difficulties in using the chat component. Figure 6.15
showcases the result to a question from one the assessments.
In addition to the results from the assessments, evaluators were able to observe the
total times that it took each user to perform each exercise by observing the event logger.
142
Figure 6.15: Example of the DynEaaS UI displaying information about a question in one
of the evaluation test’s assessments.
Through events like TR NextExercise, ExerciseStatus or TimeChangeExercise, it be-

came possible to compare the performance of both users and conclude which user had more
ease in performing the exercises. Through events like SessionStart and SessionEnd, eval-
uators were able to analyze the total time associated with each session. Note that events
regarding the therapist application were also available and provided feedback regarding
the therapist’s performance.
Regarding the questions beyond the user’s answer, DynEaaS allowed evaluators to
analyze the total times it took each user to answer them, and most of all, compare the
differences between using the speech modalities and the keyboard and mouse modalities.
Comparing the total time between the first and second session of the same user, evaluators
were able to perceive that the speech modality lead to quicker answers. For example, in
regard to the exercise percentage assessment, one of the users took 47 seconds to answer
the question in the first iteration of the test and only 23 on the second. Note that we are
not concluding that one modality is better than the other but only explicitly indicating
what evaluators were able to analyze via DynEaaS.
To establish a full comparison between the test iterations or analyzing other data,
although complex, evaluators were also able to access the database through DynEaaS, and
perform SPARQL queries to extract more information. These queries were also performed
to the execution specification itself, thus analyzing the procedure in more detail. Figure
6.16 showcases this functionality with an example.
The example query inspects what elements within an evaluation assessment are cur-
rently in progress. The result from the query indicated that an EPREvent was the cur-
rently active element within the assessment. Besides this result, evaluators could apply
other queries to inspect when it started, what events have already occurred, among others.
143
Figure 6.16: Example of a SPARQL query applied to the evaluation results in DynEaaS
Although most information is accessible via the DynEaaS platform UIs, by using SPARQL
queries, evaluations could associate different elements and even try to infer new knowledge.
6.3.3.2 Results from the usage of DynEaaS

From the creation of the domain, to the design of the evaluation test and its application
to the users, it was possible to observe the stakeholders reactions regarding our evaluation
solution. On a general perspective, designers and evaluators pointed to the value of the
solution, namely through its ability to assess specific situations that the user may incur
or that could otherwise not be noticed. Simultaneously, they also pointed to the value
of the DynEaaS platform as a valuable tool namely by allowing quick access to ongoing
evaluations or the ability to define the tests using intuitive UIs.
At the start of the whole evaluation process, in addition to the evaluation’s objectives
and general considerations, we identified a few aspects that we intended to observe re-
garding the application of our evaluation solution by both evaluators and designers. We
now present these aspects followed by an description based on their feedback and our
observation:
• the node framework versatility and ease of installation - although it was not possible
to assess if a common user would be able to install it, due to its ‘out-of-the-box’
nature, the node framework was installed without any modifications to its imple-
mentation (with the exception of the network configuration).
• the node’s ability to integrate well with other applications - the TeleRehabilitation
application designers had to configure their application to connect with both the
node’s interface manager and event logger+dispatcher modules. As the modules
possessed a simple REST API, the integration was made with small effort.
144
• the developers difficulties to use the node’s functionalities to design and integrate new
interfaces/producers - as stated in the previously point, the application was easily
integrated. Additionally, when having to change the interfaces and event producers
from the first iteration to the second, no changes were necessary to the modules that
connected the application with the node.
• the designer’s ability when creating a domain language - the designer was capable
of building a domain language and later on, extend it. The process however was
not direct and required help from us as some ontology concepts were complex for a
non-specialist. A future work result of this process is the creation of an IDE that
assists them with the process.
• the virtual domain framework as an extendable software component - regarding the

domain language, it was only necessary to extend the Enquiry UI due to the inclusion
of new question types. We observed some difficulty from developers in understanding
how to manage the more ontology related aspects of the implementation. At the same
time, their feedback pointed to the value of the examples that assisted them. After
the initial experience, the alterations from the first VD to the second VD were made
with more ease.
• the evaluator’s comprehension of the solution - via presentations and through simple
examples using DynEaaS, evaluators comprehended the concept of the solution. In
addition, we felt from the evaluators some expectation regarding the possibilities of
the methodology.
• the evaluator’s ability to successfully use a virtual domain / the evaluator’s ability to
create an evaluation without assistance - through DynEaaS, evaluators were able to
create, deploy and analyze the evaluation test with a low level of assistance (mainly
due to some unknown terminology).
It is fair to point to some difficulties in using the solution without prior knowledge and
experience. Ontologies can be complex and more so when having to implement software
solutions. The existence of the VD and node frameworks though proved that it is possible
for designers and developers to create new domains that include their own evaluation
methods and properties. Regarding the evaluation, the existence of DynEaaS as a GUI of
the VD was highly important and praised by the evaluators. Overall, the ability to create,
apply and analyze test results joined with the stakeholders feedback proved the feasibility
and validity of dynamic evaluations.
6.4 Summary
In this chapter we described a proof of concept based on our evaluation solution. We
divided the proof of concept in three areas, starting with the implementation of a software
145
solution consisting on two frameworks that could be reused by designers to create and
establish dynamic evaluation systems.
To test the solution, an evaluation test was created for a TeleRehabilitation application
based on an AAL environment. The test introduced the ability to pose questions to the
user in specific situations, thus bringing context into the evaluation. Using the created
software solution, designers were able to create an evaluation system consisting of one
virtual domain and one node. Using the resulting system, they were also able to create
and apply the evaluation test as intended.
Besides the TeleRehabilitation evaluation itself, the proof of concept intended to analyze
the stakeholders ability to implement and use our evaluation solution. Results showed they
were able to do so despite initial difficulties. Added to the positive results from the test
itself, the proof of concept allow us to verify the feasibility of the dynamic evaluation
solution and conclude its overall value.
146
Chapter 7
Conclusions
7.1 Developed Work

Motivated by the limitations of existing evaluation methodologies in respect to reactive
environments, this thesis focused on the creation of an alternative approach to evaluation.
The initial part of the work focused on the realization of exploratory work, especially
regarding evaluation practices. We researched common evaluation methodologies as well as
others that made use of contextual data as a source of information, such as ESM solutions.
In this field, we observed that most solutions were specific for a given area and did not
possess the required depth to support diverse reactive environments. At the same time,
to understand the requirements and specifics of a reactive environment such as AAL, we
researched architectural implementation solutions, especially those using services at their
basis. Finally, in order to understand how can the user be included into the evaluation as
a dynamic element and not only as a source of information, we researched user and context
model approaches.
The second part of this work focused on the conceptualization of a dynamic evaluation
paradigm for reactive environments. We started by identifying the need to include context
into the evaluation due to its possible effect on the user. We addressed the need to focus
on the user by introducing adaptability in the interaction thus countering compliance or
lack of enthusiasm issues. Finally, we pointed to the lack of reusability that evaluation
practices have, as well as to the weight that logistics still have on evaluation applications.
As a result, we proposed a dynamic evaluation paradigm focused on context-awareness,
supporting multiple interaction modalities, enhance evaluation definitions with semantic
information and automating the distribution and execution of evaluations. To support
these aspects we introduced a conceptual architecture with novel concepts such as nodes
and domains, identifying the users and the evaluators respectively as self-contained ele-
ments of a generic distributed architecture capable of creating and applying evaluation
tests to reactive environments.
Having gathered the main requirements and objectives for the creation of an evaluation
solution for reactive environments, a substantial part of this work focused on the creation of
147
a dynamic evaluation methodology supported by a flexible ontology model. The method-
ology had the objective of supporting the concepts of the dynamic evaluation paradigm,
especially by enabling the contemplation of context within evaluations thus providing eval-
uators with the ability to perform “in situ” assessments triggered by specific situations.
Aided by concepts such as EPRs and enquiries, evaluators are able to define situations
using complex event composition and gather data in real-time. Through a multiple level
approach, the methodology enables the creation of diverse domain specifications, provid-
ing designers with the ability to define new evaluation instruments applied according to
their own wishes and objectives. Due to its systematic nature, it ensures reusability and
guarantees the ability to bring semantics into the specifications. Finally, it contemplates
the diverse nature of nodes by including a common specification for execution.
The forth part of this work focused on the creation of a software architecture capable
of supporting the creation, application and analysis of evaluations based on the dynamic
evaluation paradigm. The architecture materializes concepts such as domains and nodes
into services aided by a common infrastructure based on a support unit and an evalua-
tion hub. The architecture operates on a SOA logic and internally adopts the dynamic
evaluation methodology and its model by configuring a set of operations based on their
specification principles. In short, a domain is seen as an evaluation creation service based
on a domain specification. A node is seen as an evaluation resource where evaluations
can be executed and performed by its user. A support unit links them and enables the
creation of evaluation networks using criteria associated to the domain specifications and
the node’s characteristics. Through these networks, we provide evaluators with a set of
users guaranteed to be compatible with the domain and to whom they can immediately
sent evaluation tests without any logistical effort. Altogether, they enable the creation of
mass-scale systems for dynamic evaluations.
The final part of the work proved the feasibility of the dynamic evaluation solution. To
do so, we designed a proof of concept based on three main parts: the implementation of a
software solution based on the principles of our contribution, the creation of a test using
the implementation solution and finally, its application to the users in a concrete evaluation
scenario. The software solution consisted on developing two frameworks representing the
most important parts of our solution, the node, the domain and the underlying dynamic
evaluation methodology. With it, we created an evaluation test for a TeleRehabilitation
application in an AAL environment and successfully applied it to a set of users. Subsequent
results provided important evaluation data, and stakeholders pointed to the value of the
dynamic evaluation proposal.
7.2 Main Results

At the start of this thesis, we pointed to the necessity of creating an alternative approach
for evaluation in reactive environments that considered the innate dynamicity of those
environments and addressed context and as an important data source. The result, is
a dynamic evaluation solution for reactive environments based on three major parts: a
148
paradigm, a methodology and its model, and a support architecture. Altogether, they
configure a scalable, flexible and capable evaluation proposal for evaluation design and
application in reactive environments.
In Chapter 1, we indicated some objectives that we felt were essential to the realization
of this thesis. First of all, we pointed to the need of incorporating context when performing
evaluations in reactive environments such as AAL. Our second objective concerned the low
reusability rates of evaluation tests and the need to develop new software whenever the
evaluation method changes. The third objective pointed to the necessity of facilitating
evaluation application by simplifying logistics and allowing evaluators to focus on the test
rather than on its deployment.
Regarding contextual data, we proposed an evaluation methodology where contextual
aspects can be part of evaluations. We introduced extensible specifications for elements
such as Enquiries, Events and EPRs assisted by concepts such as evaluation assessments
to allow evaluators to define specific evaluation situations and obtain “in situ” data about
them. To tackle the low reusability rates of evaluation tests, our evaluation model was
designed using ontologies. Through a systematic approach to specify evaluations, we al-
lowed them to bring concepts, specifications or even whole ontologies from their research
areas into the evaluations. With this semantic approach, we open the way for future
inter-evaluation inference and the possibility of obtaining additional knowledge.
To facilitate the creation and application of evaluation tests, we introduced a software
architecture based on the concepts of domains and nodes. Domains represent evaluation
creation areas where it is possible to create new evaluation tests using a set of predefined
elements (according to a domain specification) and automatically send them to a set of
users. Using criteria, domains are matched with nodes that met their conditions and
incorporate them within evaluation networks that become ready to use by evaluators. On
the other hand, nodes encapsulate the user by defining him as an evaluation resource and
feature an architecture capable of applying evaluations to the user.
Altogether, the methodology, the model and the architecture confirm the value of the
solution, its ability to creation dynamic systems and subsequently, dynamic evaluation
tests. To prove the feasibility and overall value of our proposal, we created a proof of
concept scenario. Obviously, within the context of a PhD, the proof of concept could not
be exhaustive given time and resource constrains. Our choice was to implement the main
elements of our proposal and perform a detailed but minimalistic test. Despite this however,
the developed solution has already permitted interesting evaluations to be performed, and
provided data that otherwise would be difficult to obtain.
In comparison with the existing ESM solutions presented at Chapter 2, our implemented
solution is capable of offering similar features such as event composition or applying an
evaluation on a phone like Momento[Carter et al., 2007] and Maestro[Meschtscherjakov
et al., 2010] do so. On the other hand, DynEaaS provides a set of features they do not,
such as the ability to consider multiple event producers or multiple interfaces and enabling
evaluators to consider more than the phone1 . In [Fischer, 2009], the author made a critical
1
Annex B shows a more in-depth analysis of our solution in comparison with the Momento and Maestro
149
review of some ESM tools and indicated several principles that should be considered when
designing an ESM tool (see Section 2.2). Our solution meets these principles as all seven
are either fulfilled or surpassed.
Overall, we consider that our evaluation solution surpassed the objectives of this thesis.
Due to the flexibility and range of the proposal, the proposed solution goes beyond the
requirements and establishes itself as a general purpose evaluation solution.
7.3 Future Work

In the scope of this work, many and diverse results can be approached in the future.
Following, we point a few of the most important and interesting ones.
Implementation of new Evaluation Systems In this PhD we chose to perform a

small but detailed evaluation test. In the future we intend to use the dynamic evaluation
solution in new tests with a higher number of users and through longer periods of time.
Similarly, we intend to perform tests that include sparse users and in areas ranging from
AAL to ubiquitous computing.
Handling concurrent evaluations In our evaluation solution, we allow nodes to receive

and apply multiple evaluations at the same time. An interesting future work aspect is
associated with how to handle these evaluations in regard to their interaction with the
user. While this aspect is dependent of the type of interfaces, interfaces that, for instance,
use speech to question the user at the same time can create a race condition with the user at
the center. Such an event can directly influence the user and the associated evaluations. In
this thesis, we did not tackle this aspect but created the first steps by including a Priority
property in the control flow ontology. Nonetheless, it would be interesting to test other
policies, for instances by categorizing interfaces through medium or resources and giving
the node’s interface manager the responsibility to handle their execution.
IDE for Domain Language Creation The development of an IDE for domain language
creation would be a helpful tool for designers. Using a visual interface with a drag and
drop method, developers would be able to create the domain language more easily as the
IDE would verify the correctness of the language and guarantee the necessary restrictions
of the domain language regarding the generic domain language.
Although complex, the IDE could include features that automatically generated UIs or
a generic code structure based on Question and Answer extended specifications thus aiding
designers when implementing new virtual domains.
Dynamic Execution Solution Our dynamic evaluation solution goes beyond the scope
of evaluation. The generic nature of the solution (methodology, model and architecture)
ESM tools
150
makes it possible to envision its usage in additional areas. By expanding the evaluation
assessment ontology with new items such as commands, and by using the Enquiry or
Event specifications as Inputs/Outputs of the assessments, our approach can be used as
a generic execution solution. Due to the outsourcing mechanism linked with the interface
components, it is possible to include not only different users in a cooperative network but
other systems as well in a distributed ecossystem. The applicability of such a solution could
range from handling a robotic device in a controlled environment to manage orchestration
in a service oriented architecture.
Introduce Enquiry Decision Points The enquiry specification presented in this PhD
does not allow evaluators to create enquiries that evolve according to the answers of the
user. We chose to leave it as future work as it would either be very limited (f.i. by being
based on multiple choice questions) or very complex for evaluators to use (as it would
require coding to interpret the output of the user). Despite this, we made the initial steps
to allow this feature (through the Gateway element of the control flow ontology) and expect
to introduce it in the future.
Multi-Evaluation Inference through Domain Languages Because domain lan-

guages can be used to establish different evaluation domains, it becomes possible to apply
algorithms to try and infer additional knowledge from performed evaluations.
7.4 Epilogue
Now this is not the end. It is not even the beginning of the end. But it is,
perhaps, the end of the beginning.
- Sir Winston Churchill
151
152
Acronyms
AAL Ambient Assisted Living. 2, 3
BPEL Business Process Execution Language. 15
CRM Customer Relationship Management. 15
DAML DARPA Agent Markup Language. 24
EJB Enterprise JavaBeans. 121
ERP Enterprise Resource Planning. 15
ESM Experience Sampling Methodology. 3
IM Instant Messaging. 10
JSF Java Server Faces. 121
OIL Ontology Inference Layer. 24
ORM Object-Role Modeling. 18
OWL Web Ontology Language. 24
RDF Resource Description Framework. 18
RDFS Resource Description Framework Schema. 24
SOA Service Oriented Architecture. 4, 14
SWRL Semantic Web Rule Language. 24
UIs User Interfaces. 87
UML Unified Modeling Language. 18
153
URI Uniform Resource Identifier. 85, 93
WSDL Web Service Definition Language. 15
XML eXtensible Markup Language. 18
154
Bibliography
[AAL4ALL Consortium, 2015] AAL4ALL Consortium (2012-2015). AAL4ALL - Ambient

Assisted Living for All. http://www.aal4all.org/. [Online; accessed 20-September-
2015].
[Abowd et al., 1999] Abowd, G. D., Dey, A. K., Brown, P. J., Davies, N., Smith, M., and
Steggles, P. (1999). Towards a better understanding of context and context-awareness. In
Proceedings of the 1st international symposium on Handheld and Ubiquitous Computing,
HUC ’99, pages 304–307, London, UK. Springer-Verlag.
[Afonso et al., 2013] Afonso, A., Lima, J., and Cota, M. P. (2013). Usability Assessment
of Web Interfaces: User Testing. In de Informação, S. e. T., editor, Conferência Ibérica
de Sistemas e Tecnologias de Informacão, volume 1, page 150, Lisboa, Portugal.
[Ailisto et al., 2002] Ailisto, H., Alahuhta, P., Haataja, V., and Kylloenen, V. (2002).
Structuring context aware applications: Five-layer model and example case. Proceedings
of the Workshop on Concepts and Models for Ubiquitous Computing.
[Anicic et al., 2011] Anicic, D., Fodor, P., Rudolph, S., and Stojanovic, N. (2011). Ep-
sparql: A unified language for event processing and stream reasoning. In Proceedings
of the 20th International Conference on World Wide Web, WWW ’11, pages 635–644,
New York, NY, USA. ACM.
[Anicic et al., 2010] Anicic, D., Fodor, P., Rudolph, S., Stühmer, R., Stojanovic, N., and
Studer, R. (2010). A rule-based language for complex event processing and reasoning.
In Hitzler, P. and Lukasiewicz, T., editors, Web Reasoning and Rule Systems, volume
6333 of Lecture Notes in Computer Science, pages 42–57. Springer Berlin Heidelberg.
[Atrey et al., 2010] Atrey, P. K., Hossain, M. A., El Saddik, A., and Kankanhalli, M. S.
(2010). Multimodal fusion for multimedia analysis: a survey. Multimedia systems,
16(6):345–379.
[Atzori et al., 2010] Atzori, L., Iera, A., and Morabito, G. (2010). The internet of things:
A survey. Comput. Netw., 54(15):2787–2805.
[Bakouros, 2000] Bakouros, Y. (2000). Technology Evaluation. Technical report, Univer-

sity of Thessaly.
155
[Baldauf et al., 2007] Baldauf, M., Dustdar, S., and Rosenberg, F. (2007). A survey on
context-aware systems. Int. J. Ad Hoc Ubiquitous Comput., 2:263–277.
[Barrett, 1999] Barrett, L. (1999). Esp: The experience sampling program. http://www.
experience-sampling.org/. [Online; accessed 20-September-2015].
[Barrett and Barrett, 2001] Barrett, L. F. and Barrett, D. J. (2001). An Introduction to

Computerized Experience Sampling in Psychology. Social Science Computer Review,
19(2):175–185.
[Bernsen and Dybkjr, 2009] Bernsen, N. O. and Dybkjr, L. (2009). Multimodal Usability.
Springer Publishing Company, Incorporated, 1st edition.
[Bevan and Bruval, 2003] Bevan, N. and Bruval, P. (2003). Usability Net:Tools &
Methods. http://www.usabilitynet.org/tools/list.htm. [Online; accessed 20-
September-2015].
[Bieberstein et al., 2005] Bieberstein, N., Bose, S., Fiammante, M., Jones, K., and Shah,
R. (2005). Service-Oriented Architecture Compass: Business Value, Planning, and En-
terprise Roadmap. Prentice Hall PTR, Upper Saddle River, NJ, USA.
[Brabham, 2008] Brabham, D. C. (2008). Crowdsourcing as a model for problem solving:

An introduction and cases. Convergence, 14(1):75.
[Brajnik and Tasso, 1994] Brajnik, G. and Tasso, C. (1994). A shell for developing non-
monotonic user modeling systems. Int. J. Hum.-Comput. Stud., 40:31–62.
[Brandt et al., 2007] Brandt, J., Weiss, N., and Klemmer, S. R. (2007). Txt 4 l8r: Lowering
the burden for diary studies under mobile conditions. In CHI ’07 Extended Abstracts on
Human Factors in Computing Systems, CHI EA ’07, pages 2303–2308, New York, NY,
USA. ACM.
[Brusilovsky, 2004] Brusilovsky, P. (2004). Knowledgetree: A distributed architecture for

adaptive e-learning. In Proceedings of the 13th International World Wide Web Confer-
ence on Alternate Track Papers &Amp; Posters, WWW Alt. ’04, pages 104–113, New
York, NY, USA. ACM.
[Carter et al., 2007] Carter, S., Mankoff, J., and Heer, J. (2007). Momento: Support for
situated ubicomp experimentation. In Proceedings of the SIGCHI Conference on Human
Factors in Computing Systems, CHI ’07, pages 125–134, New York, NY, USA. ACM.
[Castillo, 1997] Castillo, J. C. (1997). The User-Reported Critical Incident Method for Re-
mote Usability Evaluation. Master’s thesis, Faculty of the Virginia Polytechnic Institute
and State University.
[Chen, 2004] Chen, H. (2004). An Intelligent Broker Architecture for Pervasive Context-
Aware Systems. PhD thesis, University of Maryland, Baltimore County.
156
[Chen et al., 2003] Chen, H., Finin, T., and Joshi, A. (2003). An ontology for context-
aware pervasive computing environments. Knowl. Eng. Rev., 18:197–207.
[Consolvo et al., 2007] Consolvo, S., Harrison, B., Smith, I., Chen, M. Y., Everitt, K.,
Froehlich, J., and Landay, J. A. (2007). Conducting in situ evaluations for and with
ubiquitous computing technologies. International Journal of Human-Computer Interac-
tion, 22(1-2):103–118.
[Consolvo and Walker, 2003] Consolvo, S. and Walker, M. (2003). Using the experi-
ence sampling method to evaluate ubicomp applications. IEEE Pervasive Computing,
2(2):24–31.
[Csikszentmihalyi and Larson, 2014] Csikszentmihalyi, M. and Larson, R. (2014). Validity

and reliability of the experience-sampling method. In Flow and the Foundations of
Positive Psychology, pages 35–54. Springer Netherlands.
[Dey et al., 2001] Dey, A. K., Abowd, G. D., and Salber, D. (2001). A conceptual frame-
work and a toolkit for supporting the rapid prototyping of context-aware applications.
Hum.-Comput. Interact., 16(2):97–166.
[Fetter and Gross, 2011] Fetter, M. and Gross, T. (2011). Primiexperience: Experience
sampling via instant messaging. In Proceedings of the ACM 2011 Conference on Com-
puter Supported Cooperative Work, CSCW ’11, pages 629–632, New York, NY, USA.
ACM.
[Fetter et al., 2011] Fetter, M., Schirmer, M., and Gross, T. (2011). Caessa: Visual au-
thoring of context-aware experience sampling studies. In CHI ’11 Extended Abstracts on
Human Factors in Computing Systems, CHI EA ’11, pages 2341–2346, New York, NY,
USA. ACM.
[Fickas et al., 1997] Fickas, S., Kortuem, G., and Segall, Z. (1997). Software organiza-
tion for dynamic and adaptable wearable systems. In Proceedings First International
Symposium on Wearable Computers (ISWC Äô97, pages 13–14.
’
[Fielding, 2000] Fielding, R. T. (2000). REST: Architectural Styles and the Design of
Network-based Software Architectures. Doctoral dissertation, University of California,
Irvine.
[Finin and Drager, 1986] Finin, T. and Drager, D. (1986). Gums: a general user modeling
system. In Proceedings of the workshop on Strategic computing natural language, HLT
’86, pages 224–230, Stroudsburg, PA, USA. Association for Computational Linguistics.
[Fischer, 2009] Fischer, J. (2009). Experience-sampling tools: a critical review. In Proceed-

ings of the 11th International Conference on Human-Computer Interaction with Mobile
Devices and Services. ACM.
157
[Foundation, 2015] Foundation, T. A. S. (2015). Apache Jena. https://jena.apache.
org/. [Online; accessed 20-September-2015].
[Froehlich, 2009] Froehlich, J. (2009). myexperience. http://myexperience.

sourceforge.net/index.html. [Online; accessed 20-September-2015].
[Froehlich et al., 2007] Froehlich, J., Chen, M. Y., Consolvo, S., Harrison, B., and Landay,
J. A. (2007). Myexperience: A system for in situ tracing and capturing of user feedback
on mobile phones. In Proceedings of the 5th International Conference on Mobile Systems,
Applications and Services, MobiSys ’07, pages 57–70, New York, NY, USA. ACM.
[Group, 2011] Group, O. M. (2011). Business Process Model and Notation (BPMN) Ver-
sion 2.0. Technical report.
[Group, 2014a] Group, R. W. (2014a). RDF Schema 1.1. http://www.w3.org/TR/

rdf-schema/. [Online; accessed 20-September-2015].
[Group, 2014b] Group, R. W. (2014b). Resource Description Framework (RDF). http:

//www.w3.org/RDF/. [Online; accessed 20-September-2015].
[Hanington and Martin, 2012] Hanington, B. and Martin, B. (2012). Universal Methods
of Design: 100 Ways to Research Complex Problems, Develop Innovative Ideas, and
Design Effective Solutions. Rockport Publishers, Beverly, MA.
[Hektner et al., 2007] Hektner, J. M., Schmidt, J. A., and Csikszentmihalyi, M. (2007).
Experience sampling method: measuring the quality of everyday life. SAGE Publications,
Thousand Oaks, CA, USA.
[Hofer et al., 2003] Hofer, T., Schwinger, W., Pichler, M., Leonhartsberger, G., Altmann,
J., and Retschitzegger, W. (2003). Context-awareness on mobile devices - the hydrogen
approach. In Proceedings of the 36th Annual Hawaii International Conference on System
Sciences (HICSS’03) - Track 9 - Volume 9, HICSS ’03, pages 292.1–, Washington, DC,
USA. IEEE Computer Society.
[Huang et al., 1991] Huang, X., McCalla, G. I., Greer, J. E., and Neufeld, E. (1991).
Revising deductive knowledge and stereotypical knowledge in a student model. User
Modeling and User-Adapted Interaction, 1(1):87–115.
[Huhns and Singh, 2005] Huhns, M. and Singh, M. (2005). Service-oriented computing:
key concepts and principles. Internet Computing, IEEE, 9(1):75–81.
[Indulska et al., 2003] Indulska, J., Robinson, R., Rakotonirainy, A., and Henricksen, K.
(2003). Experiences in using cc/pp in context-aware systems. In In Proc. of the Intl.
Conf. on Mobile Data Management (MDM, pages 247–261. Springer.
158
[Intille et al., 2003] Intille, S. S., Rondoni, J., Kukla, C., Ancona, I., and Bao, L. (2003).
A context-aware experience sampling tool. In CHI ’03 Extended Abstracts on Human
Factors in Computing Systems, CHI EA ’03, pages 972–973, New York, NY, USA. ACM.
[Kay, 1994] Kay, J. (1994). The um toolkit for cooperative user modelling. User Modeling
and User-Adapted Interaction, 4:149–196. 10.1007/BF01100243.
[Kay, 2000] Kay, J. (2000). User modeling for adaptation. In User Interfaces for All:
Concepts, Methods, and Tools (Human Factors and Ergonomics), Human factors and
ergonomics. CRC Press, 1 edition.
[Kay et al., 2002] Kay, J., Kummerfeld, B., and Lauder, P. (2002). Personis: A server
for user models. In Proceedings of the Second International Conference on Adaptive
Hypermedia and Adaptive Web-Based Systems, AH ’02, pages 203–212, London, UK,
UK. Springer-Verlag.
[Kelly et al., 2008] Kelly, D., Harper, D. J., and Landau, B. (2008). Questionnaire mode
effects in interactive information retrieval experiments. Information Processing & Man-
agement, 44(1):122–141.
[Kobsa, 2007] Kobsa, A. (2007). Generic user modeling systems. In The adaptive web:
methods and strategies of web personalization, volume 4321 of Lecture Notes In Computer
Science, pages 136–154. Springer Verlag.
[Kobsa and Pohl, 1994] Kobsa, A. and Pohl, W. (1994). The user modeling shell system
bgp-ms. User Modeling and User-Adapted Interaction, 4:59–106. 10.1007/BF01099428.
[Kono et al., 1994] Kono, Y., Ikeda, M., and Mizoguchi, R. (1994). Themis: a nonmono-
tonic inductive student modeling system. J. Artif. Intell. Educ., 5:371–413.
[Larson and Csikszentmihalyi, 1983] Larson, R. and Csikszentmihalyi, M. (1983). The

experience sampling method. In Reis, H. T., editor, Naturalistic Approaches to Studying
Social Interaction, volume 15 of New Directions for Methodology of Social and Behavioral
Science, pages 41–56. Jossey-Bass, San Francisco, CA, USA.
[LUL Consortium, 2012] LUL Consortium (2010-2012). LUL - Living Usability Lab
for Next Generation Networks. http://www.livinglab.pt/. [Online; accessed 20-
September-2015].
[Luz, 2015] Luz, N. (2015). Ontology-based representation and generation of workflows for
micro-task human-machine computation. PhD thesis, University of Minho, Porto and
Aveiro.
[Luz et al., 2014] Luz, N., Pereira, C., Silva, N., Novais, P., Teixeira, A., and Oliveira e
Silva, M. (2014). An ontology for human-machine computation workflow specification.
In Polycarpou, M., de Carvalho, A. C., Pan, J.-S., Wozniak, M., Quintian, H., and
159
Corchado, E., editors, Hybrid Artificial Intelligence Systems, volume 8480 of Lecture
Notes in Computer Science, pages 49–60. Springer International Publishing.
[Ma, 2007] Ma, D. (2007). The business model of ”software-as-a-service”. In Services

Computing, 2007. SCC 2007. IEEE International Conference on, pages 701–702.
[Martins et al., 2014] Martins, J., Alves, M., Andrade, S., Pereira, C., Teixeira, A., and
Fale, I. (2014). Central auditory processing evaluation - normative data for portuguese
pediatric population. In Proceeding of HealthIPLeiria2014 - Segundo Congresso Inter-
nacional de Saude do IPLeiria.
[Merriam-Webster, 2015] Merriam-Webster, I. (2015). Dictionary and Thesaurus —

Merriam-Webster. http://www.merriam-webster.com/. [Online; accessed 20-
September-2015].
[Meschtscherjakov et al., 2010] Meschtscherjakov, A., Reitberger, W., and Tscheligi, M.

(2010). Maestro: Orchestrating user behavior driven and context triggered experience
sampling. In Proceedings of the 7th International Conference on Methods and Techniques
in Behavioral Research, MB ’10, pages 29:1–29:4, New York, NY, USA. ACM.
[MetricWire, 2015] MetricWire (2015). MetricWire: Mobile Data Collection made easy.
https://metricwire.com/. [Online; accessed 20-September-2015].
[Michelson, 2006] Michelson, B. (2006). Event-driven architecture overview. Patricia Sey-

bold Group, Feb.
[Mitchell, 2007] Mitchell, P. (2007). A step-by-step Guide to Usability Testing. iUniverse,

USA.
[movisens GmbH, 2015] movisens GmbH (2015). movisensXS. https://xs.movisens.

com/. [Online; accessed 20-September-2015].
[Nielsen, 1993] Nielsen, J. (1993). Usability Engeneering. Academic Press, Boston,.
[Oracle Corporation, 2014] Oracle Corporation (2014). JSR-000220 Enterprise Jav-

aBeans 3.0. https://jcp.org/aboutJava/communityprocess/final/jsr220/index.
html. [Online; accessed 20-September-2015].
[Oracle Corporation, 2015] Oracle Corporation (2015). GlassFish - World’s first Java EE
7 Application Server. https://glassfish.java.net/. [Online; accessed 20-September-
2015].
[Orwant, 1994] Orwant, J. (1994). Heterogeneous learning in the doppelgänger

user modeling system. User Modeling and User-Adapted Interaction, 4:107–130.
10.1007/BF01099429.
160
[OWL Working Group, 2009] OWL Working Group, W. (27 October 2009). OWL 2 Web
Ontology Language: Document Overview. W3C Recommendation. Available at http:
//www.w3.org/TR/owl2-overview/.
[Papazoglou and van den Heuvel, 2007] Papazoglou, M. and van den Heuvel, W.-J. (2007).
Service oriented architectures: approaches, technologies and research issues. The VLDB
Journal, 16(3):389–415.
[Paschke and Kozlenkov, 2009] Paschke, A. and Kozlenkov, A. (2009). Rule-based event
processing and reaction rules. In Governatori, G., Hall, J., and Paschke, A., editors,
Rule Interchange and Applications, volume 5858 of Lecture Notes in Computer Science,
pages 53–66. Springer Berlin Heidelberg.
[Pereira et al., 2015] Pereira, C., Almeida, N., Martins, A., Silva, S., Rosa, A., Oliveira e
Silva, M., and Teixeira, A. (2015). Evaluation of complex distributed multimodal appli-
cations: Evaluating a telerehabilitation system when it really matters. In Zhou, J. and
Salvendy, G., editors, Human Aspects of IT for the Aged Population. Design for Every-
day Life, volume 9194 of Lecture Notes in Computer Science, pages 146–157. Springer
International Publishing.
[Pereira et al., 2013a] Pereira, C., Teixeira, A., and e Silva, M. O. (2013a). Towards an
integrated view of reported outcomes. In CISTI’2013 (8th Iberian Conference on Infor-
mation Systems and Technologies), Lisbon.
[Pereira et al., 2014] Pereira, C., Teixeira, A., and e Silva, M. O. (2014). Live evaluation
within ambient assisted living scenarios. In Proceedings of the 7th International Confer-
ence on PErvasive Technologies Related to Assistive Environments, PETRA ’14, pages
14:1–14:6, New York, NY, USA. ACM.
[Pereira et al., 2013b] Pereira, C., Teixeira, A., Rocha, N., Oliveira e Silva, M., Ferreira,
F., and Oliveira, A. (2013b). Arquitectura de Desenvolvimento. In Laboratório Vivo de
Usabilidade (Living Usability Lab, pages 123–140. ARC Publishing.
[Pivotal Software, 2015] Pivotal Software, I. (2015). Rabbitmq. https://www.rabbitmq.

com/. [Online; accessed 20-September-2015].
[RedHat, 2013] RedHat, I. (2013). Wildfly homepage - wildfly. http://wildfly.org/.

[Online; accessed 20-September-2015].
[Reis and Gable, 2000] Reis, H. T. and Gable, S. L. (2000). Event-sampling and other
methods for studying everyday experience, chapter 8, pages 190–222. Cambridge Univer-
sity Press, Cambridge, UK.
[Rich, 1979] Rich, E. (1979). User modeling via stereotypes. Cognitive Science, 3(4):329–
354.
161
[Samulowitz et al., 2001] Samulowitz, M., Michahelles, F., and Linnhoff-Popien, C. (2001).
Capeus: An architecture for context-aware selection and execution of services. In Pro-
ceedings of the IFIP TC6 / WG6.1 Third International Working Conference on New
Developments in Distributed Applications and Interoperable Systems, pages 23–40, De-
venter, The Netherlands, The Netherlands. Kluwer, B.V.
[Schilit et al., 1994] Schilit, B. N., Adams, N., and Want, R. (1994). Context-aware com-
puting applications. In In Proceedings of the workshop on mobile computing systems and
applications, pages 85–90. IEEE Computer Society.
[Sheng and Benatallah, 2005] Sheng, Q. Z. and Benatallah, B. (2005). Contextuml: a uml-
based modeling language for model-driven development of context-aware web services.
In In: The 4th International Conference on Mobile Business, pages 206–212.
[Shneiderman, 1997] Shneiderman, B. (1997). Designing the User Interface: Strategies for
Effective Human-Computer Interaction. Addison-Wesley Longman Publishing Co., Inc.
[Silva, 2004] Silva, N. (2004). Multi-Dimensional Service-Oriented Ontology Mapping. PhD
thesis, Universidade de Trás-os-montes e Alto Douro.
[Silva et al., 2015] Silva, S., Almeida, N., Pereira, C., Martins, A. I., Rosa, A. F., Oliveira e
Silva, M., and Teixeira, A. (2015). Design and Development of Multimodal Applications:
A Vision on Key Issues and Methods. In Antona, M. and Stephanidis, C., editors, Uni-
versal Access in Human-Computer Interaction. Access to Today’s Technologies, volume
9175 of Lecture Notes in Computer Science, pages 109–120. Springer International Pub-
lishing.
[Stanford Center, 2015] Stanford Center, B. (2015). Protégé. http://protege.stanford.
edu/. [Online; accessed 20-September-2015].
[Strang and Linnhoff-Popien, 2004] Strang, T. and Linnhoff-Popien, C. (2004). A context
modeling survey. In Workshop on Advanced Context Modelling, Reasoning and Manage-
ment, UbiComp 2004 - The Sixth International Conference on Ubiquitous Computing,
Nottingham/England.
[Strang and Popien, 2004] Strang, T. and Popien, C. L. (2004). A context modeling survey.
In Workshop on Advanced Context Modelling, Reasoning and Management, UbiComp
2004 - The Sixth International Conference on Ubiquitous Computing.
[Studer et al., 1998] Studer, R., Benjamins, V. R., and Fensel, D. (1998). Knowledge
engineering: Principles and methods. Data Knowl. Eng., 25(1-2):161–197.
[Teixeira et al., 2012] Teixeira, A., Pereira, C., e Silva, M. O., Almeida, N., Pinto, J. S.,
Teixeira, C., Ferreira, F., and Mota, A. (2012). Health@home scenario: Creating a new
support system for home telerehabilitation. In AAL 2012 - 2nd International Living Us-
ability Lab Workshop on AAL Latest Solutions, Trends and Applications (In conjuction
with BIOSTEC 2012), Vilamoura, Portugal.
162
[Teixeira et al., 2013] Teixeira, A., Pereira, C., e Silva, M. O., Alvarelhao, J., Silva, A.,
Cerqueira, M., Martins, A. I., Pacheco, O., Almeida, N., Oliveira, C., Costa, R., Neves,
A. J. R., Queiros, A., and Rocha, N. (2013). New telerehabilitation services for the
elderly. In Isabel Maria Miranda, M. M. C.-C., editor, Handbook of Research on ICTs
for Healthcare and Social Services: Developments and Applications. IGI Global.
[Teixeira et al., 2015] Teixeira, A. J. S., Nelson Rocha, C. P., Pinto, J. S., Dias, M. S.,
Teixeira, C., e Silva, M. O., Queirós, A., Ferreira, F., and Oliveira, A. (2015). The
Living Usability Lab Architecture: Support for the Development and Evaluation of
New AAL Services for the Elderly. In Ambient Assisted Living: From Technology to
Intervention, pages 477–508. CRC Press.
[Teixeira et al., 2011a] Teixeira, A. J. S., Pereira, C., e Silva, M. O., Pacheco, O., Neves,
A. J. R., and Casimiro, J. (2011a). Adapto. adaptive multimodal output. In Benavente-
Peces, C. and Filipe, J., editors, Proceddings of the International Conference on Per-
vasive and Embedded Computing and Communication Systems (PECCS), pages 91–100.
SciTePress.
[Teixeira et al., 2011b] Teixeira, C., Pinto, J. S., Ferreira, F., Oliveira, A., Teixeira, A.,
and Pereira, C. (2011b). Cloud computing enhanced service development architecture for
the living usability lab. In Cruz-Cunha, M., Varajao, J., Powell, P., and Martinho, R.,
editors, ENTERprise Information Systems, volume 221 of Communications in Computer
and Information Science, pages 289–296. Springer Berlin Heidelberg.
[Thies and Vossen, 2008] Thies, G. and Vossen, G. (2008). Web-oriented architectures:
On the impact of web 2.0 on service-oriented architectures. In Asia-Pacific Services
Computing Conference, 2008. APSCC ’08. IEEE, pages 1075–1082.
[Tijerino and Al-muhammed, 2004] Tijerino, Y. A. and Al-muhammed, M. (2004). Toward

a flexible human-agent collaboration framework with mediating domain ontologies for
the semantic web. Meaning Coordination and Negotiation (MCN-04).
[Tomitsch et al., 2010] Tomitsch, M., Singh, N., and Javadian, G. (2010). Using diaries
for evaluating interactive products: The relevance of form and context. In Proceedings
of the 22Nd Conference of the Computer-Human Interaction Special Interest Group of
Australia on Computer-Human Interaction, OZCHI ’10, pages 204–207, New York, NY,
USA. ACM.
[Vergara, 1994] Vergara, H. (1994). Protum - a prolog based tool for user modeling. Berich
Nr.55/94; WIS-Memo 10.
[W3C, 2000] W3C (2000). SOAP Version 1.2. http://www.w3.org/TR/soap/. [Online;

accessed 20-September-2015].
[W3C, 2007] W3C (2007). Composite Capabilities / Preferences Profile (CC/PP). http:
//www.w3.org/Mobile/CCPP/. [Online; accessed 20-September-2015].
163
[W3C, 2008] W3C (2008). SPARQL Query Language for RDF. http://www.w3.org/TR/
rdf-sparql-query/. [Online; accessed 20-September-2015].
[Want et al., 1992] Want, R., Hopper, A., Falcão, V., and Gibbons, J. (1992). The active
badge location system. ACM Trans. Inf. Syst., 10:91–102.
[Weiser, 1991] Weiser, M. (1991). The computer for the 21st century. Scientific American.
[Wilkinson, 2003] Wilkinson, S. (2003). Focus Groups in Qualitative Psychology - A Prac-

tical Guide to Research Methods. Sage Publications, London.
164
Appendix A
General Evaluation Language
A.1 Enquiry Ontology

Complete listing of ontology rules regarding the Enquiry specification:
Classes
Class: enq:Answer
– Class used to represent an Answer.
– SubClassOf: cfw:Output
Class: enq:Enquiry
– Class used to represent an Enquiry.
Class: enq:EnquiryGroup
– Class used to classify types of enquiries.
Class: enq:Question
– Class used to represent a Question.
– SubClassOf: cfw:Input
Properties
Property: enq:belongsToEnquiry
– Defines to which enquiry does a question belong to.
– Domain: enq:Question
– Range: enq:Enquiry
Property: enq:hasTransitionTo
165
– Identifies the next question in a sequence of questions inside an enquiry.
– Domain: enq:Question
– Range: enq:Question
Property: enq:hasEnquiry
– States a sub-enquiry’s owner.
– Domain: enq:Enquiry
– Range: enq:Enquiry
Property: enq:hasEnquiryGroup
– Identifies the enquiry group of an enquiry.
– Range: enq:EnquiryGroup
Property: enq:hasFirstElement
– Identifies the first element of an Enquiry.
– Range: enq:Question, enq:Enquiry
Property: enq:hasDescription
– Defines a description of an enquiry or an enquiry group.
– Domain: enq:Enquiry, enq:EnquiryGroup
– Range: xsd:string
A.2 Event Ontology

Complete listing of ontology rules regarding the Event specification:
Classes
Class: evt:Event
– Represents an event that may occur in nodes and is extended in virtual domain.
– SubClassOf: epr:EventProcessingRuleElement
Class: evt:Timestamp
– Represents the instance in time that the event has occurred.
– Is Equivalent To: xsd:dateTime, eval:Timestamp
Class: evt:AtomicEventProducer
– Represents a producer of an evt:Event.
166
Properties
Property: evt:hasTimestamp
– Indicates the date and time that the event has occurred.
– Domain: evt:Event
– Range: evt:Timestamp
A.3 Event Processing Rules Ontology

Complete listing of ontology rules regarding the EPR specification:
Classes
Class: epr:EventProcessingRule
– Represents an event processing rule.
Class: epr:EventProcessingRuleElement
– Represents an element which belongs to the EPR.
Class: epr:EventOperation
– Represents an event operation of an EPR.
– SubClassOf: epr:EventProcessingRuleElement
Class: epr:EventOperationNOT
– Represents the event operation applying a ‘Negation’ logic to an event rule
element.
– SubClassOf: epr:EventOperation
Class: epr:EventOperationAND
– Represents the event operation applying an ‘And’ logic to two event rule ele-
ments.
Class: epr:EventOperationOR
– Represents the event operation applying an ‘Or’ logic to two event rule elements.
Class: epr:EventOperationFunction
167
– Represents the event operation ‘Function’ which abstracts operations that rep-
resents predicates that have arguments.
Class: epr:EventOperationDelay
– Represents the event operation ‘Delay’ which defines an operation that imposes
a waiting period of time to the EPR.
– SubClassOf: epr:EventOperationFunction
Class: epr:EventOperationActiveInterval
– Represents the event operation ‘ActiveInterval’ which defines an operation that
is only active during an period of time after the initialization of the EPR.
Class: epr:EventOperationRepetition
– Represents the event operation ‘Repetition’ which defines an operation that
imposes that its associated event rule element must occur a specific number of
times to be considered as fulfilled.
Properties
Property: epr:hasRootElement
– Identifies the root element of the EPR.
– Domain: epr:EventProcessingRule
– Range: epr:EventProcessingRuleElement
Property: epr:hasRuleElement
– Indicates to whom - event processing rule element - is an event operation applied
to.
– Domain: epr:EventRuleElement
– Range: epr:EventRuleElement
Property: epr:hasRepetitionTimes
– Indicates the number of repetitions for the event operation repetition.
– Domain: epr:EventOperationRepetition
– Range: xsd:integer
Property: epr:hasInterval
168
– Indicates the interval for the event operation delay.
– Domain: epr:EventOperationDelay, epr:EventOperationActiveInterval, epr:EventOperation
– Range: xsd:double
Property: epr:evaluatesAtEnd
– Indicates the whether the event operation ‘ActiveInterval’ evaluates its associ-
ated element during or only when the interval terminates.
– Domain: epr:EventOperationActiveInterval
– Range: xsd:boolean
Property: epr:hasName
– Allowing the specification of a name to an EPR.
– Domain: epr:EventProcessingRule
– Range: xsd:string
A.4 Evaluation Ontology

Complete listing of ontology rules regarding the Evaluation specification:
Classes
Class: eval:User
– Identifies a user (node) via its ID.
Class: eval:Evaluator
– Identifies an evaluator in a virtual domain.
Class: eval:Evaluation
– Identifies an evaluation scenario created in the scope of a virtual domain.
Class: eval:EvaluationInstantiation
– Identifies an instance of an evaluation in a virtual domain.
Class: eval:Timestamp
– Identifies an instance of an evaluation in a virtual domain.
– Is Equivalent To: evt:Timestamp, xsd:dateTime
169
Properties
Property: eval:hasCreated
– Defines who was the creator of an evaluation in a virtual domain.
– Domain: eval:Evaluation
– Range: eval:Evaluator
Property: eval:hasEvaluation
– Defines the evaluation specification to whom the instance corresponds to.
– Domain: eval:EvaluationInstantiation
– Range: eval:Evaluation
Property: eval:hasUser
– Defines the users to whom the evaluation instantiation is/was applied to.
– Range: eval:User
Property: eval:hasStartDate
– Defines the start date and time for an evaluation instantiation.
– Range: eval:Timestamp
Property: eval:hasEndTime
– Defines the end date and time for an evaluation instantiation.
– Range: eval:Timestamp
A.5 Evaluation Assessment Ontology

Complete listing of ontology rules regarding the evaluation assessment specification:
Classes
Class: ast:EvaluationAssessment
– Represents an evaluation assessment which defines a set of evaluation elements
to be applied to the user.
Class: ast:EvaluationAssessmentElement
170
– Represents an evaluation assessment element which belongs to an assessment.
Class: ast:EvaluationAssessmentInstantiation
– Represents an instantiation of an evaluation assessment when an evaluation is
instantiated to a set of users.
Class: ast:EvaluationItem
– Represents evaluation instruments which can be applied to a user in the scope
of an evaluation assessment (ex.: enquiries and EPRs)
Properties
Property: ast:hasEvaluation
– Defines the evaluation to whom the evaluation assessment belongs to.
– Domain: ast:EvaluationAssessment
– Range: eval:Evaluation
Property: ast:hasEvaluationAssessment
– Defines the evaluation assessment to whom the instance is related.
– Domain: ast:EvaluationAssessmentInstantiation
– Range: ast:EvaluationAssessment
Property: ast:hasEvaluationInstantiation
– Defines the instantiation to whom the assessment instantiation is associated
with.
– Domain: ast:EvaluationAssessmentInstantiation
– Range: eval:EvaluationInstantiation
Property: ast:hasFirstElement
– Defines the evaluation element which initiates the evaluation assessment.
– Domain: ast:EvaluationAssessment
– Range: ast:EvaluationAssessmentElement
Property: ast:represents
– Indicates which evaluation instrument/item does the evaluation assessment el-
ement correspond to.
– Domain: ast:EvaluationAssessmentElement
– Range: ast:EvaluationItem
171
Property: ast:followedBy
– Indicates the next evaluation assessment element within the evaluation assess-
ment.
– Domain: ast:EvaluationAssessmentElement
– Range: ast:EvaluationAssessmentElement
A.6 Evaluation Control Flow Ontology

Complete listing of ontology rules regarding the evaluation control flow specification:
Classes
Class: cfw:Job
– Represents the top level of a compflow specification. Indicates the first workflow
of an execution specification.
Class: cfw:Workflow
– Defines a series of ’Activity’ elements in a sequential flow.
– SubClassOf: cfw:Activity
Class: cfw:Activity
– Represents the set of possible classes that can be part of a workflow.
Class: cfw:Executable
– Represents all executable elements that can be found in a workflow.
Class: cfw:Task
– Represents an executable element with an input and an output, executed via a
TaskInterface.
– SubClassOf: cfw:Executable
Class: cfw:Event
– Represents an executable element with an input, executed via an EventInterface.
– SubClassOf: cfw:Executable
Class: cfw:InstantiationEvent
– Extends the Event as an executable element. When it executes, renews itself
and continues active.
172
– SubClassOf: cfw:Event
Class: cfw:Interface
– Executes an executable element.
Class: cfw:EventInterface
– Executes an Event element.
– SubClassOf: cfw:Interface
Class: cfw:TaskInterface
– Executes a Task element.
– SubClassOf: cfw:Interface
Class: cfw:Priority
– Determines the priority of an activity element.
Class: cfw:State
– Determines the execution state of an activity element.
Class: cfw:Gateway
– Indicates a decision point in the workflow that may the execution in one way or
another.
Class: cfw:Input
– Indicates the input of either a task or an event class.
Class: cfw:Output
– Indicates the output of a task.
Properties
Property: cfw:hasJob
– Indicates which Job corresponds to the execution of an evaluation assessment
instantiation.
– Domain: eval:EvaluationAssessmentInstantiation
– Range: cfw:Job
Property: cfw:hasWorkflow
173
– Indicates the first workflow of a Job element.
– Domain: cfw:Job
– Range: cfw:Workflow
Property: cfw:hasFirstActivity
– Indicates the first activity within a workflow.
– Domain: cfw:Workflow
– Range: cfw:Activity
Property: cfw:hasCurrentActivity
– Indicates what activity is currently being executed at a workflow.
Property: cfw:hasActivity
– Indicates the activities that belong to the workflow.
Property: cfw:hasPriority
– Indicates the priority that an activity may have within a workflow. The priority
may be used in execution to decide which activity will run first.
– Domain: cfw:Activity
– Range: cfw:Priority
Property: cfw:hasState
– Indicates the current status of an activity when in execution.
– Range: cfw:Status
Property: cfw:executedBy
– Indicates who is responsible for executing an ’Executable’ element.
– Domain: cfw:Executable
– Range: cfw:Interface
Property: cfw:hasInput
– Indicates the input of a Task or Event element.
174
– Domain: cfw:Task, cfw:Event
– Range: cfw:Input
Property: cfw:hasOutput
– Indicates the output of a Task element
– Domain: cfw:Task
– Range: cfw:Output
Property: cfw:transitionTo
– Indicates the following element within a workflow.
175
Appendix B
ESM Tool Comparison Table
Aspect Maestro Momento DynEaaS

User Selection Handpicked Handpicked/ Handpicked/
Group Creation Criteria-based
Evaluation Type Sup- Unknown ESM/ Enquiries/ ESM/ Enquiries/
port Logging Logging
Evaluation Editing Yes Yes No (only through
After Start re-deployment)
Database Type Relational Relational Relational/
Ontology-based
Data Storage Centralized Centralized Locally/
Centralized
Internet Connectivity Yes Yes No (depending of
Required node setup)
Node Autonomy No No Yes
EPR Support Unknown Yes Yes
Event Creation Pro- Phone-Only Phone-Only Any
ducers computational
device
Evaluation Specifica- Configuration File UIs UIs
tion
Interface Support Phone Only Phone Only Any
computational
device
Evaluator Friendly No Yes Yes
Communication Type Unknown SMS Web Services
Table B.1: Comparison between our solution and other ESM tools
176

Thesis CarlosPereira2016

Enviado por

Dados do documento

Título original

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Thesis CarlosPereira2016

Enviado por

Direitos autorais:

Formatos disponíveis

Departamento de Electrónica,

Universidade de Aveiro Telecomunicações e Informática

Carlos Eduardo Avaliação Dinâmica para Cenários Reactivos

Carlos Eduardo Avaliação Dinâmica para Cenários Reactivos

Tese apresentada à Universidade de Aveiro, Minho e Porto para cumpri-

presidente / president Vitor Brás de Sequeira Amaral

vogais / examiners committee João Manuel Pereira Barroso

Luis Manuel Dias Coelho Soares Barbosa

José Miguel de Oliveira Monteiro Sales Dias

António Joaquim da Silva Teixeira (Orientador)

Joaquim Manuel Henriques de Sousa Pinto

Carlos Jorge da Conceição Teixeira

Hugo Alexandre Paredes Guedes da Silva

Resumo A natureza dinâmica de cenários como Ambient Assisting Living e

3 A New Evaluation Paradigm 25

4 A Methodology and a Model for Evaluation Definition 43

5 Dynamic Evaluation Architecture 81

6 Proof of Concept 117

B ESM Tool Comparison Table 176

3.1 User’s Circle of Information . . . . . . . . . . . . . . . . . . . . . . . . . . 32

4.1 Dynamic Evaluation Lifecycle . . . . . . . . . . . . . . . . . . . . . . . . . 44

5.1 Dynamic Evaluation Architecture Overview . . . . . . . . . . . . . . . . . 82

6.1 DynEaaS Platform: EPR Creation User Interface . . . . . . . . . . . . . . 123

1.2 Thesis Statement

1.4 Main Contributions

1.6 Published Results

2.1 Common Evaluation Methodologies

2.1.1 Test Methods

• Observation is a research method that consists on attentive visualization and system-

• Remote testing is a method oriented to usability evaluation where evaluators are

2.1.2 Enquiry Methods

• The Focus group methodology consists in involving a small number of people in

• Interviewing is a method used in direct contact with the participants, to gather

2.2 Experience Sampling Methodology

MyExperience MyExperience is a open-source software that runs on Windows Mobile

Communication is handled via text or multimedia messages through HTTP or SM-

movisensXS MovisensXS is a commercial ESM software supporting self-reports, behav-

MetricWire MetricWire [MetricWire, 2015] is a commercial ESM software similar to

2.3 Support Technologies

The implementation of SOA generally implicates a good practices set of principles

• Reusability, granularity, modularity, composition - reusable services due to the ex-

• Standards compliance - to assure the interoperability of services.

• Identification, categorization, provisioning and monitoring of services - through which

2.3.2 Context Modeling

“Context is any information that can be used to characterize the situation

2.3.2.1 Architectural Principles

• Middleware Infrastructure - a middleware is used on top of sensors which separates

• Context Server - the use of a server implicates a client-server paradigm. Contextual

A common trace within distributed contextual architectures comes by the implemen-

2.3.2.2 Representation Models

Figure 2.6: CSCP Profile Example [Indulska et al., 2003]

• Graphical Models. The Unified Modeling Language (UML) is a type of graphical

2.3.3 User Modeling

“A user model is a knowledge source in a natural-language dialog system which

Stereotypes It is expected for increasingly individualized adaptation of interaction to

• inferences - the consequences of a stereotype, in case of “bad sight” this leads to a

• retraction - in case a stereotype in no longer valid, then this mechanism is capable

Normally, when stereotyping is applied to provide user-adaptation to an application,

Knowledge-Based Reasoning This technique is many times applied in conjunction

2.3.3.1 Related Systems

• User information is at the disposal of more than one application at a time;

• Applications may use information acquired by other applications, leading to a sub-

• Information about the users is stored in a non-redundant manner;