Você está na página 1de 21

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, VOL. 47, NO.

3, MARCH 2017 441

A Survey of Self-Organization Mechanisms


in Multiagent Systems
Dayong Ye, Minjie Zhang, and Athanasios V. Vasilakos

Abstract—This paper surveys the literature over the last solving a particular problem aspect. This decomposition allows
decades in the field of self-organizing multiagent systems. Self- each agent to use the most appropriate paradigm for solving
organization has been extensively studied and applied in mul- its particular problems. When interdependent problems arise,
tiagent systems and other fields, e.g., sensor networks and grid
systems. Self-organization mechanisms in other fields have been the agents in the system must coordinate with one another to
thoroughly surveyed. However, there has not been a survey ensure that interdependencies are properly managed.
of self-organization mechanisms developed for use in multia- In the multiagent system field, the key problem is the defi-
gent systems. In this paper, we provide a survey of existing nition of an agent. There is still an ongoing debate, and little
literature on self-organization mechanisms in multiagent sys- consensus, about the definition of an “agent.” An increasing
tems. We also highlight the future work on key research issues
in multiagent systems. This paper can serve as a guide and number of researchers and industrial practitioners have found
a starting point for anyone who will conduct research on that the following definition could be widely acceptable.
self-organization in multiagent systems. Also, this paper comple- “An agent is an encapsulated computational system
ments existing survey studies on self-organization in multiagent that is situated in some environment and that is
systems. capable of flexible, autonomous action in that envi-
Index Terms—Distributed artificial intelligence, multiagent ronment in order to meet its design objectives [3].”
systems, self-organization. This definition implies that an agent should exhibit pro-
active, reactive, and social behavior. Thus, the following key
I. I NTRODUCTION properties of an agent are required [4], [5].
A. Multiagent Systems 1) Autonomy: Agents are entities, which are clearly iden-
OST research in artificial intelligence to date has dealt tifiable and problem solving. In addition, agents have
M with developing theories, techniques, and systems to
study and understand the behavior and reasoning properties of
well-defined boundaries and interfaces, which have con-
trol both over their internal states and over their own
a single cognitive entity, i.e., an agent [1]. Agent-based sys- behavior.
tem technology has generated much excitement in recent years 2) Reactivity: Agents are situated (or embedded) in a par-
because of its promise as a new paradigm for conceptualizing, ticular environment. They receive inputs related to the
designing, and implementing software systems. The capacity states of their environment through sensor interfaces.
of a single agent is limited by its knowledge, its computing Agents then respond in a timely fashion and act on
resources, and its perspectives. This bounded rationality [2] the environment through effectors to satisfy their design
is one of the underlying reasons for creating problem-solving objectives.
organizations, which consist of more than one agent, namely 3) Pro-Activeness: Agents do not simply act in response
multiagent systems. If a problem domain is quite complex, to their environment. They are designed to fulfill spe-
large, or unpredictable, then the only way it can reasonably cific purposes, namely that they have particular objec-
be addressed is to develop a number of functionally specific tives (goals) to achieve. Agents are therefore able to
and modular components (agents), which are specialized in exhibit goal-directed behavior by taking the initiative
and opportunistically adopting new goals.
Manuscript received March 7, 2015; revised August 7, 2015; accepted 4) Social Ability: Agents are able to cooperate with humans
October 17, 2015. Date of publication January 14, 2016; date of current ver- and other agents in order to achieve their design objec-
sion February 23, 2017. This work was supported in part by two Australian
Research Council Discovery Projects through the Australian Research tives.
Council, Australia, under Project DP150101775 and Project DP140100974, To intuitively understand what an agent is, it is worthwhile
and in part by the School of Computer Science and Software Engineering, to consider some examples of agents [3].
University of Wollongong. This paper was recommended by Associate
Editor W. A. Gruver. 1) Any control system can be viewed as an agent. A simple
D. Ye is with the School of Software and Electrical Engineering, example of such a system is a thermostat. A thermo-
Swinburne University of Technology, Melbourne, VIC 3122, Australia (e- stat has a sensor for detecting room temperature. This
mail: dye@swin.edu.au).
M. Zhang is with the School of Computer Science and Software sensor is directly embedded within the environment
Engineering, University of Wollongong, Wollongong, NSW 2522, Australia (i.e., the room), and it outputs one of two signals: one
(e-mail: minjie@uow.edu.au). indicates that the temperature is too low and another
A. V. Vasilakos is with the Luleå University of Technology, Luleå SE-
931 87, Sweden (e-mail: vasilako@ath.forthnet.gr). indicates that the temperature is okay. The actions avail-
Digital Object Identifier 10.1109/TSMC.2015.2504350 able to the thermostat are “heating on” or “heating off.”
2168-2216 c 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
442 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, VOL. 47, NO. 3, MARCH 2017

The action heating on will generally have the effect 2) Stage 2 (Organization Design): In this stage, the orga-
of raising the room temperature. The decision mak- nizational structure of the target multiagent system is
ing component of the thermostat implements (usually in designed. Also, a set of agent classes which comprise
electro-mechanical hardware) the following two rules: the multiagent system should be defined. The orga-
a) if the room temperature is too low, the action heating nizational structure can be constructed by defining a
on is taken and b) if the room temperature is okay, the role for each agent class and specifying the author-
action heating off is taken. ity relationships between these roles. The organizational
2) Most software daemons, (such as background processes structure refers to the application domain which the
in the UNIX operating system), which monitor a soft- multiagent system is developed to support, automate,
ware environment and perform actions to modify it, can or monitor.
be viewed as agents. An example is the X Windows pro- 3) Stage 3 (Agent Internal Activity Design): This stage
gram xbiff. This program continually monitors a user’s focuses on the internal design of each agent class.
incoming emails and indicates via a GUI icon whether The internal activities of each agent class include, for
the user has unread messages. Whereas the thermostat example, what goals an agent class is designed for,
agent in the previous example inhabits in a physical envi- what knowledge this agent class has, when and how to
ronment (the physical world), the xbiff program inhabits respond to an internal or external event. An agent goal is
in a software environment. The xbiff program agent a state of the world which an agent class is designed to
obtains information about this environment by carrying achieve or satisfy [5]. The knowledge of an agent class
out software functions (e.g., by executing system pro- is an agent belief which refers to the information that
grams), and the actions it performs are software actions an agent hold about the world [17]. The responses of an
(changing an icon on the screen or executing a program). agent to events are agent plans which can be formed at
The decision making component is just as simple as the run-time by planners or reasoners. An agent plan can be
thermostat example. carried out based on some basic “if-then” rules which
Multiagent systems have been used in many industrial appli- couple the states of the environment with the actions
cations. The first multiagent system applications appeared in taken by agents.
the mid-1980s [1]. Up to now, multiagent system applications 4) Stage 4 (Agent Interaction Design): This stage defines
have increasingly covered a variety of domains which range the interactions between agent instances by designing a
from manufacturing to process control [6], air-traffic control, suitable interaction protocol or mechanism for the multi-
and information management [7]. agent system. The interaction protocol should specify the
communication message format and how communica-
tion messages are transmitted, e.g., directly or indirectly.
B. Overview of the Multiagent System Design For the direct interaction mechanism, a suitable agent
and Development interaction protocol should be defined. This interaction
A multiagent system is an extension of intelligent agent protocol should be able to resolve any conflicts between
technology. In a multiagent system, a group of autonomous agents and to ensure that all coordination rules gov-
agents act in an environment to achieve a common goal or erning the interaction are enforced. For the indirect
their individual goals. These agents may cooperate or com- interaction mechanism, the interaction protocol should
pete with each other and share or not share knowledge with be able to resolve conflicts not only between agents
each other [8], [9]. Since the concept of multiagent systems is but also between agents and tuple-center. Moreover, the
introduced, there have been several attempts to create method- interaction protocol should also model the tuple-center’s
ologies to design and develop such systems [3], [10]–[16]. behavior.
Development of a multiagent system is difficult. A multia- 5) Stage 5 (Architecture Design): This stage concerns
gent system does not only have all the features of traditional various implementation issues relating to the agent
distributed and concurrent systems, but also has exclusive architecture and the multiagent system architecture,
difficulties due to the autonomy, flexibility, and complex inter- e.g., selecting appropriate sensors to perceive the envi-
actions of individual agents. As stated by Sycara [1], there is a ronment, selecting proper effectors to react to the
lack of a proven methodology for designers to construct mul- environment, and selecting a suitable implementation
tiagent systems for applications. Recently, Tran and Low [15] platform for implementing agents and the multiagent
presented five stages in the multiagent system development, system. The characteristics of the agents’ perception,
which can summarize the basic development process of a effect and communication should be specified at the
multiagent system. design time. The internal constructs of each agent class,
1) Stage 1 (Goal Analysis): This stage aims to under- e.g., belief conceptualization, agent goals and plans,
stand the target problem domain and to specify the should be mapped onto the architectural modules during
functionalities that the target multiagent system should implementation.
provide. The development should start with capturing In this paper, the survey is delimited in stages 2 and 3,
system tasks, analyzing the conflicts among these tasks as most existing self-organization mechanisms in multiagent
and decomposing these tasks to small and easy-handled systems are developed in the two stages. The survey of other
subtasks. stages are left as one of our future studies.
YE et al.: SURVEY OF SELF-ORGANIZATION MECHANISMS IN MULTIAGENT SYSTEMS 443

C. Self-Organization any external control. This property implies continuity of


The term “self-organization” was introduced by Ashby [18] the self-organization process.
in the 1960s, where self-organization meant that some pat- Due to the above properties, self-organization has been
tern was formed by the cooperative behavior of individual introduced into multiagent systems for a long time to solve
entities without any external control or influence in a system. various problems in multiagent systems [29]. Although many
Phenomena of self-organization can be found in natural biol- specific physical systems, such as multirobot systems, sen-
ogy. For example, there is no “leader fish” in a school of fish sor networks, and so on, can be represented by multiagent
but each individual fish has knowledge about its neighbors. systems, multiagent system itself is an independent research
Due to this localized and decentralized operation, the difficult field and the research of multiagent systems is independent of
task of forming and maintaining a scalable and highly adaptive specific physical systems. The research of self-organization
shoal can be achieved [19], [20]. in multiagent systems mainly focuses on theoretical study
The ideas behind self-organization have been widely used while overlooking the requirements or constraints of specific
and studied in many fields, such as multiagent systems [21], physical systems. This is because researchers aim to design
grid computing [22], sensor networks [23]–[25], and other general self-organizing multiagent systems which could be
industrial applications [26], [27]. Self-organization has been applied in various physical systems (with proper modifica-
proved to be an efficient way to deal with the dynamic tion if necessary). To the best of our knowledge, there is no
requirements in distributed systems. Currently, there is still survey of self-organization mechanisms in general multiagent
no commonly accepted exact definition of a self-organizing systems, although surveys of self-organization mechanisms in
system that holds across several scientific disciplines [20]. specific physical systems have been provided, e.g., the survey
In multiagent system field, Serugendo et al. [21] presented of self-organization in cellular networks [30], the survey of
a definition of self-organization. self-organization in ad hoc and sensor networks [31], [32],
Self-organization is defined as a mechanism or a the survey of self-organization for radio technologies [33],
process which enables a system to change its organi- the survey of self-organization in communications [34],
zation without explicit command during its execution and the survey of self-organization in manufacturing con-
time [21]. trol [35]. In this paper, a survey of self-organization mech-
Serugendo et al. [21] further presented the definitions of anisms in general multiagent systems is provided. This survey
strong self-organizing systems and weak self-organizing sys- classifies existing self-organization mechanisms in general
tems by distinguishing between systems where there is no multiagent systems, introduces their historical development,
internal and external explicit control from those where there summarizes and compares them, and points out future research
is an internal centralized control (e.g., a termite society where directions. This survey is claimed as the contribution of
the queen internally controls the behavior of termites in the this paper.
society). The rest of this paper is organized as follows. Section II
Strong self-organizing systems are those systems presents related studies of the introduction or survey of self-
where there is no explicit central control either organization in multiagent systems. Section III provides the
internal or external. classification of self-organization mechanisms. Section IV
Weak self-organizing systems are those systems surveys self-organization mechanisms in multiagent systems.
where, from an internal point of view, reorganiza- Section V presents some applications of self-organizing multi-
tion may be under an internal (central) control or agent systems. Section VI points out future research directions.
planning. Finally, Section VII concludes this paper.
In this paper, we consider only strong self-organizing
systems. Self-organization has the following three proper- II. R ELATED W ORK
ties [21], [28]. Although there is no survey of self-organization mecha-
1) The Absence of Explicit External Control: This property nisms in multiagent systems, some general introduction of
demonstrates that the system is autonomous. Adaptation self-organization in multiagent systems has been given. These
and change of the system are based only on decisions general introduction articles make readers clearly under-
of internal components without following any explicit stand what self-organization is, the benefits of using self-
external command. This property refers to the self-part organization in multiagent systems and the applications of
of the above self-organization definition. self-organizing multiagent systems in real world systems.
2) Decentralized Control: Self-organization process can be Serugendo et al. [21] concluded on a common definition
achieved through local interactions among components of the concepts of self-organization and emergence in mul-
without central control either internal or external. In tiagent systems. They also summarized the properties and
addition, access to global information is also limited by characteristics of self-organization. Additionally, they devel-
the locality of interactions. oped an approach for selecting self-organization mechanisms
3) Dynamic and Evolutionary Operation: A self-organizing using a number of case studies and a set of evaluation criteria.
system is able to evolve. When the environment changes, Serugendo et al.’s [21] work is the fundamental one which
the self-organizing system can evolve to adapt to the defines the concepts of self-organization from a multiagent
new environment and this evolution is independent of system point of view.
444 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, VOL. 47, NO. 3, MARCH 2017

Serugendo et al. [28] further discussed the concepts of Gorodetskii [29] analyzed the state of the art in the field
self-organization and emergence. They then reviewed different of multiagent self-organizing systems. Their work consists
classes of self-organization mechanisms developed for use in of three parts. The first part introduces the basic concepts
various fields and studied the implementation of these mecha- of self-organization and multiagent systems. The second
nisms in multiagent systems. The strengths and limits of these part presents the classification of self-organization mecha-
mechanisms were also examined. The self-organization mech- nisms in multiagent systems. The remaining part provides
anisms reviewed in their paper, however, are not developed examples of self-organization mechanisms and their appli-
for use in multiagent systems, while in this paper, the sur- cations. All the examples of self-organization mechanisms
vey focuses on the self-organization mechanisms developed given by Gorodetskii [29] are biology-based, e.g., swarm
to address various issues in multiagent systems. intelligence, nest building, Web weaving, etc. Compared to
Tianfield and Unland [22] presented an overview on some Gorodetskii’s [29] work, this paper will survey various self-
interdisciplinary fields which have emerged in multiagent sys- organization mechanisms developed for use in multiagent
tems and grid computing, e.g., semantic grids, autonomic systems.
computing, and large-scale open multiagent systems. They According to the review of related work, it can be
demonstrated that large-scale complex systems have a high shown that the current survey work of self-organization
desirability to be self-organizing and they also reviewed exist- mainly focuses on the concepts and the applications of self-
ing studies which implemented self-organization in those organization. Although some studies survey self-organization
systems. However, only a small part of their review is along with multiagent systems, they still mainly focus on the
on self-organization in multiagent systems, whereas in this general introduction of self-organization, multiagent systems
paper, to give readers a clear understanding about the state- and their applications with only a small part on reviewing
of-the-art research of self-organization in multiagent sys- specific self-organization mechanisms. Also, some of these
tems, all of the surveyed studies are done in multiagent specific self-organization mechanisms are not developed for
systems. use in multiagent systems. In this paper, we intend to com-
Bernon et al. [36] described different mechanisms for plement current related survey work by surveying existing
generating self-organization in multiagent systems, which self-organization mechanisms which are developed to address
included self-organization by reactive multiagent systems, specific issues in multiagent systems.
self-organization using cooperative information agents, self-
organization by cooperation in adaptive multiagent systems, III. C LASSIFICATION FOR S ELF -O RGANIZATION
and self-organization by holons. They then provided several As stated in [29], currently, there is no conventional clas-
examples of application of self-organizing multiagent sys- sification of the self-organization mechanisms and different
tems to solve complex problems and discussed comparison researchers use different features to make a classification.
criteria of self-organization between different applications. Based on the summarization from [29] and [30], generally,
However, as the work done in [22], only a small part of there are three classification methods for self-organization
Bernon et al.’s [36] work is on reviewing self-organization mechanisms.
mechanisms in multiagent systems. 1) Objective-Based Classification: This classification
Picard et al. [37] studied how to make multiagent orga- focuses on the question of what the self-organization
nizations adapt to dynamics, openness, and large-scale envi- mechanism is designed for. A self-organization mech-
ronments. Specifically, they compared two research views anism may be designed for task allocation, relation
in detail, i.e., agent-centered point of view (ACPV) and adaptation, etc. Also, a self-organization mechanism
organization-centered point of view (OCPV) and studied how can be designed for multiple purposes, e.g., a self-
to apply these views to multiagent systems. ACPV studies organization mechanism that aims for load-balancing
organization from the point of view of emergent phenomena can optimize the capacity as well as the quality of
in complex systems, while OCPV focuses on designing the service.
entire organization and coordination patterns on the one hand, 2) Method-Based Classification: This classification focuses
and the agents’ local behavior on the other hand. Their work, on the question of which method or technique is
however, reviewed only the two views, i.e., ACPV and OCPV, used to realize a self-organization mechanism. A self-
and compared them but did not describe other self-organization organization mechanism may be designed based on
mechanisms. reinforcement learning, where the driving force of the
Serugendo et al. [38] generally discussed the existence self-organization process is a utility function and agents
of self-organization in real-world systems, such as physical try to modify their behavior so as to maximize their
systems, biological systems, social systems, business and eco- utilities. A self-organization mechanism may also be
nomic systems, and artificial systems. They then provided designed based on cooperation among agents, where
the applications of self-organization in software systems, e.g., self-organization is achieved through local interactions
multiagent systems, grid and peer-to-peer systems, network between agents in a cooperative way.
security, etc. Their work is a general introduction of self- 3) Environment-Based Classification: This classification
organization and its applications, whereas this paper focuses focuses on the question of which environment the
on surveying self-organization mechanisms in multiagent sys- self-organization mechanism is designed in. A self-
tems. organization mechanism may be designed in a
YE et al.: SURVEY OF SELF-ORGANIZATION MECHANISMS IN MULTIAGENT SYSTEMS 445

multiagent system, a sensor network, or a grid system. 3) Organizational design.


Self-organization mechanisms designed in different 4) Reinforcement learning.
environments have to take specific requirements and 5) Enhancing software quality.
constraints into account. If a self-organization mecha- 6) Collective decision making.
nism is designed in a wireless sensor network, due to Actually, the six research issues are overlapping to some
the battery energy limitation of wireless sensors, inter- extent. For example, reinforcement learning is often used as
actions between sensors should be as few as possible, a tool to study other issues, e.g., task/resource allocation and
whereas such a constraint can be relaxed properly if the relation adaptation. Also, task/resource allocation is often used
mechanism is designed in a general multiagent system. as a platform for the study of other research issues, e.g.,
In this paper, we use the first classification method, relation adaptation. Therefore, these research issues are not
objective-based classification, to classify self-organization isolated but are closely related to each other. In this paper, the
mechanisms. Because our survey is conducted in multiagent six research issues are selected for review, because: 1) self-
system environments only, the third classification method, organization techniques have been introduced into them and
environment-based classification, cannot be used in this paper. 2) they are the basic and important research issues in multia-
If we use the second classification method, method-based gent systems. The two reasons make the review of them match
classification, there will be a large number of technical con- the topic of this paper. There are some other important research
tents. Technical contents, however, will harm the readability issues in multiagent systems, e.g., negotiation, coordination,
of this paper to some extent, especially for beginners. For a planning, and reasoning. However, because introducing self-
survey paper, good readability is the first priority. Thus, the organization into these research issues has received little or no
second classification method, method-based classification, is attention, these research issues are not reviewed in this paper.
not very suitable in this paper. By using the first classifica- The discussion of these research issues about how to introduce
tion method, objective-based classification, readers can have a self-organization into them will be given in future research
clear picture regarding not only the current important research direction section (Section VI). Moreover, in order to demon-
issues in multiagent systems, but also the advantages of using strate the historical development of these self-organization
self-organization to address these issues compared to those mechanisms, we will also review a few representative nonself-
methods which do not use self-organization. organization mechanisms, because the development of each
self-organization mechanism is usually based on previous
nonself-organization mechanisms. Researchers studied and
IV. S URVEY OF S ELF -O RGANIZATION M ECHANISMS
summarized the limitations of nonself-organization mecha-
As described in Section I, the development of a multia- nisms and then, proposed self-organization mechanisms to
gent system consists of five stages. In the last decade, the overcome these limitations.
research of multiagent systems has been thoroughly carried
out in each stage. The five-stage development process is a top- A. Task/Resource Allocation
down development process which begins from the requirement Task allocation and resource allocation are very important
and goal analysis to the design of the conceptual architec- research issues in multiagent systems, as many real-world
ture and the development of specific agent classes. Such a problems can be modeled as task/resource allocation in multi-
development process, however, is infeasible in designing self- agent systems. Task allocation can be briefly described as that
organizing multiagent systems, because: 1) self-organizing an agent has a task (or tasks) and cannot finish the task (or
multiagent systems are based on autonomous agents and their tasks) by itself, so the agent has to allocate the task (or tasks)
local interactions and 2) the global goal or behavior of the to other agents to carry out. Then, how to efficiently and eco-
agents cannot be specified or predicted in advance [29], [39]. nomically allocate tasks to other agents is the problem that
Therefore, the development of self-organizing multiagent sys- task allocation mechanisms have to deal with. Resource allo-
tems has to be carried out in a bottom-up way. Unfortunately, cation has a similar meaning to task allocation, where resource
currently, there is a lack of a mature methodology or tool allocation focuses on how to efficiently allocate resources to
for developing self-organizing multiagent systems [39]. In agents so as to help them achieve their goals. In the following,
this paper, the survey is conducted by reviewing the basic we first review self-organizing task allocation mechanisms and
and important research issues in multiagent systems so as to then self-organizing resource allocation mechanisms.
obey the bottom-up design process. According to objective- Task allocation in multiagent systems has been thoroughly
based classification, based on our investigation, there are six studied and has a wide range of applications, e.g., target
important research issues in multiagent systems, which use tracking in sensor networks [40] and labor division in robot
self-organization techniques. The six research issues are as systems [41]. Task allocation mechanisms in multiagent sys-
follows.1 tems can be classified into two categories: 1) centralized
1) Task/resource allocation. and 2) decentralized. Centralized task allocation mechanisms
2) Relation adaptation. (see [42], [43]) have the single point of failure and do not
consider the change of tasks and agents. To overcome these
1 The two research issues, reinforcement learning and collective decision
drawbacks, decentralized task allocation mechanisms were
making, have been studied in both multiagent system field and machine learn-
ing field. In this paper, we delimit the discussion in multiagent systems, i.e., developed (see [44]–[48]). These decentralized mechanisms
multiagent learning and multiagent collective decision making. can avoid the single point of failure, but they still have some
446 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, VOL. 47, NO. 3, MARCH 2017

limitations. Scerri et al.’s [44] approach needs a large amount colony which has a well-balanced exploration and exploitation
of communication to remove conflicts, so it does not work well ability.
in large scale multiagent systems. Abdallah and Lesser [45] Like task allocation, resource allocation in multiagent sys-
studied task allocation on the basis of game theory. Abdallah tems has also been thoroughly studied and is relevant to a
and Lesser’s study considered only two agents and no discus- range of applications, e.g., network routing [53], manufac-
sion was given about how to extend their study to handle three turing scheduling [54], and clouding computing [55], [56].
or more agents. De Weerdt et al. [46] used the contract-net Resource allocation mechanisms can be either centralized or
protocol [49], an auction based approach, for task allocation decentralized [57]. In centralized mechanisms, there is a single
in multiagent systems. In de Weerdt et al.’s [46] method, entity to decide on the allocation of resources among agents
an agent allocates a task only to neighbors. Then, if an based on the constraints and preferences of each agent in the
agent has few neighbors, its tasks may be difficult to allo- system. Typical examples for the centralized mechanism are
cate. Chapman et al.’s [47] approach is based on a distributed combinatorial auctions [58], where the auctioneer is the cen-
stochastic algorithm which is fast and needs few commu- tral entity. In combinatorial auctions [59], [60], agents report
nication messages, but it may get stuck in local minima. their constraints and preferences to the auctioneer and the auc-
Wang et al.’s [48] mechanism is based on an ant colony algo- tioneer makes the allocation of resources to the agents. The
rithm which requires a global pheromone matrix to achieve act of reporting constraints and preferences is called “bid-
optimal solutions. ding.” An agent’s bidding may be private or public to other
Self-organizing task allocation mechanisms were also devel- agents based on requirements of the system. Bidding process
oped in multiagent systems [50], [51]. The self-organizing may be operated in one round or multiple rounds. Based on
mechanisms are decentralized as well. Compared to central- the biddings, the auctioneer will make a decision on which
ized task allocation mechanisms, self-organizing mechanisms resource is allocated to which agent. Typical decentralized
can avoid the single point of failure. Compared to the mechanisms are usually operated through local interaction,
nonself-organizing decentralized task allocation mechanisms, such as the contract-net protocol [49] which consists of four
self-organizing mechanisms have good scalability and enable interaction phases: 1) announcement phase; 2) bidding phase;
each agent to self-adapt its behavior, without global informa- 3) assignment phase; and 4) confirmation phase. Many exten-
tion, for efficient task allocation in open and dynamic systems, sions to this protocol have been proposed. Sandholm [61]
where the set of tasks and agents may constantly change developed the TRACONET system which uses a variant
over time. of the contract-net protocol to enable negotiation over the
Macarthur et al. [50] proposed a distributed anytime algo- exchange of bundles of resources. Sandholm and Lesser [62]
rithm for task allocation in open and dynamic multiagent also extended the contract-net protocol by enabling decommit-
systems. Their algorithm is based on the fast-max-sum algo- ment from agreed contracts during negotiation process with
rithm [52]. Macarthur et al. [50] improved the fast-max-sum penalties applied, which gave agents more opportunities to
algorithm by presenting a pruning algorithm to reduce the find desirable partners. Aknine et al. [63] studied concurrent
number of potential solutions that need to be considered and contract-net protocol which allowed many managers negoti-
by involving branch-and-bound search trees to reduce the exe- ating simultaneously with many contractors. They added on
cution time of fast-max-sum. Macarthur et al.’s [50] algorithm the contract-net protocol a prebidding phase and a preas-
is an online and anytime algorithm and it can self-adapt in signment phase, where agents proposed temporary bids and
dynamic environments. Thus, their algorithm has the self- managers temporarily accepted or rejected these bids. In addi-
organization property, i.e., dynamically adapting itself without tion to negotiation, reinforcement learning is also an efficient
explicit control. approach for resource allocation. Schaerf et al. [64] proposed
Dos Santos and Bazzan [51] proposed a swarm intelligence a resource allocation method based on reinforcement learn-
based clustering algorithm for task allocation in dynamic mul- ing. In their method, when jobs arrive at agents, each agent
tiagent systems. Their algorithm is inspired by the behavior of independently decides on which resources are used to execute
forager bees, where a bee is considered as an agent. During each job via reinforcement learning without interaction with
the clustering process, agents need to make a couple of deci- other agents. Resources are dedicated to specific agents who
sions: whether to abandon an agent, whether to change to the do not make decisions during resource allocation. Only those
group of the visited agent, whether to continue dancing to agents, who have jobs to execute, make decisions. Tesauro [65]
recruit other agents for a group, and whether to visit a dancer. developed a similar model to Schaerf et al.’s [64] work. There
The authors set a number of thresholds for agents to make is a resource arbiter in Tesauro’s [65] model to dynamically
decisions. In their algorithm, each agent can autonomously decide resource allocation based on agents value functions
and dynamically make decisions based on current situations. which are learned independently. Zhang et al. [66] developed
Thus, their algorithm also has the self-organization property. a multiagent learning algorithm for online resource allocation
Macarthur et al.’s [50] algorithm is based on the fast max- in a network of clusters. In their algorithm, learning is dis-
sum algorithm. Their algorithm is an anytime algorithm, so tributed to each cluster, using local information only without
it can return a valid solution to a problem even if it is inter- accessing to the global system reward. The common limitation
rupted at any time before it ends. Also, the more time the of these nonself-organizing resource allocation mechanisms is
algorithm keeps running, the better solutions are expected to be that they are difficult to handle resource allocation in open and
found. Dos Santos and Bazzan’s [51] algorithm is based on bee dynamic multiagent systems. Therefore, resource allocation
YE et al.: SURVEY OF SELF-ORGANIZATION MECHANISMS IN MULTIAGENT SYSTEMS 447

mechanisms, which have self-organization properties, are also learning, whereas An et al.’s [69] method is based on multi-
proposed. agent negotiation. Negotiation techniques usually need more
Fatima and Wooldridge [67] presented an adaptive orga- communication overhead than learning techniques and usu-
nizational policy, TRACE, for multiagent systems. TRACE ally require more time to obtain a solution than learning
enables multiagent organizations to dynamically and adap- techniques. However, negotiation techniques are more flexible
tively allocate tasks and resources between themselves to and give agents more autonomy than learning techniques. The
efficiently process an incoming stream of task requests. work done in [67] takes agents as resources and studies how to
TRACE consists of two components: 1) a task allocation allocate and reallocate agents to organizations in accordance
protocol and 2) a resource allocation protocol. The task alloca- with organizations’ demands. Thus, the aim of [67] is different
tion protocol, based on the contract-net protocol [49], allows from that of [68] and [69]. The work done in [70] and [71]
agents to cooperatively and efficiently allocate tasks to other studies resource allocation using game theoretical approaches.
agents which have the suitable capability and opportunity The aim of [70] and [71] is to find an equilibrium that no
to carry these tasks out. The resource allocation protocol, agents have incentive to deviate from the allocation results.
based on computational market systems, enables resources Thus, although all of these studies are about resource alloca-
to be adaptively and dynamically allocated to organizations tion, they focus on different aims and are suitable in different
to minimize the number of lost requests caused by an environments.
overload. Summary: In self-organized task/resource allocation, there is
Schlegel and Kowalczyk [68] devised a distributed algo- no complex coordination mechanism among agents. Instead,
rithm to solve the resource allocation problem in distributed the allocation process is a self-organized process that orig-
multiagent systems based on self-organization of the resource inates from local decisions made by each individual agent.
consumers. In their algorithm, each resource consumer has Compared to nonself-organization allocation mechanisms
several predictors to predict the resource consumption of each (see [46], [48], [58], [63]), self-organization mechanisms
server, and uses this predictive result to allocate tasks to [50], [51], [69], [70] are able to properly handle open and
servers. Then, based on servers’ feedback, each resource con- dynamic environments, where agents and tasks may be added
sumer evaluates the performance of its predictors and adjusts or removed dynamically. In addition, self-organized allocation
its predictors against each server. is robust to failures in communication and has good scalability
An et al. [69] proposed an efficient negotiation method for in the number of agents. Specially, compared to multia-
resource allocation. In their method, negotiation agents can gent reinforcement learning allocation methods (see [66]),
dynamically and autonomously adjust the number of tentative self-organization mechanisms do not need a time-consuming
agreements for each resource and the amount of concession convergence period. Table I summarizes the characteristics
they are willing to make based on the situations of agents’ of the aforementioned task/resource allocation approaches.
vicinity environment. In addition, their method allows agents In Table I, it can be seen that Macarthur et al.’s [50]
to decommit agreements by paying penalties and to dynami- self-organizing task allocation approach is based on the fast-
cally modify the reserve price of each resource. Thus, agents max-sum algorithm, while dos Santos et al.’s [51] approach
have very high autonomy in their method. is based on swarm intelligence. Both approaches are decen-
Pitt et al. [70] complemented current principles of a tralized and have good scalability. The fast-max-sum algo-
resource allocation method by introducing the canons of rithm can exploit a particular formulation of task allocation
distributive justice. The canons of distributive justice are environments to greatly reduce the communication message
represented as legitimate claims, which are implemented as size and computation required when applying max-sum in
voting functions that determine the order in which resource dynamic environments. In [51], swarm intelligence is used
requests are satisfied. They then presented a formal model to form agent groups for task allocation given that an indi-
of a self-organizing institution, where agents voted on the vidual agent does not have enough resources to complete
weight attached to the scoring functions. As a result, they a task. In the swarm intelligence based approach, agents
unified principles of enduring self-organizing institutions with use only local information and follow simple rules to derive
canons of distributive justice to provide a basis for design- intelligent global behavior. Thus, such approach is very
ing mechanisms to address the resource allocation problem in suitable in the environments, where each individual agent
open systems. has only incomplete information about the environments.
Kash et al. [71] developed a dynamic model to fairly divide For self-organizing resource allocation approaches, auction
resources between agents. They proposed desirable axiomatic based approaches [67] and negotiation based approaches [69]
properties for dynamic resource allocation mechanisms. They can achieve optimal results, because results are obtained
also designed two novel mechanisms which satisfied some of through the bargaining of both parties, which is unlike other
these properties. Their work is the first one which expands the approaches [68], [70], [71] that derive results by using only a
scope of fair division theory from static settings to dynamic specific algorithm or a specific set of algorithms. However,
settings and which takes self-adaptation into account to fair during the bargaining process, heavy communication over-
division theory. head cannot be avoided. Thus, such auction and negotiation
The work done in [68] and [69] aims at how to effi- based approaches are not suitable in some environments where
ciently distribute resources to agents which make requests. communication resources are intensive, e.g., wireless sensor
Schlegel and Kowalczyk’s [68] method is based on multiagent networks.
448 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, VOL. 47, NO. 3, MARCH 2017

TABLE I
C HARACTERISTICS OF THE TASK /R ESOURCE A LLOCATION A PPROACHES

B. Relation Adaptation enables agents to take the actions, which can maximize their
The term, relation adaptation, in different fields has dif- utilities at each step.
ferent meanings. In Web-based systems, relation adaptation Ye et al. [77] proposed a composite relation adaptation
means extracting new types of relations that exist between mechanism. Their mechanism consists of three elements. The
entities in a system [78]. Here, in multiagent systems, relation first one is a trust model to enable agents to use not only
adaptation, also known as relation modification, is a subfield their own experience but also other agents’ opinions to select
of self-organization, which studies how to modify relations candidates, which can make agents select the most valuable
between agents to achieve an efficient agent network structure. candidates to adapt relations. The second one is a multiagent
A relation adaptation mechanism enables agents to arrange and Q-learning algorithm to enable two agents to independently
rearrange the structure of a multiagent system in order to adapt evaluate their rewards about adapting relations and to balance
to changing requirements and environmental conditions [76]. exploitation and exploration. The third one is the introduction
As relation adaptation is a subfield of self-organization rather of weighted relations into the relation adaptation mechanism.
than a field which is independent of self-organization, there The introduction of weighted relations can improve the per-
are no “nonself-organizing” relation adaptation mechanisms. formance of the mechanism and make the mechanism more
Therefore, here, we directly review the work done on relation suitable in dynamic environments.
adaptation in multiagent systems. Summary: The work done in [72]–[75] assumed that only
Gaston and desJardins [72] developed two network struc- one type of relation existed in the network and the number
tural adaptation strategies for dynamic team formation. Their of neighbors possessed by an agent had no effect on its local
first strategy was a structure-based approach, where an agent load. These assumptions are impractical in some cases where
prefers to select another agent to form a connection, which multiple relations exist among agents in a network and agents
has more neighbors. Their second strategy was a performance- have to expend resources to manage their relations with other
based approach, where an agent prefers to form a connection agents. Kota et al.’s [76] work took multiple relations and
with the agent who has better performance. The two strategies relation management load into account. All of these stud-
are suitable in different situations. ies, however, considered only crisp relations between agents
Glinton et al. [73] analyzed the drawback of the structure- and oversimplified candidate selection, while Ye et al.’s [77]
based strategy proposed in [72] empirically, and then designed work considered weighted relations, where there is a relation
a new network adaptation strategy to limit the maximum strength, ranged in [0, 1], to indicate how strong the relation
number of links that an agent could have. is between two agents, and employed a trust model to select
Abdallah and Lesser [74] did further research into relation candidates to adapt relations. However, as Ye et al.’s [77]
adaptation of agent networks and creatively used reinforce- work is based on both trust modeling and reinforcement learn-
ment learning to adapt the network structure. Their method ing, the computation overhead is large. Moreover, during the
enables agents not only to adapt the underlying network struc- trust building process, agents have to communicate with one
ture during the learning process but also to use information another, so the communication overhead is also heavy. Table II
from learning to guide the adaptation process. summarizes the characteristics of the aforementioned relation
Griffiths and Luck [75] presented a tag-based mechanism for adaptation approaches.
supporting cooperation in the presence of cheaters by enabling
individual agents to change their neighborhoods with other C. Organizational Design
agents. Griffiths and Luck’s [75] mechanism is very suitable The research of organizational self-design can be traced
in particular dynamic environments where trust or reputation back to 1977. Weick [79] discussed the application of the
among agents is difficult to establish. concept of self-designing systems in social organizations. At
Kota et al. [76] devised a relation adaptation mechanism. that time, the concept of self-design was so new that concrete
Their work is the first one, which takes multiple relations and illustrations of this new concept in business organizations were
relation management cost into account. The relation adaptation rare. However, the benefits of self-design were revealed then.
algorithm in their mechanism is based on meta-reasoning and In the face of swift changes in the environment, organizations
YE et al.: SURVEY OF SELF-ORGANIZATION MECHANISMS IN MULTIAGENT SYSTEMS 449

TABLE II
C HARACTERISTICS OF THE R ELATION A DAPTATION A PPROACHES

would do too little, too late, and would even fail. Also, organi- these two agents are idle and allows one agent to divide
zations have to avoid having someone from the outside come into two agents if that agent is overloaded. However, their
in to rewire the organizations, whereas organizations have to approach does not consider agent self-extinction. Moreover,
do the rewiring themselves. Therefore, self-design becomes their approach is designed only for a specific problem:
the only choice of organizations. work-allocation and load-balancing in distributed production
Organizational design generally refers to how members of systems.
a society act and relate with one another [80]. It can be used Kamboj and Decker [84] extended Ishida et al.’s [83]
to design and manage participants’ interactions in multiagent work by including worth-oriented domains, modeling other
systems. Specifically, organizational design includes assign- resources in addition to processor resources and incorpo-
ing agents different roles, responsibilities, and peers, and also rating robustness into the organizational structures. Later,
assigning the coordination between the roles and the number of Kamboj [85] analyzed the tradeoffs between cloning and
resources to the individual agents. Different designs applied to spawning in the context of organizational self-design and
the same problem will have different performance characteris- found that combining both cloning and spawning could gener-
tics. Thus, it is important to understand the features of different ate more suitable organizations than using those mechanisms,
designs. Organizational self-design has been introduced, which which use only a single approach.
allows agents to self-design, i.e., self-assign roles, responsi- Ye et al. [86] provided an organizational self-design mech-
bilities, and peers between agents. Like relation adaptation, anism which enabled agents to clone and spawn new agents.
organizational self-design is also a subfield of self-organization These cloned and spawned agents can merge in future if nec-
in multiagent systems, so there are no nonself-organizing orga- essary. For an individual agent, spawning is triggered when
nizational self-design mechanisms. Here, we directly review it cannot finish the assigned tasks on time. If a task or sev-
the work done on organizational self-design in multiagent eral tasks in its list cannot be completed before the expiry
systems. time, an agent will spawn one or several apprentice agent(s),
Decker et al. [81] developed a multiagent system, in each of which has a corresponding resource to complete a
which agents can adapt at organizational, planning, schedul- task. Cloning happens when an agent has too many neigh-
ing, and execution levels. Specifically, their work focused bors, which means that the agent has a heavy overhead for
on agent cloning for execution-time adaptation toward load- managing relations with other agents. Spawned agents will be
balancing, when an agent recognizes, via self-reflection, that self-extinct if no more tasks have to be carried out, and cloned
it is becoming overloaded. agents merge with original agents if the number of neighbors
Shehory et al. [82] proposed an agent cloning mechanism, decreases.
which subsumed task transfer and agent mobility. To perform Summary: The work done in [81] and [83] focused on spe-
cloning, an agent has to reason about its current and future cific systems, i.e., a financial portfolio management system
loads and its host’s load, as well as the capabilities and loads and a distributed production system, respectively, so they may
of other machines and agents. Then, the agent may decide to not be suitable for other systems. Shehory et al.’s [82] work
create a clone or transfer tasks to other agents or migrate to overlooks agent merging and self-extinction and this over-
another host. Their work discusses in detail when and how look may yield a large number of redundant and idle agents.
an agent makes a clone for task allocation in a distributed Kamboj and Decker’s [84] work and Kamboj’s [85] work are
multiagent environment. under a particular computational framework, task analysis,
Ishida et al. [83] studied organizational self-design as an environment modeling and simulation [87], where tasks are
adaptive approach to work allocation and load-balancing. Their represented using extended hierarchical task structures. This
approach allows two agents to combine into one agent if binding may limit the usability of their approaches in other
450 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, VOL. 47, NO. 3, MARCH 2017

TABLE III
C HARACTERISTICS OF THE O RGANIZATIONAL D ESIGN A PPROACHES

domains. Ye et al.’s [86] work does not focus on specific sys- outstanding survey papers of multiagent reinforcement learn-
tems and is not under an existing computational framework. ing (see [93], [95]). As the aim of this paper is to survey self-
In addition, Ye et al.’s [86] work takes agent merging and organization instead of learning, we just review some latest and
self-extinction into consideration. Thus, it can overcome the representative multiagent learning algorithms in this section.
limitation of the aforementioned work to some extend. The Zhang et al. [91] proposed a gradient-based learning algorithm
application of such cloning mechanisms, however, is limited that augmented a basic gradient ascent algorithm with policy
in physical systems, e.g., robot systems and sensor networks, prediction. Their algorithm removes some strong assumptions
as the components in these physical systems are hardware and from existing algorithms (see [98]–[100]) by enabling agents
cannot be cloned. Table III summarizes the characteristics of to predict others’ policies. Thus, their algorithm has better
the aforementioned organizational design approaches. scalability and is more suitable to real applications compared
to existing algorithms. Later, Zhang et al. [33] developed
a learning approach that generalized previous coordinated
D. Reinforcement Learning distributed constraint optimization based multiagent reinforce-
Reinforcement learning is the problem faced by an agent ment learning approaches (see [101]–[103]), which needed
that must learn behavior through trial-and-error interactions intensive computation and significant communication among
with a dynamic environment [92], [93]. At each step, the agents. In comparison, Zhang et al.’s [33] approach enables
agent perceives the state of the environment and takes an multiagent reinforcement learning to be conducted over a spec-
action which causes the environment to transit into a new trum from independent learning without communication by
state. The agent then receives a reward signal that evalu- enabling agents to compute their beneficial coordination set
ates the quality of this transition. As stated in [93], there are in different situations. Elidrisi et al. [104] proposed a fast
two main strategies for solving reinforcement-learning prob- adaptive learning algorithm for repeated stochastic games.
lems. The first strategy is to search in the space of behaviors Compared to some existing algorithms, which require a large
in order to find one that performs well in the environment. number of interactions among agents, their algorithm requires
This approach has been taken by work in genetic algorithms only a limited number of interactions. In Elidrisi et al.’s [104]
and genetic programming [94]. The second strategy is to use algorithm, they developed: 1) a meta-game model to abstract
statistical techniques and dynamic programming methods to a stochastic game into a lossy matrix game representation;
estimate the utility of taking actions in states of the world. 2) a prediction model to predict the opponents’ next action;
The research of reinforcement learning in multiagent systems and 3) a reasoning model to reason about the next action to
mainly focuses on the second strategy, because the second play given the abstracted game and the predicted opponents
strategy takes advantage of the special structure of reinforce- actions. Piliouras et al. [97] proposed an analytical framework
ment learning problems that is not available in optimization for multiagent learning. Unlike standard learning approaches,
problems in general [93]. Reinforcement learning in multia- Piliouras et al.’s [97] work did not focus on the convergence of
gent systems has three new challenges [95]. First, it is difficult an algorithm to an equilibrium or on payoff guarantees for the
to define a good learning goal for the multiple reinforcement agents. Instead, they focused on developing abstractions that
learning agents. Most of the times, each learning agent must can capture the details about the possible agents’ behaviors
keep track of the other learning agents. Only when the agent of multiagent systems, in which there are rich spatio-temporal
is able to coordinate its behavior with other agents’ behav- correlations amongst agents’ behaviors.
ior, a coherent joint behavior can be achieved. Second, as The introduction of self-organization into reinforcement
the learning process of agents is nonstationary, the conver- learning is presented recently,2 which aims to improve the
gence properties of most reinforcement learning algorithms learning performance by enabling agents to dynamically
are difficult to obtain. Third, the scalability of algorithms to
realistic problem sizes is a great concern, as most multia- 2 Please do not confuse this “self-organization” with “self-organizing map”

gent reinforcement algorithms focus on only two agents. There which is a type of artificial neural network. Self-organization mentioned in
this paper is a notion which can benefit other fields by enabling agents
are a large number of multiagent learning algorithms devel- to autonomously make decisions and dynamically adapt to environmental
oped in the last decade (see [96], [97]). There are also some changes.
YE et al.: SURVEY OF SELF-ORGANIZATION MECHANISMS IN MULTIAGENT SYSTEMS 451

TABLE IV
C HARACTERISTICS OF THE R EINFORCEMENT L EARNING A PPROACHES

and autonomously adjust their behaviors to suit changing Software systems, thus, require new and innovative approaches
environments. for building, running, and management so as to become more
Kiselev and Alhajj [88], [89] proposed a computation- versatile, flexible, robust, and self-optimizing by adapting
ally efficient market-based self-adaptive multiagent approach to changing operational environments or system characteris-
to continuous online learning of streaming data and pro- tics [111]. Agent-based software engineering has also been
vided a fast dynamic response with event-driven incremental studied for a long time [4], [112], which is concerned with how
improvement of optimization results. Based on the self- to effectively engineer agent systems, that is, how to specify,
adaptive approach, the performance of the continuous online design, implement, verify (including testing and debugging),
learning is improved and the continuous online learning can and maintain agent systems [113]. Strictly speaking, agent-
adapt to environmental variations. The approach is based based software engineering is not a traditional research issue in
on an asynchronous message-passing method of continuous multiagent systems, instead it is an application of agent tech-
agglomerative hierarchical clustering and a knowledge-based nology. However, the study of agent-based software engineer-
self-organizing multiagent system for implementation. ing is significant for developing and implementing multiagent
Zhang et al. [90] integrated organizational control into mul- systems. Thus, we still review the studies of self-organization
tiagent reinforcement learning to improve the learning speed, for agent-based software engineering in this paper.
quality, and likelihood of convergence. Then, they introduced To enhance agent-based software quality, several techniques
self-organization into organizational control to further enhance have been proposed, e.g., agile techniques [114] and data min-
the performance and reduce the complexity of multiagent ing techniques [115]. Agile techniques can handle unstable
reinforcement learning [91]. Their self-organization approach requirements throughout the development life cycle and can
groups strongly interacting learning agents together, whose deliver products in shorter time frames and under budget con-
exploration strategies are coordinated by one supervisor. The straints in comparison with traditional development methods.
supervisor of a group can buy/sell agents from/to other groups Data mining techniques can be used to discover and predict
through negotiation with the supervisors of those groups. faults and errors in software systems. Self-organization can
Summary: Multiagent reinforcement learning is an efficient also be used in agent-based software systems to enhance soft-
and scalable method to solve many real world problems, ware quality. Compared to those techniques, self-organization
e.g., network packet routing and peer-to-peer information technique enables agents to self-diagnose faults and errors in
retrieval. However, due to factors including a nonstationary software systems. Thus, self-organization technique has good
learning environment, partial observability, a large number scalability and can be used in large scale agent-based software
of agents and communication delay between agents, rein- systems. Self-organizing agent-based software systems, which
forcement learning may converge slowly, converge to inferior are able to adjust their behaviors in response to the percep-
equilibria or even diverge in realistic environments [91]. tion of the environment, have become an important research
Self-organization then can be used to organize and coordi- topic. Cheng et al. [111] and de Lemos et al. [116] provided
nate the behaviors of agents based on their current states a research roadmap regarding the state-of-the-art research
of learning. Thus, self-organization cannot only improve the progress and the research challenges of developing, deploy-
quality of agent learning but also can make agents efficiently ing, and managing self-adaptive software systems. Based on
learning in dynamic environments. Table IV summarizes the their summary, there are four essential topics of self-adaptive
characteristics of the aforementioned reinforcement learning software systems: 1) design space for self-adaptive solutions;
approaches. Kiselev and Alhajj [89] focused on a specific 2) software engineering processes for self-adaptive solutions;
problem: continuous online clustering of streaming data, while 3) decentralization of control loops; and 4) practical run-time
Zhang et al. [91] focused on a common problem in multi- verification and validation.
agent learning: convergence. Although the two studies have Georgiadis et al. [105] studied the feasibility of using
different focuses, both of them use the multiagent negotiation architectural constraints as the basis for the specification,
technique to realize the self-organization process. As both of design and implementation of self-organizing architectures for
their approaches are based on multiagent negotiation, both of distributed software systems. They developed a fully decen-
them suffer the large communication overhead. tralized runtime system to support structural self-organization
based Darwin component model [117] and showed that
E. Enhancing Software Quality the required architectural styles could be expressed and
Current software systems have ultralarge scales due to subsequently analyzed in a simple set based logical
the explosion of information and complexity of technologies. formalism.
452 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, VOL. 47, NO. 3, MARCH 2017

TABLE V
C HARACTERISTICS OF THE S OFTWARE Q UALITY E NHANCEMENT A PPROACHES

Malek et al. [106] presented a self-adaptive solution for the self-* properties, e.g., self-healing, self-protection, and self-
redeployment of a software system to increase the availabil- optimization, to address changing operating conditions in the
ity of the system. Their solution is based on a collaborative system. For example, self-healing enables a software sys-
auctioning algorithm, where the auctioned items are software tem to automatically discover, diagnose, and correct faults;
components. Each host is represented as an autonomous agent self-protection enables a software system to autonomously
and agents sell and buy software components between them prevent from both internal and external malicious attacks;
through the auctioning algorithm. By redeploying software and self-optimization enables a software system to monitor
components, both availability and robustness of the software and adapt resource usage to ensure optimal functioning rel-
system can be increased. ative to defined requirements. Overall, self-adaptation is a
Iftikhar and Weyns [107] proposed a formalized architec- promising approach for modern software systems. Table V
ture model of a self-adaptive software system and used model summarizes the characteristics of the aforementioned soft-
checking to verify behavioral properties of the software sys- ware quality enhancement approaches. The similarity of
tem. They also proved a number of self-adaptation properties these studies is that all of them aim to enhance software
for flexibility and robustness based on a case study, i.e., a quality, while the difference of them is that they focus
decentralized traffic monitoring system. The traffic monitor- on different aspects of agent-based software engineering.
ing software system is conceived as an agent-based system Georgiadis et al. [105] focused on how to build a self-
consisting of two components, agent and organization mid- organizing architecture as a basis for distributed software
dleware. The agent is responsible for monitoring the traffic systems development. Georgiadis et al.’s [105] architecture
and collaborating with other agents to report a possible traffic can work well in the environment, where components may
jam to clients. The organization middleware offers life cycle suddenly fail without the opportunity to interact with the
management services to set up and maintain organizations. rest of the system. Their architecture, however, cannot han-
De la Iglesia and Weyns [108], [109] introduced a self- dle dynamic environments, where events may dynamically
adaptive multiagent system which is an architectural approach rebind and system requirements may dynamically change.
that integrates the functionalities provided by a multiagent Malek et al. [106] focused on how to increase the availability
system with software qualities offered by a self-adaptive of a system. Malek et al.’s [106] method is decentralized and
solution. They then presented a reference model for the does not need global knowledge of system properties. Thus,
self-adaptive multiagent system and applied it to a mobile the method can scale to the exponentially complex nature of
learning case. They also used a formal verification technique the redeployment problem. However, their method is based
as an approach to guarantee the requirements of the self- on an auction algorithm, so the communication overhead of
adaptive multiagent system application. The reference model their method is heavy. Iftikhar and Weyns [107] focused on
is a three-layered architecture where the bottom layer provides how to check and verify the self-adaptation properties of a
the communication infrastructure which defines the means self-organizing software system. Iftikhar and Weyns’s [107]
for communication between agents, the middle layer pro- model can enhance the validation of software system quali-
vides the multiagent system which handles requirements of ties by transferring formalization results over different phases
the domain, and the top layer provides self-adaptation which of the software life cycle. However, their model is proposed
can modify the multiagent system layer to cover system quality and evaluated through a case study, i.e., a traffic monitoring
concerns. system. Thus, it is unclear how their model works in other sys-
Summary: Self-adaptation is a well-known approach for tems. De la Iglesia and Weyns [108], [109] focused on how
managing the complexity of modern software systems by to design a general model to cover various concerns of sys-
separating logic that deals with particular runtime quali- tem quality. By using behavioral models and formal methods,
ties [118], [119]. Self-adaptation enables a software system de la Iglesia and Weyns’s [108], [109] approach can guarantee
to adapt itself autonomously to internal dynamics and chang- the correctness of system behavior and guarantee the quality
ing conditions in the environment to achieve particular quality properties of interest during the engineering of self-organizing
goals. Self-adaption in software systems includes a number of multiagent systems. However, the implemented system, based
YE et al.: SURVEY OF SELF-ORGANIZATION MECHANISMS IN MULTIAGENT SYSTEMS 453

TABLE VI
C HARACTERISTICS OF THE C OLLECTIVE D ECISION M AKING A PPROACHES

on their approach, has not been evaluated in dynamic environ- However, as the approach is opinion-based, the generation and
ments. Thus, it is unclear if the desired quality goals of the transmission of opinions in the system are computation and
system can be met in undesired states. communication intensive.
Summary: The introduction of self-organization into col-
F. Collective Decision Making lective decision making makes the decision making process
decentralized and enables agents to dynamically make deci-
Collective decision making originates from social science.
sions based on environmental changes. However, as described
When a person is in a social context, her decisions are influ-
above, collective decision making in self-organized systems is
enced by those of others. Then, collective decision making
still challenging because it relies only on local perception and
is a process where the members of a group decide on a
local communication. Table VI summarizes the characteristics
course of action by consensus [120]. Collective decision mak-
of the work in [110].
ing has been studied by economists and sociologists since at
least the 1970s [121], [122]. Later, collective decision making
has been studied by statistical physicists who developed mod- G. Other Research Issues
els to quantitatively describe social and economic phenomena In addition to the above issues, there are some other issues
that involve large numbers of interacting people [123], [124]. in multiagent systems which are also addressed using self-
Recently, collective decision making has also been investigated organization mechanisms.
in multiagent systems [110]. Traditional solutions of collective 1) Coalition Formation: In some real systems, e.g., dis-
decision making are based on centralized approaches [125]. tributed sensor networks, individual agents often need to form
Self-organization can provide a valuable alternative to the coalitions to accomplish complex tasks, as complex tasks can-
centralized solutions. However, introducing self-organization not be performed by a single agent or groups may perform
into collective decision making is a significant challenge more efficiently with respect to the single agents’ perfor-
because only local perception and local communication can mance. Most existing coalition formation studies enable each
be used [110]. Globally defined consensus time and decision individual agent to join only one coalition (see [128] for a
accuracy are very difficult to predict and guarantee. Toward survey of existing coalition formation mechanisms). To over-
this end, several self-organized collective decision making come this limitation, some researchers proposed overlapping
algorithms have been proposed [110], [120], [126], [127]. coalition formation [129] and fuzzy coalition formation [130]
Among these algorithms, only one was developed in multi- to enable each agent to join multiple coalitions. Such studies,
agent systems [110]. however, do not allow agents to dynamically adjust degrees of
Valentini et al. [110] presented a weighted voter model to involvement in different coalitions. Ye et al. [128] introduced
implement a self-organized collective decision making pro- self-adaptation into coalition formation by allowing agents to
cess to solve the best-of-n decision problem in multiagent dynamically adjust degrees of involvement in different coali-
systems. They also provided an ordinary differential equations tions and to join new coalitions. Through the introduction
model and a master equation model to investigate the system of self-adaptation, the performance of the coalition formation
behavior in the thermodynamic limit and to investigate finite- mechanism is improved in terms of agents’ profit and time
size effects due to random fluctuations. The weighted voter consumption. Ye et al.’s [128] approach, however, is based on
model is based on the extension of classic voter model by: negotiation, so it suffers from a large communication overhead.
1) considering the change of agents’ neighborhood; 2) allow- 2) Evolution of Cooperation: The evolution of cooperation
ing agents to participate in the decision process at different among selfish individuals is a fundamental issue in a num-
rounds for a time proportional to the qualities of their opin- ber of disciplines, such as artificial intelligence [131], [132],
ions; and 3) allowing agents to temporarily leave the decision physics [133], biology [134], sociology [135], and eco-
pool in order to survey the quality of their current opinion. nomics [136]. The aim of evolution of cooperation is to
Valentini et al. [110] used opinion-based approaches. Opinion- increase the proportion of cooperators in a group of agents,
based approaches need more communication overhead than the each of which is either a cooperator or a defector. Existing
swarm intelligence technique. However, a consensus is eas- strategies of the evolution of cooperation have both strengths
ier and faster to achieve using opinion-based approaches than and limitations. For example, some strategies can only increase
using swarm intelligence technique. The advantages of their the proportion of cooperators, only if the initial proportion
approach includes that: 1) with the increase of system size, of cooperators is larger than a specific number, e.g., 0.5;
the decision accuracy also increases; 2) with the increase of some strategies can only increase the proportion of cooper-
system size, the consensus time logarithmically increases; and ators, only if it works in a specific network structure, e.g., a
3) the approach is robust to noisy assessments of site qualities. small-world network [137]. Ye and Zhang [138] developed a
454 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, VOL. 47, NO. 3, MARCH 2017

self-adaptation based strategy for evolution of cooperation by early stage of the system operation when only a small number
embodying existing strategies as each agent’s knowledge and of learning samples have been used.
letting each agent dynamically select a strategy to update its Mamei et al. [144] and Camurri et al. [145] proposed a
action, i.e., cooperate or defect, according to different environ- self-organizing multiagent system for controlling road traffic
mental situations. As a result, Ye and Zhang’s [138] strategy in a large city. Traffic participants, i.e., cars, are represented
can utilize the strengths of existing strategies and avoid the by car software agents. Traffic lights are represented by light
limitations of them. Ye and Zhang’s [138] strategy, however, agents. The aim of the system is to coordinate individual
is based on a reinforcement learning algorithm. Thus, its per- cars and to control traffic lights so as to minimize traffic
formance highly depends on the performance of the learning jams. In the system, car software agents are coordinated using
algorithm. Such dependency relationship may limit the appli- the information obtained from light agents. The basic idea
cability of their strategy, as a learning algorithm is suitable of the self-organization paradigm is that car software agents
only in a limited number of situations. dynamically select routes for cars to avoid current traffic
3) Self-Checking Logical Agents: Certification and assur- jams based on the information obtained from light agents.
ance of agent systems constitute crucial issues, as agents Meanwhile, light agents implement a context-sensitive traf-
represent a particularly complex case of dynamic, adap- fic lights control strategy to minimize traffic jams throughout
tive, and reactive software systems [139]. Certification aims the city.
at producing evidence indicating that deploying a given In [146] and [147], a self-organizing multiagent system
system in a given context involves the lowest possible was proposed to control manufacturing resources. The self-
level of risk of adverse consequences. Assurance is related organization mechanism is based on a swarm intelligence
to dependability, i.e., to ensuring that system users can model which controls the production processes by predicting
rely on the system. Costantini and De Gasperis [139] and the resource utilization for a short period of time, evaluat-
Costantini [140], [141] have done significant work on self- ing the state of order execution and finding the best further
checking agent systems. They presented a comprehensive routing for the orders. The self-organizing multiagent system
framework for runtime self-monitoring and self-checking includes three types of agents: 1) product agents; 2) order
assurance of logical agents by means of temporal-logic-based agents; and 3) resource agents. The three types of agents indi-
special constraints to be dynamically checked at a certain (cus- rectly coordinate to find variants of the step-by-step order
tomizable) frequency. The constraints are based on a simple execution using concrete resources and to generate an optimal
interval temporal logic, agent-oriented interval linear tempo- product execution plan.
ral logic. Based on Costantini and De Gasperis’s [139] and Dury et al. [148] described an application of a self-
Costantini’s [140], [141] framework, agent systems are able to organizing multiagent system in land utilization. The system
dynamically self-check the violations of desired system prop- is used to optimally assign farming territories to various crops
erties. Moreover, in the case of violation, agents can quickly so as to obtain the maximum total profit by selling the crops
restore to a desired state by means of run-time self-repair. yielded in the future. In this assignment problem, the resource
However, their framework mainly focuses on self-checking to be assigned is the set of farming lots with characteristics
and self-repair while overlooks other self-* functionality of such as area, soil type, distance to the nearest villages, and
agent systems, e.g., self-healing, self-optimization, etc. transportation infrastructure. The self-organizing multiagent
system involves a set of agents which compete for capturing
the lots. Agents are self-organized into groups. The agents in
V. A PPLICATIONS OF S ELF -O RGANIZING the same group want to get hold of the lots for the same crop.
M ULTIAGENT S YSTEMS Each agent in each group competes for capturing a lot with
In addition to theoretical studies, self-organizing multiagent the desired properties. If an agent wins it makes a contribution
systems can be used in many application domains [36], [39]. In to the utility function of its group.
this section, some examples of application of self-organizing Sohrabi et al. [149] presented three protocols/algorithms
multiagent systems are provided. for self-organization of wireless sensor networks, where each
Georgé et al. [142], [143] developed a self-organizing mul- sensor is represented as an agent. The first protocol is the
tiagent system for flood forecasting. The system consists of self-organizing medium access control protocol which is for
several stations installed over the river basin which forecast network start-up and link-layer organization of wireless sensor
local variation in the water level. Each station has a two- networks. The first protocol is used to form a flat topology for
level multiagent architecture. The lower level includes sensors wireless sensor networks. The first protocol is a distributed one
which detect variations of water level every hour and provide which enables sensor nodes to discover their neighbors and
the data to upper level agents. Each upper level agent then establish transmission/reception schedules for communicating
makes its forecast based on the data, provided by sensors, and with them without the need for local or global master nodes.
the assessment of the quality of its preceding forecasts. The The second algorithm is an eavesdrop-and-register algorithm
self-organization process is carried out at the level of sensors, which is used for seamless interconnection and mobility man-
where each sensor dynamically modifies the weights of mea- agement of mobile nodes in wireless sensor networks. The
surements taken at different times. Experiments demonstrated third protocol consists of three algorithms: 1) the sequential
that the proposed self-organizing multiagent system is appli- assignment routing algorithm which is for multihop routing;
cable to the actual evolution of the water level even at the 2) the single winner election algorithm; and 3) the multiwinner
YE et al.: SURVEY OF SELF-ORGANIZATION MECHANISMS IN MULTIAGENT SYSTEMS 455

election algorithm which handle the necessary signaling and to reallocate tasks and resources if such reallocation can bring
data transfer tasks in local cooperative information processing. more benefits to the focal agents. Thus, the future research
In multirobot systems, self-organization can be used for on self-organizing task/resource allocation can be extended to
the division of labor control [150]–[152]. For example, self-organizing task/resource allocation and reallocation. Such
Liu et al. [151] presented a self-adaptation mechanism which reallocation could be based on the performance of existing
could dynamically adjust the ratio of foragers to resters in agents and the capability of new agents. In addition, most
a swarm of foraging robots in order to maximize the net of existing self-organizing approaches do not consider the
energy income to the swarm. The self-adaptation mechanism is interdependencies among tasks and resources [155], [156].
based only on local sensing and communications. By using this Thus, the future research can take such interdependencies into
mechanism, robots can use internal information (e.g., success- account. Also, most of existing self-organizing approaches
ful food retrieval), environmental information (e.g., collisions were developed in selfish environments, where every agent
with team mates while searching for food), and social informa- aims to maximize its own benefit. However, many real
tion (e.g., team mate success in food retrieval) to dynamically world systems, e.g., sensor networks, are cooperative envi-
vary the time spent on foraging and resting. ronments, where agents aim to maximize the overall benefit
of a system. Thus, it is also important to develop self-
VI. F UTURE R ESEARCH D IRECTIONS organizing task/resource allocation approaches in cooperative
The technology of self-organizing multiagent systems inte- environments.
grates the properties of self-organization, e.g., decentralization Self-organizing task allocation in general multiagent sys-
and dynamic and evolutionary operation, and the advan- tems, however, has not attracted much attention. This is
tages of multiagent systems, e.g., autonomy and sociability. because task allocation is often used as a platform for
Self-organizing multiagent systems, therefore, have good scal- other research issues, e.g., coalition formation [42] and rela-
ability, are robust to failures of components and can adapt tion adaptation [76]. Also, task allocation is often studied
to the dynamics of external environments and the changing incorporating with specific physical systems, e.g., sensor net-
of internal structures. The majority of current research of works [157] and multirobot systems [151]. Thus, in this
self-organizing multiagent systems is theoretical. The study situation, the future research of self-organizing task alloca-
on application of self-organizing multiagent systems is still tion should concentrate on specific physical systems by taking
in an early stage. Thus, as a whole, the future study of specific constraints and requirements into account. Such phys-
self-organizing multiagent systems should focus more on real ical systems include sensor networks, multirobot systems, grid
world systems by considering specific constraints and require- computing, manufacturing control, and so on, where self-
ments. Moreover, since currently there is a lack of a mature organization is highly desirable to increase the autonomy,
methodology or tool for developing self-organizing multiagent scalability, and robustness of these physical systems. Likewise,
systems, the future research could also focus on devising an self-organizing resource allocation could also concentrate on
efficient methodology or tool for developing self-organizing specific physical systems, e.g., sensor networks [158] and
multiagent systems.3 smart grid [159]. These physical systems may be open and
To design and develop self-organizing multiagent systems, dynamic and are difficult to manage or organize using exist-
various self-organization mechanisms have to be developed ing nonself-organization techniques. However, it should be
to address the basic research issues described in Section IV. noted that in different physical systems, resources may have
These research issues are important not only in multiagent sys- different properties. For example, resources may be contin-
tems but also in specific physical systems, e.g., task allocation uous or discrete. In a smart grid, resource (i.e., energy) is
in multirobot systems, resource allocation in sensor networks, continuous while in a fruit market, resource (i.e., fruit) is
etc. Although a number of self-organization mechanisms in discrete. Resources may be reusable or not. In a computer
multiagent systems have been developed in the last decades, system, resource (e.g., CPU or memory) is reusable while
there is still room for them to be improved or extended. In in a smart grid, resource (i.e., energy) is not reusable. All
this section, the future research directions of self-organization such properties must be taken into consideration when design-
mechanisms against each important research issue in mul- ing self-organizing resource allocation mechanisms in specific
tiagent systems and some specific physical systems will be physical systems.
discussed.
B. Relation Adaptation
A. Task/Resource Allocation
Relation adaptation is actually researched as a subfield of
Task allocation and resource allocation are traditional and self-organization. In [74], [77], and [160], the authors used
important research issues in multiagent systems. Existing self- the terms “relation adaptation” and self-organization inter-
organizing task/resource allocation approaches in multiagent changeably. Relation adaption did not attract much attention in
systems mainly focus on how to efficiently allocate tasks and multiagent systems compared to those popular research issues,
resources. Most of these approaches, however, overlook how e.g., task/resource allocation, coalition formation, and rein-
3 Although there have been some methodologies for developing self- forcement learning. The research on relation adaptation usually
organizing multiagent systems [36], [143], [153], [154], they are far from has to deal with two problems: 1) with whom to modify rela-
mature. tions and 2) how to modify relations. The first problem is about
456 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, VOL. 47, NO. 3, MARCH 2017

selecting partners in a network and can usually be addressed reinforcement learning. The self-organization approach used
using trust models. Existing approaches for partner selection, in Zhang et al.’s [90], [91] work carries out in the management
however, are based on agents interaction history without con- layer, i.e., between supervisors of groups. Future research may
sidering the dynamism of the environment. For example, if focus on designing a self-organization approach which is able
agent i and agent j has a good interaction history. Then, based to work not only in the management layer but also between
on existing approaches, i will add j as one of its neighbors. agents in each group. In addition, the essence of reinforce-
However, due to the dynamism of the environment, j may ment learning is how to adjust probability distribution among
leave the environment very shortly. Thus, in this situation, if i available actions and many adjustment approaches have been
takes environmental dynamism into account, it will not add j proposed. Thus, another future research may be that existing
as one of its neighbors. The second problem is about selecting probability distribution adjustment approaches can be embod-
a proper relation, in the case that the number of relations is ied as knowledge of each agent and each agent autonomously
more than one, and can usually be addressed using reasoning selects an adjustment approach in each learning round. Also,
and learning (either individual or collective depending on the in reinforcement learning, the setting of values of learning
settings). However, agents, which use reasoning and learning parameters, e.g., learning rates, can affect the performance
techniques, may become very subjective, as there is no interac- of a learning algorithm, and no set of values of parameters
tion between agents during reasoning and learning processes. is best across all domains [164]. However, in most existing
Thus, the negotiation technique may be a good choice in this learning algorithms, the setting of values of parameters are
situation, because both parties can present their requirements hand-tuned. Thus, it should be interesting to develop a self-
and offers. Therefore, a negotiation result, which can benefit organization approach to self-adjust the values of learning
both parties, can be achieved. parameters during the learning process in different situations.
In addition, relation adaptation can be used in real-world Reinforcement learning has been employed in many physi-
systems, e.g., social networks [161] and multirobot sys- cal systems. For example, reinforcement learning can be used
tems [162], to adapt relations among entities to achieve more for sleep/wake-up scheduling in wireless sensor networks to
efficient organizational structures. Therefore, it is also a future save sensors’ energy [165], [166]. Reinforcement learning can
research direction to study relation adaptation in real-world also be used to learn the pricing strategies for the broker agents
systems. in smart grid markets [167]. Therefore, future research can
also focus on applying self-organizing reinforcement learning
C. Organizational Design in physical systems to improve the learning performance in
these systems, e.g., increasing convergence speed or reducing
Organizational design was initially studied in social organi-
communication and computation overhead.
zations for specific purposes. Then, the research was conveyed
in multiagent systems for general purposes. Originally, the
research of organizational design focuses on how to assign E. Enhancing Software Quality
roles to different participants in an organization. When self- Traditionally, software quality is guaranteed by system man-
organization is introduced in organizational design, i.e., orga- agers. However, an error happening in software systems may
nizational self-design, the research includes other aspects, e.g., cost system managers several hours, sometimes even sev-
agent cloning/spawning, agent extinction, and agent mobil- eral days, to find and fix it. Therefore, self-organization has
ity. However, there still lacks an organizational self-design been introduced into software systems, which includes many
framework which combines all these aspects together: self- self-* properties, e.g., self-configuration, self-checking, self-
assigning roles, self-cloning, self-spawning, self-extinction, healing, self-protection, self-optimization, etc. Most existing
etc. Therefore, future research of organizational self-design studies include part of these self-* properties. Thus, future
can be conducted through this way. research may design a self-organizing agent-based software
Also, as the research of organizational design was originated system which includes all of these properties. This is certainly
from social organizations, in the future, the research of organi- a very large project and would need a group of researchers to
zational self-design can be conducted in social organizations, collaboratively complete.
e.g., enterprises [163]. Compared to existing organizational
techniques in social organizations, introducing organizational F. Collective Decision Making
self-design technique into social organizations can increase
Like organizational design, collective design making also
the autonomy of each entity and avoids the centralization of
originates from social science, where members of a group
authority to some extent.
have to collectively make decisions to achieve a consensus.
Most existing studies of self-organizing collective decision
D. Reinforcement Learning making were conducted in multirobot systems instead of gen-
Like task/resource allocation, reinforcement learning is also eral multiagent systems. Very recently, Valentini et al. [110]
an important and a popular research topic in multiagent sys- investigated self-organization for collective decision making
tems. However, introducing self-organization into reinforce- in multiagent systems. Their work considered nearly every
ment learning has not attracted much attention. The milestone aspect of self-organizing collective decision making. Future
work in this field is Zhang et al.’s [90], [91] work, which research of self-organizing collective decision making may be
introduced organizational control and self-organization into conducted in open environments where new agents can join
YE et al.: SURVEY OF SELF-ORGANIZATION MECHANISMS IN MULTIAGENT SYSTEMS 457

the group and existing agents can leave the group. In addi- is a set of rules which govern the interaction process among
tion, as the problem of finding a collective agreement over agents. Such rules indicate the allowable participants (e.g.,
the most favorable choice among a set of alternatives is the which agents are allowed to join the negotiation), the negotia-
“best-of-n” decision problem, in a dynamic environment, the tion states (e.g., bids or offers generated, accepted or rejected,
“n” may change over time. Hence, it is necessary to develop negotiation started and negotiation terminated), the events that
a self-adaptive approach to enable agents to self-adjust their cause state transitions (e.g., when a bid or an offer is accepted,
behavior in a timely manner to make a best decision in the the negotiation is terminated; or when the deadline is reached,
dynamic environment. the negotiation is terminated no matter if an agreement is
Self-organizing collective decision making can also be achieved) and the valid actions of the participants in par-
applied in sensor networks for various purposes, e.g., clock ticular states (e.g., what can be sent by whom, to whom
synchronization. Most of existing techniques for clock syn- and at when) [170]. In the future, self-organization may be
chronization in sensor networks need global information or introduced into negotiation for agent decision making in the
require all sensors to participate in the synchronization pro- interaction process. For example, agents may self-adjust the
cess [168], [169]. By using the self-organizing collective strategies for bid or offer generation and dynamically decide
decision making technique for clock synchronization, only when to generate bids or offers in different situations based
local information is needed and only part of sensors are on self-organization mechanisms.
required to participate. In order to successfully interact in environments, agents
must be able to reason about their interactions with other
heterogeneous agents which have different properties and
G. Other Research Issues capabilities [171]. During the reasoning process, an agent first
In addition to the above research issues, there are some observes the environment and its internal state. Then, the agent
other important research issues in multiagent systems, which creates a new goal and generates a set of candidate plans.
attracted little or no attention on how to introduce self- Finally, the agent selects the most suitable plan to execute to
organization into them. These research issues include coali- achieve the goal [172]. The plan generation is called planning.
tion formation, evolution of cooperation, self-checking logical Multiagent planning is also known as multiagent sequential
agents, negotiation, coordination, planning, and reasoning. decision making, that is a set of agents with complemen-
Coalition formation and evolution of cooperation have been tary capabilities coordinate to generate efficient plans so as to
studied for a very long time. However, very few studies achieve their respective goals [173]–[175]. These plans should
considered introducing self-organization into them. Thus, the not be in conflict with each other. Reasoning, planning, and
research of introducing self-organization into coalition forma- coordination have a close relationship with one another, as
tion and evolution of cooperation has a great potential. As planning is a step during a reasoning process and coordina-
both coalition formation and evolution of cooperation consist tion is used to guarantee that individual agents’ plans are not in
of a number of steps, future research of introducing self- conflict with each other. In the future, self-organization may
organization into the two topics can focus on different steps. be introduced into planning and coordination. For example,
For example, for coalition formation, an existing study [128] self-organization mechanisms can be developed for adaptive
uses self-organization for agents to self-adapt degrees of generation and selection of plans. Also, as coordination can be
involvement in different coalitions, while future research carried out using learning [173], [175], [176], self-organization
could use self-organization for agents to autonomously and in coordination may be achieved by designing self-organizing
dynamically recruit/expel coalition members. For evolution of learning algorithms such as the ones discussed in Section IV.
cooperation, an existing study [86] uses self-organization for Moreover, there are delay phenomena, e.g., time delay
agents to autonomously select an action update strategy in each and feedback delay, in practical self-organizing systems, e.g.,
round, while future research could use self-organization for biological systems [177], [178] and neural network sys-
agents to modify the relationships (e.g., strengthen or weaken tems [179], [180]. These delay phenomena, however, have
the relationships) with their neighbors in each round. not been considered in existing self-organizing multiagent sys-
Most of the research on self-checking logical agents has tems, although delay phenomena have been taken into account
been undertaken by Costantini and De Gasperis [139] and in general multiagent systems [181]. In order to make self-
Costantini [140], [141]. The research on self-checking logi- organizing multiagent systems applicable in practical systems,
cal agents is akin to the research on self-checking software it is also the future research to take delay phenomena into
agents except that Costantini and De Gasperis [139] and account when designing self-organizing multiagent systems.
Costantini [140], [141] considered more on logical agents.
Therefore, similar to the issue of enhancing software quality,
future research on self-checking logical agents could extend VII. C ONCLUSION
to other self-* properties of logical agents, e.g., self-healing, In this paper, self-organization mechanisms in multiagent
self-optimization, etc. systems have been surveyed. The classification method used in
In multiagent systems, negotiation is a key tool for multiple this survey is objective-based classification in order to provide
autonomous agents to reach mutually beneficial agreements. good readability. Readers then can have a deep understanding
The process of negotiation can be of different forms, e.g., of the benefits of using self-organization to address various
auctions, protocols, and bargains. In each of the forms, there multiagent system research issues. In this survey, we provided
458 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, VOL. 47, NO. 3, MARCH 2017

the basic concepts of self-organization, have highlighted [19] C. W. Reynolds, “Flocks, herds and schools: A distributed behav-
the major research issues in multiagent systems, discussed ioral model,” in Proc. Annu. Conf. Comput. Graph. Interact. Tech.
(SIGGRAPH), Anaheim, CA, USA, 1987, pp. 25–34.
how these issues can be addressed using self-organization [20] W. Elmenreich, R. D’Souza, C. Bettstetter, and H. de Meer, “A survey
approaches, and presented important research results achieved. of models and design methods for self-organizing networked systems,”
We also identified other survey papers regarding self- in Self-Organizing Systems, vol. 5918. Berlin, Germany: Springer,
2009, pp. 37–49.
organization in multiagent systems and pointed out the dif- [21] G. D. M. Serugendo, M.-P. Gleizes, and A. Karageorgos, “Self-
ferences between their work and ours. Finally, this paper was organization in multi-agent systems,” Knowl. Eng. Rev., vol. 20, no. 2,
concluded with a discussion of future research directions of pp. 165–189, 2005.
[22] H. Tianfield and R. Unland, “Towards self-organization in multi-agent
those surveyed research issues in multiagent systems. The systems and grid computing,” Multiagent Grid Syst., vol. 1, no. 2,
research issues discussed in this paper have been broadly stud- pp. 89–95, 2005.
ied not only in multiagent systems but also in other specific [23] A. Rogers, E. David, and N. R. Jennings, “Self-organized routing for
wireless microsensor networks,” IEEE Trans. Syst., Man, Cybern. A,
systems, e.g., robot systems and sensor networks. Thus, each Syst., Humans, vol. 35, no. 3, pp. 349–359, May 2005.
of the research issues deserves a separate survey, which is one [24] Y. Song, L. Liu, H. Ma, and A. V. Vasilakos, “A biology-based algo-
of our future studies. Moreover, as described in Section I, the rithm to minimal exposure problem of wireless sensor networks,” IEEE
Trans. Netw. Service Manag., vol. 11, no. 3, pp. 417–430, Sep. 2014.
survey in this paper delimits in stage 2: organization design [25] L. Liu, Y. Song, H. Zhang, H. Ma, and A. V. Vasilakos, “Physarum
and stage 3: agent internal activity design. Thus, in the future, optimization: A biology-inspired algorithm for the Steiner tree prob-
the survey could be extended to other stages of multiagent lem in networks,” IEEE Trans. Comput., vol. 64, no. 3, pp. 818–831,
Mar. 2015.
system development. [26] R. Frei and G. D. M. Serugendo, “Self-organizing assembly sys-
tems,” IEEE Trans. Syst., Man, Cybern. C, Appl. Rev., vol. 41, no. 6,
pp. 885–897, Nov. 2011.
R EFERENCES [27] M. A. Khan, H. Tembine, and A. V. Vasilakos, “Game dynamics and
cost of learning in heterogeneous 4G networks,” IEEE J. Sel. Areas
[1] K. P. Sycara, “Multiagent systems,” AI Mag., vol. 19, no. 2, pp. 79–92,
Commun., vol. 30, no. 1, pp. 198–213, Jan. 2012.
1998.
[28] G. D. M. Serugendo, M.-P. Gleizes, and A. Karageorgos, “Self-
[2] H. A. Simon, Models of Man: Social and Rational—Mathematical organisation and emergence in MAS: An overview,” Informatica,
Essays on Rational Human Behavior in a Social Setting. New York, vol. 30, no. 1, pp. 45–54, 2006.
NY, USA: Wiley, 1957. [29] V. I. Gorodetskii, “Self-organization and multiagent systems: I. Models
[3] M. J. Wooldridge and N. R. Jennings, “Intelligent agents: Theory and of multiagent self-organization,” J. Comput. Syst. Sci., vol. 51, no. 2,
practice,” Knowl. Eng. Rev., vol. 10, no. 2, pp. 115–152, 1995. pp. 256–281, 2012.
[4] M. J. Wooldridge, “Agent-based software engineering,” IEE Proc. [30] O. G. Aliu, A. Imran, M. A. Imran, and B. Evans, “A survey of self
Softw. Eng., vol. 144, no. 1, pp. 26–37, 1997. organisation in future cellular networks,” IEEE Commun. Surveys Tuts.,
[5] M. J. Wooldridge, Multiagent Systems: A Modern Approach to vol. 15, no. 1, pp. 336–361, Mar. 2013.
Distributed Artificial Intelligence, G. Weiss, Ed. Cambridge, MA, [31] K. L. Mills, “A brief survey of self-organization in wireless sensor net-
USA: MIT Press, 1999. works,” Wireless Commun. Mobile Comput., vol. 7, no. 7, pp. 823–834,
[6] B. Chaib-Draa, “Industrial applications of distributed AI,” Commun. 2007.
ACM, vol. 38, no. 11, pp. 49–53, 1995. [32] F. Dressler, “A study of self-organization mechanisms in ad hoc and
[7] N. R. Jennings, K. Sycara, and M. Wooldridge, “A roadmap of agent sensor networks,” Comput. Commun., vol. 31, no. 13, pp. 3018–3029,
research and development,” J. Auton. Agents Multi-Agent Syst., vol. 1, 2008.
no. 1, pp. 7–38, 1998. [33] Z. Zhang, K. Long, and J. Wang, “Self-organization paradigms and
[8] M. J. Wooldridge, An Introduction to Multiagent Systems, 2nd ed. optimization approaches for cognitive radio technologies: A survey,”
Chichester, U.K.: Wiley, 2009. IEEE Wireless Commun., vol. 20, no. 2, pp. 36–42, Apr. 2013.
[9] P. G. Balaji and D. Srinivasan, “An introduction to multi-agent sys- [34] S. Dobson et al., “A survey of autonomic communications,” ACM
tems,” in Innovations in Multi-Agent Systems and Applications—1. Trans. Auton. Adapt. Syst., vol. 1, no. 2, pp. 223–259, 2006.
Berlin, Germany: Springer, 2010, pp. 1–27. [35] S. Bousbia and D. Trentesaux, “Self-organization in distributed manu-
[10] D. Kinny, M. Georgeff, and A. Rao, “A methodology and modelling facturing control: State-of-the-art and future trends,” in Proc. IEEE Int.
technique for systems of BDI agents,” in Proc. 7th Eur. Workshop Conf. Syst. Man Cybern., Hammamet, Tunisia, 2002, Art. ID WA1L1.
Model. Auton. Agents Multi-Agent World, vol. 1038. Eindhoven, [36] C. Bernon, V. Chevrier, V. Hilaire, and P. Marrow, “Applications of self-
The Netherlands, 1996, pp. 56–71. organising multi-agent systems: An initial framework for comparison,”
[11] H. S. Nwana, “Software agents: An overview,” Knowl. Eng. Rev., Informatica, vol. 30, no. 1, pp. 73–82, 2006.
vol. 11, no. 3, pp. 205–244, 1996. [37] G. Picard, J. F. Hübner, O. Boissier, and M.-P. Gleizes, “Reorganisation
[12] C. A. Iglesia, M. Garijo, and J. C. González, A Survey of Agent- and self-organisation in multi-agent systems,” in Proc. Int. Workshop
Oriented Methodologies (LNCS 1555). Berlin, Germany: Springer, Organ. Model., Paris, France, 2009, pp. 66–80.
1998, pp. 185–198. [38] G. D. M. Serugendo, M.-P. Gleizes, and A. Karageorgos, “Self-
[13] M. Wooldridge, N. R. Jennings, and D. Kinny, “The Gaia methodol- organisation and emergence in software systems,” in Self-Organising
ogy for agent-oriented analysis and design,” Auton. Agents Multi-Agent Systems. Berlin, Germany: Springer, 2011, pp. 7–32.
Syst., vol. 3, no. 3, pp. 285–312, 2000. [39] V. I. Gorodetskii, “Self-organization and multiagent systems: II.
[14] M. F. Wood and S. A. DeLoach, “An overview of the multiagent Applications and the development technology,” J. Comput. Syst. Sci.,
systems engineering methodology,” in Proc. 1st Int. Workshop Agent- vol. 51, no. 3, pp. 391–409, 2012.
Orient. Softw. Eng., vol. 1957. Limerick, Ireland, 2001, pp. 207–221. [40] M. Sims, C. V. Goldman, and V. Lesser, “Self-organization through
[15] Q. N. N. Tran and G. Low, “MOBMAS: A methodology for ontology- bottom-up coalition formation,” in Proc. AAMAS, Melbourne, VIC,
based multi-agent systems development,” Inf. Softw. Technol., vol. 50, Australia, 2003, pp. 867–874.
nos. 7–8, pp. 697–722, 2008. [41] P. R. J. Ferreira, F. S. Boffo, and A. L. C. Bazzan, “A self-organized
[16] O. Z. Akbari, “A survey of agent-oriented software engineering algorithm for distributed task allocation in complex scenarios,” in
paradigm: Towards its industrial acceptance,” J. Comput. Eng. Res., Proc. Workshop Coord. Control Massively Multi-Agent Syst. (CCMMS),
vol. 1, no. 2, pp. 14–28, 2010. 2007, pp. 19–33.
[17] Y. Shoham and S. B. Cousins, “Logics of mental attitudes in AI,” [42] O. Shehory and S. Kraus, “Methods for task allocation via agent
in Foundations of Knowledge Representation and Reasoning. Berlin, coalition formation,” Artif. Intell., vol. 101, nos. 1–2, pp. 165–200,
Germany: Springer, 1994, pp. 296–309. 1998.
[18] W. R. Ashby, “Principles of the self-organizing system,” in [43] X. Zheng and S. Koenig, “Reaction function for task allocation
Principles of Self-Organization. London, U.K.: Pergamon Press, 1962, to cooperative agents,” in Proc. AAMAS, Estoril, Portugal, 2008,
pp. 255–278. pp. 559–566.
YE et al.: SURVEY OF SELF-ORGANIZATION MECHANISMS IN MULTIAGENT SYSTEMS 459

[44] P. Scerri, A. Farinelli, S. Okamoto, and M. Tambe, “Allocating tasks [70] J. Pitt, J. Schaumeier, D. Busquets, and S. Macbeth, “Self-organising
in extreme teams,” in Proc. AAMAS, Utrecht, The Netherlands, 2005, common-pool resource allocation and canons of distributive justice,” in
pp. 727–734. Proc. IEEE 6th Int. Conf. Self-Adapt. Self-Organ. Syst., Lyon, France,
[45] S. Abdallah and V. Lesser, “Learning the task allocation game,” 2012, pp. 119–128.
in Proc. AAMAS, Hakodate, Japan, 2006, pp. 850–857. [71] I. Kash, A. D. Procaccia, and N. Shah, “No agent left behind: Dynamic
[46] M. de Weerdt, Y. Zhang, and T. Klos, “Distributed task allocation fair division of multiple resources,” in Proc. AAMAS, St. Paul, MN,
in social networks,” in Proc. AAMAS, Honolulu, HI, USA, 2007, USA, 2013, pp. 351–358.
pp. 500–507. [72] M. E. Gaston and M. desJardins, “Agent-organized networks for
[47] A. C. Chapman, R. A. Micillo, R. Kota, and N. R. Jennings, dynamic team formation,” in Proc. AAMAS, Utrecht, The Netherlands,
“Decentralised dynamic task allocation: A practical game-theoretic Jul. 2005, pp. 230–237.
approach,” in Proc. AAMAS, Budapest, Hungary, 2009, pp. 915–922. [73] R. Glinton, K. Sycara, and P. Scerri, “Agent organized networks redux,”
[48] L. Wang, Z. Wang, S. Hu, and L. Liu, “Ant colony optimization for in Proc. AAAI, vol. 1. Chicago, IL, USA, 2008, pp. 83–88.
task allocation in multi-agent systems,” China Commun., vol. 10, no. 3, [74] S. Abdallah and V. Lesser, “Multiagent reinforcement learning and self-
pp. 125–132, Mar. 2013. organization in a network of agents,” in Proc. AAMAS, Honolulu, HI,
[49] R. G. Smith, “The contract net protocol: High-level communication USA, May 2007, pp. 172–179.
and control in a distributed problem solver,” IEEE Trans. Comput., [75] N. Griffiths and M. Luck, “Changing neighbours: Improving tag-based
vol. C-29, no. 12, pp. 1104–1113, Dec. 1980. cooperation,” in Proc. AAMAS, Toronto, ON, Canada, May 2010,
[50] K. S. Macarthur, R. Stranders, S. D. Ramchurn, and N. R. Jennings, pp. 249–256.
“A distributed anytime algorithm for dynamic task allocation in multi- [76] R. Kota, N. Gibbins, and N. R. Jennings, “Decentralized approaches
agent systems,” in Proc. AAAI, San Francisco, CA, USA, 2011, for self-adaptation in agent organizations,” ACM Trans. Auton. Adapt.
pp. 701–706. Syst., vol. 7, no. 1, pp. 1–28, 2012.
[51] D. S. dos Santos and A. L. C. Bazzan, “Distributed clustering for group [77] D. Ye, M. Zhang, and D. Sutanto, “Self-organization in an agent net-
formation and task allocation in multiagent systems: A swarm intelli- work: A mechanism and a potential application,” Decis. Support Syst.,
gence approach,” Appl. Soft Comput., vol. 12, no. 8, pp. 2123–2131, vol. 53, no. 3, pp. 406–417, 2012.
2012. [78] D. Bollegala, Y. Matsuo, and M. Ishizuka, “Relation adaptation:
[52] S. D. Ramchurn, A. Farinelli, K. S. Macarthur, and N. R. Jennings, Learning to extract novel relations with minimum supervision,” in Proc.
“Decentralised coordination in RoboCup rescue,” Comput. J., vol. 53, IJCAI, Barcelona, Spain, 2011, pp. 2205–2210.
no. 9, pp. 1447–1461, 2010. [79] K. E. Weick, “Organization design: Organizations as self-designing
[53] R. Feldmann, M. Gairing, T. Lücking, B. Monien, and M. Rode, systems,” Organ. Dyn., vol. 6, no. 2, pp. 31–46, 1977.
“Selfish routing in non-cooperative networks: A survey,” in Proc. 28th [80] B. Horling and V. Lesser, “Using quantitative models to search for
Int. Symp. Math. Found. Comput. Sci., Bratislava, Slovakia, 2003, appropriate organizational designs,” Auton. Agents Multi-Agent Syst.,
pp. 21–45. vol. 16, no. 2, pp. 95–149, 2008.
[54] P. Sousa, C. Ramos, and J. Neves, “The fabricare scheduling prototype
[81] K. Decker, K. Sycara, and M. Williamson, “Cloning for intelligent
suite: Agent interaction and knowledge base,” J. Intell. Manuf., vol. 14,
adaptive information agents,” in Proc. 2nd Aust. Workshop Distrib.
no. 5, pp. 441–455, 2003.
Artif. Intell., Cairns, QLD, Australia, 1996, pp. 63–75.
[55] G. Wei, A. V. Vasilakos, Y. Zheng, and N. Xiong, “A game-theoretic
[82] O. Shehory, K. Sycara, P. Chalasani, and S. Jha, “Agent cloning: An
method of fair resource allocation for cloud computing services,”
approach to agent mobility and resource allocation,” IEEE Commun.
J. Supercomput., vol. 54, no. 2, pp. 252–269, 2010.
Mag., vol. 36, no. 7, pp. 58–67, Jul. 1998.
[56] M. R. Rahimi, N. Venkatasubramanian, and A. V. Vasilakos, “MuSIC:
[83] T. Ishida, L. Gasser, and M. Yokoo, “Organization self-design of dis-
Mobility-aware optimal service allocation in mobile cloud computing,”
tributed production systems,” IEEE Trans. Knowl. Data Eng., vol. 4,
in Proc. IEEE CLOUD, Santa Clara, CA, USA, 2013, pp. 75–82.
no. 2, pp. 123–134, Apr. 1992.
[57] Y. Chevaleyre et al., “Issues in multiagent resource allocation,”
Informatica, vol. 30, no. 1, pp. 3–31, 2006. [84] S. Kamboj and K. S. Decker, “Organizational self-design in semi-
[58] P. C. Cramton, Y. Shosham, and R. Steinberg, Combinatorial Auctions. dynamic environments,” in Proc. AAMAS, Honolulu, HI, USA,
Cambridge, MA, USA: MIT Press, 2006. May 2007, pp. 1228–1235.
[59] V. Krishna, Auction Theory. San Diego, CA, USA: Academic Press, [85] S. Kamboj, “Analyzing the tradeoffs between breakup and cloning in
2002. the context of organizational self-design,” in Proc. AAMAS, Budapest,
[60] P. R. Wurman, M. P. Wellman, and W. E. Walsh, “A parametrization Hungary, May 2009, pp. 829–836.
of the auction design space,” Games Econ. Behav., vol. 35, nos. 1–2, [86] D. Ye, M. Zhang, and D. Sutanto, “Cloning, resource exchange, and
pp. 304–338, 2001. relation adaptation: An integrative self-organisation mechanism in a
[61] T. W. Sandholm, “An implementation of the contract net protocol based distributed agent network,” IEEE Trans. Parallel Distrib. Syst., vol. 25,
on marginal cost calculations,” in Proc. AAAI, Washington, DC, USA, no. 4, pp. 887–897, Apr. 2014.
1993, pp. 256–262. [87] V. Lesser et al., “Evolution of the GPGP/TAEMS domain-independent
[62] T. W. Sandholm and V. R. Lesser, “Leveled commitment contracts and coordination framework,” J. Auton. Agents Multi-Agent Syst., vol. 9,
strategic breach,” Games Econ. Behav., vol. 35, nos. 1–2, pp. 212–270, no. 1, pp. 87–143, 2004.
2001. [88] I. Kiselev and R. Alhajj, “A self-organizing multi-agent system for
[63] S. Aknine, S. Pinson, and M. F. Shakun, “An extended multi-agent adaptive continuous unsupervised learning in complex uncertain envi-
negotiation protocol,” J. Auton. Agents Multi-Agent Syst., vol. 8, no. 1, ronments,” in Proc. AAAI, Chicago, IL, USA, 2008, pp. 1808–1809.
pp. 5–45, 2004. [89] I. Kiselev and R. Alhajj, “An adaptive multi-agent system for continu-
[64] A. Schaerf, Y. Shoham, and M. Tennenholtz, “Adaptive load balanc- ous learning of streaming data,” in Proc. WI-IAT, vol. 2. Sydney, NSW,
ing: A study in multi-agent learning,” J. Artif. Intell. Res., vol. 2, Australia, 2008, pp. 148–153.
pp. 475–500, May 1995. [90] C. Zhang, S. Adballah, and V. Lesser, “Integrating organizational con-
[65] G. Tesauro, “Online resource allocation using decompositional rein- trol into multi-agent learning,” in Proc. AAMAS, Budapest, Hungary,
forcement learning,” in Proc. AAAI, vol. 2. Pittsburgh, PA, USA, 2005, 2009, pp. 757–764.
pp. 886–891. [91] C. Zhang, V. Lesser, and S. Adballah, “Self-organization for coordinat-
[66] C. Zhang, V. Lesser, and P. Shenoy, “A multi-agent learning approach ing decentralized reinforcement learning,” in Proc. AAMAS, Toronto,
to online distributed resource allocation,” in Proc. IJCAI, Pasadena, ON, Canada, 2010, pp. 739–746.
CA, USA, 2009, pp. 361–366. [92] A. V. Vasilakos and G. I. Papadimitriou, “A new approach to the design
[67] S. S. Fatima and M. Wooldridge, “Adaptive task and resource allocation of reinforcement schemes for learning automata: Stochastic estimator
in multi-agent systems,” in Proc. AGENTS, Montreal, QC, Canada, learning algorithm,” Neurocomputing, vol. 7, no. 3, pp. 275–297, 1995.
2001, pp. 537–544. [93] L. P. Kaelbling, M. L. Littman, and A. W. Moore, “Reinforcement
[68] T. Schlegel and R. Kowalczyk, “Towards self-organising agent-based learning: A survey,” J. Artif. Intell. Res., vol. 4, no. 1, pp. 237–285,
resource allocation in a multi-server environment,” in Proc. AAMAS, 1996.
Honolulu, HI, USA, 2007, pp. 74–81. [94] J. Schmidhuber, “A general method for incremental self-improvement
[69] B. An, V. Lesser, and K. M. Sim, “Strategic agents for multi- and multi-agent learning in unrestricted environments,” in Evolutionary
resource negotiation,” Auton. Agents Multi-Agent Syst., vol. 23, no. 1, Computation: Theory and Applications. Singapore: World Scientific,
pp. 114–153, 2011. 1996, pp. 81–123.
460 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, VOL. 47, NO. 3, MARCH 2017

[95] L. Busoniu, R. Babuska, and B. De Schutter, “A comprehensive survey [121] T. C. Schelling, Micromotives and Macrobehavior. New York, NY,
of multiagent reinforcement learning,” IEEE Trans. Syst., Man, Cybern. USA: Norton, 1978.
C, Appl. Rev., vol. 38, no. 2, pp. 156–172, Mar. 2008. [122] M. Granovetter, “Threshold models of collective behavior,” Amer. J.
[96] C. Zhang and V. Lesser, “Coordinating multi-agent reinforcement learn- Sociol., vol. 83, no. 6, pp. 1420–1443, 1978.
ing with limited communication,” in Proc. AAMAS, St. Paul, MN, USA, [123] C. Castellano, S. Fortunato, and V. Loreto, “Statistical physics of social
2013, pp. 1101–1108. dynamics,” Rev. Modern Phys., vol. 81, no. 2, pp. 591–646, 2009.
[97] G. Piliouras, C. Nieto-Granda, H. I. Christensen, and J. S. Shamma, [124] D. Helbing, Quantitative Sociodynamics: Stochastic Methods and
“Persistent patterns: Multi-agent learning beyond equilibrium and Models of Social Interaction Processes. Berlin, Germany: Springer,
utility,” in Proc. AAMAS, Paris, France, 2014, pp. 181–188. 2010.
[98] M. Bowling and M. Veloso, “Multiagent learning using a variable [125] C. Sueur, J.-L. Deneubourg, and O. Petit, “From social network (cen-
learning rate,” Artif. Intell., vol. 136, no. 2, pp. 215–250, 2002. tralized vs. decentralized) to collective decision-making (unshared vs.
[99] B. Banerjee and J. Peng, “Generalized multiagent learning with per- shared consensus),” PLoS One, vol. 7, no. 2, 2012, Art. ID e32566.
formance bound,” Auton. Agents Multi-Agent Syst., vol. 15, no. 3, [126] C. A. C. Parker and H. Zhang, “Cooperative decision-making in decen-
pp. 281–312, 2007. tralized multiple-robot systems: The best-of-N problem,” IEEE/ASME
[100] S. Abdallah and V. Lesser, “A multiagent reinforcement learning algo- Trans. Mechatronics, vol. 14, no. 2, pp. 240–251, Apr. 2009.
rithm with non-linear dynamics,” J. Artif. Intell. Res., vol. 33, no. 1, [127] A. Brutschy, A. Scheidler, E. Ferrante, M. Dorigo, and M. Birattari,
pp. 521–549, 2008. “‘Can ants inspire robots?’ Self-organized decision making in robotic
[101] C. Guestrin, M. G. Lagoudakis, and R. Parr, “Coordinated reinforce- swarms,” in Proc. IEEE/RSJ Int. Conf. Intell. Robot. Syst., 2012,
ment learning,” in Proc. ICML, San Francisco, CA, USA, 2002, pp. 4272–4273.
pp. 227–234. [128] D. Ye, M. Zhang, and D. Sutanto, “Self-adaptation-based dynamic
[102] J. R. Kok and N. Vlassis, “Collaborative multiagent reinforcement coalition formation in a distributed agent network: A mechanism and
learning by payoff propagation,” J. Mach. Learn. Res., vol. 7, a brief survey,” IEEE Trans. Parallel Distrib. Syst., vol. 24, no. 5,
pp. 1789–1828, Dec. 2006. pp. 1042–1051, May 2013.
[103] C. Zhang and V. Lesser, “Coordinated multi-agent reinforcement learn- [129] G. Chalkiadakis, E. Elkind, E. Markakis, M. Polukarov, and
ing in networked distributed POMDPs,” in Proc. AAAI, San Francisco, N. R. Jennings, “Cooperative games with overlapping coalitions,”
CA, USA, 2011, pp. 764–770. J. Artif. Intell. Res., vol. 39, pp. 179–216, Sep. 2010.
[104] M. Elidrisi, N. Johnson, M. Gini, and J. Crandall, “Fast adaptive [130] M. Mareš, “Fuzzy coalition structures,” Fuzzy Sets Syst., vol. 114, no. 1,
learning in repeated stochastic games by game abstraction,” in Proc. pp. 23–33, 2000.
AAMAS, Paris, France, 2014, pp. 1141–1148. [131] L. M. Hofmann, N. Chakraborty, and K. Sycara, “The evolution of
[105] I. Georgiadis, J. Magee, and J. Kramer, “Self-organising software archi- cooperation in self-interested agent societies: A critical study,” in Proc.
tectures for distributed systems,” in Proc. WOSS, Charleston, SC, USA, AAMAS, Taipei, Taiwan, 2011, pp. 685–692.
2002, pp. 33–38. [132] Y. Wang, A. Nakao, and A. V. Vasilakos, “On modeling of coevolu-
[106] S. Malek, M. Mikic-Rakic, and N. Medvidovic, “A decentralized tion of strategies and structure in autonomous overlay networks,” ACM
redeployment algorithm for improving the availability of distributed Trans. Auton. Adapt. Syst., vol. 7, no. 2, 2012, Art. ID 17.
systems,” in Component Deployment. Berlin, Germany: Springer, 2005, [133] A. Szolnoki and M. Perc, “Group-size effects on the evolution of
pp. 99–114. cooperation in the spatial public goods game,” Phys. Rev. E, vol. 84,
Oct. 2011, Art. ID 047102.
[107] M. U. Iftikhar and D. Weyns, “A case study on formal verification
[134] R. Axelrod and W. D. Hamilton, “The evolution of cooperation,”
of self-adaptive behaviors in a decentralised system,” in Proc. Int.
Science, vol. 211, no. 4489, pp. 1390–1396, 1981.
Workshop Found. Coordin. Lang. Self Adapt., Newcastle upon Tyne,
[135] E. Fehr and U. Fischbacher, “The nature of human altruism,” Nature,
U.K., 2012, pp. 45–62.
vol. 425, no. 6960, pp. 785–791, 2003.
[108] D. G. de la Iglesia and D. Weyns, “Enhancing software qualities in
[136] J. H. Kagel and A. E. Roth, The Handbook of Experimental Economics.
multi-agent systems using self-adaptation,” in Proc. Eur. Workshop
Princeton, NJ, USA: Princeton Univ. Press, 1995.
Multi-Agent Syst., Toulouse, France, 2012, pp. 1–15.
[137] D. J. Watts and S. H. Strogatz, “Collective dynamics of ‘small-world’
[109] D. G. de la Iglesia and D. Weyns, “SA-MAS: Self-adaptation to networks,” Nature, vol. 393, pp. 440–442, Jun. 1998.
enhance software qualities in multi-agent systems,” in Proc. AAMAS, [138] D. Ye and M. Zhang, “A self-adaptive strategy for evolution of coop-
St. Paul, MN, USA, 2013, pp. 1159–1160. eration in distributed networks,” IEEE Trans. Comput., vol. 64, no. 4,
[110] G. Valentini, H. Hamann, and M. Dorigo, “Self-organized collective pp. 899–911, Apr. 2015.
decision making: The weighted voter model,” in Proc. AAMAS, Paris, [139] S. Costantini and G. De Gasperis, “Runtime self-checking via tem-
France, 2014, pp. 45–52. poral (meta-)axioms for assurance of logical agent systems,” in Proc.
[111] B. H. C. Cheng et al., “Software engineering for self-adaptive systems: 7th Workshop Logical Aspects Multi-Agent Syst., Paris, France, 2014,
A research roadmap,” in Self-Adaptive Systems (LNCS 5525). Berlin, pp. 241–255.
Germany: Springer, 2009, pp. 1–26. [140] S. Costantini, “Self-checking logical agents,” in Proc. LA-NMR, 2012,
[112] N. R. Jennings, “On agent-based software engineering,” Artif. Intell., pp. 3–30.
vol. 117, no. 2, pp. 277–296, 2000. [141] S. Costantini, “Self-checking logical agents,” in Proc. AAMAS, St. Paul,
[113] M. Winikoff, “Future directions for agent-based software engineering,” MN, USA, 2013, pp. 1329–1330.
Int. J. Agent Orient. Softw. Eng., vol. 3, no. 4, pp. 402–410, 2009. [142] J.-P. Georgé, M.-P. Gleizes, P. Glize, and C. Régis, “Real-time sim-
[114] A. Hossain, M. A. Kashem, and S. Sultana, “Enhancing software qual- ulation for flood forecast: An adaptive multi-agent system STAFF,”
ity using agile techniques,” J. Comput. Eng., vol. 10, no. 2, pp. 87–93, in Proc. Symp. Adapt. Agents Multi-Agent Syst., 2003, pp. 7–11.
2013. [143] J.-P. Georgé, B. Edmonds, and P. Glize, “Making self-organizing
[115] Z. Liaghat, A. H. Rasekh, and A. R. Tabebordbar, “Enhance software adaptive multi-agent systems work—Towards the engineering of emer-
quality using data mining algorithms,” in Proc. Spring Congr. Eng. gent multi-agent systems,” in Methodologies and Software Engineering
Technol. (S-CET), Xi’an, China, 2012, pp. 1–5. for Agent Systems. New York, NY, USA: Kluwer Academic, 2004,
[116] R. de Lemos et al., “Software engineering for self-adaptive systems: pp. 319–338.
A second research roadmap,” in Self-Adaptive Systems (LNCS 7475). [144] M. Mamei, F. Zambonelli, and L. Leonardi, “Co-fields: A physically
Berlin, Germany: Springer, 2013, pp. 1–32. inspired approach to motion coordination,” IEEE Pervasive Comput.,
[117] J. Magee, N. Dulay, S. Eisenbach, and J. Kramer, “Specifying dis- vol. 3, no. 2, pp. 52–61, Apr./Jun. 2004.
tributed software architectures,” in Proc. 5th Eur. Softw. Eng. Conf., [145] M. Camurri, M. Mamei, and F. Zambonelli, Urban Traffic Control With
Sitges, Spain, 1995, pp. 137–153. Co-Fields (Lecture Notes in Artificial Intelligence), vol. 4389. Berlin,
[118] J. O. Kephart and D. M. Chess, “The vision of autonomic computing,” Germany: Springer, 2007, pp. 239–253.
Computer, vol. 36, no. 1, pp. 41–50, Jan. 2003. [146] H. Karuna et al., Emergent Forecasting Using a Stigmergy Approach in
[119] J. Kramer and J. Magee, “Self-managed systems: An architectural chal- Manufacturing Coordination and Control (Lecture Notes in Artificial
lenge,” in Proc. Future Softw. Eng., Minneapolis, MN, USA, 2007, Intelligence), vol. 3464. Berlin, Germany: Springer, 2005, pp. 210–226.
pp. 259–268. [147] H. Karuna, P. Valckenaers, P. Verstraete, B. S. Germain, and
[120] M. A. Montes de Oca et al., “Majority-rule opinion dynamics H. Van Brussel, A Study of System Nervousness in Multi-
with differential latency: A mechanism for self-organized collective Agent Manufacturing Control System (Lecture Notes in Artificial
decision-making,” Swarm Intell., vol. 5, no. 3, pp. 305–327, 2011. Intelligence), vol. 3910. Berlin, Germany: Springer, 2006, pp. 232–243.
YE et al.: SURVEY OF SELF-ORGANIZATION MECHANISMS IN MULTIAGENT SYSTEMS 461

[148] A. Dury, F. L. Ber, and V. Chevrier, A Reactive Approach for Solving [173] C. Boutilier, “Planning, learning and coordination in multiagent deci-
Constraint Satisfication Problems: Assigning Land Use to Farming sion processes,” in Proc. 6th Conf. Theoretical Aspects Ration. Knowl.,
Territories (Lecture Notes in Artificial Intelligence), vol. 1555. Berlin, 1996, pp. 195–210.
Germany: Springer, 1998, pp. 397–412. [174] Y. Dimopoulos and P. Moraitis, “Multi-agent coordination and cooper-
[149] K. Sohrabi, J. Gao, V. Ailawadhi, and G. J. Pottie, “Protocols for ation through classical planning,” in Proc. IEEE/WIC/ACM Int. Conf.
self-organization of a wireless sensor network,” IEEE Pers. Commun., Intell. Agent Technol., Hong Kong, 2006, pp. 398–402.
vol. 7, no. 5, pp. 16–27, Oct. 2000. [175] M. Grzes and P. Poupart, “Incremental policy iteration with guaran-
[150] T. H. Labella, M. Dorigo, and J.-L. Deneubourg, “Division of labor teed escape from local optima in POMDP planning,” in Proc. AAMAS,
in a group of robots inspired by ants’ foraging behavior,” ACM Trans. Istanbul, Turkey, 2015, pp. 1249–1257.
Auton. Adapt. Syst., vol. 1, no. 1, pp. 4–25, 2006. [176] S. E. Page, “Self organization and coordination,” Comput. Econ.,
[151] W. Liu, A. F. T. Winfield, J. Sa, J. Chen, and L. Dou, “Towards energy vol. 18, no. 1, pp. 25–48, 2001.
optimization: Emergent task allocation in a swarm of foraging robots,” [177] J. F. Allard, G. O. Wasteneys, and E. N. Cytrynbaum, “Mechanisms of
Adapt. Behav., vol. 15, no. 3, pp. 289–305, 2007. self-organization of cortical microtubules in plants revealed by com-
[152] Y. Ikemoto, T. Miura, and H. Asama, “Adaptive division-of-labor con- putational simulations,” Mol. Biol. Cell, vol. 21, no. 2, pp. 278–286,
trol algorithm for multi-robot systems,” J. Robot. Mechatronics, vol. 22, 2010.
no. 4, pp. 514–525, 2010. [178] E. A. Gaffney and S. S. Lee, “The sensitivity of turing self-organization
[153] C. Bernon, M.-P. Gleizes, S. Peyruqueou, and G. Picard, ADELFE: to biological feedback delays: 2D models of fish pigmentation,” Math.
A Methodology for Adaptive Multi-Agent Systems Engineering Med. Biol., vol. 32, no. 1, pp. 56–78, 2014.
(LNCS 2577). Berlin, Germany: Springer, 2003, pp. 156–169. [179] K.-I. Amemoria and S. Ishii, “Self-organization of delay lines by spike-
time-dependent learning,” Neurocomputing, vol. 61, pp. 291–316,
[154] C. Bernon, V. Camps, M.-P. Gleizes, and G. Picard, Tools for
Oct. 2004.
Self-Organizing Applications Engineering (Lecture Notes in Artificial
[180] A.-H. Tan, N. Lu, and D. Xiao, “Integrating temporal difference meth-
Intelligence), vol. 2977. Berlin, Germany: Springer, 2004, pp. 283–298.
ods and self-organizing neural networks for reinforcement learning with
[155] X. Zhang, V. Lesser, and S. Abdallah, “Efficient management of multi- delayed evaluative feedback,” IEEE Trans. Neural Netw., vol. 19, no. 2,
linked negotiation based on a formalized model,” Auton. Agents Multi- pp. 230–244, Feb. 2008.
Agent Syst., vol. 10, no. 2, pp. 165–205, 2005. [181] Z. Meng, Z. Li, A. V. Vasilakos, and S. Chen, “Delay-induced synchro-
[156] X. Zhang, V. Lesser, and T. Wagner, “Integrative negotiation among nization of identical linear multiagent systems,” IEEE Trans. Cybern.,
agents situated in organizations,” IEEE Trans. Syst., Man, Cybern. C, vol. 43, no. 2, pp. 476–489, Apr. 2013.
Appl. Rev., vol. 36, no. 1, pp. 19–30, Jan. 2006.
[157] K. H. Low, W. K. Leow, and M. H. Ang, “Task allocation via self-
organizing swarm coalitions in distributed mobile sensor network,”
in Proc. AAAI, San Jose, CA, USA, 2004, pp. 28–33.
[158] W. Li et al., “Efficient allocation of resources in multiple heterogeneous Dayong Ye received the B.Eng. degree in mecha-
wireless sensor networks,” J. Parallel Distrib. Comput., vol. 74, no. 1, tronic engineering from the Hefei University of
pp. 1775–1788, 2014. Technology, Hefei, China, in 2003, and the M.Sc.
[159] M. G. Kallitsis, G. Michailidis, and M. Devetsikiotis, “A decentralized and Ph.D. degrees in computer science from
algorithm for optimal resource allocation in smartgrids with communi- the University of Wollongong, Wollongong, NSW,
cation network externalities,” in Proc. IEEE SmartGridComm, Brussels, Australia, in 2009 and 2013, respectively.
Belgium, 2011, pp. 434–439. He is currently a Research Fellow with the
Swinburne University of Technology, Melbourne,
[160] R. Kota, N. Gibbins, and N. R. Jennings, “Self-organising agent organ-
VIC, Australia. His current research interests include
isations,” in Proc. AAMAS, Budapest, Hungary, 2009, pp. 797–804.
self-organizing multiagent systems, service-oriented
[161] R. Xiang, J. Neville, and M. Rogati, “Modeling relationship strength systems, and cloud computing.
in online social networks,” in Proc. WWW, Raleigh, NC, USA, 2010,
pp. 981–990.
[162] Z. Yan, N. Jouandeau, and A. A. Cherif, “A survey and analysis of
multi-robot coordination,” Int. J. Adv. Robot. Syst., vol. 10, no. 399,
pp. 1–18, 2013. Minjie Zhang received the B.Sc. degree in computer
science from Fudan University, Shanghai, China, in
[163] J. R. Galbraith, “The evolution of enterprise organization designs,”
1982, and the Ph.D. degree in computer science from
J. Organ. Design, vol. 1, no. 2, pp. 1–13, 2012.
the University of New England, Armidale, NSW,
[164] T. Hester, M. Lopes, and P. Stone, “Learning exploration strategies in Australia, in 1996.
model-based reinforcement learning,” in Proc. AAMAS, St. Paul, MN, She is currently a Professor of Computer Science
USA, 2013, pp. 1069–1076. with the University of Wollongong, Wollongong,
[165] M. Mihaylov, Y.-A. L. Borgne, K. Tuyls, and A. Nowé, “Decentralised NSW, Australia. Her current research interests
reinforcement learning for energy-efficient scheduling in wireless sen- include multiagent systems, agent-based simulation,
sor networks,” Int. J. Commun. Netw. Distrib. Syst., vol. 9, nos. 3–4, and modeling in complex domains.
pp. 207–224, 2012.
[166] S. Sengupta, S. Das, M. Nasir, A. V. Vasilakos, and W. Pedrycz, “An
evolutionary multiobjective sleep-scheduling scheme for differentiated
coverage in wireless sensor networks,” IEEE Trans. Syst., Man, Cybern.
C, Appl. Rev., vol. 42, no. 6, pp. 1093–1102, Nov. 2012.
Athanasios V. Vasilakos received the Ph.D. degree
[167] P. P. Reddy and M. M. Veloso, “Strategy learning for autonomous in computer engineering from University of Patras,
agents in smart grid markets,” in Proc. IJCAI, Barcelona, Spain, 2011, Patras, Greece, in 1988.
pp. 1446–1451. He is a Professor with the Luleå University of
[168] Q. Li and D. Rus, “Global clock synchronization in sensor networks,” Technology, Luleå, Sweden.
in Proc. INFOCOM, Hong Kong, 2004, pp. 214–226. Prof. Vasilakos is serving as an Editor for
[169] C.-H. Yu, J. Werfel, and R. Nagpal, “Collective decision-making in several technical journals, such as the IEEE
multi-agent systems by implicit leadership,” in Proc. AAMAS, Toronto, T RANSACTIONS ON N ETWORK AND S ERVICE
ON, Canada, 2010, pp. 1189–1196. M ANAGEMENT, the IEEE T RANSACTIONS ON
[170] M. Beer et al., “Negotiation in multi-agent systems,” Knowl. Eng. Rev., C LOUD C OMPUTING, the IEEE T RANSACTIONS
vol. 14, no. 3, pp. 285–289, 1999. ON I NFORMATION F ORENSICS AND S ECURITY , the
[171] P. Stone, “Learning and multiagent reasoning for autonomous agents,” IEEE T RANSACTIONS ON C YBERNETICS, the IEEE T RANSACTIONS
in Proc. IJCAI, Hyderabad, India, 2007, pp. 13–30. ON NANO B IOSCIENCE , the IEEE T RANSACTIONS ON I NFORMATION
[172] M. D’Inverno, M. Luck, M. Georgeff, D. Kinny, and M. Wooldridge, T ECHNOLOGY IN B IOMEDICINE, ACM Transactions on Autonomous and
“The dMARS architecture: A specification of the distributed multi- Adaptive Systems, and the IEEE J OURNAL ON S ELECTED A REAS IN
agent reasoning system,” Auton. Agents Multi-Agent Syst., vol. 9, no. 1, C OMMUNICATIONS. He is the General Chair of the European Alliances for
pp. 5–53, 2004. Innovation.

Você também pode gostar