Escolar Documentos
Profissional Documentos
Cultura Documentos
a r t i c l e i n f o a b s t r a c t
Article history: In wireless sensor and actor networks maintaining inter-actor connectivity is very important in
Received 7 February 2011 mission-critical applications where actors have to quickly plan optimal coordinated response to
Received in revised form detected events. Failure of one or multiple actors may partition the inter-actor network into disjoint
16 October 2011
segments, and thus hinders the network operation. Autonomous detection and rapid recovery
Accepted 1 December 2011
procedures are highly desirable in such a case. This paper presents DCR, a novel distributed partitioning
Available online 14 December 2011
detection and connectivity restoration algorithm to tolerate the failure of actors. DCR proactively
Keywords: identifies actors that are critical to the network connectivity based on local topological information, and
Wireless sensor and actor network designates appropriate, preferably non-critical, backup nodes. Upon failure detection, the backup actor
Fault recovery
initiates a recovery process that may involve coordinated relocation of multiple actors. We also present
Connectivity restoration
an extended version of DCR, named RAM, to handle one possible case of a multi-actor failure. The
Node relocation
proposed algorithms strive to avoid procrastination, localize the scope of recovery and minimize the
movement overhead. Simulation results validate the performance of the proposed algorithms.
& 2011 Elsevier Ltd. All rights reserved.
1084-8045/$ - see front matter & 2011 Elsevier Ltd. All rights reserved.
doi:10.1016/j.jnca.2011.12.002
M. Imran et al. / Journal of Network and Computer Applications 35 (2012) 844–856 845
In order to tolerate critical node failure, three methodologies (Tamboli and Younis, 2009), VCR (Imran et al., 2010) avoid the
can be identified: (i) proactive, (ii) reactive and (iii) hybrid. increased overhead for tracking 2-hop neighbors and require each
Proactive approaches establish and maintain bi-connected topol- node to maintain only its directly reachable nodes, i.e. 1-hop
ogy in order to provide fault tolerance. This necessitates large neighbors. Like our proposed DCR algorithm, DARA (Abbasi et al.,
actor count that leads to higher cost and becomes impractical. On 2007) strives to restore connectivity lost due to failure of cut-
the other hand, in reactive approaches the network responds only vertex. However, DARA requires more network state in order to
when a failure occurs. Therefore, reactive approaches might not ensure convergence. Meanwhile, in PADRA (Akkaya et al., 2008),
be suitable for time-critical applications. In hybrid approaches Akkaya et al. identify a connected dominating set (CDS) of the
pre-failure planning is pursued in order to increase the efficiency whole network in order to detect cut-vertices. Since the CDS
of the recovery. We argue that a hybrid approach better suits based method is not accurate for critical node detection, they
autonomous WSANs that are deployed for time-critical applica- perform a depth-first search (DFS) on each member for the CDS to
tions due to the reduced recovery time and overhead. DCR uses a confirm that the node is really a cut vertex or not. Although, they
localized scheme to identify critical actors and designate backups use a distributed algorithm, their solution still requires 2-hop
for them. The backup actor detects the failure of the primary and neighbor’s information that increases messaging overhead.
pursues node relocation to repair the partitioned network topol- Another work proposed in Azadeh (2009) also uses 2-hop infor-
ogy. DCR considers one failure at a time and no other node fails mation to detect cut-vertices. The proposed DCR algorithm relies
during the recovery. We extend DCR to handle one possible only on 1-hop information and reduces the communication
scenario of multiple simultaneous failures of actors and will be overhead.
discussed later in Section 4.3. Although RIM (Younis et al., 2010), C3R (Tamboli and Younis,
2009) and VCR (Imran et al., 2010) use 1-hop neighbor informa-
tion to restore connectivity, they are purely reactive and do not
3. Related work differentiate between critical and non-critical nodes. Whereas,
DCR is a hybrid algorithm that proactively identifies critical nodes
The issue of fault tolerance in different WSAN contexts has and designates for them appropriate backups. ACR (Imran et al.,
only been studied in few studies. For instance, the fault-tolerant 2011) is a recently proposed hybrid algorithm that maintains
model presented in Ozaki et al. (2006) designates multiple actors 1-hop information and factors in application-level interests while
to each sensor and multiple sensors to each actor in order ensure connectivity restoration. However, ACR cannot handle simulta-
guaranteed event notification even in case of either failure or neous failure of multiple actors.
inaccessibility. However, our fault-tolerant model is in context of The very first work to handle multiple simultaneous node
maintaining inter-actor connectivity rather than reliable event failures in the context of sensor networks is recently proposed in
notification delivery. On the other hand, some research has also Lee and Younis (2010). The authors deploy additional relay nodes
exploited node mobility as a means for performance optimization (RNs) to restore the overall connectivity using the least RN count.
both in sensor networks and WSANs. For example, the movement Unlike Akkaya et al. (2010), our work relies on reconfiguring the
of the base-station is employed in Akkaya et al. (2005) to increase existing topology instead of employing additional nodes. Akkaya
sensors lifetime and throughput while minimizing latency. How- et al. extended their work (Akkaya et al., 2008) by introducing a
ever, exploiting node mobility to mend severed topologies has mutual exclusion mechanism called MPADRA (Akkaya et al.,
just recently started to attract attention. The reader is referred to 2010) in order to handle multiple simultaneous failures in a
Younis and Akkaya (2008) for a comprehensive survey on node localized manner. Our proposed approach differs from MPADRA
relocation strategies. in multiple aspects. First, MPADRA requires a mutual exclusion
The existing work on using node mobility to recovery from a mechanism to avoid race conditions. Second, MPADRA reserves
failure can be categorized into block and cascaded movement. the nodes on the path in advance before actual relocation. On the
Block movement often requires a high pre-failure connectivity in other hand, RAM designates distinct backups and does not engage
order for the nodes to coordinate their response. An example of relocating nodes beforehand. Third, MPADRA maintains 2-hop
block movement based approaches is reported in Basu and Redi network state information and requires primary and secondary
(2004), where the initial network is assumed to be 2-connected failure handlers for each dominator. Whereas, our approach only
and goal is to sustain such 2-connectivity even under link or node requires 1-hop information and each critical node has only one
failure. The idea of movement of robots is similar to ours but their backup to handle its failure. A variant of DCR was presented in
approach is centralized in nature and does not fit autonomous Imran et al. (2010). This paper improves the backup selection
WSANs. Das et al. studied the similar problem and presented a criteria of Imran et al. (2010), provides detailed analysis and
distributed approach to restore 2-connectivity in Orozco-Barbosa introduces a new mechanism to handle concurrent failure of
et al. (2007). Unlike Basu and Redi (2004) and (Orozco-Barbosa multiple adjacent nodes.
et al. (2007), our algorithm focuses on providing 1-connectivity.
Block movements often becomes infeasible in absence of
higher level of connectivity. Therefore, few researchers have 4. Partition detection and connectivity restoration
pursued cascaded node movement (Guiling et al., 2005) or shifted
relocation (Li et al., 2007). The idea is to gradually replace As mentioned earlier, hybrid algorithms better suits time-
intermediate nodes on the path instead of moving a node for a sensitive applications that require a rapid recovery. The proposed
long distance. Although, the idea of cascaded movement is similar DCR algorithm is hybrid in the sense it consists of two parts, i.e.
to DCR and RAM, the prime objective of Guiling et al. (2005) is to proactive and reactive. In the proactive part, critical actors are
mitigate holes in coverage introduced due to failure of sensors. determined using a localized algorithm. Once critical nodes
Our objective is to restore inter-actor connectivity. (primary) are determined, they select and designate an appro-
Strategies adopting cascaded relocations can be further cate- priate neighbor (backup) to handle their failure when such
gorized based on the network state information that nodes are contingency arises in the future. Each backup starts monitoring
assumed to maintain. Some approaches like DARA (Abbasi et al., its primary through HEARTBEATS. In the reactive part, a backup
2007), PADRA (Akkaya et al., 2008) require each actor to maintain initiates a recovery process when the primary fails. The backup
2-hop neighbors. Others, such as RIM (Younis et al., 2010), C3R replaces the primary and cascaded relocations are performed
M. Imran et al. / Journal of Network and Computer Applications 35 (2012) 844–856 847
until the recovery is complete. The detailed algorithm is described 4.2. Recovery from single critical actor failure
in the balance of this section.
This subsection details the DCR algorithm that is designed to
recover from the failure of a critical actor. The details of the
4.1. Identifying critical actors
algorithm are in the following subsection.
F F
I 0 I 0
H A B H A B
M M
C C
Fig. 4. Critical actors designate their backup using DCR for network segment
Fig. 3. A segment of an inter-actor network showing1-hop positional: (a) critical shown in Fig. 2: (a) backups start monitoring their primary and (b) B detect failure
and (b) non-critical actors. of primary F.
848 M. Imran et al. / Journal of Network and Computer Applications 35 (2012) 844–856
optimal. Nonetheless, the local selection enables DCR to be failure of Af will be interpreted by its backup Aj as if Ai is lost and
applied in a distributed manner and scale for large networks. Aj will thus move to replace Ai. The recovery process consists of
Figure 4(a) shows the setup where critical actors appoint their following steps:
backups. The arrow head point towards the primary. Note that
DCR does not require extra actors for serving as backup. It (a) Primary recovery: The backup actor immediately initiates a
employs existing actors just to take care of each other. recovery process once it detects failure of its primary. The
Failure detection: Once an actor receives a BACKUP notification, scope of recovery depends on the position of backup actor
it starts monitoring the primary through HEARTBEAT messages. which can be one of the following three scenarios. First, if a
The failure of the primary is detected by corresponding backup backup is a non-critical node the scope of the recovery will be
through successive misses of HEARTBEATS. Figure 4(b) indicates limited because it does not require further relocations. The
that the backup node B detects the failure of primary F and backup actor moves to the position of the failed primary and
triggers the recovery process as detailed in the following section. exchange heartbeat messages with new neighbors. It selects
and designates a new backup since it has become a critical
node at the new position. This movement alerts the other
4.2.2. Recovery process primary nodes (if any) at the previous location to choose a
The reactive recovery process is initiated by the backup upon new backup for themselves. An illustrative example is pro-
the detection of a primary failure. The scope of the recovery vided in Fig. 5, where non-critical backup B simply replaces its
depends on the NAS. If the backup is a non-critical actor, it simply primary (i.e. F) and selects a backup for itself.
replaces the primary and the recovery would be complete. The second scenario is when the backup is also a critical node.
However, if the backup is also critical node, cascaded relocation In this case, the backup actor will notify its own backup so
is performed. Basically, repositioning of actor Ai in response to the that the network stays connected. This scenario may trigger a
series of cascaded repositioning of nodes as explained below.
The third scenario is when the failed (primary) and its backup
are both critical nodes and simultaneously serving as backup
for each other. This scenario is articulated in Fig. 6. Actor B
detects the failure of F as both are mutually serving as backup
for each other as shown in Fig. 6(a). Figure 6(b) shows that the
actor B selects another actor ‘‘A’’ as backup. Then B sends a
movement notification message and moves to the position of
F as shown in Fig. 6(c). This movement triggers a series of
cascaded relocations as discussed below and is shown in Fig. 6
Fig. 5. Recovery process when backup actor is non-critical. (d), with A replacing B and C replacing A.
(b) Cascaded relocation: As mentioned earlier, the position of that
backup determines the scope of the recovery. In particular,
the recovery process of the second scenario is repeated to
handle the departure of a backup node. Basically, when the
critical backup actor B moves to the location of the failed
actor, it waits for receiving heartbeat messages from its own
backup BB. Once node B receives a heartbeat message from BB,
it selects and designates a new backup based on the new
neighborhood that it has joined. This process may be again
applied by BB and so on until a non-critical backup replaces a
primary. Figure 7(a) illustrates this scenario where the backup
actor is also critical and the recovery process continues in a
cascaded manner. The failed actor B is replaced by another
critical actor D (i.e. backup). Figure 7(b) shows the scenario
where moving critical actor D further partitions the network,
a cascaded relocation is triggered. The non-critical backup
actor K replaces critical primary actor D and the connectivity
Fig. 6. Applying the recovery process when two actors are simultaneously is restored. Upon conclusion of the recovery, the backup
primary-backup of each other. designation will be updated to get the network ready for
Fig. 7. Illustrating the recovery process when backup actor is critical: (a) the critical node D detects failure of primary B and (b) D replaces Band K replace D.
M. Imran et al. / Journal of Network and Computer Applications 35 (2012) 844–856 849
Bi
A critical node can be chosen as a backup only if it is not
already appointed by some other node. Moreover, two adja-
Pi Pj cent critical actors cannot serve each other as backup simul-
taneously. This will ensure that there will be some backup
Bj
node to recover incase adjacent actors fail at the same time.
Fig. 8. Illustrating the challenges in handling multiple simultaneous failures,
If a critical actor ‘‘A’’ picks a non-critical neighbor ‘‘B’’ as a
where moving two non-critical partitions the network. backup, RAM requires ‘‘B’’ to also pick a backup ‘‘C’’ among its
neighbors using the same criteria mentioned above. However,
node ‘‘B’’ status is not changed to critical. This condition
recovery in case a node fails in the future. The pseudo code of enables recovery when the primary and backup both fail at
the DCR algorithm is presented in Appendix A. the same time. In addition, it prevents the scenario of Fig. 8.
A6
A7 A6
A6
A7
A1 A2 A9
A1 A13 A9 A13 A9 A7
A13
Fig. 10. Special case of failure detection and recovery: (a) A13 detect failure of A1 and A2; (b) A13 replaces A2 and appoints A9 as backup and (c) A13 moves to the place of A1,
A9 and A7 follow it.
5. Algorithms analysis Theorem 4. The time it takes the DCR algorithm to converge
while restoring inter-actor connectivity is proportional to N and r
In this section, we analyze the performance of DCR and RAM where r is the communication range of actors and N is the number
algorithms. We show that both algorithms converge and successfully of actors.
M. Imran et al. / Journal of Network and Computer Applications 35 (2012) 844–856 851
Proof. Since, DCR proactively (before failure) designates for each Proof. RAM strives to restore connectivity through multiple and
critical actor a backup, the maximum time it takes a backup to independent invocations of DCR. Furthermore, when a series of
substitute failed actor is proportional to r, as proved in Theorem 2. critical nodes are engaged in a primary-backup relationship as a
If moving a critical backup further triggers c relocations, the total part of a ring, RAM handles this as an optimized implementation
recovery time will be proportional to (c þ1) r because of the of multiple-application of DCR and would not thus introduce a
sequential relocations. Thus in the worst case DCR convergence new cut-vertex in the topology. Thus, based on Theorem 3, it can
time to restore connectivity is (c þ1) r which does not exceed be concluded that RAM does not introduce new critical actors
(N r). during the recovery process.
Theorem 5. The total message complexity of DCR is O(N) where Theorem 9. The recovery process of RAM incurs messaging
N is the number of actors in WSAN. overhead of O(N2) where N is the number of actors in a WSAN.
Proof. since DCR only maintains 1-hop neighbor information to Proof. Like DCR, RAM also maintains 1-hop neighbor information
rejuvenate inter-actor connectivity, this requires 1 message for to restore inter-actor connectivity; this requires 1 message for
each actor. Moreover, every critical node participating in the each actor. In addition, a node B that is picked as a backup by
recovery has to send 1 movement notification message to its node A, B will need to inform its own backup C about A. In the
backup. DCR does not count message exchange with neighbors at worst case when the topology is a ring, this will involve N more
new location and considers as a part of the regular status update messages per node, i.e., total of N2. Furthermore, every critical
for maintaining 1-hop neighbor list. Thus, in the worst case, when actor involved in recovery has to send 1 movement notification
all the(c 1) critical nodes move, the total number of messages message to the corresponding backup. If the number of adjacent
will be (N þc 1). Therefore, DCR incurs total message complexity primary actors that fail is f, RAM moves each critical node only
of O(N). once. Thus, in the worst case, when all (C f) critical nodes move,
the total number of messages will be (N2 þN þC f). Therefore,
5.2. RAM analysis the messaging complexity overhead in RAM is O(N2).
We compare the performance of DCR to that of DARA (Younis have high node degree since they often have non-critical nodes in
et al., 2010) and RIM (Akkaya et al., 2005). Like DCR, DARA and the neighborhood. Figure 12(a) indicates that the performance of
RIM are distributed algorithms and exploit node relocation to DCR scales very well and is not affected by the node density
recover from node failure. However, their procedure is different. because of choosing non-critical nodes as backups. Similar obser-
When a node F fails, DARA selects a best candidate A among its vation can be made for the communication range (Fig. 12(b)),
1-hop neighbors and replaces it. The algorithm is recursively where the connectivity-restoration overhead is significantly less
applied to tolerate connectivity loss due to movement, i.e., A will than that of the baseline approaches.
be replaced with one of its neighbors and so on. On the other Number of moved nodes: Fig. 13 shows the number of nodes
hand, RIM moves all the 1-hop neighbors towards F until they that were involved in the recovery when DCR and the baseline
become connected. Like DARA, RIM is applied recursively to re- approaches are applied. The performance graphs confirm the
establish links affected by nodes movement. Both DARA and RIM advantage of DCR which moves fewer actors than RIM and DARA.
are reactive approaches and do not provision for recovery ahead This is because DCR limits the scope of the recovery and avoids
of time. successive cascaded relocations by choosing non-critical nodes as
backup. Moreover, DCR moves high degree critical nodes that
often have non-critical nodes in the neighborhood. Furthermore,
6.2. Performance evaluation of DCR
the performance of DCR remains almost constant while varying
the number of nodes and their radio range, which indicates great
The experiments involve randomly generated topologies with
scalability.
varying actor counts and communication ranges. The number of
Number of exchanged messages: Fig. 14 reports the messaging
actors has been set to 20, 40, 60, 80 and 100. The communication
overhead as a function of the network size and radio range. As the
range of actors is changed among 50, 100, 150 and 200. When
figure indicates, DCR incurs far less messaging overhead than
changing the node count, ‘‘r’’ is fixed at 100 m; and ‘‘N’’ is set to 60
DARA and RIM. This is because DCR limits the message exchange
while varying the communication range. The results of the
to only between a pair of primary and backup nodes instead of all
individual experiments are averaged over 30 trials. All results
1-hop and 2-hop neighbors as is the case in RIM and DARA,
are subject to 90% confidence interval analysis and stays within
respectively. Moreover, unlike, DARA and RIM, DCR strives to
10% the sample mean.
involve non-critical nodes in the recovery which limits the need
Total distance moved: Fig. 12 shows the distance traveled by all
for cascaded relocation and thus reduces the number of notifica-
nodes until the connectivity is restored. DCR significantly outper-
tion messages. Furthermore, DCR limits the scope of the recovery
forms both DARA and RIM because it strives to only move non-
by involving high-degree nodes that have non-critical nodes in
critical nodes in order to avoid cascaded relocations. As both
the neighborhood. The average number of notification messages
graphs in the figure indicate, the performance advantage of DCR
sent by DCR in Fig. 14(a) and (b) are 0.31–0.45 and 0–0.57,
remains almost consistent even with higher node densities and
respectively. On the other hand, Fig. 14 indicates that the messa-
longer transmission ranges. This is because DCR strives to avoid
ging overhead in RIM significantly grows for high actor densities
moving critical nodes that causes further partitioning and
and long communication ranges because the number of recovery
requires successive relocations. Furthermore, DCR performs cas-
participants increases in both cases.
caded relocations only when non-critical nodes in the neighbor-
hood of a failed actor are not available. Even then DCR strives to
limit the scope of the relocations by moving critical actors that
Fig. 12. Distance traveled by all nodes during the recovery until restoring Fig. 13. The number of nodes moved during the recovery, while varying the
connectivity, as a function of N in (a) and r in (b). network size (a) and radio range (b).
M. Imran et al. / Journal of Network and Computer Applications 35 (2012) 844–856 853
Fig. 14. Effect of changing N (a) and r (b) on total number of messages exchanged
by all nodes during the recovery.
Fig. 16. Level of inter-actor connectivity after recovery, as a function of N in
(a) and r in (b).
and RIM still do not make up for the coverage loss and definitely
do not match DCR’s performance. The advantage of DCR in terms
of coverage is obviously due to the limited scope of node
relocation, which causes a coverage loss at the network periphery.
Moreover, DCR engage strongly connected nodes in recovery that
have more coverage overlap with neighbors. Hence, moving those
actors only reduce the overlap coverage. Figure 15(b) indicates
that the performance of DCR in terms of coverage reduction is not
much affected with increasing the communication range. On the
other hand, the performance of RIM significantly worsens when
growing the communication range. With the increased value of r,
the network becomes more connected and the number of neigh-
bors of F grows. RIM moves nodes inwards making the area
around F to be more crowded while leaving uncovered parts at
the network periphery and thus cause a significant loss of
coverage.
Average node degree: Fig. 16 shows the level of connectivity
maintained by all approaches after the recovery is completed. As
both figures indicate, DCR consistently maintains the same level
of connectivity of other approaches, despite the fact that DCR is
not factoring connectivity like the other baseline approaches. This
is due to moving high degree non-critical nodes and limiting the
scope of the relocation. Figures 12–15 confirm that DCR strikes a
balance between the various objectives.
Fig. 15. Coverage reduction after recovery, as a function of N in (a) and r in (b). We use the same simulation setup to evaluate the perfor-
mance of RAM. We identify critical actors and choose two
Percentage of coverage reduction: Fig. 15 shows the impact on adjacent cut-vertices at random to be failed simultaneously. For
coverage, measured in terms of percentage of coverage reduction RAM-I, the failed nodes have backup independent of each other,
relative to the pre-failure level, while changing the N and r. The and thus it is like running DCR twice. As we have seen in the
action range is set to 50 m in these experiments. Overall, DCR previous section, DCR significantly outperforms contemporary
limits the coverage loss and consistently outperforms baseline schemes found in literature; therefore, the validation of RAM-I
approaches. Although increasing the node density helps, DARA is based on DCR. The RAM-A curve reflects the results when one of
854 M. Imran et al. / Journal of Network and Computer Applications 35 (2012) 844–856
Fig. 17. Distance traveled by all nodes during the recovery until restoring
Fig. 18. Distance traveled by all nodes during the recovery until restoring
connectivity, as a function of N in (a) and r in (b).
connectivity, as a function of N in (a) and r in (b).
the designated backups also fails along with its adjacent primary
as illustrated above in special case. For RAM-I and RAM-A, we
have performed experiments with 15 different topologies. The
goal of comparing the performance of RAM-I and RAM-A is to
capture the effect of failure scenarios, for which a node C has to
deal with the failure of its primary B as well as node A that B
serves as a backup.
Total distance moved: Fig. 17 shows the total distance moved by all
the nodes involved in the recovery. Both graphs shows that RAM-A
slightly move longer distance than RAM-I. This is due to engaging
additional nodes to recover from the failure of adjacent node. More-
over, RAM-I have independent pre-designated backups for the failed
actors that do not have to travel additional distance to recover from
failure of the grand primary. Figure 17(a) indicates that the perfor-
mance of both algorithms improves with the increased actor density.
Increasing the number of actors boosts the level of connectivity and
consequently boosts the number of non-critical nodes. The availabil-
ity of non-critical nodes reduces the scope of cascaded relocations. On
the other hand, Fig. 17(b) shows that the travel distance grows with
the increase in the transmission range. This is because nodes have to
travel longer distances in order to restore connectivity.
Number of moved nodes: Fig. 18 reports on the number of
nodes that get involved in the recovery. Again, both performance
graphs indicate that RAM-I marginally outperforms RAM-A in
terms of the scope of recovery. This is because independent
execution of RAM offers more non-critical nodes on the recovery
paths that prevent unnecessary cascaded relocations. The perfor-
Fig. 19. Distance traveled by all nodes during the recovery until restoring
mance of both variants of RAM improves with the increased actor connectivity, as a function of N in (a) and r in (b).
density and the longer transmission range due to the increased
degree of connectivity and the availability of non-critical nodes in
the neighborhood. This limits the scope of recovery. RAM-A. The obvious reason is that RAM-A sends extra recovery
Number of messages exchanged: Fig. 19 shows the messaging coordination messages. Both figures suggest that messaging over-
overhead as a function of the network size and radio range. As head reduces with the higher density and the longer radio range
expected, RAM-A incurs slightly more messaging overhead than since the network connectivity improves in both cases. This
M. Imran et al. / Journal of Network and Computer Applications 35 (2012) 844–856 855
Acknowledgment
Figure A1 shows the high level pseudo code of DCR which will
run on each actor ‘‘A’’ in a distributed manner. If an actor A is
critical, it will select an appropriate backup actor using the
AssignBackup() procedure (lines 1–3). While serving as backup
to node F, if actor A either detects the failure of F or receives a
Fig. B1. Pseudocode of RAM for backup selection and failure recovery.
movement notification message from F, it initiates a recovery
process (lines 4–6).
A critical actor A finds an appropriate backup among the backup or its pre-designated backup (lines 9–13). Now actor A can
neighbors. The AssignBackup() procedure preferably designate a move to replace F (line 14).
non-critical neighbor (either leaf or with highest degree) as
backup. In case non-critical node is not available, it chooses a
critical actor with highest degree and least distance to A (line 7). Appendix B Detailed Pseudo-Code for Ram
The recovery procedure is executed on backup actor A, if it either
detects the failure of primary F or receives a message from F. Figure B1 shows the pseudo code of RAM that each actor ‘‘B’’
While executing the recovery procedure, A checks whether it is would execute. The pre-failure steps resemble DCR. During the
critical or not (line 8). If it is critical, it checks the status of its network bootstrapping phase, each actor (either critical or
backup BackupStatus() before going to move. If the backup of A engaged as backup) will appoint an appropriate backup among
has failed, it selects another node as backup. It then sends a neighbor actors using the AppointDistinctBackup() procedure
movement notification message to inform the newly assigned (lines 1–3). If actor B either detects the failure of primary F or
856 M. Imran et al. / Journal of Network and Computer Applications 35 (2012) 844–856
receives a movement notification message from F, node B triggers a Dantu, K, Rahimi, M, Shah, H, Babel, S, Dhariwal, A, Sukhatme, GS, Robomote:
recovery procedure FailureRecovery() to recover from F (lines 4–6). enabling mobility in sensor networks. In: Proceedings of the 4th international
symposium on processing in sensor networks (IPSN 2005), California, USA;
The AppointDistinctBackup() procedure is slightly different April 2005.
from its counterpart ‘‘AssignBackup’’ in DCR. The AppointDistinct- Das, S, et al. Localized movement control for fault tolerance of mobile robot
Backup() procedure ensures that the picked backup node does not networks. In: Proceedings of IFIP 1st international conference on wireless
sensor and actor networks (WSAN’07), Albacete, Spain; September 2007.
serve another primary and bases the selection on the criteria Duque-Anton M, Bruyaux F, Semal P. Measuring the survivability of a network:
mentioned in Section-IV (C) (line 7).The procedure FailureRecov- connectivity and rest-connectivity. European Transactions on Telecommuni-
ery() is also different from the ‘‘Recovery’’ in DCR since in RAM cations 2000;11(2):149–59.
Guiling, W, Guohong, C, La Porta, T, Wensheng, Z. Sensor relocation in mobile
two adjacent actors are not allowed to choose each other as sensor networks. In: Proceedings of the 24th annual IEEE conference on
backup at the same time. If the backup B is a critical actor, it computer communications (INFOCOM’05), Miami, FL; March 2005.
notifies its backup so that the connectivity can be maintained Goyal, D, Caffery, JJ. Partitioning avoidance in mobile ad hoc networks using
network survivability concepts. In: Proceedings of the Seventh International
(lines 8–10). Since backup B is aware of the status of the failed
Symposium on Computers and Communications (ISCC’02), Taormina, Italy;
primary F, it checks whether the failed primary was critical or not. July 2002.
If the failed node F was critical B moves to replace F (lines 11–13). Imran, M, Younis, M, Said, AM, Hasbullah, H. Volunteer-instigated connectivity
Otherwise, no need to replace since it was non-critical. In other restoration algorithm for wireless sensor and actor networks. In: Proceedings
of the IEEE International Conference onWireless Communications, Networking
words, B will directly move to the location of grand primary Gas and Information Security (WCNIS 2010), Beijing, China; June 2010.
shown in Fig. 11 and will be discussed in the following lines. Imran, M, Said, AM, Younis, M, Hasbullah, H, Application-centric connectivity
If the backup node B also detects the failure of its grand restoration algorithm for wireless sensor and actor networks. In: Proceedings
of the 6th international conference on grid and pervasive computing (GPC
primary G (i.e. primary of primary) then B executes the recovery 2011), Oulu, Finland; May 2011.
procedure FailureRecovery()to recover from grand primary as Imran, M, Younis, M, Said, AM, Hasbullah, H. Partitioning detection and con-
mentioned in Figs. 10 and 11 earlier (lines 14–16). nectivity restoration algorithm for wireless sensor and actor networks. In:
Proceedings of the 8th IEEE/IFIP international conference on embedded and
ubiquitous computing (EUC 2010), Hong Kong, China; December 2010.
References Li, N Xu S, Stojmenovic, Ivan. Mesh-based sensor relocation for coverage main-
tenance in mobile sensor networks. In: Proceedings of the 4th international
conference on ubiquitous intelligence and computing (UIC 2007), Hong Kong,
Akyildiz IF, Kasimoglu IH. Wireless sensor and actor networks: research chal- China; July 2007.
lenges. Ad Hoc Networks 2004;2:351–67. Lee S, Younis M. Recovery from multiple simultaneous failures in wireless sensor
Abbasi, AA, Akkaya, K, Younis, M. A distributed connectivity restoration algorithm networks using minimum Steiner tree. The Journal Parallel and Distributed
in wireless sensor and actor networks. In: Proceedings of the 32nd IEEE Computing 2010;70(5):525–36.
conference on local computer networks (LCN 2007), Dublin, Ireland; October MilenkoJorgić, IS, Hauspie, Michaël, Simplot-ryl, David. Localized algorithms for
2007. detection of critical nodes and links for connectivity in ad hoc networks. In:
Akkaya, K, Younis, M. COLA: a coverage and latency aware actor placement for Proceedings of the 3rd annual IFIP mediterranean ad hoc networking work-
wireless sensor and actor networks. In Proceedings of the 64th IEEE vehicular shop, Med-Hoc-Net, Bodrum, Turkey; June 2004.
technology conference (VTC-Fall’ 06), Montreal, Canada; September 2006. Ozaki, K, Watanabe, K, Itaya, S, Hayashibara, N, Enokido, T, and Takizawa, M, A
Akkaya K, Younis M, Bangad M. Sink repositioning for enhanced performance in fault-tolerant model for wireless sensor-actor system. In: Proceedings of the
wireless sensor networks,. Computer Networks 2005;49:512–34. 20th international conference on advanced information networking and
Akkaya, K, Thimmapuram, A, Senel, F, Uludag, S. Distributed recovery of actor applications (AINA 2006), Vienna, Austria; April 2006.
failures in wireless sensor and actor networks. In Proceedings of the IEEE Orozco-Barbosa L, Olivares T, Casado R, Bermúdez A, Das S, Liu H, Kamath A, Nayak
wireless communications and networking conference (WCNC 2008), Las A, Stojmenović I. Localized movement control for fault tolerance of mobile
Vegas, NV; March 2008. robot networks. Wireless Sensor and Actor Networks 2007;248:1–12. Springer
Azadeh, Z. A hybrid approach to actor actor connectivity restoration in wireless Boston.
sensor and actor networks. In: Proceedings of the 8th IEEE international Tamboli, N and Younis, M. Coverage-aware connectivity restoration in mobile
conference on networks (ICN 2009), Cancun, Mexico; March 2009. sensor networks. In: Proceedings of the IEEE international conference on
Akkaya K, Senel F, Thimmapuram A, Uludag S. Distributed recovery from network communications (ICC 2009), Dresden, Germany; June 2009.
partitioning in movable sensor/actor networks via controlled mobility. IEEE Younis M, Lee S, Abbasi AA. A localized algorithm for restoring inter-node
Transactions on Computers 2010;59(2):258–71. connectivity in networks of moveable sensors. IEEE Transactions on Compu-
Batalin, MA, Sukhatme, GS. The analysis of an efficient algorithm for robot ters 2010;99(12).
coverage and exploration based on sensor network deployment. In: Proceed- Youssef, A, Ashok, A, Younis, M. Accurate anchor-free node localization in wireless
ings of the 2005 IEEE international conference on robotics and automation sensor networks. In: Proceedings of the 24th IEEE international performance,
(ICRA 2005), Barcelona, Spain; April, 2005. computing, and communications conference (IPCCC 2005). Phoenix, AZ; April
Bulusu N, Heidemann J, Estrin D. GPS-less low-cost outdoor localization for very 2005.
small devices. IEEE Personal Communications 2000;7(5):28–34. Younis M, Akkaya K. Strategies and techniques for node placement in wireless
Basu P, Redi J. Movement control algorithms for realization of fault-tolerant ad hoc sensor networks: a survey. The Journal of Ad-Hoc Networks 2008;6(4):
robot networks. IEEE Networks 2004;18(4):36–44. 621–55.