Você está na página 1de 4

International Journal of

Science and Engineering Investigations

vol. 1, issue 1, February 2012

The Branch & Bound Algorithm Improvement in Divisible Load Scheduling with Result Collection on Heterogeneous Systems by New Heuristic Function
Farzad Norouzi Fard1, Sasan Mohammadi2, Peyman Parvizi3
1 2

Islamic Azad University south Tehran branch Islamic Azad University south Tehran branch 3 Islamic Azad University south Tehran branch
(1 st_f_norouzifard@azad.ac.ir, 2 s_mohammadi@azad.ac.ir, 3 st_p_parvizi@azad.ac.ir)

Abstract- in this paper we propose a new heuristic function, for Branch & Bound algorithm. By this function we can increase the efficiency of Branch & bound algorithm. Divisible loads represent computations which can be arbitrarily divided into parts and performed independently parallel. The scheduling problem consists in distributing the load in a heterogeneous system taking into account communication and computation times, so that the whole processing time is as short as possible. Since our scheduling problem is computationally hard, we propose a Branch & Bound algorithm. By simulating and comparing results, it is observed which this result produces better answers than other methods, it means that, branch & bound algorithm have less total average of relative error percentage in the variety Heuristic functions. Keywords- divisible load scheduling; Heterogeneous System; Branch & bound algorithm.

originated in the late 1980s [4, 5]. Surveys of divisible load theory (DLT), including applications, can be found in [1, 6]. DLT proved to be a valuable tool for modeling processing of big volumes of data [7, 8] includes image processing [9], signal processing, data mining and research in Database [10]; calculate linear algebra [11] and multimedia functions [12]. Distributing the load causes inevitable communication delays. To shorten them, the load may be sent to processors in small chunks rather than in one long message. This way the computations start earlier. Such multi-installment or multiround divisible load processing was proposed first in [13]. Memory limitations for single-installment communications were studied in [14], where a fast heuristic has been proposed. In [15] it was shown that this problem is NP-hard if a fixed startup time is required for initiation of communications. In this theory we use master-slave model. The load located on master. Master computer divides divisible load between slaves, when slave computers received all load, start processing. Each slave computers after finishing of processing report the result to master. The problem consists in finding a communication sequence, the schedule of communications from the originator to the workers, and sizes of transferred load pieces, so the total responding time becomes minimum. It has not already represented a certain algorithm with polynominal time complexity that can produce answer less time in all cases but existent creative ways are LifoC, FifoC [16, 17], ITERLP [18], Sport [19, 20], and GA [21] and Branch & bound LifoC [23]. Our aim is to suggest Branch and bound algorithm for solving divisible load scheduling with result collection on heterogeneous systems. The rest of this paper is organized as follows. In section 2 the problem is formulated. Section 3 describes Branch and bound algorithm for solving DLS problem. The results presented in section 4. The last section is dedicated to conclusions.

I.

INTRODUCTION

Divisible loads form a special class of parallelizable applications, which if given a large enough volume, can be arbitrarily partitioned into any number of independently and identically processable load fractions. Divisible load theory (DLT) is the mathematical framework that has been established to study divisible load scheduling (DLS) [1, 2]. The problem of working scheduling heterogeneous system has specific importance because of the necessity of optimize using calculating processors and also spending less time for performing of scheduling algorithms. In this paper we study divisible load scheduling with result collection on heterogeneous which has star network. In a star connected network where the center of the star acts as the master and holds the entire load to be distributed, and the points of the star form the set of slave processors, the basic principle of DLT to determine an optimal schedule is the AFS (All nodes Finish Simultaneously) policy [3]. In heterogeneous system, processors Efficiency, communication network topology and speed of network lines can be different. Scheduling works in heterogeneous system is computationally hard. One of the computation models is divisible load. Divisible load model

62

II.

SYSTEM MODEL AND PROBLEM DEFINITION

The network model to be considered here consists of (M + 1) processors interconnected through M links in a singlelevel tree fashion as shown in Fig. 1.

last process from master processors. Result collection phase begins only after the entire load fraction has been processed, and is ready for transmission back to the source. This is known as a block based system model, since each phase forms a block on the time line Fig. 2.

Fig. 1 A heterogeneous star network

In this paper we assume star interconnection. A set of working processors { } is connected to a central server called master. A processor is a unit comprising a CPU, memory and a hardware network interface. The CPU and network interface can work in parallel so that simultaneous communication and computation is possible. { } Is the set of computation parameters of the slave computers, and { } is the set of communication parameters of the network links. Is the reciprocal of the speed of processor , and is the reciprocal of the bandwidth of link . In this model, L is the whole dividable load that exists in master computer. Since it does not damage problem, we suppose that L=1. The source p_0 splits L into parts and sends them to the respective processors for computation. Each such set of m parts known as a load distribution { }. All processors follow a single-port and no-overlap communication model, implying that processors can communication with only one other processor at the time, and communication and computation cannot occur simultaneously. If the allocated load fraction is , then the returned result is equal to , where 01. The constant is application specific, and is the same for all processors for a particular load L. for a load part , is the transmission time from to , is the time it takes to perform the requisite processing on , and is the time it takes to transmit the results back to . And are two permutation of order m that represent the allocation and collection sequences respectively and denote the processor number that occurs at index { }. And are two lookup functions that return the index of the processor k in the allocation and collection sequences. Purpose of scheduling is to find the sequence pair ( , ), and that minimize total processing time. The total processing time is started from the time of load distribution until receiving the

Fig. 2 schedule for M=3

As and are determined, we can find programming as below:


( )

with linear (1) (2) (3) (4)

( )

=J

In the above formulation, for a pair ( , ), (1) imposes the no-overlap constraint. The single- port communication model is enforced by (2). The fact that the entire load is distributed among the processors is ensured by (3). This is known as the normalization equation. The non-negativity of the decision variables is ensured by constraint (4) [22]. By using branch and bound algorithm to find , and . There is ( ) Possible permutations each of and , and the linear program has to be evaluated ( ) times to determine the globally optimal solution.

III.

BRANCH & BOUND ALGORITHM FOR SOLVING DLS PROBLEM

Branch and bound algorithm is one of the trees and graphs traversal and exploring methods. Branch and bound algorithm is performed like below: Tree travers Heuristic function Pruning branches

At the beginning the root node is selected, once the root is selected its children will be created. After that heuristic function will work on all children and compare their answers. Then it will select the child who had the best result and it

International Journal of Science and Engineering Investigations, Volume 1, Issue 1, February 2012 www.IJSEI.com

63 Paper ID: 10112-13

repeats this action until the result is found. We probably can find many answers for DLT about Branch and bound algorithm ended when the first answer is found. Branch and bound algorithm Travers tree as BFS and use heuristic functions for pruning branches. In Fig. 3 we display how to extend nodes.

As displayed in Fig. 4, when we have 4 slaves computer, Branch and Bound Copt algorithm in much value has the lowest average of relative error percentage. Considering the running time being less in Branch and Bound algorithm, we can introduced it as the best algorithm.

Fig. 3 extending node in Branch and bound

In our tendered algorithm (Branch and bound Copt), first the selected processor and its father be located in allocations list then total slaves are located in allocation list by the best C (band width) between them, after that we call heuristic function with this data.

IV.

COMPUTATIONAL EXPERIMENTS

Fig. 5 Average of relative error percentage for m=4, 5

In experiments, we compared efficiency of Branch and bound algorithm by Sport, LifoC and Genetic Algorithms. We performed our Tests by Amd Athelon Dual 3.0 Ghz with 2 Gigabyte RAM in Matlab environment. To display a heterogeneous system we consider 25 different cases of C and E. For every 25 cases, m value of C and E produced randomly. In all tests, we calculated time of process for each algorithm. If shows us the time of process for optimal algorithm and shows the time of process for other algorithms, the percentage of relative error ( ) was calculated as formulation (5). = 100 (5)

With respect to the efficiency of Branch and Bound Copt algorithm, Branch and bound LifoC algorithm and Genetic algorithm rather than the other two, we compare them in Fig. 5.

Since we produce 25 different cases of heterogeneous system, the average of relative error percentage is calculating as formulation (6). =
( )

(6)

In order to consider the effects of & parameter in mention algorithm, the result time obtains experiments which have been done for M=4,5 and =0.1,0.2,...1, and the average of relative errors has been shown in Fig (4,5). In these figs, we see average error percentage of Genetic algorithm, Sport, LifoC and Branch and Bound LifoC for 4 and 5 slave computers.
Fig. 5 average of relative error percentage for m=4, 5

International Journal of Science and Engineering Investigations, Volume 1, Issue 1, February 2012 www.IJSEI.com

64 Paper ID: 10112-13

For m=5 and =0.7, The Run time& average of relative errors percentage for all of algorithm has been shown in Table 1.
Table 1. RUN TIME & AVERAGE OF RELATIVE ERROR PERCENTAGE FOR m=5 & =0.7 Algorithm Run time Average of relative error percentage

[6] [7]

[8]

[9]

Optimal algorithm Branch & Bound LifoC algorithm Branch & Bound Copt algorithm Genetic algorithm LifoC algorithm FifoC algorithm Sport algorithm

182.6719 0.2

0
[10]

0.000299117 0.000283476 0.000637334 0.0039602808


[13] [11]

0.2125
30.5712 0.0125 0.015 0.0025

[12]

0.074704891
[14]

0.183.05
[15]

CONCLUSION In this paper, a new heuristic algorithm, Branch and Bound, for the scheduling of divisible loads on heterogeneous systems and considering the Result collection phase is presented. A large number of simulations are performed and it is found that Branch and Bound consistently delivers near optimal performance. As future work, an algorithm with similar performance, but with better cost characteristics than Branch and Bound LifoC needs to be found. Another important area would be to extend the results to multi-level processor trees.

[16]

[17]

[18]

[19]

[20]

REFERENCES
[1] V.Bharadwaj, D.Ghose, V.Mani, T.G.Robertazzi. Scheduling Divisible Loads in Parallel And Distributed Systems, IEEE Computer Society Press, Los Alamitos CA (1996) C.H. Lee and K.G. Shin, Optimal task assignment in homogeneous networks IEEE Trans. Parallel Distrib. Syst., vol.8, no.2, pp.119129, Feb. 1997. G.D. Barlas, Collection-aware optimum sequencing of oprations and closed-form solutions for the distribution of divisible load on arbitrary processor trees, IEEE Trans. Parallel Distrib. Syst., vol.9, no.5, pp.429441, May 1998. R. Agrawal, H.V. Jagadish, Partitioning Techniques for Large-Grained Parallelism, IEEE Transactions on Computers 37(12), 1627-1634 (1988) Y.-C.Cheng, T.G.Robertazzi, Distributed computation with communication delay. IEEE Transactions on Aerospace and Electronic Systems 24, 700-712 (1988) [21]

[2]

[22]

[3]

[23]

[4]

T.G. Robertazzi. Ten reasons to use divisible load theory, IEEE Computer, 36, 6368 (2003) Bharadwaj, V., Ghose, D., Robertazzi, T. G., "Divisible Load Theory: A New Paradigm for Load Scheduling in Distributed Systems", Cluster Computing, vol.6, no.1, pp.7-17, Jan. 2003. Jingxi, J., Bharadwaj, V., Ghose, D., "Adaptive Load Distribution Strategies for Divisible Load Processing on Resource Unaware Multilevel Tree Networks", IEEE Transactions on Computers, vol. 65, no. 7, pp. 99-1005, 2007. Li, X., Bharadwaj, V., KO, C. C., "Distributed Image Processing on a Network of Workstations", Int'l J. Computers and Applications, vol. 25, no. 2, pp. 1-10, 2003. Blazewicz, J., Drozdowski, M., Markiewicz, M., "Divisible Task Scheduling: Concept and Verification", Parallel Computing, vol. 25, pp. 87-98, 1990. Chan, S., Bharadwaj, V., Ghose, D., "Large Matrix-Vector Products on Distributed Bus Networks with Communication Delays Using the Divisible Load Paradigm: Performance and Simulation", Math. And Computers in Simulation, vol. 58, pp.71-92, 2001. Altilar, D. Paker, Y., "Optimal Scheduling Algorithms for Communication Constrained Parallel Processing", Proc. Eighth Int'l Euro-Par Conf. pp. 197-206, 2002. V.Bharadwaj, D.Ghose, V.Mani, Multi-installment Load Distribution in Tree Networks with Delays. IEEE Transactions on Aerospace and Electronic Systems 31, 555-567 (1995) X.Li, V.Bharadwaj, C.C.Ko, Processing divisible loads on single-level tree networks with buffer constraints, IEEE Transactions on Aerospace and Electronic Systems 36, 1298-1308 (2000) M. Drozdowski, P. Wolniewicz, Optimum divisible load scheduling on heterogeneous stars with limited memory, European Journal of Operational Research 172, 545-559 (2006) Rosenberg, A. L., "Sharing Partitionable Workloads in Heterogeneous NOWs: Greedier Is not Better", IEEE International Conf. on Cluster Computing, pp. 124-131, Newport Beach, CA, Oct. 2001. Beaumont, O., Marchal, L., Rehn, V., Robert Y., "FIFO Scheduling of Divisible Loads with Return Messages Under the One Port Model", Proc. Heterogeneous Computing Workshop HCW'06, April 2006. Ghatpande, A., Nakazato, H., Watanabe, H., Beaumont, O., "Divisible Load Scheduling with Result Collection on Heterogeneous Systems", Proc. Heterogeneous Computing Workshop (HCP 2008), April 2008. Ghatpande, A., Nakazato, H., Beaumont, O., Watanabe, H., "SPORT: An Algorithm for Divisible Load Scheduling With Result Collection on Heterogeneous Systems", IEICE Transactions on Communications, vol. E91-B, no. 8 August 2008. Ghatpande, A., Nakazato, H., Beaumont, O., Watanabe, H., "Analysis of Divisible Load Scheduling with Result Collection on Heterogeneous Systems", IEICE Transactions on Communications, vol. E91-B, no. 7, July 2008. Suresh, S., Mani,V., Omkar, S. N., Kim, H. J., "Divisible Load Scheduling in Distributed Systems with Buffer Constraints: Genetic Algorithm and Linear Programming Approach", International Journal of Parallel, Emergent and Distributed Systems, Vol. 21, No. 5, pp. 303321, Oct. 2006. Vanderbei, R. J., Linear Programming: Foundations and Extensions, 2nd Ed., International Series in Operations Research & Management, vol. 37, Kluwer Academic Publishers, 2001. S.Mohammadi, F.Norouzi Fard, F.Norouzi Fard. Branch & bound: An Algorithm for Divisible Load Scheduling with Result Collection on Heterogeneous Systems, international conference on computer science and information engineering (ICCSE 2011), pp. 2010-3778, Venice, April 2011.

[5]

Farzad Norouzi Fard. He is now with the Department of mechatronics, Islamic azad university south Tehran branch. (Email: st_f_norouzifard@azad.ac.ir)

International Journal of Science and Engineering Investigations, Volume 1, Issue 1, February 2012 www.IJSEI.com

65 Paper ID: 10112-13

Você também pode gostar