Você está na página 1de 18

+ournal of "nformation Science

http://jis.sagepub.com !ormulation and analysis of in"#lace $%& radi' sort algorithms


Nasir !l,Darwish Journal of Information Science ())-. /%. 01& D2": %).%%&&/)%1---%-)-)-&))&

#he online 3ersion of this article can be found at: http://jis.sagepub.com/cgi/content/abstract//%/1/01&

Published b4:
http://www.sagepublications.com

2n behalf of:
5hartered "nstitute of Librar4 and "nformation Professionals

Additional services and information for Journal of Information Science can be found at( )mail Alerts( http://jis.sagepub.com/cgi/alerts %ubscri#tions( http://jis.sagepub.com/subscriptions *e#rints( http://www.sagepub.com/journals6eprints.na3 Permissions( http://www.sagepub.com/journalsPermissions.na3 Citations 7this article cites 1 articles hosted on the S!8E +ournals 2nline and 9igh:ire Press platforms;: http://jis.sagepub.com/cgi/content/refs//%/1/01&

Downloaded from http://jis.sagepub.com atPENNSYL !N"! S#!#E $N" on !pril %&' ())* 2005 Chartered Institute of Library and Information Professionals. All rights reserved. Not for commercial use or unauthori ed distribution.

Formulation and analysis of in-place MSD radix sort algorithms

Nasir Al-Darwish
ICS Department, King Fahd University of Petroleum and Minerals, Dhahran, Saudi Arabia Received 15 December 2004 Revised 25 April 2005

+eywords, s&rti " al"&rithms. radi/ s&rt. 0SD radi/ s&rt. 0ates&rt. "e eral radi/ 0ates&rt. 1)ic-s&rt

-. .ntroduction
S&rti " is a ') dame tal pr&blem i c&mp)ter scie ce, with wide applicati& s 213* 4he m&me t the al"&rithm r) -time a d space c&mple/it% were '&rmali5ed, s&rti " al"&rithms were at the '&re'r& t '&r a al%sis a d impr&veme t* +arl% rec&" iti& was "ive t& 1)ic-s&rt 223 a d 6eaps&rt 273 al"&rithms, beca)se the% impr&ve & the 89n2: &rder &' r) i " time &' &ther sl&w s&rti " al"&rithms, t& 89n l&" n:* 6eaps&rt is 89n l&" n: i the w&rst case, while 1)ic-s&rt is 89n l&" n: i the avera"e case* I practice, 1)ic-s&rt r) s 'aster tha 6eaps&rt m&st &' the time, alth&)"h it e/hibits 89n2: w&rst-case &rder &' r) i " time, which happe s, '&r e/ample, whe the i p)t data is earl% s&rted* I "e eral, s&rti " al"&rithms ca be divided i t& tw& cate"&ries: ;c&mparis& -based< a d ;distrib)ti& -based<* A c&mparis& based al"&rithm, li-e 6eaps&rt &r 1)ic-s&rt, s&rts b% c&mpari " tw& eleme ts at a time* 8 the &ther ha d, a distrib)ti& -based al"&rithm, li-e radi/ s&rt 24=>3, w&r-s b% distrib)ti " the eleme ts i t& di''ere t piles based & their val)es* Radi/ s&rt al"&rithms 'all i t& tw& classes: 0SD 9m&st si" i'ica t di"it: a d ?SD 9least si" i'ica t di"it:* Radi/ s&rt al"&rithms pr&cess the eleme ts i sta"es, & e di"it at a time* A di"it is a "r&)p &' c& sec)tive bits with the di"it si5e 9 )mber &' bits: set at the be"i i " &' the al"&rithm* 0SD radi/ s&rt starts with the m&st si" i'ica t 9le'tm&st: di"it a d m&ves t&ward the least si" i'ica t di"it* ?SD radi/ s&rt d&es it the &ther wa%* ?SD distrib)tes the eleme ts i t& di''ere t "r&)ps = c&mm& l% - &w as ;b)c-ets< a d treated as @)e)es 9'irst-i -'irst-&)t data str)ct)re: = acc&rdi " t& the val)e &' the least si" i'ica t 9ri"htm&st: di"it* 4he the eleme ts are re-c&llected 'r&m the b)c-ets a d the

Abstract.
We present a unified treatment of a number of related in- place MSD radix sort algorithms with arying radices! collecti ely referred to here as "Matesort# algorithms. $hese algorithms use the idea of in-place partitioning which is a considerable impro ement o er the traditional lin%ed list implementation of radix sort that uses &'n( space. $he binary Matesort algorithm is a recast of the classical radix- exchange sort! emphasi)ing the role of in-place partitioning and efficient implementation of bit processing operations. $his algorithm is &'k( space and has &'kn( worst-case order of running time! where k is the number of bits needed to encode an element alue and n is the number of elements to be sorted. $he binary Matesort algorithm is e ol ed into a number of other algorithms including "continuous Matesort# for handling floating point numbers! and a number of "general radix Matesort# algorithms. We present formulation and analysis for three different approaches 'se*uential! di ide-and-con*uer and permutation-loop( for partitioning by the general radix Matesort algorithm. $he di ide-andcon*uer approach leads to an elegantly coded algorithm with better performance than the permutation-loop-based American Flag Sort algorithm.

Corresponden e to: Nasir Al-Darwish, ICS Departme t, !i " #ahd $ iversit% &' (etr&le)m a d mi erals, Dhahra , Sa)di Arabia* +-mail: darwish,-')pm*ed)*sa*

Downloaded from http://jis.sagepub.com atPENNSYL !N"! S#!#E $N" on !pril %&' ())* A&)r al &' I '&rmati& Scie ce, 71 9B: 2005, pp* 4B>=4C1 D CI?I(, D8I: 10*11>>E01B555150505>00> 2005 Chartered Institute of Library and Information Professionals. All rights reserved. Not for commercial use or unauthori ed distribution.

4B>

Downloaded from http://jis.sagepub.com atPENNSYL !N"! S#!#E $N" on !pril %&' ())* 2005 Chartered Institute of Library and Information Professionals. All rights reserved. Not for commercial use or unauthori ed distribution.

0SD radi/ s&rt al"&rithms

pr&cess c& ti )es with the e/t di"it* 8 the &ther ha d, 0SD radi/ s&rt 'irst distrib)tes the eleme ts acc&rdi " t& their le'tm&st di"it a d the calls the al"&rithm rec)rsivel% & each "r&)p* 0SD eeds & l% t& sca disti ")ishi " pre'i/es, while all di"its are sca ed i ?SD* #&r e/ample, '&r radi/-2 0SD 9i*e* di"it si5e F 1 bit:, tw& b)c-ets are )sed a d the eleme ts are distrib)ted i t& either b)c-et depe di " & the val)e &' the m&st si" i'ica t bit* 4he the pr&cess c& ti )es with the e/t bit, c& sideri " & l% "r&)ps that have m&re tha & e eleme t, ) til all the bits have bee sca ed* Clearl%, s)ch a al"&rithm )ses 89n: space '&r the c&mbi ed tw& b)c-ets a d is able t& s&rt n & - e"ative i te"ers i the ra "e 20,m3 i 89n l&" m: &rder &' r) i " time* !"!" Previous related #or$ 6ist&ricall%, 0SD a d ?SD radi/ s&rt al"&rithms have bee impleme ted )si " 89n: space, either )si " a w&r-i " arra% &' si5e n &r li -ed lists* A &table e/cepti& is ;radi/ e/cha "e< s&rt 243 a d ')rther "e erali5ati& b% 0cIlr&% et al* 2>3* Radi/ e/cha "e s&rt was 'irst s)""ested '&r bi ar%-alphabet b)t ca be )sed with stri "s pr&vided that bit-e/tracti& a d testi " are d& e as l&w-level machi e &perati& s* 4he basic idea &' radi/ e/cha "e s&rt is t& split in%pla e the data i t& tw& "r&)ps based & the m&st si" i'ica t bit* 4his is d& e )si " tw& &pp&sitel% m&vi " p&i ters. the le't 9ri"ht: p&i ter s-ips eleme ts havi " 0-bit 91-bit:. &therwise, it e& hanges the eleme ts p&i ted t& b% le't a d ri"ht p&i ters* 4he the pr&cess is applied rec)rsivel% t& each "r&)p c& sideri " the e/t bit* Radi/ e/cha "e s&rt is best th&)"ht &' as a ;mati "< &' Radi/ s&rt a d 1)ic-s&rt, si ce i -place partiti& i " is a characteristic &' 1)ic-s&rt* 4here'&re, the a)th&r s)""ests that it be called ;0ates&rt<* Secti& 2 prese ts a '&rmal treatme t &' radi/ e/cha "e s&rt 9i*e* bi ar% 0ates&rt: s)mmari5i " -e% the&retical a d e/perime tal res)lts* Radi/ e/cha "e s&rt has bee ) 'airl% )psta"ed b% 1)ic-s&rt despite the 'act that it is simple t& impleme t, r) s 'ast, a d has a w&rst-case &rder '&r r) i " time l&wer tha that &' 1)ic-s&rt* G&rse still, ma % te/tb&&-s & al"&rithms 'ail t& prese t it &r eve re'ere ce it* I Secti& 5, it is sh&w that bi ar% 0ates&rt al"&rithm w&r-s well '&r stri "s t&&* 4& i -place impleme t a "e eral radi/ 9i*e* di"it si5e lar"er tha 1 bit: 0SD radi/ s&rt, 0cIlr&% et al* 2>3 pr&p&sed the ;America #la"< i -place partiti& i " meth&d = amed as s)ch beca)se the pr&blem &' i place $-wa% partiti& i " is a "e erali5ati& &' the D)tch #la" 9three-wa% i -place partiti& i ": pr&blem 2C, H3 4BC

pr&p&sed b% DiI-stra* 4his )ses the c& cept &' a ;perm)tati& l&&p< = it )ses a prepr&cessi " step t& determi e the c&) t &' eleme ts 9based & the c)rre t di"it: that sh&)ld bel& " t& a partic)lar di"it val)e a d this i '&rmati& i t)r is )sed t& determi e where the eleme t sh&)ld be placed i the ;s&rted & the c)rre t di"it< arra%* I 2>3 it was c& cl)ded that the America #la" s&rt is the 'astest '&r s&rti " stri "s = a c& cl)si& c& 'irmed b% &thers 2103* I Secti& 4*7, we prese t a al%sis a d &ptimi5ed impleme tati& &' the perm)tati& l&&p c& cept* 6&wever, this al"&rithm is '&) d t& be i 'eri&r t& &)r pr&p&sed divide-a d-c& @)er partiti& i " al"&rithm* Appare tl% the w&r- &' 0cIlr&% et al* 2>3 & i -place radi/ s&rt has received little atte ti& &r was '&) d i c&mprehe sible b% s&me* 4he w&r- rep&rted here, which ca be viewed as a impr&veme t & , a d e/te si& t&, their w&r-, c&) ters prevaili " belie's that 0SD is c&mple/ t& impleme t a d w&)ld re@)ire l&ts &' space* 4& @)&te: ;0&st si" i'ica t di"it 90SD: Radi/ s&rt ta-es a l&t m&re b&&--eepi " = the list m)st repeatedl% be split i t& s)blists '&r each val)e &' the last di"it pr&cessed< 2113. a d ;4he b&&--eepi " w&)ld @)ic-l% "et &)t &' ha d. p&i ters i dicati " where the vari&)s b)c-ets be"i a d i '&rmati& eeded t& rec&mbi e the eleme ts i t& & e list w&)ld have t& be stac-ed a d ) stac-ed &'te < 212, p* 2023* 0&re rece t w&r- & the s)bIect &' i -place radi/ s&rti " is rep&rted i 2173 a d 2143* 4he pr&p&sed al"&rithms appear t& )se the perm)tati& -l&&p idea. h&wever, the prese tati& s &' these al"&rithms are s-etch% at best a d the a)th&rs 'ail t& rec&" i5e a d relate t& earlier w&r- b% 0cIlr&% et al* I 2173, it is s)""ested that the di"it si5e be ;adaptive< a d varied d)ri " the e/ec)ti& &' the al"&rithm* 6&wever, we have '&) d & si" i'ica t di''ere ces i the r) i " times as the di"it si5e is varied* 4his paper prese ts a c&mprehe sive, ) i'ied a d readable treatme t &' i -place 0SD radi/ s&rt* 4he rest &' the paper is &r"a i5ed as '&ll&ws* Secti& 2 prese ts the bi ar% 0ates&rt al"&rithm a d c&mpares its per'&rma ce t& &ther p&p)lar s&rti " al"&rithms* Secti& 7 prese ts c& ti )&)s 0ates&rt '&r s&rti " real )mbers* I Secti& 4 we prese t several "e eral radi/ 0ates&rt al"&rithms i cl)di " the ew Je 0ates&rtK DC 9)si " divide-a d-c& @)er partiti& i ": a d establish a )mber &' per'&rma ce-related lemmas* 4he a al%sis c& 'irms that '&r ra d&m data, "e eral radi/ 0ates&rt is & better tha bi ar% 0ates&rt* 4& &)r - &wled"e, s)ch a a al%sis has &t bee p)blished be'&re* Secti& 5 prese ts c&mparis& res)lts &' 'ive 0ates&rt al"&rithms )sed '&r s&rti " + "lish te/t* I

A&)r al &' I '&rmati& Scie ce, 71 9B: 2005, pp* 4B>=4C1 D CI?I(, D8I: 10*11>>E01B555150505>00>
Downloaded from http://jis.sagepub.com atPENNSYL !N"! S#!#E $N" on !pril %&' ())* 2005 Chartered Institute of Library and Information Professionals. All rights reserved. Not for commercial use or unauthori ed distribution.

N* A?-DARGIS6

this paper, we "ive c&mplete pr&"ram c&de listi "s i CL 9als&, e@)ivale t t& Aava:* 4his sh&)ld e able eas% veri'icati& &' the claimed res)lts a d av&id a % impleme tati& ambi")it% = as 0cIlr&% et al* 2>3 s)""ested, ;4he tr&)bles with radi/ s&rt are i impleme tati& , &t i c& cepti& <*

v&id 0ates&rt9i t23 A, i t l&, i t hi, i t bitl&c: M EE i itial call: bitl&c F hi"hest bit p&siti& 9starti " 'r&m 0: i' 99l& N hi : OO 9bitl&c PF0:: M i t -F Qit(artiti& 9A,l&,hi,bitl&c:. 0ates&rt9A,l&,-,bitl&c-1:. 0ates&rt9A,-R1,hi,bitl&c-1:. S S v&id 1)ic-s&rt9i t23 A, i t l&, i t hi: M i' 9l& N hi : M i t -F (artiti& 9A,l&,hi:. 1)ic-s&rt9A,l&,--1:. 1)ic-s&rt9A,-R1,hi:. S S i t Qit(artiti& 9i t23 A, i t l&, i t hi, i t bitl&c: M i t piv&tl&c F l&-1. i t t. i t 0as- F 1NN bitl&c. '&r9i t iF l&. iNFhi . iRR: EE i' 9 99A2i3PP bitl&c: O 0/1: FF0: i' 9 9A2i3 O 0as-: NF 0: M EE swap with eleme t at piv&tl&cR1 a d )pdate piv&tl&c piv&tl&cRR. t F A2i3. A2i3 F A2piv&tl&c3. A2piv&tl&c3 F t. S ret)r piv&tl&c. S i t (artiti& 9i t23 A, i t l&, i t hi: M i t t. i t piv&t F A2l&3. i t piv&tl&c Fl&. '&r9i t iF l&R1. iNFhi . iRR: i' 9A2i3 NF piv&t: M EE swap with eleme t at piv&tl&cR1 a d )pdate piv&tl&c piv&tl&cRR. t F A2piv&tl&c3. A2piv&tl&c3 F A2i3. A2i3 F t. S EE m&ve piv&t t& its pr&per l&cati& A2l&3 F A2piv&tl&c3. A2piv&tl&c3 F piv&t. ret)r piv&tl&c.

/. 0inary Matesort algorithm


4he bi ar% 0ates&rt al"&rithm has s&me stri-i "

resembla ce t& 1)ic-s&rt = the di''ere ce is i the e/tra ;bitl&c< 9bit l&cati& : i p)t parameter a d the partiti& meth&d )sed* 4he idea '&r 0ates&rt c&mes 'r&m ;al"&rithm desi" b% i d)cti& < 2153 )si " the '&ll&wi " i d)cti& step* .nduction step, s)pp&se a % &' the eleme ts i the arra% A21 * * * n3 is e c&ded )si " $ bits 9b$=1 b$=2 * * * b0: a d that A21 * * * n3 is partiti& ed based & the m&st si" i'ica t bit 9b$=1: s)ch that all eleme ts with b$=1 F 0 appear be'&re eleme ts with b$=1 F 1* 4he what remai s t& be d& e is t& s&rt each "r&)p* $heorem -, the bi ar% 0ates&rt al"&rithm has 89$n: w&rst-case &rder &' r) i " time, where n is the )mber &' eleme ts t& be s&rted a d $ is the )mber &' bits eeded t& e c&de a eleme t val)e* 4he al"&rithm is 89$: space* 1roof, s)pp&se that a % &' the eleme ts is e c&ded )si " $ bits 9b$=1 b$=2 * * * b0:* 4he r) i " time '&r the Qit(artiti& al"&rithm '&r n eleme ts is easil% sh&w t& be 89n: si ce it per'&rms 891: 9c& sta t time: per eleme t* 4he calls t& 0ates&rt ca be depicted as a bi ar% tree wh&se r&&t is the i itial call t& 0ates&rt* At a % tree level, the arra% eleme ts 9n eleme ts i t&tal: are split disI&i tl% am& " vari&)s calls t& 0ates&rt* 4he calls at level r 9r&&t at level 0 c&rresp& ds t& bitlo F$ = 1: are ass&ciated with calls t& Qit(artiti& with bitlo F 9$ = r = 1:* 4h)s the pr&cessi " ass&ciated with a % tree level is 89n:* 4he l&west tree level is '&r bitlo F 0 a d th)s the tree has a ma/im)m &' $ levels* 4h)s the w&rst- case &rder &' r) i " time is 89$n:* 4he space )sed is d&mi ated b% the stac- space ass&ciated with rec)rsive calls 9i*e* ret)r addresses, call parameters a d l&cal variables:* +ach s)ch call )ses 891: space = &te that Qit(artiti& is 891: space* I the w&rst case, there ma% be $ calls pe di "* 4h)s the al"&rithm is 89$: space*

A -e% &perati& '&r 1)ic-s&rt a d 0ates&rt al"&rithms is the rearra "eme t &' the arra% eleme ts )si " certai

'"!" A&)r Partitioning al &' I '&rmati& Scie ce, 71 9B: 2005, pp* 4B>=4C1 D CI?I(, D8I: 10*11>>E01B555150505>00>
Downloaded from http://jis.sagepub.com atPENNSYL !N"! S#!#E $N" on !pril %&' ())* 2005 Chartered Institute of Library and Information Professionals. All rights reserved. Not for commercial use or unauthori ed distribution.

4BH

?isti " 1* Qi ar% 0ates&rt al"&rithm i c&mparis& with 1)ic-s&rt al"&rithm*

criteria s)ch as ar&) d a pre-selected val)e* Ge limit &)r atte ti& t& i -place partiti& i " 9i*e* with&)t )si " a % additi& al w&r-i " arra%s:* 4he 89n: partiti& i " al"&rithms prese ted here are &t ew = the% are i cl)ded '&r the prese tati& t& be sel'-c& tai ed* #&r bi ar% 0ates&rt, the '&ll&wi " pr&blem eeds t& be s&lved* 0it1artition problem, "ive a i te"er arra% A21 * * n3 a d a bit p&siti& i, ret)r $ s)ch that bit bi i a % &' 9A213, A223, * * * , A2$3: 0 N bit bi i a % &' 9A2$ R 13, A2$ R 23, * * * , A2n3:* 8 the &ther ha d, 1)ic-s&rt re@)ires a s&l)ti& t& the '&ll&wi " pr&blem*

A&)r al &' I '&rmati& Scie ce, 71 9B: 2005, pp* 4B>=4C1 D CI?I(, D8I: 10*11>>E01B555150505>00>
Downloaded from http://jis.sagepub.com atPENNSYL !N"! S#!#E $N" on !pril %&' ())* 2005 Chartered Institute of Library and Information Professionals. All rights reserved. Not for commercial use or unauthori ed distribution.

4BH

0SD radi/ s&rt al"&rithms

1artition problem, "ive a i te"er arra% A21 * * n3 a d a piv&t val)e (* Ret)r $ s)ch that a % &' 9A213, A223, * * * , A2$3: ( N a % &' 9A2$ R 13, A2$ R 23, * * * , A2n3:* #&r either pr&blem, we re'er t& $ as the piv&t l&cati& * I d)cti& ca be easil% empl&%ed t& c&me )p with "&&d s&l)ti& s t& the ab&ve pr&blems* Partitioning)Method ! .nduction step, ass)me the arra% A21 * * n = 13 is partiti& ed s)ch that a % &' 9A213, A223, * * * , A2$3: ( N a % &' 9A2$ R 13, A223, * * * , A2n = 13: = $ is the piv&t l&cati& * N&w c& sider A2n3* I' A2n3 P ( the d& e 9piv&t l&cati& is ) cha "ed:. &therwise, si ce A2n3 ( A2n3 N A2$ R 13, we ca swap A2n3 with A2$ R 13 a d adva ce the piv&t l&cati& t& $ R 1* 4he s&l)ti& t& the Qit(artiti& pr&blem is similar t& the ab&ve e/cept that the criteri& )sed t& )pdate the piv&t l&cati& 9A2n3 (: is replaced b% 9bi &' A2n3 0:* 4he impleme tati& &' this al"&rithm is sh&w i ?isti " 1* N&te that '&r Qit(artiti& , we )tili5e a bit ma ip)lati& &perati& * 4& is&late the bit at l&cati& bitlo , & e ca )se the e/pressi& 9A2i 3 PP bitl&c: O 0/1* It d&es shift right the val)e A2i 3 bitlo bits a d the AND with the he/adecimal val)e ;00 * * * 01<* 6&wever, a m&re e''icie t meth&d is t& )se the e/pressi& 9A2i 3 O 0as-:, where 0as- F 2bitl&c c&mp)ted as a l&&p i vari- a t* #&r the partiti& meth&d )sed b% 1)ic-s&rt, the 'irst eleme t is ch&se as the piv&t a d the , at the e d, the piv&t is swapped with A2$3 t& p)t the piv&t i t& its pr&per s&rti " p&siti& * S)ch a step is &t warra ted '&r Qit(artiti& * N&te that the i itial call t& 0ates&rt is passed the l&cati& &' the hi"hest bit* 4his ca be determi ed b% a c)m)lative 8R &perati& &ver all i p)t eleme ts a d the determi i " the p&siti& &' the le'tm&st bit val)e &' 1* A &ther s&l)ti& t& the (artiti& pr&blem is based & the '&ll&wi " i d)cti& step* Partitioning)Method ' .nduction step, c&mpare A213 with A2n3* 4here are '&)r cases t& c& sider:

s-ippi " eleme ts (, whereas the ri"ht p&i ter m&ves t&ward the le't s-ippi " eleme ts P (* 4he tw& partiti& i " meth&ds were '&) d t& have c&mparable r) i " times '&r ra d&m data* Alth&)"h 0eth&d 2 d&es nE4 e/pected )mber &' swaps 9vs* nE2 '&r 0eth&d 1:, it ma-es t#o c&mparis& s three-@)arters &' the time i &rder t& elimi ate & e arra% eleme t 9vs* & e c&mparis& per eleme t '&r 0eth&d 1:* '"'" *inary Matesort ompared to +ui $sort and ,eapsort #irst, we sh&)ld &te that all the rep&rted e/perime ts were r) & a (e ti)m IT 2*B 065 512 0Q RA0 r) i " Gi d&ws U( 2002* 4he pr&"rams were writte , c&mpiled a d r) 9Release Q)ild: )si " 0icr&s&'t<s CL TS*Net 2007* #&r the "e erati& &' ra d&m i te"er data '&r this e/perime t, we have )sed the fillarray ') cti& "ive i ?isti " 2* 4his ') cti& 'ills the arra% with i te"ers i ra "e 20, ma&3 )si " ) i'&rm ra d&m distrib)ti& , where ma& F nE&* 4his scheme is simpl% re'erred t& as 9UE&: a d & ca be th&)"ht &' as a repetiti& 'act&r* 4able 1 s)mmari5es the res)lts, where a "ive timi " e tr% is a avera"e &' 'ive r) s* N&te that 95######:he/ F 100,BB72,2H5 9i*e* ab&)t 100 milli& eleme ts:* 4he res)lts sh&w that 1)ic-s&rt a d 0ates&rt r) ec--a d- ec- a d clearl% &)tper'&rm &ther al"&rithms i cl)di " the 0icr&s&'t *Net b)ilt-i Arra% S&rt* '"-" *inary Matesort ompared to +ui $sort 1)ic-s&rt is - &w t& s)''er 'r&m tw& pr&blems that ca)se per'&rma ce de"radati& = see 4able 2 = whilst 0ates&rt seems t& be 'ree &' these pr&blems* 4he 'irst pr&blem is that the al"&rithm 9i*e* repeated partiti& i ": is sl&w = i c&mparis& with &ther s&rti " al"&rithms = whe applied t& small si5e data, especiall% '&r data that is earl% s&rted &r c& tai i " ide tical val)es* 4able 2 clearl% sh&ws this behavi&r as the data

A213 ( N A2n3 $ F (artiti& &' A22 * * n = 13* A213 ( A2n3 $ F (artiti& &' A22 * * n3* A213 P ( N A2n3 $ F (artiti& &' A21 * * n = 13* A213 P ( A2n3 swap A213 with A2n3, $ F (artiti& &' A22 * * n = 13* 4he iterative impleme tati& &' this meth&d = see (arS titi& KC& t2 i ?isti " 4 = )tili5es tw& p&i ters 9left a d right: that are set t& 'irst a d last arra% l&cati& s, respectivel%* 4he le't p&i ter m&ves t&ward the ri"ht ?isti " 2* Je erati& &' i te"er test data* 4>0 A&)r al &' I '&rmati& Scie ce, 71 9B: 2005, pp* 4B>=4C1 D CI?I(, D8I: 10*11>>E01B555150505>00>
Downloaded from http://jis.sagepub.com atPENNSYL !N"! S#!#E $N" on !pril %&' ())* 2005 Chartered Institute of Library and Information Professionals. All rights reserved. Not for commercial use or unauthori ed distribution.

v&id 'illarra%9i t23 A, i t : M i t rep'act&rF4. EETar% Repetiti& 'act&r as ecessar% Ra d&m r F ew Ra d&m9:. i t ma/ F Erep'act&r. '&r9i t iF 1. iNF . iRR: A2i3F r*Ne/t9ma/:.

N* A?-DARGIS6

4able 1 0ates&rt c&mpared t& &ther s&rti " al"&rithms '&r ra d&m ) i'&rml% distrib)ted i te"er data* 1)ic-s&rtK0&d a d 0ates&rtK0&d are '&r m&di'ied versi& s that )se I serti& s&rt '&r i p)t si5e N 20 Si5e 9n:, i he/ Distrib)ti& 1###### *Net S&rt 6eaps&rt 1)ic-s&rt 1)ic-s&rtK0&d 0ates&rt 0ates&rtK0&d *Net S&rt 6eaps&rt 1)ic-s&rt 1)ic-s&rtK0&d 0ates&rt 0ates&rtK0&d *Net S&rt 6eaps&rt 1)ic-s&rt 1)ic-s&rtK0&d 0ates&rt 0ates&rtK0&d UE1 14>50 5>H21 C07H >>74 >C47 B>1C 142HB 5>5H7 >HBC BB>1 >72C BCH0 14207 5>>1C C0B2 BB25 BH7> >04B +/ec)ti& time 9ms: 7###### 70274 1771>C 1BC12 15H0B 1B207 14125 2HH7> 1740>C 1BB72 17HC4 152HB 144BC 2H50H 174140 1BB72 17C47 14571 14>B5 5###### 4B7>5 21CC47 2B747 2472C 2575H 22274 4B071 21C>B5 25751 21515 240B2 22>B5 44C2C 2200B2 25>42 211>1 22C5H 27207

UE2

UE4

repetiti& 'act&r is i creased* Alteri " the piv&t selecti& strate"% d&es &t help* Als&, we have &bserved that the strate"% &' c&)pli " 1)ic-s&rt with I serti& S&rt 9whe i p)t 'alls t& ab&)t 20 t& 70 eleme ts: was &' little val)e i these sit)ati& s 9i*e* V0*001 red) da c%:* 4he sec& d pr&blem is als& appare t 'r&m 4able 2, which sh&ws that rec)rsi& depth a d e/ec)ti& time i crease as the data repetiti& 'act&r is i creased* I the w&rst case, the rec)rsi& depth 9a d ass&ciated stac-

space: ca "r&w as bad as 89n:* 4he 1)ic-s&rt al"&rithm ca be m&di'ied easil% )si " ;tail-rec)rsi& elimi ati& < t& limit the rec)rsi& depth 223* 4ailrec)rsi& elimi ati& mea s replaci " a rec)rsive call l&cated at the e d &' a pr&ced)re b&d% bl&c- b% a l&&p* 4he tric- t& limit rec)rsi& depth t& 89l&" n: is t& elimi ate the rec)rsive call that has m&re tha hal' &' the eleme ts*

2. 3sing Matesort algorithm for continuous domain


v&id 1)ic-s&rtK4R89i t23 A, i t l&, i t hi: M while 9l& N hi: M i t - F (artiti& 9A,l&,hi:. i' 99--l&: P 9hi--:: M 1)ic-s&rtK4R89A,-R1,hi:. hi F --1. S else M 1)ic-s&rtK4R89A,l&,--1:. l& F -R1. S S S

?isti " 7* 4ail-Rec)rsi& 8ptimi5ed 1)ic-s&rt*

A &table attempt t& )se the eleme t val)e t& determi e the ;s&rti "< address is e/empli'ied b% ;pr&/imit% map< s&rt* It )ses a hashi " ') cti& t& determi e the vici it% where the eleme t will be placed i the 'i al s&rti " &rder* 6&wever, s)ch al"&rithms are "e erall% c&mple/, a d ver% &'te & l% s)itable '&r certai -i ds &' data* D&b&siewic5 21B3 prese ted a s&rti " al"&rithm based & distrib)tive partiti& i "* 4he al"&rithm s&rts n real )mbers b% distrib)ti " them i t& n i tervals &' e@)al width* 6is al"&rithm r) s i 89n: e/pected time a d 89n l&" n: w&rst-case time* 6&wever, his al"&rithm d&es &t av&id pair-wise eleme t c&mparis& s 4>1

A&)r al &' I '&rmati& Scie ce, 71 9B: 2005, pp* 4B>=4C1 D CI?I(, D8I: 10*11>>E01B555150505>00>
Downloaded from http://jis.sagepub.com atPENNSYL !N"! S#!#E $N" on !pril %&' ())* 2005 Chartered Institute of Library and Information Professionals. All rights reserved. Not for commercial use or unauthori ed distribution.

0SD radi/ s&rt al"&rithms

4able 2 Rec)rsi& depth a d e/ec)ti& time '&r bi ar% 0ates&rt 90S:, 1)ic-s&rt-4ailKRec)rsi& K8ptimi5ed 91S4R8: a d three versi& s &' 1)ic-s&rt* I p)t: A2i 3 F ra d&m i 20, nE&3 , '&r n F 1##### 9he/:* 4he 1)ic-s&rt versi& s )sed '&r (artiti& : 0eth&d 1, 0eth&d 2, 0eth&d 2 with (iv&t F media &' A2l&3, A2hi3 a d A29l&Rhi:E23* 4his latter (artiti& meth&d is )sed '&r 1S4R8* &: 9UE&: Distrib)ti& 0S 1 125 250 500 1000 25 25 25 25 25 B4 204 751 B1H 11BB Rec)rsi& depth 1)ic-s&rt >2 24> 42B C11 15B> 5> 222 7HH >70 17>2 1S4R8 1> 17 17 12 12 0S >C47 BC5H BC12 B>C1 B>74 C15B 1547> 240H7 7CB5B BHCH0 +/ec)ti& time 9ms: 1)ic-s&rt >H57 H571 11HC4 1BH21 2>40B C0>C H421 114C4 15>50 24>HB 1S4R8 >CH0 HH57 12C12 1C>50 70HBC

alt&"ether, si ce it re@)ires the media &' the eleme ts t& be c&mp)ted* 4he al"&rithm we prese t e/t is m)ch simpler a d r) s 'ast all the time si ce it d&es red)cti& at the bit represe tati& level* $p& care')l e/ami ati& &' the bi ar% 0ates&rt al"&rithm, it ca be &bserved that the 'irst call t& Qit(artiti& , which c& siders the m&st si" i'ica t bit 9$th bit:, is e@)ivale t t& partiti& i " ar&) d a piv&t val)e &' 2$* 4he , the call t& Qit(artiti& & the val)es 2$, )ses a piv&t val)e &' 2$=1 a d the call t& Qit(artiti& & the val)es P2$, )ses a piv&t val)e &' 2$ R 2$=1* 4his s)""ests )si " a versi& &' (artiti& that is passed the piv&t val)e as a i p)t parameter a d the recasti " 0ates&rt i t& 0ates&rtKC& t = see ?isti " 4 = with tw& e/tra i p)t parameters: a l&wer limit 9i itiall% set t& 0 '&r & - e"ative data: a d a piv&t 9i itiall% set t& hal' the ma/im)m val)e am& " all eleme ts:* Ghe the piv&t val)e reaches 1 the data is s&rted based & the i te"er parts &' the i p)t )mbers* I' we c& ti )e t& halve the piv&t ')rther, we will be able t& s&rt )mbers

that 'all withi 0*5 &' each &ther, a d s& & * 4he pr&cess sh&)ld c& ti )e ) til s&me mi im)m piv&t val)e 9'racti& : is reached that is smaller tha the precisi& &' the i p)t data* It ma% be c& sidered a &%i " t& set a mi im)m piv&t val)e be'&re calli " c& ti )&)s 0ates&rt* Alter ativel%, the pr&cess &' halvi " the piv&t ca c& ti )e ) til a ide tical eleme ts c& diti& is reached* 0ates&rtKC& tT2 impleme ts this idea* 4he test '&r ide tical eleme ts is i c&rp&rated withi the (artiti& meth&d = see c&mme ted li es i (artiti& C& t2 i ?isti " 4 = b% havi " the meth&d ret)r =1 i' all eleme ts are e@)al* As sh&w i 4able 7, this versi& &' c& ti )&)s 0ates&rt r) s sli"htl% 'aster tha the &ther versi& * 4he table sh&ws the res)lts &' s&me tests 9data "e erated )si " the fillarray meth&d "ive i ?isti " 4 where a eleme t has a 'racti& al part with 0*001 acc)rac%: carried &)t t& c&mpare c& ti )&)s 0ates&rt, 1)ic-s&rt a d 0icr&s&'t *Net b)ilt-i Arra% S&rt* 4he test data varied i si5e betwee 10 milli&

4able 7 +/ec)ti& times '&r *Net b)ilt-i S&rt, 1)ic-s&rt a d C& ti )&)s 0ates&rt* 1)ic-s&rt )ses '&r (artiti& : 0eth&d 2 with (iv&t F media &' A2l&3, A2hi3 a d A29l&Rhi:E23* + tries i pare theses are '&r versi& s &' these al"&rithms c&)pled with I serti& s&rt '&r i p)t si5e N 20* Arra% si5e 9n: 10B *Net S&rt 10 20 40 50 >7B2 14C47 7110H 7H274 1)ic-s&rt 7747 BHC4 14500 1C747 92C>5: 9B04B: 912H57: 91B755: +/ec)ti& time 9ms: 0ates&rtKC& t 42C1 C>C1 1>CH0 22B0H 92C12: 95C47: 9121C>: 9155H7: 0ates&rtKC& tT2 4015 C2HB 1>207 227B2 92>B5: 95>HB: 912125: 91547>:

4>2

A&)r al &' I '&rmati& Scie ce, 71 9B: 2005, pp* 4B>=4C1 D CI?I(, D8I: 10*11>>E01B555150505>00>
Downloaded from http://jis.sagepub.com atPENNSYL !N"! S#!#E $N" on !pril %&' ())* 2005 Chartered Institute of Library and Information Professionals. All rights reserved. Not for commercial use or unauthori ed distribution.

N* A?-DARGIS6

v&id 0ates&rtKC& t9d&)ble23 A,i t l&,i t hi,d&)ble 'r&mval,d&)ble i c: M i' 99l& N hi : OO 9i c PFmi piv&t:: M i t -F (artiti& KC& t29A,l&,hi,'r&mvalRi c:. 0ates&rtKC& t9A,l&,-, 'r&mval, i cE2*0:. 0ates&rtKC& t9A,-R1,hi, 'r&mvalRi c, i cE2*0:. S S v&id 0ates&rtKC& tT29d&)ble23 A,i t l&,i t hi,d&)ble 'r&mval,d&)ble i c: M i' 9l& N hi: M i t -F (artiti& KC& t29A,l&,hi,'r&mvalRi c:. i' 9-FF-1: ret)r . 0ates&rtKC& tT29A,l&,-, 'r&mval, i cE2*0:. 0ates&rtKC& tT29A,-R1,hi, 'r&mvalRi c, i cE2*0:. S S i t (artiti& KC& t29d&)ble23 A, i t l&, i t hi, d&)ble piv&t: M d&)ble t. i t i,I. EE ret)r =1 i' all eleme ts are e@)al. e able '&r 0ates&rtKC& tT2 EE '&r9iFl&. iN hi. iRR: EE i' 9A2i3 WF A2hi3: "&t& c& t. EE ret)r -1. EE c& t: i Fl&. IFhi. while 9i N I: M while 9 9i NFhi: OO 9 A2i3 NF piv&t :: i RR. while 9 9I PF l&: OO 9 A2I3 P piv&t :: I --. i' 9i N I: M t F A2i3. A2i3 F A2I3. A2I3 F t. S S ret)r I. S v&id 'illarra%9d&)ble23 A, i t : M Ra d&m r F ew Ra d&m9:. '&r9i t iF 1. iNF . iRR: A2i3F r*Ne/t9 : R r*Ne/t91000:E1000*0. S EE-- mai i t F 40000000. d&)ble23 D F ew d&)ble2 R13. 'illarra%9D, :. mi piv&tF*0005. EESet 0i im)m (iv&t be'&re calli " 0ates&rtKC& t* 0ates&rtKC& t9D,1, ,0, E2:.

?isti " 4* 0ates&rt al"&rithms '&r real

)mbers*

a d 50 milli& eleme ts* A "ive timi " e tr% is a avera"e &' 'ive r) s* Alth&)"h the res)lts sh&w that 1)ic-s&rt a d 0ates&rt have c&mparable speed, & e has t& remember that the "ive 0ates&rt al"&rithms whe r) '&r n )mbers with a ma" it)de precisi& &' $ bits, have w&rst-case r) i " time &' 89$n: vs 89n2: '&r 1)ic-s&rt* 4he res)lts sh&w that 0icr&s&'t *Net b)ilt-i Arra% is m)ch sl&wer 950X sl&wer: tha either 0ates&rt &r 1)ic-s&rt*

4. 5eneral radix Matesort algorithms


Ghereas the basic 0ates&rt al"&rithm pr&cesses the data & e bit at a time, the "e eral radi/ 0ates&rt = see ?isti " 5 = pr&cesses the data & e di"it 9a "r&)p &' bits: at a time* A radi/ val)e &' r implies that the di"it si5e F l&" r, e*"* radi/ 1B )ses a di"it si5e &' 4 bits* 4h)s, c&mpared t& bi ar% 0ates&rt, we &bserve the '&ll&wi "* #irst, the Qit(artiti& meth&d is replaced b% the 4>7

A&)r al &' I '&rmati& Scie ce, 71 9B: 2005, pp* 4B>=4C1 D CI?I(, D8I: 10*11>>E01B555150505>00>
Downloaded from http://jis.sagepub.com atPENNSYL !N"! S#!#E $N" on !pril %&' ())* 2005 Chartered Institute of Library and Information Professionals. All rights reserved. Not for commercial use or unauthori ed distribution.

0SD radi/ s&rt al"&rithms

Di"it(artiti& meth&d that is passed digitlo 9di"it l&cati& : a d digitval 9di"it val)e: parameters* Sec& d, '&r radi/ r a d a "ive digitlo , the data m)st be split i t& r "r&)ps, & e '&r each radi/ val)e* 4his pr&cess ca be d& e se@)e tiall%, & e di"it val)e at a time 9se@)e tial partiti& i ":, &r thr&)"h divide-a dc& @)er* #&r se@)e tial partiti& i " 9see Je 0ates&rtKSe@ i ?isti " 5:, we eed t& iss)e e/actl% 9r = 1: calls t& Di"it(artiti& t& "et the data split i t& r parts a d the iss)e r rec)rsive calls t& Je 0ates&rtKSe@, & e '&r each part, t& pr&cess the data '&r the e/t di"it l&cati& 9digitlo = 1:*

."!" An elegant /enMatesort)DC 4here is a p&te tial '&r i e''icie c% i se@)e tial partiti& i "* Namel%, it is p&ssible that a'ter sca i " all eleme ts '&r the p)rp&se &' partiti& i " ar&) d the 'irst di"it val)e, we 'i d & eleme t pivot, a d the pr&cess repeats & the wh&le set &' i p)t eleme ts 9n: '&r s)bse@)e t di"it val)es* 4h)s it is p&ssible that se@)e tial partiti& i " ma% de"e erate i t& 89rn: r) i " time per di"it l&cati& * A better partiti& i " strate"% is t& partiti& ar&) d the middle di"it val)e 9mid:* 4his wa% the eleme ts that 'all bel&w 9ab&ve: mid are &)t &' c& siderati& whe Di"it(artiti& is called '&r di"it val)es P mid 9di"it val)es mid:* I' the data is ra d&m a d ) i'&rml% distrib)ted the this

i t Di"it(artiti& 9i t23 A,i t l&,i t hi,i t di"itl&c,i t piv&t: M i t t, i t bitl&c F 4Ydi"itl&c. EE 0)ltiplier is Radi/ depe de t i t piv&tl&cFl&-1. '&r9i t iF l&. iNFhi . iRR: i' 9 99A2i3PP bitl&c: O 0/#: NFpiv&t : EE 0as- is Radi/ dep* M piv&tl&cRR. t F A2piv&tl&c3. A2piv&tl&c3 F A2i3. A2i3 F t. S ret)r piv&tl&c. S EE N&te: I itial call is /enMatesort)Se01A,!,n,digit ount%!2 v&id Je 0ates&rtKSe@9i t23 A,i t l&,i t hi,i t di"itl&c: M i t di"itval, -. i' 99l& N hi : OO 9di"itl&c PF0:: M '&r9di"itvalF0. di"itval N 15. di"itvalRR: EE Ra "e is Radi/ dep* M i' 9l&PFhi: brea-. EE &pti& all% added '&r &ptimi5ati& - F Di"it(artiti& 9A,l&,hi,di"itl&c,di"itval:. Je 0ates&rtKSe@9A,l&,-, di"itl&c-1:. l& F -R1. S Je 0ates&rtKSe@9A,l&,hi, di"itl&c-1:. S S EE N&te: I itial call is /enMatesort)DC1A,!,n,digit ount%!,3,!42 v&id Je 0ates&rtKDC9i t23 A,i t l&,i t hi,i t di"itl&c,i t 'r&mdi"itval, i t t&di"itval: M i t mid, -. i' 9 9l& N hi : OO 9di"itl&c PF0: : i' 9'r&mdi"itval N t&di"itval: M mid F 9'r&mdi"itval Rt&di"itval:E2. -F Di"it(artiti& 9A,l&,hi,di"itl&c,mid:. Je 0ates&rtKDC9A,l&,-, di"itl&c, 'r&mdi"itval, mid:. Je 0ates&rtKDC9A,-R1,hi, di"itl&c, midR1, t&di"itval:. S else Je 0ates&rtKDC9A,l&,hi, di"itl&c-1,0, 15:. EE Ra "e is Radi/ dep* S

?isti " 5* Je eral radi/ 0ates&rt )si " se@)e tial partiti& i " 9Je 0ates&rtKSe@: a d "e eral radi/ 0ates&rt )si " divide-a d-c& @)er (artiti& i " 9Je 0ates&rtKDC : = versi& s "ive are '&r Radi/ F 1B*

4>4

A&)r al &' I '&rmati& Scie ce, 71 9B: 2005, pp* 4B>=4C1 D CI?I(, D8I: 10*11>>E01B555150505>00>
Downloaded from http://jis.sagepub.com atPENNSYL !N"! S#!#E $N" on !pril %&' ())* 2005 Chartered Institute of Library and Information Professionals. All rights reserved. Not for commercial use or unauthori ed distribution.

N* A?-DARGIS6

res)lts i 89n l&" r: r) i " time '&r the 'irst di"it = see ?emma 2 bel&w* 8)r &ri"i al attempt t& c&de the divide-a d-c& @)er idea was t& )se a meth&d that ret)r s a arra% &' piv&t l&cati& s '&r piv&t val)es ra "i " &ver all p&ssible di"it val)es* $p& re'lecti& , we reali5ed that s)ch a meth&d serves & p)rp&se &ther tha &r"a i5i " a d se@)e ci " the calls t& Di"it(artiti& * Ca we &t simpl% )se Je 0ates&rtKDC itsel' '&r thatZ 4he a swer is, &' c&)rse, ;[es<, i' we e/te d the parameters &' Je 0ates&rtKDC t& i cl)de fromdigitval a d todigitval parameters* 4h)s the si/ li e 9literall% a si "le if state- me t: al"&rithm "ive i ?isti " 5 was b&r * N&te that '&r a "ive digitlo Je 0ates&rtKDC is repeatedl% called 9'irst tw& rec)rsive calls: ) til we reach a si "le di"it val)e 9fromdigitval N todigitval is 'alse:* At that p&i t, we iss)e a call Je 0ates&rtKDC 9resetti " the fromdigitval, todigitval ra "e: t& c& ti )e with the e/t di"it l&cati& 9digitlo = 1:* ."'" 5emmas and performan e results for general radi& Matesort #&r n eleme ts data a d radi/ r, ass)mi " the data is ra d&m a d ) i'&rml% distrib)ted, the '&ll&wi " lemmas h&ld, ass)mi " partiti& i " is per'&rmed )si " (artiti& i "K0eth&d1 "ive i Secti& 2* Als&, &te that eleme t access c&) t d&es &t i cl)de eleme t access d)ri " a swap &perati& * 6emma -, '&r ra d&m ) i'&rml% distrib)ted data, the e/pected )mber &' eleme t accesses 2swaps3 per'&rmed b% the "e eral radi/ 0ates&rt )si " se@)e tial partiti& i " '&r the 'irst di"it l&cati& is 9r = 1: n = 9nEr: 9r = 2:9r = 1:E2 2rec)rre ce e@)ati& s: C9n, 1: F 0. C9n,r: F 9r = 1:9nEr:Er R C9n = nEr,r = 1: '&r r P 13* 1roof, the 'irst call t& Di"it(artiti& e/ami es n eleme ts 9)si " the 'irst radi/ val)e as piv&t: a d the data is split i t& tw& parts &' le "th nEr a d 9n = nEr:* Di"it(artiti& is the called & the sec& d part 9)si " the sec& d radi/ val)e as piv&t:, etc* 4h)s, L &' eleme ts accessed F n R 9n = nEr: R 9n = 2nEr: R * * * , '&r a t&tal r = 1 terms* F 9r = 1:n = nEr 91 R 2 R * * * R r = 2: F 9r = 1:n = 9nEr: 9r = 2:9r = 1:E2* #&r r F 25B, L &' eleme ts accessed F 255 n = 9nE25B: 912>Y255: 12C n* N&w c& sider swaps* #&r the 'irst call t& Di"it(artiti& ,

the '&ll&wi " ta-es place* Si ce nEr eleme ts are e/pected t& 'all withi the 'irst di"it val)e a d each &' the r parts c& trib)tes e@)all% 2i*e* each part c& trib)tes 9nEr:Er eleme ts3* 4h)s r = 1 &' the parts will c& trib)te a t&tal &' 9r = 1:Y 9nEr2: eleme ts via swaps* 4h)s, L &' swaps F 9r = 1:Y 9nEr:Er R 9r = 2: Y 29n = nEr:E9r = 1:3 R * * * , '&r a t&tal &' r-1 terms* 4his is e@)ivale t t& the rec)rre ce: C9n,1: F 0. C9n,r: F 9r = 1:9nEr:Er R C9n = nEr,r = 1: '&r r P 1* 6emma /, '&r ra d&m ) i'&rml% distrib)ted data, the e/pected )mber &' eleme t accesses 2swaps3 per'&rmed b% the "e eral radi/ 0ates&rt )si " dividea d-c& @)er partiti& i " '&r the 'irst di"it l&cati& is 9r = 1:n 2nE2 l&" r3* 1roof, the 'irst call t& Di"it(artiti& e/ami es n eleme ts a d the data is split i t& tw& e@)al parts '&r tw& s)bse@)e t calls* 4he these tw& parts are e/ami ed a d each part "e erates tw& s)bse@)e t calls, etc* 4h)s, L &' eleme ts accessed F n R 2 Y 9nE2: R 4 Y 9nE4: R * * * , '&r a t&tal &' l&" r terms* F n l&" r* #&r r F 25B, L &' eleme ts accessed F Cn* #&r swaps, it is e/pected that & e hal' &' the eleme ts '&r a partic)lar part c&mes 'r&m the &ther part via swaps* 4h)s d)ri " the 'irst call t& Di"it(artiti& , 9nE4 R nE4: swaps are made* 4he each &' these tw& parts is ass&ciated with tw& s)bse@)e t calls t& Di"it(artiti& , etc* 4h)s, L &' swaps F 2Y nE4 R 4 Y 9nEC: R C Y 9nE1B: R * * * , '&r a t&tal &' l&" r terms* F 9nE2: l&" r* #&r r F 25B, L &' eleme t swaps F 4n* 6&w ab&)t '&rm)las '&r Qi ar% 0ates&rtZ Qi ar% 0ates&rt ca be c& sidered as se@)e tial a d divide-a d-c& @)er at the same time, with a di"it si5e &' 1 bit 9r F 2:* 4h)s, we ca s)bstit)te r F 2 i the '&rm)las "ive i ?emma 1 &r ?emma 2* 6emma 2, '&r ra d&m ) i'&rml% distrib)ted data, the e/pected )mber &' eleme t accesses 2swaps3 per'&rmed b% bi ar% 0ates&rt '&r the 'irst bit l&cati& is n2nE23* N&w, '&r ra d&m ) i'&rml% distrib)ted data, '&ll&wi " the partiti& i " & the 'irst di"it l&cati& , we e d )p with r sets, each &' si5e nEr, that are t& be partiti& ed 4>5

A&)r al &' I '&rmati& Scie ce, 71 9B: 2005, pp* 4B>=4C1 D CI?I(, D8I: 10*11>>E01B555150505>00>
Downloaded from http://jis.sagepub.com atPENNSYL !N"! S#!#E $N" on !pril %&' ())* 2005 Chartered Institute of Library and Information Professionals. All rights reserved. Not for commercial use or unauthori ed distribution.

0SD radi/ s&rt al"&rithms

& the sec& d di"it l&cati& * I' we ass)me that the eleme ts i each set are still ra d&m a d ) i'&rml% distrib)ted with respect t& radi/ val)es, the ?emma 4 is I)sti'ied* 6emma 4, '&r ra d&m ) i'&rml% distrib)ted data, let C9n,r: de &te the c&) t "ive b% a % &' the ab&ve lemmas, the the t&tal c&) t &ver all di"it l&cati& s, 6C9n,r,$:, ass)mi " each eleme t is e c&ded )si " $ di"its, is C9n,r: R rC9nEr,r: R r 2C9nEr 2,r: R * * * R r m=
1

C9nEr m=1,r:, where m is mi im)m &' M$ = 1, ma/im)m 7 s)ch that r 7 nS* 4he per di"it c&) t is 6C9n,r,$:E$* $si " ?emmas 7 a d 4 t& c&) t the swaps made b% bi ar% 0ates&rt '&r the le'tm&st C bits, we "et nE2 R 2 Y 9nE4: R * * * 9C terms: F 4n* 4his is c& siste t with the )mbers &btai ed '&r bi ar% 0ates&rt "ive i 4able 4* ?i-ewise, the )mber &' eleme t accesses made b% bi ar% 0ates&rt '&r the le'tm&st C bits is Cn* C&mpari " these )mbers t& th&se &' divide-a d-c& @)er partiti& i " c& 'irms that these al"&rithms have c&mparable per'&rma ce* 4able 4 sh&ws a s)mmar% &' res)lts '&r a e/perime t that was r) t& "a)"e h&w the ab&ve '&rm)las c&rrelate with the c&mp)ted 9via impeded c&) ti " stateme ts: c&) ts* 4he )mbers i pare theses are '&r 9 omputed, formula:, i that &rder, '&r the 'irst di"it l&cati& 9i*e* C-bit character:* 4he res)lts are c& siste t with the ab&ve lemmas* 4able 4 sh&ws the eleme t access c&) t per character l&cati& 9c&mp)ted t&tal

divided b% character c&) t: a d eleme t swap c&) t per character l&cati& * Als&, &te that = c& siste t with ?emma 4 = the c&) ts "et smaller a d smaller '&r s)bse@)e t character l&cati& s a d lead t& a per character avera"e that is smaller tha that c&mp)ted '&r the 'irst character l&cati& * #&r prese tati& clarit%, we have elimi ated the per character l&cati& c&) t as "ive b% the '&rm)la i ?emma 4* It ca be easil% veri'ied that the '&rm)la i ?emma 4 "ives a "&&d appr&/imati& * #&r e/ample, eval)ati " 6C9n,r,$:E$ '&r n F 1000000, r F 25B, $ F 20 a d C9n,r: "ive b% ?emma 1, "ives 1H,2>4,414 which is a cl&se appr&/imati& t& the c&mp)ted 1H,B5B,C5C 94th e tr% i JR 0SKSe@ r&w:* ."-" /eneral radi& Matesort using multi%digit partitioning I "e eral, partiti& i " a data arra% & a ra "e &' di"it val)es ca be d& e i & e &' tw& wa%s* 4he 'irst wa% is t& &r"a i5e a se@)e ce &' calls a"ai st a ;si "le val)e< di"it partiti& al"&rithm* 4his is e/actl% what Je 0ateS&rtKSe@ a d Je 0ateS&rtKDC did* 4he sec& d wa% is t& "e erali5e the di"it partiti& i " al"&rithm itsel' t& ha dle a set &' m)ltiple piv&ts a d ret)r a arra% &' piv&t l&cati& s* Ge re'er t& a % s)ch partiti& i " al"&rithm as multi%digit partitioning* Ge have '&) d that "e erali5ati& &' (artiti& i "K0eth&d1 t& ha dle m)ltiple piv&ts pr&d)ces eleme t access a d swap c&) ts c&mparable t& se@)e tial partiti& i " b)t

4able 4 C&) t &' eleme t accesses a d swaps 9per character a d '&r le'tm&st character: '&r vari&)s 0ates&rt al"&rithms* I p)t: ra d&m 9b%te val)e 20,2553: stri " arra%s, stri " le "thF20* Arra% si5e 107 +/ec)ti& time 9ms: 100 Qi ar% 0S JR 0SKSe@ 10H 1071 1000 1CB5 140H7 C&) t &' eleme t accesses per char l&c 9c&mp&) d, '&rm)la:, '&r 'irst char l&c 100 CH>0C, 9C00000, same: 1>CCC07, 912C5CHBB, 12C51211: CH>4>, 9C00000, same: 410147, 91HHCC4, 1HHB0H: 1000 10B7240, 9C000000, same: 1HB5BC5C, 912C5B100>, 12C4HB0H7: 10B712H, 9C000000, same: 1501H25, 91HHB7H5, 1HHB0H7: C&) t &' eleme t swaps per char l&c 9c&mp)ted, '&rm)la:, '&r 'irst char l&c 100 44C21, 97HHB4>, 400000: 17CH1, 9HH5>1, H>B0>: 44C>2, 97HHB11, 400000: 112C2 9HHB2C, HHB0H: 1000 571724, 94001072, 4000000: 152702, 9HHB010, H>B0>B: 571520, 97HHC>14, 4000000: 17H7B0, 9HHB17H, HHB0H7:

JR 0SKDC

10H

1C>5

JR 0SK0D(?&&p

274

22B5

4>B

A&)r al &' I '&rmati& Scie ce, 71 9B: 2005, pp* 4B>=4C1 D CI?I(, D8I: 10*11>>E01B555150505>00>
Downloaded from http://jis.sagepub.com atPENNSYL !N"! S#!#E $N" on !pril %&' ())* 2005 Chartered Institute of Library and Information Professionals. All rights reserved. Not for commercial use or unauthori ed distribution.

N* A?-DARGIS6

with twice the r) i " time = m&st pr&babl% d)e t& access &' arra%s )sed t& -eep i '&rmati& ab&)t each di"it* A m&re e''icie t al"&rithm is t& )se a divide-a dc& @)er appr&ach* #&ll&wi " a similar a al%sis t& that &' ?emma 1 a d ?emma 2 sh&ws that, t& partiti& n eleme ts ar&) d r piv&ts, the c&) t &' eleme t accesses '&r these tw& appr&aches is nrE2 a d n l&" r, respec- tivel%* Ne/t, we c& sider the ;perm)tati& l&&p< al"&- rithm i tr&d)ced b% 0cIlr&% et al* 2>3 '&r the s&-called America #la" s&rt al"&rithm* 4he ;perm)tati& l&&p< al"&rithm )ses a prepr&cessi " step t& determi e the c&) t &' eleme ts 9based & the c)rre t di"it: that bel& " t& a partic)lar di"it val)e a d this i '&rmati& i t)r is )sed t& determi e where the eleme t sh&)ld be placed i the ;s&rted & the c)rre t di"it< arra%* 4he l&&pi " thr&)"h all eleme ts, '&r each eleme t, &, let p& F S&rt(&siti& &' 9&:. swap9&,A2p&3:* N&w & is i its pr&per p&siti& b)t A2p&3 ma% &t be, s& the pr&cess c& ti )es with A2p&3* 4he impleme tati& &' this idea is "ive b% the 0)ltiDi"it(artiti& K(?&&p meth&d sh&w i ?isti " B* It ca easil% be ar")ed that 0)ltiDi"it(artiti& K (?&&p & n eleme ts per'&rms at least 2n eleme t accesses 9n '&r the l&&p that d&es the c&) ti ", a &ther n '&r the l&&p that p)ts the eleme ts i their pr&per places: a d at m&st n = 1 eleme t swaps 9beca)se each swap p)ts at least & e eleme t i its place:* 4he e/t lemma "ives ti"hter limits '&r the 'irst di"it l&cati& * 6emma 7, '&r ra d&m ) i'&rml% distrib)ted data, the e/pected )mber &' eleme t accesses 2swaps3 per'&rmed b% 0)ltiDi"it(artiti& K(?&&p '&r the 'irst di"it l&cati& is n R n9r = 1:Er 2n9r = 1:Er3* 1roof, c& sider eleme t swaps 'irst* A swap is eeded '&r a % eleme t that is &)t &' place* 4he pr&babilit% &' a eleme t bei " &)t &' place is 9r = 1:Er* 4h)s L &' swaps F n9r = 1:Er* Ne/t, c& sider eleme t access* N&te that there are n accesses '&r the l&&p that d&es the c&) ti "* 4he '&r the l&&p that places the eleme t, the i th p&siti& ma% be accessed as ma % times as a swap is eeded* 4his leads t& n R n9r = 1:Er eleme t accesses* 6&w "&&d is the (erm)tati& -?&&p (artiti& i " meth&dZ C& sider the c&) t &' eleme t accesses 2swaps3 '&r the m)lti-di"it perm)tati& -l&&p meth&d '&r the 'irst di"it l&cati& * $si " ?emma 5 = als& c& siste t with e/perime tal )mbers rep&rted i 4able 4, = we "et, '&r r F 25B, L &' eleme t accesses 2swaps3 2n 2n3* 4hese c&) ts are m)ch smaller tha th&se = '&) d t& be Cn 24n3 = &' divide-a d-c& @)er that )ses (artiti& i "K

0eth&d1 &' Secti& II* Q)t these )mbers d& &t tell the wh&le st&r% = we did &t c&) t &ther &perati& s i v&lvi " the arra%s that -eep i '&rmati& ab&)t each di"it val)e* Acc&rdi "l%, the r) i " time &' Je 0ateS&rtK0D(?&&p was '&) d t& be lar"er tha that &' Je 0ateS&rtKDC = see 4ables 5 a d B* 4his p&i t ca als& be see 'r&m a &ther a "le* I' the perm)tati& l&&p meth&d is s& "&&d wh% &t )se it '&r ;si "le val)e< partiti& i " 9e*"* )se it '&r radi/ F 2:Z 6&wever, the )mber &' eleme t accesses 2swaps3 '&r the ;si "le val)e< versi& &' the perm)tati& -l&&p meth&d '&r ra d&m ) i'&rml% distrib)ted data is n R nE2 2nE23 = &btai ed b% either '&ll&wi " thr&)"h the al"&rithm l&"ic &r s)bstit)ti " r F 2 i ?emma 5* 4his is lar"er tha th&se &' (artiti& i "K0eth&d1 9see ?emma 7:, where the c&) t &' eleme t accesses 2swaps3 is n 2nE23*

7. 3sing Matesort algorithms for text data


4he res)lts &' the previ&)s secti& have established that, '&r ra d&m data, "e eral radi/ 0ates&rt is & better tha bi ar% 0ates&rt* 6&wever, as dem& strated here, "e eral radi/ 0ates&rt is able t& e/pl&it data 9a d e c&di ": red) da c% = s&methi " that bi ar% 0ates&rt is &blivi&)s t&* #&r e/ample, i' we - &w be'&reha d that the te/t is limited t& + "lish )ppercase letters, the , rather tha sca i " the wh&le ra "e &' val)es 90=255: represe ted b% a b%te, the ra "e sh&)ld be limited t& B5=H0 9c&rresp& di " t& the ASCII e c&di " &' letters ;A<=;\<:* 6ere we c& sider several 0ates&rt al"&rithms: bi ar% 0ates&rt, Je 0ates&rtK Se@, Je 0ates&rtK0D(?&&p, Je 0ates&rtKDC a d Je 0ates&rtKDCK8pt* Je 0ates&rtK0D(?&&p is the same as "ive i ?isti " B* ?isti " > sh&ws h&w Qit(artiti& 9)sed b% binary Matesort: a d Di"it(artiti& are ad&pted '&r te/t data* #&r Qit(artiti& , the hi"hest bit l&cati& 9charc&) tYC=1: c&rresp& ds t& the le'tm&st bit &' the 'irst stri " character* Similarl%, '&r Di"it(artiti& , the hi"hest character l&cati& 9charc&) t=1: c&rresp& ds t& the 'irst stri " character* Ge have &bserved that represe ti " the data )si " arra% &' stri "s rather tha arra% &' characters is 'aster* 4his is pr&babl% beca)se arra% eleme t assi" me t i the '&rmer case i v&lves p&i ter m&veme t, whereas, i the latter case, the stateme t ;char23 t F A2piv&tl&c3< is appare tl% d&i " character-b%-character c&p%i "* 4"!" 8&ploiting data redundan y As sh&w i ?isti " H, Je 0ates&rtKDC ca )se a m&di'ied divisi& r)le '&r the c&mp)tati& &' mid, 4>>

A&)r al &' I '&rmati& Scie ce, 71 9B: 2005, pp* 4B>=4C1 D CI?I(, D8I: 10*11>>E01B555150505>00>
Downloaded from http://jis.sagepub.com atPENNSYL !N"! S#!#E $N" on !pril %&' ())* 2005 Chartered Institute of Library and Information Professionals. All rights reserved. Not for commercial use or unauthori ed distribution.

0SD radi/ s&rt al"&rithms

v&id Je 0ateS&rtK0D(?&&p9stri "23 A,i t l&,i t hi,i t di"itl&c: M i t di"itval, -. i' 99l& N hi : OO 9di"itl&c PF0:: M i t23 splitl&c F 0)ltiDi"it(artiti& K(?&&p9A, l&,hi,di"itl&c:. '&r9di"itvalF0. di"itvalN 255. di"itvalRR: EE ra "e depe ds & radi/ M -F splitl&c2di"itval3. i' 9-FF-1: c& ti )e. Je 0ateS&rtK0D(?&&p9A,l&,-, di"itl&c-1:. l& F -R1. S Je 0ateS&rtK0D(?&&p9A,l&,hi, di"itl&c-1:. S S i t23 0)ltiDi"it(artiti& K(?&&p9stri "23 A,i t l&,i t hi,i t charl&c: M stri " t. i t "rp ,s)mc t,pi, last"rpF0. i t23 e dl&cF ew i t225B3. i t23 startl&cF ew i t225B3. i t23 c&) tF ew i t225B3. EE&verl&aded:als& )sed '&r c)rre tl&c i a "r&)p charl&cFcharc&) t-1-charl&c. EEadded '&r &rder 'i/i " '&r te/t data '&r9i t iF0.iNF255.iRR: M e dl&c2i3F-1.c&) t2i3F0. S EEdistrib)te eleme ts based & val)e &' c)rre t di"it 9char: '&r9i t iF l&. iNFhi . iRR: c&) t2A2i32charl&c33RR. s)mc t F l&-1. '&r9i t iF0.iNF255.iRR: i' 9c&) t2i3P 0: M startl&c2i3 F s)mc tR1. s)mc t F s)mc tRc&) t2i3. e dl&c2i3 F s)mc t. c&) t2i3 F startl&c2i3. EEre)se c&) t '&r c)rre tl&c last"rpFi. S EE&ptimi5e: hi is red)ced b% si5e &' last "r&)p hi F startl&c2last"rp3-1. '&r9i t iF l&. iNFhi . iRR: while 9tr)e: M "rp FA2i32charl&c3 . i' 99i PF startl&c2"rp3: OO 9iNFe dl&c2"rp3:: brea-. EE "et s&rt p&siti& '&r A2i3 EE pi F c&) t2"rp3. c&) t2"rp3RR. EE&ptimi5e: replaced b% e/t l&&p while9tr)e: M pi F c&) t2"rp3. c&) t2"rp3RR. i' 9 A2pi32charl&c3 WF"rp : brea-. S EE swap A2i3 with A2pi3 t F A2pi3. A2pi3FA2i3. A2i3 F t. S ret)r e dl&c. EE &te:e dl&c &' last"r&)p &t eeded b% caller S

?isti " B* Je 0ateS&rtK0D(?&&p, "e eral radi/ 0ates&rt )si " perm)tati& )sed '&r te/t data*

l&&p partiti& i " = versi& "ive

is '&r radi/ 25B

4>C

A&)r al &' I '&rmati& Scie ce, 71 9B: 2005, pp* 4B>=4C1 D CI?I(, D8I: 10*11>>E01B555150505>00>
Downloaded from http://jis.sagepub.com atPENNSYL !N"! S#!#E $N" on !pril %&' ())* 2005 Chartered Institute of Library and Information Professionals. All rights reserved. Not for commercial use or unauthori ed distribution.

N* A?-DARGIS6

i t Qit(artiti& 9stri "23 A, i t l&, i t hi, i t bitl&c: M stri " t. EE e/t 2 li es: map bitl&c t& charl&c a d bitl&c withi char bitl&cFcharc&) tYCN]L20CP1-bitl&c. i t charl&cFbitl&c E C. bitl&c F > - 9bitl&c X C:. i t 0as- F 1NN bitl&c . i t piv&tl&c Fl&-1. '&r9i t iF l&. iNFhi . iRR: EE i' 9 99A2i32charl&c3 PP bitl&c: O 0/1: FF0 : i' 9 9A2i32charl&c3 O 0as-: NF 0 : M piv&tl&cRR. t F A2i3. A2i3 F A2piv&tl&c3. A2piv&tl&c3 F t. S ret)r piv&tl&c. S i t Di"it(artiti& 9stri "23 A, i t l&, i t hi, i t charl&c, i t piv&t: M stri " t. charl&cFcharc&) t-1-charl&c. EE&rder 'i/: remap charl&c i t piv&tl&cFl&-1. '&r9i t iF l&. iNFhi . iRR: i' 9 A2i32charl&c3 NF piv&t : M piv&tl&cRR. t F A2piv&tl&c3. A2piv&tl&c3 F A2i3. A2i3 F t. S ret)r piv&tl&c. S v&id Je 0ates&rtKDC9stri "23 A,i t l&,i t hi,i t di"itl&c,i t 'r&mdi"itval, i t t&di"itval: M i t mid, -. i' 99l& N hi : OO 9di"itl&c PF0: : i' 9 'r&mdi"itval N t&di"itval : M EE mid F 9'r&mdi"itval Rt&di"itval:E2. mid F 9>Y'r&mdi"itval R7Yt&di"itval:E10. EE seve -three r)le - F Di"it(artiti& 9A,l&,hi,di"itl&c,di"itval:. Je 0ates&rtKDC9A,l&,-, di"itl&c, 'r&mdi"itval,mid:. Je 0ates&rtKDC9A,-R1,hi, di"itl&c,midR1,t&di"itval:. S else Je 0ates&rtKDC9A,l&,hi, di"itl&c-1,B4, H0:. EE else Je 0ates&rtKDC9A,l&,hi, di"itl&c-1,0, 255:. S

?isti " >* Qit(artiti& a d Di"it(artiti& ad&pted '&r te/t data 9stri " arra%:* Als& sh&w , Je 0ates&rtKDC )si " the seve =three r)le a d restricted ra "e 9B4=H0:. c&mme ted li es are '&r ) restricted ra "e*

s)ch as ;midF9&Y'r&mdi"itvalR910=/:Yt&di"itval:E10< '&r s&me i te"er & betwee 1 a d H* I "e eral, a "&&d divisi& r)le is & e that partiti& s the i p)t data i t& tw& earl% e@)al-si5e halves* A'ter s&me i vesti"ati& , it was &bserved that the seve =three divisi& r)le pr&d)ced timi " res)lts that were ab&)t 15X less tha th&se pr&d)ced b% the &rmal 9i*e* & F 5: divisi& r)le* 6&wever, we have &mitted these 'i")res beca)se the%

remai lar"er 9albeit sli"htl%: tha the 'i")res &btai ed '&r Je 0ates&rtKDCK8pt* 4here is a simple e/plaati& wh% the seve =three r)le w&r-s better tha the &rmal r)le* Si ce the letters at the l&wer e d &' the + "lish alphabet &cc)r m&re 're@)e tl% tha the letters at the hi"her e d 9e*"* the letters ;A<=;D< are m&re 're@)e t tha the letters ;G<=;\<:, the divisi& p&i t sh&)ld be m&ved cl&ser t&ward the l&wer e d i &rder 4>H

A&)r al &' I '&rmati& Scie ce, 71 9B: 2005, pp* 4B>=4C1 D CI?I(, D8I: 10*11>>E01B555150505>00>
Downloaded from http://jis.sagepub.com atPENNSYL !N"! S#!#E $N" on !pril %&' ())* 2005 Chartered Institute of Library and Information Professionals. All rights reserved. Not for commercial use or unauthori ed distribution.

0SD radi/ s&rt al"&rithms

t& have a appr&/imatel% 50=50 split &' the data a d the seve =three r)le d&es e/actl% that* Gith the seve =three r)le applied t& the ;,<=;\< ra "e, we "et the tw& "r&)ps ;,<=;6< a d ;I<=;\<* Clearl%, c& sideri " letter 're@)e c% a"ai , the ;I<=;\< "r&)p is e/pected t& be m&re eve l% split with the seve =three r)le tha the &rmal r)le* Rather tha )si " a 'i/ed divisi& r)le, the mid di"it ma% be determi ed based & act)al character 're@)e c% b)t this was '&) d t& be sl&w* Ge have &pted '&r a appr&/imate s&l)ti& * ?isti " C sh&ws Je 0ates&rtKDCK8pt* 4his is basicall% Je 0ates&rtK DC m&di'ied t& c&mp)te the e&a t e dp&i ts &' the fromdigitval-todigitval ra "e b% a pr&cess &' sca i " '&r mi im)m a d ma/im)m di"it 9character: val)es* 4his is & l% d& e whe a partic)lar di"it l&cati& is visited '&r the 'irst time '&r a "ive lo-hi "r&)p &' eleme ts* D)ri " s)ch a sca , the di"it val)es are s)mmed a d the avera"e is )sed as the mid val)e at that time* As sh&w i 4able 5, this pr&d)ced the 'astest timi " res)lts*

#&r the timi " res)lts sh&w i 4able 5, the i p)t data c& sisted &' & e milli& 'i/ed le "th stri "s 9with le "ths 'i/ed at 20, 70, 40, a d 50 characters: e/tracted 'r&m + "lish d&c)me ts* A stri " is '&rmed 'r&m the )pper-case letters: ;A< t& ;\< pl)s ;,< character = )sed as replaceme t '&r the bla - character beca)se its ASCII c&de 9B4: immediatel% precedes the ASCII c&de '&r the letter ;A<*

8. 9oncluding remar%s
I this paper, we have prese ted a d a al%5ed a )mber &' i -place 0SD radi/ s&rt al"&rithms, c&llectivel% re'erred t& as 0ates&rt al"&rithms* 4hese al"&rithms are ev&lved 'r&m the classical radi/ e/cha "e s&rt* +/perime ts have sh&w that bi ar% 0ates&rt is &' c&mparable speed t& 1)ic-s&rt '&r ra d&m ) i'&rml% distrib)ted i te"er data* $ li-e 1)ic-s&rt, which bec&mes sl&wer as data red) da c% i creases

v&id Je 0ateS&rtKDCK8pt9stri "23 A,i t l&,i t hi,i t di"itl&c,i t 'r&mdi"itval,i t t&di"itval: M i t mid, -. i t i,charl&c. i t c. i t s)m. i' 99l& N hi: OO 9di"itl&c PF0: : M s)m F0. i' 9'r&mdi"itvalFF0: EE set 'r&mdi"itval,t&di"itval ra "e M charl&c F charc&) t-1-di"itl&c. EEadded '&r &rder 'i/i " '&r te/t 'r&mdi"itval F A2l&32charl&c3. t&di"itval F 'r&mdi"itval. '&r 9iFl&R1.iNFhi. iRR: M c F A2i32charl&c3. s)m F s)m Rc. i' 9c N 'r&mdi"itval: 'r&mdi"itval F c. else i' 9c P t&di"itval: t&di"itval F c. S S i' 9 'r&mdi"itval N t&di"itval: M i' 9s)m P 0: EE i' ra "e is set the set mid t& avera"e val)e midF s)mE9hi-l&:. else mid F 9'r&mdi"itval Rt&di"itval:E2. EE else mid F 9>Y'r&mdi"itval R7Yt&di"itval:E10. EE seve -three r)le -F Di"it(artiti& 9A,l&,hi,di"itl&c,mid:. Je 0ateS&rtKDCK8pt9A,l&,-, di"itl&c, 'r&mdi"itval,mid:. Je 0ateS&rtKDCK8pt9A,-R1,hi, di"itl&c,midR1, t&di"itval:. S else Je 0ateS&rtKDCK8pt9A,l&,hi, di"itl&c-1,0, 255:. EEI the stateme t ab&ve, 'r&mdi"itval parameter 90: is )sed as a 'la" t& tell EEthe start &' e/ami ati& &' a di"it l&cati& 9di"itl&c: '&r this 9l&,hi: "r&)p S S

?isti " C* Je 0ateS&rtKDCK8pt: 8ptimi5ed Je 0ateS&rtKDC t& a)t&maticall% d& ra "e restricti& *

4C0

A&)r al &' I '&rmati& Scie ce, 71 9B: 2005, pp* 4B>=4C1 D CI?I(, D8I: 10*11>>E01B555150505>00>
Downloaded from http://jis.sagepub.com atPENNSYL !N"! S#!#E $N" on !pril %&' ())* 2005 Chartered Institute of Library and Information Professionals. All rights reserved. Not for commercial use or unauthori ed distribution.

N* A?-DARGIS6

4able 5 +/ec)ti& times '&r + "lish te/t data* 4he times i pare theses are '&r restricted radi/ val)es t& 2> characters 9ASCII C&des B4=H0:* 4he sec& d set &' )mbers i the last c&l)m is '&r the seve =three r)le* Stri " arra% Si5e 9 : F 10B Stri " le "th 20 70 40 50 +/ec)ti& time 9msec:
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^=

*Net S&rt B5>C BC5H >125 >40B

1)icS&rt 75H7 7>1C 410H 4571

Qi ar% 0ates&rt B7>5 H015 10>07 1254B

JR 0SKSe@ 7>C5H 42C5H 4CH>7 525H7 9BB5B: 9>B40: 9C>B5: 9H>07:

JR 0SK0D(?&&p >7H0 1154B 14CH0 1H5H7 9B747: 910140: 91240B: 915HBC:

JR 0SKDC 4H21 5CH0 BH57 >CH0 94C2C: 95>C1: 9BCH0: 9>>1C:

JR 0SKDCK8(4 74BC, 7B40, 7C2C, 404B, 72C1 7421 7B25 7C2C

a d ma% de"e erate i t& 89n2: al"&rithm, bi ar% 0ates&rt al"&rithm is ) a''ected a d remai s b&) ded b% 89$n:, where $ is the eleme t si5e i bits* Ge have disc)ssed three partiti& i " meth&ds '&r )se b% the "e eral radi/ 0ates&rt al"&rithm: se@)e tial, divide-a d-c& @)er a d perm)tati& -l&&p* #&r + "lish te/t, e/perime ts have sh&w that the "e eral radi/ 0ates&rt )si " divide-a d-c& @)er partiti& i " is the 'astest* 0&re&ver, the divide-a d-c& @)er meth&d ca be &ptimi5ed ')rther t& e/pl&it data red) da c%* #i all%, the e/perime tal res)lts rep&rted here sh&w that 0ates&rt 9a d als& 1)ic-s&rt: are m)ch 'aster 9twice as 'ast i ma % i sta ces: tha the b)ilt-i 0icr&s&'t *Net Arra% S&rt meth&d, which s)""ests that 0icr&s&'t &)"ht t& e/ami e their impleme tati& *

:eferences
213 D*+* ! )th, 6he Art of Computer Programming 9 :ol" Sorting and Sear hing 9Addis& -Gesle%, Readi ", 0A, 1H>7:* 223 C*A*R* 6&are, 1)ic-s&rt, Computer ;ournal 591: 91HB2: 10=1B* 273 A*G* Gilliams, Al"&rithm 272: 6eaps&rt, Communi% ations of the ACM >9B: 91HB4: 74>=C* 243 (* 6ildebra dt a d 6* Isbit5, Radi/ e/cha "e = a i ter al s&rti " meth&d '&r di"ital c&mp)ters, ;ournal of the ACM B92: 91H5H: 15B=B7*

253 I*A* Davis, A 'ast radi/ s&rt, 6he Computer ;ournal 759B: 91HH1: B7B=42* 2B3 R* Sed"ewic-, Algorithms 9Addis& -Gesle%, Readi ", 0A, 1HCC:* 2>3 (*0* 0cIlr&%, !* Q&stic a d 0*D* 0cIlr&%, + "i eeri " radi/ s&rt, Computing Systems B91: 91HH7: 5=2>* 2C3 +*G* DiI-stra, A Dis ipline of Programming 9(re tice 6all, $pper Saddle River, 1HH>:* 2H3 C*?* 0c0aster, A a al%sis &' al"&rithms '&r the D)tch Nati& al #la" (r&blem, Communi ations of the ACM 21910: 91H>C: C42=B* 2103 A* A derss& a d S* Nilss& , Impleme ti " radi/s&rt, ACM ;ournal of 8&perimental Algorithmi s 7 91HHC: article >* Available at http:EEw&ta *li)*ed)Ed&cisEdblE acme/a 9accessed 1C A)l% 2005:* 2113 D* Ri", A Comparison of Sorting Algorithms 92007:* Available at www*dev/*c&mEvb2thema/EArticleE1HH00 9accessed September 2004:* 2123 S* Qasse a d A*T* Jelder* Computer Algorithms 9Addis& -Gesle%, Q&st& , 0A, 2000:* 2173 A* 0a)s, AR? = a 'aster i -place, cache 'rie dl% s&rti " al"&rithm* I : N* St&l et al* 9eds:, <ors$ Informati$ Koferranse <IK='33', Kongsberg, '> <ovember '33' 92002: C5=H5* Available at www* i-* &E2002E 9accessed 1C A)l% 2005:* 2143 A* Al-Qadara eh a d #* Al-A-er, +''icie t i -place radi/ s&rti ", Informati a 1597: 92004: 2H5=702* 2153 $* 0a ber, Introdu tion to Algorithms? A Creative Approa h 9Addis& -Gesle%, Readi ", 0A, 1HCH:* 21B3 G* D&b&siewic5, S&rti " b% distrib)tive partiti& , Infor% mation Pro essing 5etters >91: 91H>C: 1=B*

A&)r al &' I '&rmati& Scie ce, 71 9B: 2005, pp* 4B>=4C1 D CI?I(, D8I: 10*11>>E01B555150505>00>
Downloaded from http://jis.sagepub.com atPENNSYL !N"! S#!#E $N" on !pril %&' ())* 2005 Chartered Institute of Library and Information Professionals. All rights reserved. Not for commercial use or unauthori ed distribution.

4C1

Você também pode gostar