Você está na página 1de 4

Imperial Journal of Interdisciplinary Research (IJIR)

Vol-2, Issue-5, 2016


ISSN: 2454-1362, http://www.onlinejournal.in

Review to the Duckworth-Lewis Method


Using Data Mining Techniques
Rohan Brahme1*, Roshan Birar2, Poonam Kadnar3 &
Prof. Suruchi Malao4
1,2,3,4
Department of Computer Engineering, K. K. Wagh Institute of Engg. Education &
Research, Savitribai Phule Pune University, India
Abstract- The Duckworth - Lewis system represents
mathematical formulation used to get a target score B. Duckworth-Lewis Method
for cricket matches interrupted by bad weather Two British statisticians Frank Duckworth and
conditions. The Duckworth-Lewis (D/L) method Tony Lewis developed their Method called
considers only two factors to provide updated target Duckworth-Lewis (D/L) method which is nothing but
i.e. number of runs which can be scored in the a statistical method used to predict the target score of
remaining innings as a function of the number of the team batting second in a limited overs game which
overs remaining and the number of wickets in hand. is interrupted by unavoidable circumstances. The D/L
We will be using WEKA tool to find bias in current method, a system based on mathematical model
D/L system and capably illustrate those. considers only two resources – wickets left and overs
Duckworth Lewis system has observed to be remaining. When overs are lost, setting an adjusted
biased towards the team batting first and the team target is not as simple as to reduce the batting team’s
winning the toss from the scenarios like interruption target proportionally, because a team batting second
of the game for multiple times in same match and fall with wickets in hand can be expected to play more
wickets in death overs while batting second. Bias in aggressively than one with full 50 over’s and hence
the context of the outline is defined as taking can achieve a higher run rate. So then Duckworth &
advantage of the assets of systems such as the Lewis (1998) considered the most common situation
D/Lewis method. We also explore to show that such where two terms play a full length game.
taking advantage of the system permits prediction of
the result of the match winner which is better than
just chance. Using the above analysis, we propose a
modification to the existing Duckworth Lewis system
by considering the observed patterns from the dataset
as an additional resource to reduce the bias along
with the existing resources to predict the target score.

Keywords- Cricket, Duckworth - Lewis, WEKA, C4.5,


Decision Trees.

I. INTRODUCTION
A. The game of Cricket
The above graph [1] shows the percentage of
As mentioned in [1]. Cricket is a bat-and-ball team
resources remaining for a team to the number of overs
sport that is originated in England and is one of the
bowled. As we see it in an exponential graph reducing
most popular games in the world. Moving on from the
as more number of wickets keep falling and comes
conventional Test cricket, it has slowly ventured into
down to zero when the 9th wicket falls. Duckworth-
limited over formats like ODI and T20 so that a
Lewis observed a close connection between the
definite result is obtained, making it more entertaining
availability of these resources and team’s final score,
as a spectator sport. Sometimes due constraints like
which this algorithm tries to exploit.
bad weather (rain, sandstorms and bad lights),
In above table the remaining overs are plotted
floodlight failure and crowd issue certain amount of
against wickets lost.
overs are lost and hence a definite result isn’t
obtained. To overcome these obstacles methods have
been devised to revise target scores and/or declare a
winner.

Imperial Journal of Interdisciplinary Research (IJIR) Page 547


Imperial Journal of Interdisciplinary Research (IJIR)
Vol-2, Issue-5, 2016
ISSN: 2454-1362, http://www.onlinejournal.in

1. South Africa vs. New Zealand, Durban,


November 2000
Batting first for New Zealand and score
was 81 for 5 after 27.2 overs when rain
reduced the game to 49 overs per side.
Then, with New Zealand on 114 for 5 in
32.4, their innings was stopped due to
rain, and the second innings was
shortened to 32 overs. South Africa's got
new target according to the D/L charts
was 153, but modified version suggests
that the target would have been 156. At
the time when game was interrupted, New
Zealand's run rate was 3.48 for five
wickets down and 17.2 overs to spare.
Resource Percentage Table
According to D/L's modified calculations,
South Africa's required run rate would be
The above table [3] is the calculation of percentage
4.87.
of resources left. Here the percentage of resources are
calculated beforehand by taking into consideration of
2. West Indies vs. New Zealand, Port-of-
the overs left and the wickets lost and is stored in the
Spain, 2002
table so that it comes in handy while calculating the
New Zealand made 212 for 5 in 44.2
revised target. This table is actually referred to while
overs while batting first, when their
Duckworth Lewis comes into picture.
innings was called off and West Indies'
However, some of the factors like the toss may
chase was truncated to 33 overs. D/L
play a crucial role while deciding the winner since it
calculated their revised target at the time
involves a lot of speculation and research while
as 212. Again, a comparison of run rates
deciding bat or field first after winning the toss. For
raises a few questions. New Zealand's run
example, the analysis of pitch report, previous history
rate at the end of their innings was 4.78;
of the ground, and expected weather conditions and
West Indies' required rate in 33 overs
these factors that suggests the decision. In rain
according to D/L is 6.36, an increase of
affected matches the batting first is the advantageous
33%.
decision. After rain, the pitch becomes soft and
outfield becomes slow and the ball bounces unevenly,
3. South Africa vs. New Zealand,
making it difficult to bat as mentioned in [2].
Johannesburg, WC2003
Replying to South Africa's imposing 306
C. Duckworth-Lewis Model
for 6 in 50 overs, New Zealand, riding on
Objective of D/L system was to find method that
Stephen Fleming's outstanding century,
must follow the criteria given below.
were 182 for 1 in 30.2 when rain reduced
1. It must maintain exact fairness to both
the chase to 39 overs. According to the
sides.
new D/L calculations, the revised target
2. It must give appropriate result in all
would have been 229 (it was 226 at the
possible situations.
time). The point of contention is this: at
3. Team 1’s scoring pattern should not affect
the time of the interruption, New
the revised target for team 2 in an
Zealand's required rate was 6.35 runs per
interrupted game.
over, stretching over a period of almost 20
The interruption of game for multiple times during
overs. Going by the current D/L
same match and the fall of wickets in death overs
calculations, the required run rate on
cause the unfair dealing with target prediction. So the
resumption is 5.42, over a period of just
data mining to reduce such bias in specific conditions
8.4 overs - obviously, the rain has
should be done.
simplified New Zealand's task enormously
(though the D/L contention is that New
Zealand are reaping the rewards of being
D. Controversial D/L method decided
well ahead of the par score at the point of
matches [2]
interruption).
Some actual scenarios from ODIs that highlight the
shortcomings in D/L method:

Imperial Journal of Interdisciplinary Research (IJIR) Page 548


Imperial Journal of Interdisciplinary Research (IJIR)
Vol-2, Issue-5, 2016
ISSN: 2454-1362, http://www.onlinejournal.in

E. Formula for D/L score calculation[6] in death over cause unfair result if inning gets
Let, truncated.
S: Team 1’s score [4] We are going to these tools to extract such
R1: Resources % available to team 1 (from R. P. patterns and will try to minimise bias. This is the
table) foundation for evaluation part of the project and base
R2: Resources % available to team 2 (from R. P. input for extension part.
table) Following are some observed patterns [2]:
T: Target score for team 2 1. Pattern 1: Team winning the toss wins the
matches in 66% cases.
Case 1: 2. Pattern 2: Team batting first wins the
If R1>R2, match in 64% cases.
T=S(R1/R2)+1; 3. Pattern 3: 54% of teams winning toss
Reduces team’s score in proportion to reduction in elects to field first in the rain affected
resources matches.
4. Pattern 4: Average of difference in run
Case 2: rate between winning and losing team
If R1=R2, scores is not significant
T=S+1;
No adjustment required A. Inferences from the above patterns[2]
1. D/L method has been biased towards the
Case 3: team batting 1st.
If R1<R2; 2. D/L method has been biased towards the
T=S+[G50*(R2-R1)/100]+1; team winning the toss.
Where G50 for matches involving ICC full 3. D/L method stresses more on wickets
member nations, at present is 235 rather than run rate and the runs scored.
Increase team 2 target score by the extra runs that
are predicted in accordance with the extra resources Based on the above observations, we have a
heuristic model to predict the winner in a match
F. Project Goals decided by D/L method. In this model, we calculate a
The primary goal of this project is to evaluate the weighted score for each of the two teams using the
Duckworth Lewis system and identify its limitations following formula[2]:
from scenarios and patterns. In addition to evaluation
of system we want to propose an interesting extension Weighted Score = (overall probability of winning)
which will be capable to address these limitations. We ×
are not needed to propose an entirely new model [α (toss winner or toss loser) +
some additional resources along with the former will β (batting or fielding) +
do the job. In the try to identify the shortcomings in γ (rainy season or no rainy season)
the existing D/L we are going to use the data mining × (home or away or neutral)]
classification algorithm C4.5 using WEKA tool and
propose an extension to D/L method. We would like Historically, in matches decided by Duckworth-
to show that data mining techniques (classification Lewis[2]:
using WEKA tool) can be used as effective tools to 1. The team batting first won on 64% of the
evaluate systems such as the D/L method and also occasions involving two of the top nine teams.
devise alternative models. 2. In 66% of the occasions, the team winning the
toss has won.
II. LITRATURE SURVEY 3. In 54% of the occasions, the team winning the
toss decides to field.
As mentioned above in order to extent the present 4. In 82% of the occasions, matches were affected
system WEKA tool was used to identify patterns in by rain.
the data. WEKA tools are mainly used to deal with
different predicators that different aspects of the game Where:
like pitch degradation, net run-rate, and toss. WEKA α=0.66 …from (1)
tools are applied on the sample dataset of matches β=0.64 …from (2)
which were affected by D/L and their scores were re- γ = 0.82 * 0.54 * 0.36 from (3 and 4)
evaluated.
In such matches the interruption of game for
multiple times in same match gives unfair prediction III. MOTIVATION
in target also while batting second the fall of wickets

Imperial Journal of Interdisciplinary Research (IJIR) Page 549


Imperial Journal of Interdisciplinary Research (IJIR)
Vol-2, Issue-5, 2016
ISSN: 2454-1362, http://www.onlinejournal.in

Duckworth-Lewis method was introduced in the


late 1990s and has since been adopted by all major Target score=Target score by D/L method * [p1
cricketing boards. No other sport uses a statistical *p2*….*pn]
method to select the winning target for a match. But
the peculiarities of cricket and its susceptibility to bad Where:
weather have made it imperative to and such a pi is a bias reducing parameter for ith pattern,
solution for matches where a result is mandatory.
Cricket one of the most popular game in the world. VI. CONCLUSIONS
With billions of followers around the world and This paper presents a novel approach to evaluate
extension to the D/L method is probably the greatest the Duckworth Lewis system which is used to predict
contribution to the sporting world from a the target score in rain affected cricket matches when
mathematical, statistical and operational research one or both the teams have had their innings
perspective. shortened. Using sophisticated data mining techniques
such as C4.5 with help of WEKA tool we will
discover the bias in the D/L method from different
IV. EVALUATION OF THE DUCKWORTH LEWIS patterns extracted from sample dataset using WEKA
METHOD AND IDENTIFICATION OF IT’S tool. The observed bias will be in the favor the team
LIMITATIONS
batting first and the team winning the toss. The
additional resources while giving the input to the
B. Description of Dataset[4] system will be the patterns when D/L gives unfair
The dataset consists of information of all the prediction such as interruption of game for multiple
matches that were affected by inclement weather and times in the same match that and fall of wickets in
the location. i.e. the country and the stadium where death overs the will help to reduce the bias.
Duckworth Lewis method had come to use. The
dataset consists mainly of One Day International
(ODI) matches of teams from India, Pakistan,
England, West Indies, South Africa, New Zealand, Sri
Lanka, Australia, Bangladesh & Afghanistan. [2]
VII. ACKNOWLEDGMENT
V. EXTENSION TO THE DUCKWORTH LEWIS Our thanks to Prof. Dr. S. S. Sane and Mr. Sameer
METHOD Mainkar for guidance and support.
In the analysis part of the project we will show VIII. REFERENCES
the exploitation of D/L towards the team batting first
[1] http://en.wikipedia.org/wiki/Cricket
and the team winning the toss using data mining [2] Phanse V. & Deorah S. (2011, December). Evaluation and
techniques. We also have some of the sample Extension to the Duckworth Lewis Method: A Dual
examples to show the bias in the system. There are Application of Data Mining Techniqies. In Data Mining
several factors which causes exploitation in the Workshop (ICDMW), 2011 IEEE 11th International
Conference on (pp. 763-770). IEEE
system such as interruption of the game for multiple [3] Frank Duckworth, The Duckworth/Lewis method: an
times during same match, fall of wickets in death exercise in Maths, Stats, OR and communications in MSOR
overs which leads to unfair changes in the target Connections Vol 8 No 3 August – October 2008
score prediction. So we will extract such ‘n’ number [4] http://www.espncricinfo.com/
[5] Schall R. & Weatheral D. (2013). Accuracy and fairness of
of factors using WEKA tool and these patterns will be rain rules for interrupted one-day cricket matches. Journal of
provided to the Duckworth-Lewis method calculator Applied statistics, 40(11), 2462-2479
along with its former inputs No. of overs to be played [6] Harshil Shah, Jay Sampat, Rushabh & Kiran Bhowmick
and No. of wickets as a resources. (2015). Review of Duckworth Lewis Method
[7] Parera H. P. & Swartz T. B. (2013). Resource estimation in
Bias in the system gives permission to guess the T20 cricket. IMA Journal of Management Mathematics,
winner of the match in rain affected games which 24(3), 337-347.
marks a question on D/L system sometimes.

Imperial Journal of Interdisciplinary Research (IJIR) Page 550

Você também pode gostar