Você está na página 1de 9

bettingexpert Blog How To Build A Monte Carlo Simulation

English

How To Build A Monte Carlo


Simulation
What is a Monte Carlo Simulation? How can it help you project
end of season points totals and finishing positions? Today on the
blog Zach Slaton introduces Monte Carlo simulations and shows
us how to develop one
!y Zach Slaton
"ublished# $th %une &'()
*pdated# &+th ,ebruary &'(-
5
Like Like Tweet
29
0
Sign up or Login

This is the forth post in Zach Slaton.s series e/plaining how to use simple0but0effective
statistical concepts that can help provide a richer understanding of the data already at
your fingertips The first post in the series dealt with how linear regression prediction
intervals can yield deeper insights1 the second post e/plained how to use e/ponential
regression to 2uantify rare events li3e goal scoring totals1 and the third post e/plained
how ordered logistic regression can be used to forecast individual match outcomes
Today Zach e/plains how individual match outcome li3elihoods can be used to simulate
the outcome of the all the remaining fi/tures in a season
In m last post in this series I explained how an ordered logistic regression could be built
to explain soccer match outcomes! and e"en pro"ided se"eral examples o# the tpes o#
inputs I$"e included in the ordered logistic regression models I ha"e built o"er time%
These models are highl use#ul in understanding the potential impact statisticall
signi#icant predictors ma ha"e on the li&elihood o# a match ending in a win! tie! or loss%
But how can those indi"idual building bloc&s be assembled to #orm a comprehensi"e
#orecast #or how all o# the teams in a league ma sit relati"e to each other o"er the next
wee&! next month! or at the end o# the season' There appears to be a nearl in#inite
number o# point combinations that could be realised gi"en there are ()* matches in a
+*,team league$s season! each match could end in a loss! tie! or win #or each team! and
no match has the odds o# each outcome e"enl split into thirds% How can an analst
ma&e sense o# such a range o# possible outcomes'
Introducing Monte Carlo Simulation
-ne answer to this complexit is Monte Carlo simulation% As the name implies! Monte
Carlo simulation is essentiall a .model o# chance%/ 0i&ipedia describes it as1
.2a broad class o# computational algorithms that rel on a repeated random sampling to
obtain numerical results! i%e% b running simulations man times o"er in order to calculate
those same probabilities heuristicall 3ust li&e actuall plaing and recording our results
in a real casino situation2 Monte Carlo methods are mainl used #or three distinct
problems1 optimisation! numerical integration! and generation o# samples #rom a
probabilit distribution%/
The repeated random simulations o# indi"idual inputs can thus pro3ect the li&elihood o#
an aggregate outcome i# one has the probabilit o# outcome4s5 #or each e"ent% Such an
approach ma sound intimidating! but a solution can be #ound in the much,maligned,but,
in#initel,use#ul Microso#t Excel%
Simulating Individual Match Results
To start! assume that the analst interested in the aggregate outcome has created a
model in their statistical tool o# choice% In this case! it$s a model that pro3ects the
li&elihood o# winning! ting 4drawing5! or losing a match% The model is applied to each
match in a league season! in this case Ma3or 6eague Soccer in the 7nited States%
The #irst order o# business is to create a random outcome #or each match! and the
method used within this example is Excel$s 8A9: #unction that creates a random number
between * and ;% The output o# the 8A9: #unction is then compared to the match
outcomes using the #ollowing logic1
4, 5678 9 "robability of :oss
TH;7 match outcome is a loss
;:S;
4, 5678 9 <"robability of :oss = "robability of Tie>8raw?
TH;7 match outcome @ tie>draw
;:S; match outcome @ win
A screenshot o# a 3ust such a setup is pro"ided below%
9ow that the analst has a random outcome assigned to e"er match in a season! how
should the go about creating a Monte Carlo simulation and how man random
simulations o# the season should the run'
6ast things #irst1 the answer is that .it depends/% <or a tpical season most analsts run
;*!*** simulations% This number is o#ten #ound to o##er the proper balance between
simulation duration o# a couple hours and model resolution gi"en the number o#
interactions due to each indi"idual match%
Utilising Pivot Tables to Roll Up Match
Results
9ow #irst things last1 Microso#t Excel o##ers a solution #or running those ;*!***
simulations% =i"ot table #unctionalit within Excel is the per#ect wa to roll up the results
#rom the indi"idual matches in point total! goal di##erential! and win>draw>loss outcome
count% These totals are achie"ed b creating pi"ot tables with .team>club/ on the rows
and either match outcome or points on the columns% In either case! the "alues within the
pi"ot table are the sums o# either match outcome or points% See the example below%
The other bene#it o# using a pi"ot table is that re#reshing it is a .calculation/ within Excel!
and the 8A9: #unction re,calculates each time there is a calculation elsewhere in an
Excel wor&boo&% This means that ;*!*** simulated seasons can be created with the
8A9: #unction! a #ew lin&ed pi"ot tables! and less than twent lines o# ?isual BASIC
code that could be learned in a #irst,le"el computer science and consists o# do>while
loops o# cop>paste commands o# the pro3ected table o# each simulated season%
:oing so should produce results that loo& li&e this1
The ;*!*** simulations o# the remaining #ixtures now must be added to the point totals!
match outcomes! and goal di##erential to date% This can be done "ia Excel$s ?6--@7=
command re#erencing another pi"ot table built using the results to date! and adding the
returned "alue to the "alue #or the same attribute in the pro3ected results% Auto,#illing the
columns with ?6--@7= commands pro"ides pro3ected "alues #or all o# the "ariables!
and all that$s le#t to do is sort the results b run! then point total! then b the league$s tie
brea&ers%
:oing this sort ensures data stas within the respecti"e run in which it was generated!
and it pro"ides pro3ected table positions within each season%
All thats left to generate is a likelihood of each teams finish position, and another pivot
table of table position versus team can do this. In this case the pivot table plots teams on
the rows and table position in the columns and values. The pivot tables values will need
to be changed to a count rather than a sum (the model is measuring how man times a
team is pro!ected to finish in a table position", and the #how data as$ field should be
marked as % of row.
The resultant pivot table should look like this$
Thats it. That is all that is re&uired to build a 'onte (arlo simulation. )sers of the
simulation can now update its inputs * matches plaed versus upcoming fi+tures * as
fre&uentl as the like, run what if studies for the ne+t weeks matches, and an other
variet of forecasts. The process can become highl automated and take less than ,-
minutes a week to update if special attention is paid to the .+cel workbooks
construction. A person can automate even the process of combining prior matches and
future fi+tures with /0112)3 and sort functions with even the most basic programming
skills via .+cels record macro function.
Applications of Monte Carlo Simulation
4ere are some e+amples of how this ver basic approach can be utilised in competition
forecasting.
Transfer Price Index Simulations of the English Premier
League Season
Transfer 3rice Inde+s m#&5 model, which utilises venue and relative s&uad costs as
inputs, was used to forecast the most likel final table positions of each club on a weekl
basis. This model &uantified individual match outcomes impacts on each teams likel
finish position ( it wasnt !ust 'anchester )niteds win over (it in 1ctober that swung
the title their wa", as well !ust how much of an advantage a club might have surrendered
along the wa (see Tottenhams 6-%7 likelihood of a Top 8 after beating Arsenal in earl
'arch and how much it fell awa over the final two9and9a9half months of the season".
MLS Eastwood Index
:logger 'artin .astwood created the .astwood Inde+ as a wa to know where teams
stand relative to each other, how results against clubs with various levels of &ualit
impact a teams rating, and how the ratings difference between two clubs can help
predict future match outcomes.
This model has been applied to '0#, and the 'onte (arlo simulations have been used
to &uantif things like the impact the #eattle #ounders poor start had on the danger (or
lack thereof" of not making the league plaoffs.
CONCACA !orld Cup "ualification
;inall, 'onte (arlo simulations can even be used to run a post9mortem what if using
others forecast match outcomes after the matches are completed. 1ne such source for
such match forecasts are bookmaker odds. :ookmakers are looking to ma+imise their
profit, so the often don<t forecast more than one match in advance, or onl a few
matches in advance if the schedule is compact. As an e+ample, 'onte (arlo methods
have been paired with bookmaker odds to help analse the likelihood of current point
totals within (1=(A(A;s final round of >orld (up &ualifing.
>hile everone knows 'e+ico has struggled from match9to9match, it turns out that
bookmakers onl foresaw 'e+icos current three points or less in ,-% of the aggregate
outcomes contained in their forecasts. 'eanwhile, the )nited #tates four points puts
them s&uarel within bookmaker e+pectations.
Conclusion
)sing 'onte (arlo simulation methods allows analsts to properl measure and model
discrete events like soccer matches, and then roll the results of those discrete events up
to a bigger forecast over a season or more.
'ore importantl, 'onte (arlo simulation methods provide a probabilistic outlook to
such forecasts, allowing the analst to e+press their level of statistical certaint (or
uncertaint" in the forecast. This is ke to thinking in a nois, uncertain sport like soccer,
and as this post has attempted to e+plain its not too comple+ an analsis to set up. All
thats needed is a probabilistic model, a tool like 'icrosoft .+cel for storing results, and a
bare minimum of programming capabilit.

Você também pode gostar