Feedless Data Science Application

Feedless Data Science Application
Stefan Schneider
MSc Graduate
8/19/2016
Stefan871@gmail.com
Match Overview Summary

Model
Simple Neural Network
RBF Support Vector Machine
K-Nearest Neighbors
Random Forest
Prediction
Accuracy
72.4%
65.9%
58.8%
56.5%
Based on the draft alone, my Neural Network (NN) model will predict the correct outcome
72.4% of the time.
Figure 1 Summary Statistic of each model describing the number of true positives, false
positives, false negatives, and true negatives each model predicted on the test set.
Figure 2 ROC (Receiver Operating Characteristic) curves are a real time representation of how
a model improves in accuracy over time comparing the rate of true positives predicted to false
positives. Models with the greatest area under the ROC curve best represent the data.
It is possible to run alternative neural networks that may improve upon the 72.4% accuracy
including deep neural networks, bidirectional NNs, recurrent NNs, and long short-term memory
neural networks. Since you mentioned the timing of results are important, I have not done that
here. If you would like to work together I would gladly implement these models. I would also
caution against anyone who has not reported their statistics in a convincing rather then
technical way. Under the current submission format it would be quite easy to fabricate a result.
Figure 3 Screen Capture of a Small Section of Code Illustrating my Talents
Match Details Summary

Since this data set was not a classification problem (ie. reported wins / losses), but rather an
observational data set, I took on the liberty to report statistics I found personally interesting.
Enjoy!
Caveat: There were 113 different heroes listed within the data however there are only 111
heroes currently playable. Two heroes had no listed entries (#24 and #108). I assume these are
for PitLord and Monkey King? I tried to insert hero names alphabetically but by the missing
hero logic, and based on below results, I determined the hero list is not alphabetical. Please
keep in mind the below statistics could have hero names beside them.
Figure 1 Average GPM vs Time per Hero. Alchemist shows off his passive by taking the lead
Figure 2 Average XPM vs Time per Hero.
Some Fun Statistics

Statistic
Most Picked
Highest Quit Rate Before Lvl 5
Highest Percent of Games > 45min
Net Worth
Highest Average Net Worth 25min
Highest Average Net Worth > 45min
Largest Average Increase in Rate of
Gold Gain After 25 minutes
XP
Highest Average XP 25min
Highest Average XP > 45 min
Largest Average Increase in Rate of
XP Gain After 25 minutes
First
44 (6200)
91 (4.47%)
91 (14.6%)
Hero Number (Statistic)

Runner Up
104 (5569)
105 (3.32%)
38 (10.8%)
Last
92 (283)
54 (0.0607%)
79 (4.06%)
97 (14013 gold)
97 (29663 gold)
35 (10275 gold)
44 (23107 gold)
77 (5489 gold)
77 (11934 gold)
44 (333 gold/min)
48 (269 gold/min)
89 (45 gold/min)
65 (17469 xp)
65 (26670 xp)
97 (12008 xp)
44 (26565 xp)
37 (5173 xp)
37 (12628 xp)
44 (408 xp/min)
48 (400 xp/min)
4 (128 xp/min)
Figure 3 K-Means Unsupervised Learning Algorithm. Unsupervised learning algorithms cluster

data based on similarities without any additional information from the user. These are useful
when a dataset requires classifications but a classification is not provided. The above figure
illustrates hero gold per minute and experience per minute at the 45 minute mark. The
algorithm above has classified each individual hero into one of three groups. These could be
considered as Supports in Green, Offlaners or greedy Supports in Blue, and Carries in Red.
From the above, I hope I have convinced you of my data science skills and knowledge of Dota 2.
I have been looking for Data Science work specifically involving Dota 2 and would be a valuable
asset to your team. I understand the development process of your application and the
requirements to make it succeed. Combining machine learning with Dota 2 is a passion of mine
and I would be an energetic and enthusiastic member of your team. I look forward to hearing
back from you and wish you the best of luck.
Sincerely,
Stefan Schneider
Stefan871@gmail.com

Feedless Data Science Application - Stefan Schneider

Enviado por

Dados do documento

Descrição original:

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Feedless Data Science Application - Stefan Schneider

Enviado por

Direitos autorais:

Formatos disponíveis

Match Overview Summary

Figure 3 Screen Capture of a Small Section of Code Illustrating my Talents

Match Details Summary

Figure 2 Average XPM vs Time per Hero.

Some Fun Statistics

Hero Number (Statistic)

Figure 3 K-Means Unsupervised Learning Algorithm. Unsupervised learning algorithms cluster

Você também pode gostar