Escolar Documentos
Profissional Documentos
Cultura Documentos
TableofContents
Introduction
WhatisDataScience?
DifferentRoleswithinDataScience
HowDifferentCompaniesThinkAboutDataScience
1Earlystagestartups(200employeesorfewer)lookingtobuildadataproduct
2Earlystagestartups(200employeesorfewer)lookingtotakeadvantageoftheirdata
3MidsizeandlargeFortune500companieswhoarelookingtotakeadvantageoftheir
data
4Largetechnologycompanieswithwellestablisheddatateams
IndustriesthatemployDataScientists
GettingaDataScienceInterview
NinePathstoaDataScienceInterview
TraditionalPathstoJobInterviews
1DataScienceJobBoardsandStandardJobApplications
2WorkwithaRecruiter
3GotoJobFairs
ProactivePathstoJobInterviews
4AttendorOrganizeaDataScienceEvent
5
FreelanceandBuildaPortfolio
6
G
etInvolvedinOpenDataandOpenSource
7ParticipateinDataScienceCompetitions
8AskforCoffees,doInformationalInterviews
9AttendDataHackathons
WorkingwithRecruiters
HowtoApply
CVvsLinkedIn
CoverLettervsEmail
www.springboard.com
HowtogetReferencesandYourNetworktoWorkforYou
PreparingfortheInterview
WhattoExpect
1ThePhoneScreen
2TakehomeAssignment
3PhoneCallwithaHiringManager
4OnsiteInterviewwithaHiringManager
5TechnicalChallenge
6InterviewwithanExecutive
Whatadatascientistisbeingevaluatedon
TheCategoriesofDataScienceQuestions
BehavioralQuestions
MathematicsQuestions
StatisticsQuestions
ScenarioQuestions
TacklingtheInterview
Conclusion
WhatHiringManagersareLookingFor
InterviewwithWillKurt(QuickSprout)
InterviewwithMattFornito(OpsVisionSolutions)
InterviewwithAndrewMaguire(PMC/Google/Accenture)
InterviewwithHristoGyoshev(MasterClass)
Conclusion
HowSuccessfulIntervieweesMadeIt
SaraWeinstein
NirajSheth
SdrjanSantic
Conclusion
7ThingstoDoAfterTheInterview
1Sendafollowupthankyounote
2Sendthemthoughtsonsomethingtheybroughtupintheinterview
3Sendrelevantwork/homeworktotheemployer
www.springboard.com
4Keepintouch,therightway
5Leverageconnections
6Acceptanyrejectionwithprofessionalism
7Keepuphope
TheOfferProcess
HandlingOffers
CompanyCulture
Team
Location
NegotiatingYourSalary
FactsandFigures
TakingtheOffertotheBestFirstDay
Templates
Reachingouttogetareferral
Followingupafteraninterview
Resources
www.springboard.com
Introduction
WhenwefirstwrotetheSpringboard
CareersGuidetoDataScience
,wedidntexpectthe
engagementitdgarner.Thousandsofpeoplesignedupinafewdays,confirmingourbeliefthat
therewasascarcityofgreatadviceonwhatisanexcitingbutnebulousfield.
Inspeakingwithmoreandmorepeople,wefoundonlyafewgreatresourcesthatexplainedhow
tobreakintoadatasciencecareer.Therewereindividualstoriesandcollectionsofinterview
questions,butwecouldntfindafullguidetocovereverythingaboutthedatascienceinterview
processfromhowtogetaninterviewinthefirstplacetohowtodealwithanyofferedpositions.
Iwantedaguidecollectingperspectivesfrompeopleonbothsidesofthetable.Iwantedtotalkto
recruiterswhorefercandidates,hiringmanagerswhotableoffers,andcandidateswhohad
successfullymadeitthroughthedatascienceinterviewtodemystifythedatascienceinterview
processwithinsightsfrompeoplewhohadpreviouslygonethroughtheprocess.
Icoauthored
thisbookwithSriKanajanaseniordatascientistinNewYorkCityatamajor
investmentbank.
At
Springboard
,wevetaughtthousandsofdatascienceaspirantsthroughourmentored
workshops.Webuiltlarge,engagedcommunitiesofmentorsandalumni,whichaffordusaunique
vantagepointtodeliverreallifeperspectivesonthedatascienceinterviewprocess.
Itwasdifficultcollectingeverythinghere,likeitwasdifficultformanyofthecandidateswhomade
itthroughtheprocess.Someoftheleadersindatascience,includingtheChiefDataScientistof
theUnitedStates,hadtogothroughsixmonthsofwaitingbeforetheygotanoffer!Most
companiesdatascienceinterviewprocessesaredesignedtoweedoutallbutthemostdetermined
andskilledcandidates.Itcanseem,attimes,likeahurdlepreventinganysanejobseekerfrom
entering.Yet,whiletheinvestmentcanseemimmense,thereturncanbeevengreater.
www.springboard.com
Datasciencehasbeencalledthe
sexiestjobofthe21stcentury
.Datascientistsdontjustmake
goodmoneytheydrivesignificantsocialimpactfrom
mappingworldpoverty
tostopping
pandemics
beforetheyevenhappen.Datascientistsunearthedthe
identityofBanksy
,andthey
masteredtheartofpredicting
basketballscores
inMarchMadness.Workingindatascienceisnt
aboutjusthavingagoodsalaryandgoodworklifebalanceitsaboutsolving
bigproblemsthat
matter
.
Wewrotethisguidebecausewewantedtoyoutogofrombeingcuriousaboutdatascienceto
activelytryingtogetajobinthefield.Wewantedtounearthwhatittakesforyoutomakeit
throughthedatascienceinterviewprocess.Wewrotethisguidebecause
wewantyoutorock
yourdatascienceinterview.
www.springboard.com
Beforeyoulookfordatascienceinterviews,youshouldknowwhatthetermmeansandwhat
youregettingyourselfinto.
DJPatil
,thecurrentChiefDataScientistoftheUnitedStates,firstcoinedthetermdatascience.
Adecadeafteritwasfirstused,thetermremainscontested.Thereisa
lotofdebateamong
practitionersandacademicsaboutwhatdatasciencemeans
,andwhetherornotitsdifferentfrom
thedataanalyticscompanieshavealwaysused.Whenpeopletalkaboutbigdataandusing
machinelearningtosolvedataproblems,theyareventuringintoawholenewfieldwhoseterms
arebeingdefinedrightnow.
Differentcompanieshavedifferingdefinitionsofwhatdatasciencemeans.Individualhiring
managersmaydifferaboutexactlywhattheyrelookingfortheywillhireandinterview
accordingly.
Thisconfusionmakesthedatascienceinterviewprocessdifficultforalotofcandidates.Data
sciencecanhavevastlydifferentdefinitionsdependingonwhatroleyoureapplyingforandthe
companyyoureinterviewingwith.
www.springboard.com
Letsgothroughasampledata
scienceprojecttoelaborateon
thedifferentrolesyoullseein
datascience.Adatascience
teammightbeassignedtouse
deeplearningtoclassifyimages
likeYelpsteamdid
.
Millionsofphotosareuploaded
onYelpeverysingleday,butit
canbehardtogetimagesyou
wantforeachrestaurant.
Sometimes,thephotos
uploadedareallofthesame
categorymaybetheyreall
photosofthefoodorthe
outsideoftherestaurant.A
holisticevaluationofa
restaurantrequiresimagesofdifferentkinds.
Youcanusemachinelearningtoautomaticallycategorizewhichimagesfallintowhatcategory.
Computerscan,withthehelpofatrainingset,tellyouwhetherornotanimageistheoutsideof
therestaurantoroffood.
Datascientistscreatethemodeltohelpmachinescreatethosedistinctions.Theywouldbeableto
thinkthroughthetypesofdatatheyneed,frommanuallytaggedphotostokeywordsinimage
captions.Thistendstobeamoreseniorlevelrole,astheyoftenmanagedataproductsfrom
www.springboard.com
endtoendanddealwithallfacetsofdatascienceproblems,fromalgorithmselectionto
engineeringdesign.
Dataengineerscreatesystemstosourcealloftheimagedataandstoreit,aswellasimplement
someofthealgorithmsdeterminedbydatascientistsatscale.Thistendstobearoleforpeople
withstrongtechnicalchopsbutmightnotknowasmuchaboutthetheoryofthealgorithmstheyre
implementingatscale.
Dataanalystsqueryandpresentthebusinessimplicationsofthechange.Diditpleaseusers?How
muchmoretrafficdidYelpgenerateduetotherecentchange?Thesearequestionsdataanalysts
wouldask.Then,theycommunicatetheinsightstheyfound.Thisroletendstobefilledbymore
entrylevelpeopleandpeopleinbusinessfacingroleslearningtoapplytheirinsightsona
technicalbasis.
Therearemoreroleswellcoverindetaillater.Fornow,youshouldknowthatthedatascience
interviewprocessforallthreeofthesegeneralrolescanbevastlydifferentfromoneanotherand
infact,theyoftenare!
www.springboard.com
Notonlyaretheredifferentrolesindatascience,therearealsodifferentcompanieswithvastly
differentinterviewprocesses!
Ingeneral,theserolescanbesplitintofourroughcategories.
1Earlystagestartups(200employeesorfewer)lookingto
buildadataproduct
WelcometothebeatingheartlandofSiliconValley.Theearlystagestartupisaromanticnotion,
butoneseeingastaggeringamountofsuccessinarapidamountoftime.Ifyoujoinanearlystage
startup,bepreparedtowearalotofhatsandpotentiallytakeonallthreedatasciencerolesatthe
sametime.Youwillneverhavetheresourcesyouneedinfull,sobepreparedtobescrappyand
tough.
Thebarwillbeespeciallyhighifthestartupinquestiondealswithdataasitsproduct.Aplatform
optimizingotherpeoplesdataorappliesmachinelearningtodifferentdatasetswillhavemuch
higherstandardsforhowtheythinkaboutdatathancompaniestryingtolearnfromtheirown
data.Thecofounderswilllikelybepioneersinthefieldofdatascienceorhaveledlargescaledata
scienceteams.TheywillbelookingforAplayerswhohavesignificantexperienceinthefieldor
tonsofpotentialanddrive.Ifyoujoinanorganizationlikethis,bepreparedforthelearning
experienceofalifetime,andbepreparedtobeheldtothehigheststandardpossiblewhenitcomes
todatascience.
Examplesofthiscompanytype
:
Looker
,
ModeAnalytics
,
RJMetrics
Samplejobpostings:
DataAnalyst(Looker)
,
SeniorAnalyst(ModeAnalytics)
www.springboard.com
Sizeofthecompany
:143associatedonLinkedIn(1150companysize)
Howtoreadthisjobdescription
:Focusoncommunicationandscriptinglanguagesfor
queryingandvisualizingdataindicatesthisisabusinessfacingrolewhereinsightsmustbe
communicatedtorelevantteams.
2Earlystagestartups(200employeesorfewer)lookingto
takeadvantageoftheirdata
Thebarwillbelowerifastartupismerelylookingtotakeadvantageofitsdataratherthanselling
adataproducttoothercompanies,butsincethesmartuseofdataisessentialtothecompetitive
advantageofastartup,youshouldstillexpectarelativelyhighbar.
www.springboard.com
10
Startupsinthetechindustrycontainalotoftechnicaltalent,buttheyneedsomebodytobridge
thebusinessandtechteams,especiallyiftherearecommunicationissuesbetweenthedifferent
teamsonhowdataisused.Bepreparedtoworkhardforthecompanytoembracebeing
datadrivenatalllevels,andbepreparedtobetheonewhobringsinnewtoolsandprocessesfor
collectingandusingdataatalllevelsoftheorganization.
Workingforacompanythatdealswithitsowndatabutdoesntthinkaboutdataatscalemaybe
anuniquechallengeasyoullbecalledupontoenforceandspreadadatadrivenculture
throughouttheorganization.Bepreparedtoexerciseyourleadershipandcommunicationskills.
Lastly,B2BstartupsandB2Cstartupsdifferentiateinthedatatheyget.B2Bstartupsare
businesstobusinesstheysellsoftwaredirectlytolargecompanies.ThinkSalesforce.B2C
startupscatertomanyindividualcustomers.ThinkAmazon.WhenyouredealingwithB2B
startups,yourelikelygoingtobefacedwithdatachallengesthataresmallinvolumebuthighin
detailandfeaturesstartupsthatselldirectlytobusinessesdonthavemanycustomers,butthey
focusmaniacallyontheonestheydohavesinceeachindividualcustomerwillbringinlotsof
revenue.B2Cstartupswillhavemoredataproblemsdealingwithvolumeandscaleastheywill
havemanymorecustomers,butthefocusonindividualcustomerswillbedilutedtofocuson
groupsofthem.AB2Bstartupmaydealwith1,000customers,allofwhompay$1,000amonth.
AB2Cstartupmaydealwith100,000users,buteachusermayonlygenerate$1inrevenuea
month!
Befamiliarwiththecompanyyoureapplyingforandtheuniquedatachallengesitfaces.Research
thoroughly,andmakesureyoureonlyapplyingforcompaniesthatfityourpassionsandskills.
Examplesofthiscompanytype
:
Springboard
,
Branch
,
Rocksbox
,
Masterclass
,
Sprig
Samplejobpostings
:
LeadDataScientistatBranch
,
DataScientist(Research)atRocksbox
,,
DataScientistatMasterclass
www.springboard.com
11
Sizeofthecompany
:37associatedonLinkedIn(1150companysize)
Howtoreadthisjobdescription
:Lookingforageneralistwhocandivedeeperandstill
communicatedifferentinsightsindicatesthisisadatascientistrolethatwillbeverybroadin
termsofskillsetsdemanded.Thisroleisgoingtobeproactiveandentrepreneurial.
3MidsizeandlargeFortune500companieswhoare
lookingtotakeadvantageoftheirdata
www.springboard.com
12
Thelargestcompaniesintheworldknowthattakingadvantageoftheirdataisatoppriority.Some
willhaveestablisheddatascienceteamsthatarewellfunded,robust,andfedwithlotsofdata.
Somewillhavestartupliketeamswithintheorganizationtohelpthemtranslatetheirdatainto
businessinsights.Therearealotofcompanieshiringdatascienceteamsuponrealizinghow
importantdataistoremainingcompetitive.Usethistoyouradvantageitcanbeeasierpassing
thedatascienceinterviewforalarge,prestigiousbrand.
Whilealotofthesecompanieswillhaveestablishedcorporateculturesandbureaucraciesthat
makeithardertoinnovate,theywillalsohavedataonmillionsofpeople.Imagineprocessing
logisticsdataforWalmartyouwillhavemillionsofdatapoints,andyourinsightswillmakea
differenceinthelivesofmillionsofpeople.
Whilethesecompaniesarenottraditionallyseenastheonesbuildingcuttingedgedatascience
solutions,thereisstillalotofgoodworkavailableforthosewhowanttoworkonchallenging
datasetswithtalentedteammates.
Examplesofthiscompanytype
:Walmart,JPMorgan,MorganStanley,CocaCola,Capital
One
Samplejobpostings
:
DataScientist,ModeleratMorganStanley
,
DataEngineeratCapitalOne
www.springboard.com
13
www.springboard.com
14
Sizeofthecompany
:~30,000associatedonLinkedIn(10,000+companysize)
Howtoreadthisjobdescription
:FocusonBigDatatoolsindicatesthatthisisgoingtobea
fairlyspecializedrolethatlooksintohandlingtheimmenseamountsofdataCapitalOneis
holding.
4Largetechnologycompanieswithwellestablisheddata
teams
Largetechnologycompaniesareabreedinandofthemselves.Theyrethecontinuationofthe
startupobsessionwithdata,exceptnowtheyhavescaledtoapointdealingwithmillionsofdata
pointsormore.ThinkoftheUbers,theAirbnbs,theFacebooks,andtheGooglesoftheworld.
Withlargetechnicalteamsledbysomeofthemostbrilliantmindsintheindustry,datascience
roleshereareheavilyspecialized,andyoullworkoncuttingedgeproblemswithdatathat
requiresferociouslyinnovativethinking.
Comehereifyoucraveachallengeandifyouwanttolearnalotwithalotofdatapoints.The
upsideisntasgoodastheearlierstagestartups,butyoullgetgoodperks,goodsalary,andgreat
teammatesandagreatCVjobdescriptionincaseyoueverwanttomoveon.
Examplesofthiscompanytype
:Facebook,Google,Airbnb
Samplejobpostings
:
DataScientist,Oculus
,
DataScientistAirbnbMachineLearning
www.springboard.com
15
Sizeofthecompany
:~16,715associatedonLinkedIn(10,000+companysize)
Howtoreadthisjobdescription
:Focusonmultifaceted,innovativeskillsetshowsthisis
goingtobeanopenendeddatasciencerolethatwillbeexpectedtothinkofnewprojectsandlead
themfromendtoend.
www.springboard.com
16
Datasciencealsovariesdependingontheindustry.Industrieshavecertainareasofknowledge
specifictotheindustryitself,andtheyinvolvedifferenttypesofdata.Aschoolwillbefocusedon
differentmetricsthanabank.
Ifyouhappentohaveapassionforacertainindustry,makesureitcomesoffwithkeywordson
yourCVandLinkedIn.Demonstratingwhyyouloveacertainindustryanddeepknowledgeofthe
industryitselfpositivelydifferentiatesyouasacandidate.
The
threelargesthiringindustries
fordatascienceinOReillyssurveyofthefieldaresoftware
companies,consultingcompanies,andbanking/financecompanies.Thosethreeindustriesalso
tendtopaythemostfordatascienceprofessionals.
Differentindustriesalsovaryinthetypesofrolestheyhirefor.Software,medicineand
telecommunicationscompaniestendtobethelargesthirersofdatascientists.Software,
aerospace,andinformationtechnologycompanieshiremoredataengineers.Lastly,dataanalysts
tendtobehiredbyhealthcarecompaniesandconsulting/bankingorganizations.
Beawareoftheindustryyourpotentialemployerisin,andinferwhattheirdatascienceneedsare.
Youhavetobeawareofthedifferentroles,companies,andindustrieswithindatascienceto
understandexactlyhowyourdatascienceinterviewprocesswillgo.
Tododatascience,youmustbeabletofindandprocesslargedatasets.Youlloftenneedto
understandanduseprogramming,math,andtechnicalcommunicationskills.Youllalsoneedto
tailoryourskillsetandhowyoupresentyourselftothedifferentrolesandhiringcompanieswithin
theworldofdatascience.
www.springboard.com
17
Mostimportantly,youneedtohavea
senseofdeterminationtounderstandtheworld
throughdataandnotbedeterredeasilybyobstacles.
Thedatascienceinterviewprocessisdesignedtotestforthoseskillsandresilience.Bepreparedto
bechallengedoneverydimension.
GettingaDataScienceInterview
Thefirststepinthedatascienceinterviewprocessisntdealingwiththeinterviewitsfindingitin
thefirstplace,aprocessthatinandofitselfcantakemonthsofeffort!
Wesurveyedabouttwentypeopleaboutthehardestpartsofthedatascienceinterviewprocessas
partoftheresearchforthisbook.Theanswerwegotbackhadlittletodowiththetechnical
questionswethoughtwerethehardest.Whiletechnicalquestionsrankedsecondwith
68%of
respondents
selectingitasoneofthehardestpartsoftheinterviewprocess,awhooping
80%of
respondents
selectedgettingadatascienceinterview!
Literaturewasscarceoutthereabouthowtogetaninterview,especiallyforpeopletransitioning
fromdifferentcareers.Wediveddeeper,andlookedthroughreallifecasestudiesinadditionto
differentresourceswevecuratedforyou.
www.springboard.com
18
Wefoundtraditionalpathstojobinterviewsthatcouldworktoacertaindegreeindatascience.
Wealsofoundnew,proactiveapproaches,especiallywithemergingstartups,where
nontraditionaltacticscouldgetcandidatestotheforefrontofthehiringrace.
TraditionalPathstoJobInterviews
Ifitaintbroke,dontfixit.Whilealotofthenew,proactivetacticswediscusscanhavealotmore
efficacy,itsalwaysgoodtoknowthebasics.
Youcansubmityourresumesandcoverletterstocompanycareerssites.Then,youcanwaitand
hope.Werenotsayingtoavoidthisroute,butitshouldntbetheoneyourelyon.
Use
Indeed
and
Careerbuilder
tosearchfordifferentdatasciencepostings.Then,therearespecific
jobboardsforthedatasciencespace,suchasthe
KaggleJobsBoard
.
Youcancontactrecruiterswhocanhelpputyouintouchwiththerightemployers.Thereare
recruiterswhospecializeindatascienceandtechnologyspaces.Theyaregatekeeperstojobsnever
listedinpublicoutlets.AquicksearchonLinkedInfordatasciencerecruitersnearyouwillhelp
youfindthemostrelevantmatches.
3- Go to Job Fairs
Jobfairsindatasciencearefarandfewinbetween,though
Harvard
and
Stanford
dohost
computersciencejobfairsthathaveplentyofdatasciencejobsfortheirstudents.Yourebetteroff
www.springboard.com
19
attendingeithereventsormeetupswiththelocaldatasciencecommunityratherthanlooking
aroundforyourtraditionaljobfair.
ProactivePathstoJobInterviews
Wevecoveredthetraditionalpathstojobinterviews,theoptionsthathavebeenthedefaultof
jobseeking.Thesedays,gettinganoffersometimesrequireshustleandgritoutsideof
triedandtruetactics.Startupsprovidealargenumberofnewdatasciencejobs.Theircultureand
hiringtacticstrickleduptolargecompaniesthatadecadeagowerejuststartupsaswell.Theresult
isanewhiringenvironmentwhereoftentimes,onehastobeproactivetoreachdecisionmakers
whohaveknownnothingbutgritwhentheybuilttheirowncompanies.
Youneedtofindpeopleinterestedinthedatasciencecommunitytofindhiddenopportunitiesand
becomeproactiveatintegratingintothecommunity.Thereareseveraleventswhereyoucando
this,fromlargerconferencestosmallercommunitymeetups.
Conferences
StrataConference
TheStrataConferenceisabigdatascienceconferencethattakesplaceworldwideindifferent
cities.Speakerscomefromacademiaandprivateindustrythethemesorientaroundcuttingedge
datasciencetrendsinaction.Theconferenceallowsyoutolearnthetechnologybehinddata
science,andthereareplentyofnetworkingevents.
KDD(KnowledgeDiscoveryinDataScience)
KDDorKnowledgeDiscoveryinDataScienceisanotherlargedatascienceconference.Itsalsoan
organizationthatseekstoleaddiscussionandteachingofthesciencebehinddatascience.
Membershipandattendanceattheseconferencesoffersamarvelouswaytocontributetogrowing
trendsindatascience.
www.springboard.com
20
NIPS(NeuralInformationProcessingSystems)
NIPS,orNeuralInformationProcessingSystems,isalargelyacademicdatascienceconference
focusedonevaluatingcuttingedgesciencepapersinthefield.Attendingwillgiveyouasneak
previewofwhatwillshapedatascienceinthefuture.
Meetups
Wevelistedthemajorconferenceswherethedatasciencecommunityassembles,butthereare
oftensmallermeetupsthatservetoconnectthelocaldatasciencecommunity.
TheSanFranciscoBayAreatendstohavethemostdatameetups,thougheverymajorcityin
Americausuallyhasone.Youcanlookupdatasciencemeetupsnearyouwith
Meetup.com
.Some
ofthelargestdatasciencemeetups,withmorethan4,000members,are
SFDataMining
,
Data
ScienceDC
,
DataScienceLondon
,andthe
BayAreaRUserGroup
.
Youllwanttojointheevents,orcreateameetupyourselfifyoucannotfindanearbyevent.Our
directorofdatascienceeducation,Raj,gotajobbybecomingknownasadatascienceconnector.
HehostedalocalmeetupinAtlantaandinviteddistinguishedspeakersindatascience.Soon,he
wasknownasadatascienceinfluencer,andassoonastherewereopendatasciencepositions,he
wastappedtoapply.
SundeepPattemisadatainnovationleaderattheCaliforniaDepartmentofJustice.Hesalso
mentoredforseveraldatasciencecourses,andasadatascientist,heworksoncreatingendtoend
solutionsthatextractvaluefromdata.Hehaspersonalwebsiteswithdifferent
datascience
projects
.
Hisbreakthroughintodatasciencecamewhenhefoundanunsolvedprobleminenergy
sustainabilityandworkedtosolveit.Hewassoonapublishedauthorataprestigiousacademic
conference,andshortlythereafter,hewashiredtobecomeapracticingdatascientist.
www.springboard.com
21
Ifyoureunsureofwhatdatayouwanttoanalyze,wehavealistof
19free,opensource
public
datasetsyoucanexplore.
Ifyoufreelancearounddataproblemsyouloveandbuildincrediblesolutions,keeparecordof
everythingyoudoinanaccessibleportfoliothattellsstoriesaroundyourpassions.
Themostinterestingprojectsintheworlddontnecessarilyresideinsecretivecompanydatabases
anymore.Theyareofteninopensourcerepositorieson
Github
.Thisincludesthe
Natural
LanguageToolkit
project,whichhelpsdealwithhumanlanguageasadatasourceandthevarious
librariesthatmakeupthePython
datascienceandmachinelearningtoolkit
.TheRcommunity
alsohostsmanyofitspackagesona
consolidatedpublicwebsite
.
ManyleadingCTOswillhirebasedon
yourcontributiontoopensourceprojects
,andmayeven
findyouthroughthatroute.Itseasytotellifsomebodyisabletoworkinateamandbuild
marvelousthingsthroughthetransparentglassofopensource.Makesureyoutakeadvantage.
Ifthebroadconfinesofopensourceprojectsarentyourtypeofprojectsandyourcreativitythrives
bestinmoreconfinedsituations,considerjoiningadatasciencecompetition.
Datasciencecompetitionplatformslike
Kaggle
,
Datakind
and
Datadriven
allowyoutoworkwith
realcorporateorsocialproblems.Byusingyourdatascienceskills,youcanshowyourabilityto
makeadifferenceandcreatethestrongestinterviewassetofall:ademonstratedbiastoaction.
OneofourSpringboardmentors,SinanOzdemir,competedhiswaytoadatasciencejobbasedon
hisworkonproblemsonKaggle.Youcandothesame.
Attheendoftheday,yournetworkwillgetyouthebestchanceatanewjob.Youshouldseekto
knowmorepeopleinthefieldyouwanttoworkin,ifonlytogetanideaoftheproblemstheyhave
andwhichyoucansolve.
www.springboard.com
22
LegendaryentrepreneurandstrategistSteveBlankhasagreatframeworkforgettingcoffeeswith
peopletoobusytoseeyou
,asmostdatascientistswillbe.Youhavetofindawaytoprovidevalue
ofsomekindandlooktogivethemafreshperspectiveontheproblemstheyface.
Thiscanculminateinaninformationalinterviewwhereyouseekadviceandinformationfrom
datascientistsinthefield
.Ifyoudothisright,youllconstantlygrowyournetworkofdatascience
opportunities,andyoullunderstandmoreabouthowdatascienceworksinindustry.
9- Data Hackathons
Inlinewiththetrendofseeingworkinaction,datahackathonsofferyouanuniqueopportunityto
learndataskillswithamotivatedteam.Youwillhavetosolveadataprobleminacoupleofdays.
Anexampleofthiskindofhackathonisthe
DataWeekhackathon
inSanFrancisco.Byteamingup
withotherstodeliverrealsolutions,youlldifferentiateyourselffromotherjobcandidates.Many
employerslieinwaitathackathonsaswell,somecompaniesgoingasfarastosponsorhackathon
prizesinthehopesoffindingtheirnextdatascientist!
www.springboard.com
23
Forthissectionweworkedwith
AndyMusick
,
anAtlantabasedrecruiter:
contacthimat
andy.musick@hotmail.com
ifyouwerelookingforanAtlantaareajob.Wealsoworkedwith
AnnaMeyer,
adatasciencerecruiterat
RobertWalters
,arecruitmentagencyspecializedin
datascience.Feelfreetocontactherat
anna.meyer@robertwalters.com
.
HowtoApply
CV vs LinkedIn
Alotofpeopleouttheremayhaveatraditionalviewonwhatmakesforagoodjobapplication.
Theyrealreadymissingalargerpoint:thetraditionalviewisout.
Thereisa
fundamentaldifferencebetweenacademiaandworkinginanindustry,
and
itstartsinhowyoupresentyourself.
Wetalkedwithrecruiters,students,andhiringmanagers,andtheyallagreedthatLinkedInwas
thegoldenstandardofrecruitment.HavingawelloptimizedLinkedInprofileallowsemployersto
sizeyouupandrecruiterstofindyoutherightopportunity.
IfyourenotmakingsureyoushineonLinkedIn,yourealreadylosingouttocandidateswhoare.
Whilearesumemayberequiredtogothroughtheprocess,itisntthemaindrawthatwillgetyou
inthedooranymore.Recruiterswillonlylookthroughyourresumeonceitspresentedinfrontof
them,whileagreatLinkedIncouldleadtoinboundworkopportunitiesonaconstantbasis.
Unlikeinacademia,whereanimpressivearrayofpapersandacademicworkwillwinover
everythingelse,applyingforindustryjobsinvolvesbeingassuccinctaspossibleandlistingthe
www.springboard.com
24
impactyoudrivewithyouraccomplishments.resumesarenotsomuchreadasscanned.Keepthat
inmindifyouregoingtobuildone.Arecruiterspendsanaverageofthirtysecondsonaresume.
KeyAdviceonResumes
1) Keepthemshort,preferablyunderapage.Rememberthatpeoplearescanningyourresume
forsignsofinterestbeforetheyeverconsiderdoingadeepdive.
2) Makesureyourskillsstandoutandarehighlighted(considerboldingrelevantskills).
Recruitersandhiringmanagerswilllooktoseeifyoureatechnicalfitbeforelooking
further.
3) Haveclearjobheadings,andatmost,threeonelinepointsineachoneofyourjob
descriptions.Youwanttoclearlymarkhowyourexperiencetiesinwiththejob
requirementsyouveappliedfor.
4) Demonstrateyourimpactwithnumbers!DontsaythatyoudidX.Tellthehiring
managerwhateffectsXhad.Youwanttosayyoudiscoveredsomethingthathelped
thousandsofpeoplesavehoursoftimenotthatyousimplydiscoveredsomething.Write
createdanautomatedsalesemailsoftwarethatgenerated$400,000notcreatedan
automatedsalesemailsoftware.
KeyAdviceonLinkedIn
1) Dontbeshy.Filloutasmanydetailsasyoucanitmakesadifference.Mosthiring
managerswillwanttoseeyourLinkedInbeforetheyeverinterviewyou.
2) Makesureyourjobtitlesareclearandconsistentwithsearchtermsthatrecruiterswould
use.Sayingthatyouworkedasadatascientistorasadataanalystispreferredtocomingup
withyourownjobtitle.
3) Onewayyoucandifferentiatefromothersisaddingsomepersonalflavortoyourprofile.
Addsomeofyourinterestsandpassions,andmakesuretheyareevidentinyourLinkedIn.
Hiringmanagerslikeevaluatingcandidatesfortechnicalskillsandculturalfit.Beingableto
showthatyouhaveyourownuniquetakeontheworldwillonlyaddvaluetoyourjob
searchandhelpyoustandout.
www.springboard.com
25
4) Whileyoumightnotwanttotailoryourprofileforcertainjobsorindustries,makesureyou
knowwhatyourelookingfor,andmakesurethatcomesoutonyourLinkedIn.Youwantto
beverydeliberateatconstructingyourprofilesothatitgetsyouthepositionyouwant.
Avoidlistingdataentryifyoudontwanttogetentryleveloffers.Mentionspecific
industriesifyourheartissetonworkingonaparticulartypeofproblem.
Makesureyouknowwhatrolesyoureapplyingfor,andapplyindustrykeywordsandskill
keywordsthatmatch.Interestedinadatasciencejobinfinance?Donthesitatetoputindustry
terminologyalloveryourCVandLinkedIn.Ifyouhaveaskillthatyouresearchedisindemandfor
theroleyourelookingfor,additliberally!Youcanresearchwhattechnologiesacompanyuses
companieslike
Yelp
and
AirBnB
willoftenblogabouttheirdataprojects.Ifyouseethattherolein
questiondemandsPythonandRskills,makesurethatyourCVandLinkedInmarksthoseskills.
EndorsementsalsoplayapositiveroleinthisregardwhenitcomestoLinkedIn,sodontbeshyat
askingpeoplewhohaveworkedwithyoutoendorseyourskillsandgiverecommendations.
MorerecruitersandhiringmanagerslookthroughLinkedInthanresumestoday.Arecruiterwill
lookataCVforanaverageof30secondsbeforediscardingit.Makesuretheimpactyouvedriven
isfleshedoutwithstrongactionverbs,youveformattedyourresumeandLinkedIntostandout,
andyouvefilledthemwiththerightkeywords.
KeepinmindthatthisisafirststepandthatapplyingwithjustaCVorLinkedInwillgetyou
consideredatmostplaces,butnotwithanyparticularenthusiasm.Youllhavejoinedthequeueof
thousandsofothersapplyingthesameway,andyoullprobablyneedtodomoretogetyourdream
job.Regardless,makesureyouoptimizeeverystepofyourapplication,includingtheCVor
LinkedInthatemployerswillinevitablylookover.
CoverLettervsEmail
Acoverletterwasalwaysthestandardforacademicadvancement.Nowadays,therecruiterswe
talkedtoconfirmedthatcompaniesseldomreadthem.Ifyouwanttodifferentiatewhoyouare,
youllhavetodoitonyourCVoryourLinkedIn.
www.springboard.com
26
Ifyouregoingtobeproactive,sendabriefsummaryofwhatyouvedoneinanemailtoahiring
manager.Thisservesasmoreofanexplainertheycansharewithotherpeopleinthecompany.
Youllwanttokeepitbriefnomorethanafewparagraphsatbestandyoullwanttokeepthis
emailfocusedonthetopthreepointsthatdefinetheimpactyouvedriven.
HowtogetReferencesandYourNetworktoWorkforYou
Mostpeopledontrealizehowcriticalitistobuildandmaintainyournetworktogetyourfeetin
thedoorwiththedatascienceinterviewprocess.Thestrongestsignalhiringcompanieslookforis
strongreferrals,especiallyfrominternalsources.Ifyouhavesomebodyadvocatingforyouinside
theorganizationyoureapplyingfor,thatcanensurethatyourCVwillbelookedover,anditcan
evengetyoutoskipstepsintheinterviewprocess!
Weinformallysurveyedsomeofouralumsgoingthroughthehiringprocess.Itturnsoutthata
referralfromaninsiderwithinthecompanyledtoa
85%chance
ofgettinganinterviewwiththat
particularapplication,whilethosewhoreachedoutcoldandonlyappliedwiththeirCVor
LinkedInorthroughthestandardformatonlyhadarounda
10%chance
ofgettinganinterview.
Pursuingtheformercanimproveyourjobhuntingprocessbyanorderofmagnitude.Ouralums
alsosaidthatthereferraldoesntevenhavetocomefromafriend,
thefactthatanapplicationhas
beenreferredbyanexistingemployeeoftenguaranteesatleastaphoneinterview
.
Takea
longtermview
onthisbyaddingvaluetodifferentpeopleinyournetwork,whether
thatsbeinggenerouswithadviceoncethatsaskedofyouorbeinggenerouswithintroductionsto
otherpeopleinyournetwork.Hopefully,bythetimeyourelookingforajob,youllhavebuiltupa
strongnetworkofpeoplealsointerestedindatasciencethatcanmaketherightintroductionsand
giveyoutherightreferrals.
Ifthatisntthecase,andyourelookingtogetthosereferralsrightnow,youcanusewhatiscalled
the
informationalinterview
technique.Thisentailsreachingouttopeoplewhoareworkinginthe
fieldtogetasenseofwhatsgoingonandwhattheirproblemsare.People,evencomplete
strangers,canbeverygenerouswiththeirtimeifyoushowthatyouregenuinelyinterestedin
whattheyredoingandyouoffertohelpaswell.
www.springboard.com
27
Lookforpeopleatmeetups,orspecificallytargetpeopleonnetworkssuchas
LinkedIn
,
Angellist
and
FounderDating
.Presentyourintentionshonestly,butindicatethatyoureveryinterestedin
thecompanyanddatascienceingeneral.Askforacoffeewhereyoucanaddperspectivetoa
problemtheyresolvingorlearnabouttheircompany.
Asamplescriptmightgoasfollows(whereyoucanaddsomebodyonLinkedInasafriendor
messagethemdirectlyonFounderDatingorAngellist):
Hi[name],
IwassuperinterestedintheproblemsAirbnbisfacingindatascience.Ivebeenaspiringto
breakintothefield,andbeingapassionatefollowerofthe
AirbnbNerds
blog,Inoticedthat
buildingtrustwithdata
isanimportantpartofwhatdrivesAirbnb.Basedonmybackgroundin
psychologyandstatistics,Imightbeabletohelpcomeupwithsomecreativeideasonhowto
fostertrust.
IdlovetotakeyououttocoffeeandgetagreatersenseofwhatproblemsAirbnbhasperhapsI
canhelp!Wouldyouhavesometimeinthecomingweeks?
Cheers,
[yourname]
LinkstoyourLinkedIn,resume,portfolioand/orarecentproject
Ifyoureachouttoenoughpeopleandseekintroductionstopeoplethroughyournetwork,youllbe
abletofindpeopleinanycompanytotalkwith.CheckoutyoursecondconnectionsonLinkedIn
andhowtheyareconnectedtoyou,whichyoucaneasilydothroughanyLinkedIncompanypage.
Heresanexampleofacompanypagefor
Airbnb
.
Onceyouresetforaninformationalinterview,makesureyouveresearchedthecompanyandthe
personyouvetalkedwithbylookingatthecompanywebsiteandanyotherresourcesyoufind.
www.springboard.com
28
Youshouldhaveaprettygoodsenseofwhatproblemsthecompanyencountersonadaytoday
basis.
Theseinformationalinterviewsareagreatchancetoknowexactlywhatishappeningatacompany
andwhattheirprioritiesare,whichisgreatlybeneficialknowledgeinanactualjobinterview.If
youcomeinwellpreparedandpositionyourselfassomeonewhocanhelpthecompany,the
personyourehavingcoffeewithcouldbecomeastronginternaladvocateandhelpyoujump
throughtheusualrecruitinghoopstogetyourfirstroundinterview.
www.springboard.com
29
Hopefullyalltheworkyouputintogettingthedatascienceinterviewpaysoff,andyougetthe
emailthatsignifiesthestartoftheinterviewprocessforyou:acompanyrepresentativebeckoning
foraninitialphonecall.Hereswhatwillhappenandhowyoushouldprepare.
WhattoExpect
Thedatascienceinterviewisacomplexbeast,withbehavioralquestionsmixedwithabunchof
technicalquestions.Youvegottenprettyfarifyoureabletogetaninterviewinthefirstplace,but
youstillhavefurthertogo.
Letsstartfromthebeginningadatascienceinterviewwillbevastlydifferentdependingonthe
positionyoureapplyingforandthehiringorganization.Certainorganizationswillbevery
rigorousandmakeyougothroughseveraltechnicalchallenges.Otherswilllookmoreatculturefit
and,especiallyifyouhavestrongreferences,getyoustraightthroughtothefinalround.
Themostrigorousprocesspossiblelookssomethinglikethis:
ThiswilltypicallybedonebysomebodyinHRandactsasafiltertosavehiringmanagerstime.
Sometimes,therewillbebasictechnicalquestionstoscreenoutcandidateswhoaregrossly
unqualified,butmostofthetime,thisphonescreeninvolvesestablishingthebeginningsofculture
fitandmakingsurethatthecandidatehasgoodenoughcommunicationskillstocomeoffwellin
theinterview.
Inthiscall,youllwanttogetasenseofwhatproblemsthedatateamisfacingandthe
organizationalstructureoftheteamyoureapplyingto.Comepreparedwiththoughtfulquestions
www.springboard.com
30
thatdemonstrateadeepunderstandingofthebusinessandthespacetheyoperatein,andbe
preparedtoaskthemattheend.
2- Take-home Assignment
Afterthephonescreen,companiesoftensendapreparedassignmentforcandidates,withsome
timepressurebeingapplied.Thisisagoodwaytoscreenoutcandidateswhomaybetechnically
weak,orwhomaynotbecommittedenoughtoinvestalotoftimeintherecruitmentprocess.
Somecompaniesdispenseofthisaltogether,butthosethatdoembracethetakehomeassignment
oftenuseitasatestingbartosavetheirhiringmanagerstime.
Anexampleofatakehomeassignmentisdoingadeepanalysisonaspecificdatasetprovidedfor
you.Whentheassignmentisdesignedwell,theassignmentisalsoanopportunitytolearnmore
aboutthetypesofproblemsyouwouldworkonifyouweretogetthejob.Here,youdbeexpected
tostorytellaroundinsightsyoudfindinthedata.Anotherexamplewouldbehavingadatasetwith
significanterrorsinitthatyoudbeexpectedtoclean.Afinalexamplewouldinvolveworkingwith
aspecificproblemrelevanttothebusiness,suchasbuildingajobrecommendationsystemfor
applicantsbasedondatafromjobdescriptions.
Onlythosethatpassthebarofhavinggoodassignmentswilltalktoahiringmanagerfacetoface.
Youllgetweededoutquicklyifyourefusetodoitalltogether.
Takethetimetodotheassignment,andtrytoseehowitrelatestowhatproblemsthecompanyis
undergoing.Usingtheassignmentasawaytoseewhatkindofskillsyoullbetestedonandhow
thecompanyinquestionisthinkingaboutyourroleensuresthatyoumaximizeyourtime.Thisis
whereyoucanshineinahiringprocessandshowhowyouaredifferentfromothercandidates.
www.springboard.com
31
Youmayreceiveanotherphonecallscreenthatwillbefocusedoneithermathematicsand
statisticsquestionsorcodingquestions.Thiswillbedonebyahiringmanageroratechnical
person.Thiswilllikelybethefinalevaluationbeforeacompanyinvitesyoutoanonsiteinterview.
Thephonecallwilltypicallybesplitintothreecomponents.Sometimes,thisisdoneinonelong
callothertimes,itisdoneinthreeshortphonecallsofaboutthirtyminuteseach.
Mathematical/StatisticalPhoneCall
Youllbeevaluatedoncoremathematicalandstatisticalconceptshere,whichwilldepend
somewhatonwhatroleandwhatcompanyyoureapplyingfor.Webcompanieswilltendtofocus
onyourknowledgeofA/Bsplittesting,yourunderstandingofhowpvaluesarecalculated,and
whatstatisticalsignificancemeans.Energycompaniesmighttestyoumoreheavilyonregression
andlinearalgebra.Nomatterwhattypeofintervieweryouretalkingwith,youllwanttosketch
outtheentirethoughtprocessbehindyourproblemsolving.
IfyoureaskedaboutA/Bsplittests,describetheA/Bsplittestprocessindetail,fleshingoutwhat
pitfallstowatchoutforandleaningonanyexperienceyoumighthaveinthefield.Treatthe
questionlikeamathematicalproofandatestofyourabilitytostatisticallyreason,butdont
hesitatetoturnyourfinereyestodetailandacoherentstoryaboutwhythismatterstothe
companyathand.
CodingPhoneCall
Thispartoftheinterviewprocessisfairlytypicalandisalsotheclosesttoothertechnical
interviews.Youllbeevaluatedonyourability,overthephone,tosolvecodingchallengesby
presentingeitherpseudocode,orinharderinterviews,compilereadycode.Ifyoureapplyingfora
dataanalystposition,thiswillswingmoretoaskingyouhowyoudthinkaboutqueryingdatawith
SQL.Otherwise,youllbeaskedquestionsintheprogrammingandscriptinglanguagesyouve
claimedexperiencein,fromJavatoPython.
www.springboard.com
32
Yourinterviewermayalsousetoolslike
HackerRank
or
Collabedit
toevaluateyouliveonline.In
thiscase,yourhiringmanagerwillwatchyouasyoutypeoutyoursolution:bereadyfor
approachessuchasthis,andtrainwiththosetoolsifyoucan!
Thereareplentyofgreatresourcesoutthereforcodinginterviews,from
CrackingtheCoding
Interview
to
InterviewCake
.Usethemtoyouradvantage.
Practicemakesperfecthere.Makesureyouhaveacomfortablespaceandnaturalenvironmentfor
youtocode.Bepreparedtojotdowncodeonapaperandexplainitonaphonecall,orbeprepared
totypeinthecodeonalaptop.
Youwilloftenbeaskedaboutdatastructuresmorethananythingelse.Knowhashmaps,trees,
stacks,andqueuesverywell.Prepareforthisphonecalllikehowsoftwareengineerswould
prepareforacodinginterview,andyoullpasswithflyingcolors.
CallwiththeHiringManager
Finally,youllbepatchedthroughtothehiringmanager,whoisnowevaluatingyouonhowwell
youcommunicateandifyoudfitwellontheteam.Thismaybeonaseparatecallfromthe
technicalphonescreens,oritcanbethelastpartofamegacallthatencompassesallthree.Inthis
call,thehiringmanageristryingtogetafeelforyourcharacter,yourmotivation,yourfitwith
theirteam,andyourrawintelligence.Mosthiringmanagershaveamentalmodelforwhotheyare
lookingfor.Thecloseryoufittoit,themorelikelyyouwillpasstoonsiteinterviews.
Thisiswhereyourworkwiththerecruiterbeforehandwillshine.Themoreyouknowaboutthe
problemsthehiringmanagerisfacingandthekindofpersontheyrelookingfor,thebetteryoull
bepreparedtopresentyourselfastheperfectfit.Tailoryourcommunicationstothatgoal,andbe
confidentandclear,andyoullmakeittothenextround.Trytopasstheairplanetestaswell
imaginethehiringmanagerevaluatingwhethertheydliketospendhoursoftimewithyou.The
workplacewillforceyoutoworktogethercloselyandspendalotoftimetogether.Makesureyou
showthatyoucangetalongwithyourmanager!
www.springboard.com
33
Finally,ifyouvemadeitthroughtheearliercalls,youllmeetyourhiringmanagerfacetoface.
Theyllbeevaluatingyoufrombothatechnicalandnontechnicalperspective.Theyrelookingto
ascertainifyoureafit,andtheymaytestyouonyourtechnicalchopsbyhavingyouwhiteboard
differentscenarios.
5- Technical Challenge
Ifthisdoesnthappentoyouduringtheonsiteinterview,preparetobechallengedonyour
technicalskillsinoneformoranother,especiallyforrolesthatleanmoretowardsdata
engineering.Youlloftenfindthatthisissimilartoa
softwareengineeringinterview
whereyouwill
beaskedtowhiteboardandwritedownhowyoudimplementcertainalgorithmsorsolvecertain
problems.
Hereiswherestrongknowledgeofsoftwareengineeringconceptssuchastimecomplexity/BigO
notationandastronggraspofthemathematicsandstatisticsbehinddataalgorithmscantruly
shine.
Ifyoupassthebarforyourhiringmanager,youlloftendoafinalinterviewwithasenior
executive.Inastartup,thiswilloftenbethecofounderortheCEOthemselves.
Ifyouvemadeitthisfar,congratulations!Donttakeitforgranted,butthisisasignthata
companyisleaningtoanofferforyou.Normally,onlycandidateswhohavepassedthetechnical
barwillgethere,sonowyouneedtoemphasizeexactlyhowyoucandriveimpactwithyour
knowledgeofthebusinessitself,andthespecificproblemsitfaces.Atthispoint,yourenot
lookingtoproveyourselfsomuchastoavoidglaringerrors.
www.springboard.com
34
Mathemat
ics/Statist
ics (e.g.
P-value
analysis,
AB
testing)
Databas
e
Querying
(SQL)
Algorithm
s (e.g.
Supervise
d learning,
Entity
Resolutio
n)
Software
Engineerin
g (e.g.
Python,
Java,
Object
Oriented)
Big
Data/Systems
Engineering1
(e.g. Spark,
HBase, Hadoop)
Soft
Skills/Domain
Expertise (E.g.
public speaking,
presentation
skills)
Medium
Medium
Medium
High
High
Medium
Data Engineering
Low
Medium
Low
High
High
Low
Data Scientist
High
Medium
High
Low
Low
High
Medium
High
Medium
Low
Low
High
Data Analyst
Low
High
Low
Low
Low
High
Differentdatascience
roleswillhavevastlydifferentexpectationsondifferentskillsets.Whilea
dataengineermightnotbeexpectedtohavemanybusinesspresentationskills,theyareexpected
todominatealltypesofprogrammingchallenges.Conversely,adataanalystwillleanmoreon
theirSQLskillsandnotbeexposedtoheavytechnicalproblems,buttheywillbeexpectedtobe
topnotchpresenters.
Thistableimpliestheindustrydemandanddifficultyofthepositionsfromtoptobottom,with
ProductDataScientistsbeingthemostindemandfortheirspecialized,difficulttoacquireskills.
Knowwhatroleyoureapplyingfor.Seektoscoutoutexactlywhatneedsacompanyislookingfor
andwhatroletheyaretryingtofityouinitwillhelpyounavigateandpredicttheirdatascience
interviewprocess.
1
Thisismoreinlinewithdealingwithsettinguplargescaledataengineeringplatformsandintegratingvarioustechnologies
together.
2
Thesedatascientiststypicallybuildthealgorithmandproductionizeitthroughthedataengineeringinfrastructure.E.g.
Theywouldbuildtherecommendationsystemalgorithmandproductionizetherecommendationsystemliveonthe
platform.
www.springboard.com
35
Heresahighleveloverviewofthespecificroles:
ProductDataScientist
:Endtoenddatascientistwithdataengineeringskills.Productdata
scientistsleadteamstobuildadataproduct.Theytweakalgorithmsandhaveastrongsayinhow
thedataisservedtoendusers.Theywilloftenhavetheengineeringabilitytodeliveronthose
ideas.
DataScientist
:Theunicornmixoftechnicalskills,businessskills,andmathematicalknowledge.
Adatascientistunderstandshowtocreateandoptimizedataalgorithms,andhowtoexplaintheir
findings.Theymayneedtoknowlessprogrammingthantheirdataengineerpeers,buttheyll
neverthelessneedtounderstandenoughtodealwithdataatscale.
BusinessIntelligenceDataScientists
:BusinessIntelligenceDataScientistsarefocusedon
gettingbusinessinsightsoutofdata.Theywillunderstandenoughaboutstatisticalmethodsand
differentmachinelearningalgorithmstodifferentiatethemselvesfromdataanalysts.Theybuild
dashboardsandcompletevariousanalyticalstudiestohelpthevariousteamsmakebetter
decisions.
DataEngineering
:Adataengineerisntoftencountedontohaveadvancedknowledgeofthe
statisticsandmathematics,buttheywillhavetoaceeverytechnicalchallengeouttheretoprove
theycandealwithimplementingalgorithmsonmassiveamountsofdata.
DataAnalyst
:Anentrylevelrolethatreliesheavilyonmakingoneoffreportsbylooking
throughdataandinterpretingtheresults.ThisroletypicallyrequiresastrongknowledgeofSQL
andExcel.
www.springboard.com
36
Thedatascienceinterviewprocessinvolvesalotofbehavioralquestions,similartoanyother
interview.Theinterviewerintendstotestforyoursoftskillsandseeifyoufitinculturallywiththe
company.
1. Tellmeaboutadatascienceprojectyouhavedoneinthepast?
Intent
:Theintentofthequestionistounderstandthedepthofknowledgeand
contributionsyouhavefromyourpastexperiences.Ittestsyourabilitytotellastory
aroundyourworkandwhetheryoucantieittoimpactonthecompanyyouworkedwith.
HowtoAnswertheQuestion
:
Trytodescribeaprojectthatdemonstratesbothproductandengineering
experience,i.e.theprojectprovidedtheanalyticalinsightandproductionised
theinsighttomakeitactionable.Forexample,ifyouidentifiedkeytopicsina
textdatasetthroughtopicextractiontechniques,youshouldexplainhow
thesetopicsfurtheredcompanygrowthinadataproduct.
Gointodetailaboutyourspecificcontributionandtheoutcomefroma
businessgoalperspective.Theinterviewerwantstoknowwhatyou
specificallydidwhiletryingtounderstandtheoverallgoaloftheproject.
Rehearseyourexperiencesmanytimes.Thisisaverycommonquestion,so
have23gotoprojectsyoucangointoextremedetailabouteloquently.
2. Whathaveyoulikedanddislikedaboutyourpreviousposition?
Intent
:Theintentofthequestionistoidentifywhethertheroleyoureinterviewingforis
suitableforyou,andtoidentifywhyyouremovingonfromapreviousposition.
HowtoAnswertheQuestion
:
Understandtherolewell.UsetheHRcontacttogetinsiderinformationabout
theroleanditschallenges.TheHRpersoncanbeatreasuretroveof
informationabouttherole,team,history,andkeyimmediatebusinessgoals.
www.springboard.com
37
Avoidtalkingaboutissueswithspecificpeople,andbeprofessionalwhen
talkingaboutwhatyoudisliked.Introspectcarefullyandtalktowhatmakes
youpassionate.Forexample,talkaboutderivinginsightsfromdataand
conveyingthemtomanagementinanactionablewayassomethingyouenjoy.
Youcouldalsotalkaboutlearningnewtechnologiesthatmakedatascience
moreactionablethroughtheorganization.Youcoulddislikehowthe
organizationisnotplacingdatascienceatthecenterofitsstrategyorthatthe
companyhashadsignificantattritionattopmanagementlevelandthe
directionoftheteamisunclear.Keepitpositive,pointsoriented,andaway
frompersonalsituations.
Bad:
Ihatedthatdatascientistswerealwaysputbelowtheengineers
andthatmanagementdidnthaveacluewhatthecompanydirection
was!
Good:
IrealizedIwantedtoworkinacompanywheredatascienceis
partofitscorestrategyandthecompanyhasacleardirection.
3. Tellmeaboutasituationinthepastwhereyouhadtoconvinceothersaboutyourposition
onaspecificmatter.Whatwastheoutcome?
Intent
:Theintentistofindouthowgoodareyouatdefendingyourpositionandyour
abilitytoengenderchangewithinateam.
HowtoAnswertheQuestion
:Trytofindanexamplewhereyouweresuccessfulat
makingthechangeandthatthechangeisquantifiableinitsimpact.Ifpossible,useadata
sciencetypeexampleifyouhaveone.Itsimportantthatyoudemonstrateyour
communicationandleadershipskillshere.
Mathematics Questions
Questionsaboutthemathematicsanglewillcomefordatascientistroleswhereyouareexpected
notonlytoimplementalgorithms,butalsotweakthemforspecificpurposes.
www.springboard.com
38
1HowdoestheLinearRegressionalgorithmfigureoutwhatarethebestcoefficientvalues?
(ThiswasaquestionaskedinC3EnergysDataScientistinterview)
Rationale
:Theintentofthequestionistoseehowdeeplyyouunderstandlinearregression,
whichiscriticalbecauseinmanydatascienceroles,youwontjustworkwithalgorithmsinablack
boxyoullimplementtheminsomeway.Thiscategoryofquestion(andyoucouldseeitfromany
typeofalgorithm)testshowmuchyouknowaboutwhatisactuallyhappeningbeyondthesurface.
HowtoAnswertheQuestion
:Traceouteverystepofyourthinkingandwritedownthe
equations.Describeyourthoughtprocessasyourewritingoutthesolution.
TheAnswer
:Atthehighestlevel,thecoefficientsareafunctionofminimizingthesumofsquare
oftheresiduals.Next,writedowntheseequationswhilepayingcarefulattentiontowhatisa
residual.Togofurther,considerthefollowing:
1. Writetheminimizationgoal(ideallyinlinearalgebraic(matrix)notation)ofminimizingthe
sumofsquaresoftheresidualsgivenalinearregressionmodel..
2. Solvetheminimizationequationbyillustratingthatthesumofsquareoftheresidualsisa
convexfunction,whichcanbedifferentiatedandthecoefficientscanbederivedbysetting
thedifferentiationto0andsolvingthatequation.
3. Describethatthecomplexityofsolvingthelinearalgebrabasedsolutionin#2isof
polynomialtimeandamorecommonsolutionisbyobservingthattheequationisconvex
andhencenumericalalgorithmssuchasgradientdescentmaybemuchmoreefficient.
StatisticsQuestions
Agraspofstatisticsisimportantforsolvingdifferentdatascienceproblems.Youllbetestedon
yourabilitytoreasonstatisticallyandyourknowledgeofthetheoryofstatistics.Bepreparedto
reciteyourknowledgeaboutstatisticalconceptslikeTypeIerrorandTypeIIerrorflawlessly,and
bepreparedtodemonstrateyourgraspofdifferentprobabilitydistributions.
1WhatisthedifferencebetweenTypeIerrorandTypeIIerror?(OuralumnusNiraj
encounteredthisquestion).
www.springboard.com
39
Rationale
:Companieswillwanttotestyourgraspofdifferentbasicstatisticalconceptstotest
howgoodyouarewiththefundamentalsofstatisticsandseehowyoucommunicatedifferentideas
youmaynotoftenapplywiththesometimestechnocraticlanguageembeddedinstatistics.
HowtoApproachyourAnswer
:Benononsense,andcommunicateclearlywhateveryouare
askedtodefine.
TheAnswer:
TypeIerroriswhatisreferredtoasafalsepositive,ortheincorrectrejectionof
thenullhypothesis.TypeIIerroriswhatisreferredtoasafalsenegative,ortheincorrect
acceptanceofthenullhypothesis.Youmaywanttocommunicateyourgraspoftheconceptswith
anexampleandhowitmightberelevanttothebusinessathand.TypeIerrororafalsepositive
wouldbetellingamantheywerepregnant,whileTypeIIerrorwouldbetellingapregnantwoman
theywerent.Ifyouwererunningafrauddetectionbusiness,youmighthaveaveryhightolerance
forfalsepositives(aclientwillnotfussaboutanemailonthepotentialoffraud),butafalse
negative(notdetectingfraudwhenitishappening)couldbedisastroustoyou.
2Thiswasaquestionforadatascientistpositionatabiginsurancecompany.Supposea
populationisdividedintotwogroups:aggressivedriversandnonaggressivedrivers.40%of
thepopulationareaggressivedriverswhile60%arenonaggressivedrivers.Theprobabilityof
anaggressivedrivergettinginto3accidentsinoneyearis15%.Theprobabilityofa
nonaggressivedrivergettinginto3accidentsinoneyearis5%.Johnisknowntohave3
accidentsinthepastyear.Whatistheprobabilitythatheis(a)anaggressivedriver,and(b)a
nonaggressivedriver?
Rationale
:AlotofcompanieswilltestyourBayesianinferenceskillsasaprimerforhowyou
thinkstatistically.Bayesianprobabilitycontrastswithfrequentistinterpretationsofstatistics,and
yourabilitytoreasonthroughanyBayesianproblemwillshowyouhaveaquickgraspofstatistical
conceptsandthementalmathneededforit.Ifyouneedarefresher,oneofSpringboardsmentors
WillKurtrunsablogcalled
CountBayesie
,andhehasawonderfulguideto
Bayesian
statistics.
HowtoApproachyourAnswer
:
TheintentofthequestionistoseeyourlevelofunderstandingBayesianprobability.Sketchoutall
ofyourassumptionsandthecalculationsyouredoingforyourinterviewerinalogicaland
organizedfashion.
www.springboard.com
40
TheAnswer
:Writeoutwhatyouknow.
Probabilityofaggressivedriversinthepopulation=40%or0.4
Probabilityofnonaggressivedriversinthepopulation=60%or0.6
Probabilityofaggressivedriversgettingintothreeaccidentsayear=15%or0.15
Probabilityofnonaggressivedriversgettingintothreeaccidentsayear=5%or0.05
YoullwanttounderstandtheconceptofpriorsandposteriorsforBayesianequations.Aprioris
whatyouaregivenbeforetheproblem,datathatyoureceive.Theprobabilitythatsomebodyisan
aggressivedriverinthepopulationisapriorassumptiongiventoyouthatyoucannotchange.The
posterioristheprobabilityyouderivefromusingtheBayesRuleontheseassumptions(P(A/B)).
BayesRule
ThefirstquestioniswhatisthechanceJohnisanaggressivedriverifhesbeenin3accidentsa
year?
Visually,yourereally
tryingtodrawaVenndiagram
ofprobabilities:ofallofthepeoplewhohave
beenin3accidentsayear,howmanyareaggressivedrivers?Howmanyarenot?
www.springboard.com
41
Thereisa67%probability(really66.66%repeating)thatsomebodywhogetsintoa
3accidentsayearisanaggressivedriver.Thisisnowyourposterior.
Theprobabilitythatsomebodywhogetsinto3accidentsayearisnonaggressiveis
justtheflipsideofthat.10.6666=0.33333repeating,or33%probability.
3Whatisprobabilitydistributiontype(orshowthederivationofthepdf)youwoulduseto
describethefollowingrandomvariables?
a. Probabilityofkcustomersarrivingtoarestaurantwithinadurationoftminutes
b. TheprobabilityoftheheightofapersoninacrowdbeingatleastXinches
c. Theprobabilityofthesumoftwo6sidedfairdicesbeingY
d. TheprobabilityofhavingkheadsthrownoutofNcointhrows
Rationale
:Thisquestiontestsyourknowledgeofprobabilitydistributionsandtestswhetheror
notyouknowwhatmodelstousegivenhowyourdataisorganized.
HowtoAnswertheQuestion
:Explainyourassumptionsaboutthedataandthedetailsofhow
thedistributioninquestionfitsthemodel.Beabletovisualizedistributionsandexplaintothe
interviewerwhythedistributionyouvisualizefollowsthemodel.
Answer
:
www.springboard.com
42
a. Poissondistribution.Thisisassumingthatcustomerarrivalsareentirelyindependentfrom
eachother.
b. Normaldistribution.NotethatinacontinuousdistributionthelikelihoodofbeingexactlyX
inchesiszero.
www.springboard.com
43
c. P(sum(x1+x2)={0,1,(2,12),(3,11),(4,10)...36})={0,0,1/36,2/36,3/36,}.Youcanplotthis
outwherethexaxisisthesum,andtheyaxisistheprobability.Illustratethatthisisa
probabilitymassfunctionvsacontinuousprobabilitydistributionfunction.
d. Binomialdistribution.P(kisthenumberofheadsinNthrows):
www.springboard.com
44
Notethatthisvisualizationsaysthereisa25%chanceyouwillget5coinsoutof10tobeheads.
www.springboard.com
45
CodingQuestions
Alargepartofadatasciencerole,especiallyifitismorefocusedtodataengineering,is
programmingtoimplementalgorithmsatscale.Bepreparedtofacesomethingsimilartoa
softwareengineeringinterviewwhereyoullbetestedonyourexperiencewiththetechnicaltoolsa
companyusesandyouroverallknowledgeofprogrammingtheory.
1
SQLGivenatableoftransactions(Transaction_ID,Item_ID,quantity,purchase_date
(MM/DD/YY))andanothertableofprices(item_ID,price),givethefollowinginformation:
1. Totalrevenue
2. Totalnumber/average/standarddeviationofpurchasequantitiesforthesetofweekdays
(MondayFriday)orderedbydescendingnumberofpurchases.
3. Numberofitem_IDsthatwereNOTpurchasedintheweekdays.
Exampletableoftransactions(definedastransactions)
:
Transaction_ID
Item_ID
Quantity
Purchase_Date
06/28/2016
06/27/2016
06/27/2016
06/26/2016
Exampletableofprices(definedasprices):
item_ID
Price
$2
$3
www.springboard.com
46
Rationale
:TheuseofSQLtoquerydatabasesisprevalentinlargerstartupsandestablished
companieslookingtoleveragetheircompany.Ifyouareadataanalyst,yourtechnicalinterview
mayexclusivelybeSQLquestions.Understandinghowtogetdatatherightwaycanmakethe
differencebetweengettingajobandnot.
HowtoAnswertheQuestion
:Youwilloftenbeaskedtosketchoutyourcodeonpaperor
workwithacollaborativecodingtoollike
HackerEarth
whereyouwillbecodingintheinterpreter
andyourcodeisseenlivebyyourinterviewer.Makesureyoutryforthemostefficientsolution
withasfewerrorsaspossiblegivenashorttimeconstraint.Usesomethinglike
SQLFiddle
ifyou
wanttopracticeyourSQLqueryingskills!
Answer
:
1. SELECT
sum
(a.quantity
*
b.price)
FROM
transactions
AS
a
JOIN
prices
AS
b
ON
a.item_ID
=
b.item_ID
Thiswilljointhepricecolumnfromthepricestableontothetransactionstable,allowingyouto
multiplythequantityofeachitemwithitspriceandthentosumupthatmultiplication.Thiswill
yieldananswerof$37forourtwoexampletables.
2. SELECT
DAYOFWEEK(purchase_date),
sum
(quantity),
avg
(quantity),
std(quantity)
FROM
transactions
WHERE
DAYOFWEEK(purchase_date)
BETWEEN
2
AND
6
GROUP
BY
DAYOFWEEK(purchase_date)
ORDER
BY
2
DESC
Thisquerywillusethe
DAYOFWEEK
functioninMySQL,whichreturnsanumberindexofwhich
dayacalendardayis,andreturnsavaluefrom1and7,with1correspondingtoSunday,and7
www.springboard.com
47
correspondingtoSaturday.Filtering,selecting,andthenorderingbydescendingquantities
satisfiesthequestionoftable2.
Ifyouranthequeryonthesampletable,youdgetthefollowingoutput,with2correspondingto
Monday(June27th,2016):
3. Twoapproaches(usingLeftJoinvs.GroupBy):
a. SELECT
COUNT
(
DISTINCT
A.item_ID)
FROM
transactionsA
LEFT
JOIN
(
SELECT
purchase_date
FROM
transactions
WHERE
day_of_week(purchase_date)
IN
(Monday,
Tuesday,
Wednesday,
Thursday,
Friday))
AS
B
ON
A.Transaction_ID
=
B.Transaction_ID
WHERE
B.purchase_date
=
NULL
b. SELECT
COUNT
FROM
(
SELECT
item_ID
FROM
transactions
WHERE
IsWeekDay(purchase_date)
!=
TRUE
groupby
item_ID)
Eitherapproachwillnarrowdownatableofitemsthatwerenotpurchasedontheweekend,then
applyaspecialcounttoit.
TipsforSQLQuestions:
www.springboard.com
48
1. Dosmallqueriesfirstinsteadofgoingtothesubqueries.Breaktheproblemdownto
specificintermediatetables,anddothequeriesforthoseintermediatetablesfirst.
2. Becarefulofthecolumnyoudothejoinon.Askwhetheryouwanttokeeprowswhere
therewasntamatch(i.e.leftjoinifneeded).
3. Ifyoudontknowtheexacttransformationfunction,assumetheexistenceofone,statethe
input/outputtotheinterviewer,andmoveon.
2DevelopaKNearestNeighborsalgorithmfromscratch
(
algorithmcoding
)
Rationale
:Showingyoucanwriteoutthethinkingbehindanalgorithmanddeployitefficiently
inagiventimeconstraintwillbeacriticalwaytoevaluatedataengineeringskills.Thiskindof
questionwillbeaskedofdatascientistswhohaveknowledgeofbothalgorithmsandtheir
technicalimplementation,ordataengineerswhoaregivencontextonwhatisthealgorithm.This
questionscanbeaskedofanyalgorithm,butmostofthetimeinterviewerswilluserKnearest
neighbours,asitsrelativelyeasytocomeupwithcodethatcanwork.
HowtoAnswertheQuestion
:First,clarifythequestion.Givenafeaturevector,findthe
euclideandistancefromthatvectortoeveryotherknownvector,andtaketheclassthatisthe
majoritywithintheclosestKvectors.Thisparticularquestiontestsyourunderstandingofmatrix
computationandhowtodealwithvectorsandmatrices.Startbygoingthroughasamplesetof
inputsandoutputs,andmanuallyderivetheanswer.Also,keepaneyeonthetime/space
complexity.Inthesolutionbelow,eachpredictionisofO(2N+NlogN)timecomplexitywhereN
isthenumberofrowsoftrainingdata.
Youwillwanttowritedownyoursolution.Syntaxcounts,andsodovariousfaultsthatwillstop
yourcodefromcompilingproperly,butitdoesntcountasmuchasexpressingthelogicbehindthe
algorithm,andshowinghowyoucanapplyalgorithmicthinkingtotheplaneofcomputerscience.
Solution
:
www.springboard.com
49
Othercodingquestionscanbemorebigdataspecific.Forexample,askingaboutmapreduceisa
typicalquestioninthecasethatthepositionrequiresanalysisofverylargedatasets.Questions
hereaskhowtotaketheaverageofalargedatasetorfindthemostfrequenteventinanevent
stream.
3HowdoeswordcountmapreduceworkonHadoop?
Rationale:
YouwillgetquestionsaboutHadoopandbigdatatoolsifyouindicateonyourCV
thatyouhaveexperiencewiththem,orifthecompanyinquestiondealswithmassivedatasets.
LargerFortune500companiesandtechstartupsthathavescaledbeyondmillionsofusersare
likelytochallengeyouonyouruseofbigdata.Youshoulddemonstrateaknowledgeof
mapreduce,whichcancomefromworkexperienceorplayingaroundwithmassivedatasetson
yourown.Hortonhasresourcesdedicatedtohelpingpeoplelearn
MapReduce
ifyouneedto
brushup.
HowtoApproachtheAnswer
:Thisquestionseeshowdeeplyyouunderstandthemapreduce
frameworkonHadoop.ThisistypicallydoneusingJava.Althoughthewordcountproblemisan
extremelycommonlyunderstoodone,knowinghowit'simplementedwithintheJavaHadoop
frameworkistheimportantpiecehere.
www.springboard.com
50
Answer
:Thedrivercodewouldsetupthejobandconfiguration.IfthedatacomesfromHDFS
andoutputiswrittentoHDFS,addtheinput/outputpathtothejobtothosedirectories.Thenthe
mapperjobwouldtakeeachlineinthefileandemitavalueof1foreachwordasthekey.Notethat
thedatapassedbetweenmapperandreducermustusetheHadoopdatastructuressuchasText
andIntWritablessincethesearemoreefficientforbytearrayserializationvs.primitivetypessuch
asStringsandInts.Themapperoutputwouldthenbecollectedineachexecutor,andthenthe
combinertaskwouldbeexecuted.Thecombinerisalocalaggregatorthatisoptionallysetto
reducetheamountofdatasentbetweenthemappersandreducers.
Onceallthemappersarecomplete,onlythencantheshufflephasebegin.Youmightobserveyour
jobsstuckat33%reducer,whichimpliesthattheshufflephaseiswaitingonthemappersto
complete.Onceallthekeysaresenttothereducersbasedonthisshuffle,thesortphasebeginson
eachreducer.Afterthat,thereducelogicisexecuted,andtheoutputcanbewrittentoanother
HDFSfile.
Commonfollowupquestionsininterviewswouldbetoestimatethetimecomplexityofthis
algorithm,andtheamountofdatathesystemwritesorcommunicatesbetweenmachines.Dont
forgettotakeredundancyintoaccount,i.e.aHadoopsystemusuallymakesmultiplecopiesofdata
incaseamachinegoesdown.
ScenarioQuestions
1Ifyouwereadatascientistatawebcompanythatsellsshoes,howwouldyoubuildasystem
thatrecommendsshoestoavisitor?(QuestionaskedinVerizonDataScientistInterview)
Rationale
:Thisquestiontestshowyouthinkaboutyourworkintermsofdeliveringproducts
fromendtoend.Scenarioquestionsdonttestforknowledgeineveryfieldtheyaresettoexplore
aproductfrombeginningtodeliveryandseewhatlimitsthecandidatewouldhave.Whilealso
evaluatingforholisticknowledgeofwhatittakestomanageateamtodeliverafinalproduct,this
questionistoseehowthecandidatewouldfitintoateamsituation.
www.springboard.com
51
Typically,datascientistswillbeaskedthisquestion,whiledataengineersoranalystsmightbe
askedforspecificpartsofthescenariorelevanttothem.Dataengineersmightbeaskedtothinkof
howtoimplementacertainalgorithmatscalewithouthavingtothinkofthealgorithmitself,while
dataanalystsmightbeaskedwhatdatatheydquerytodetermineusershistoricalpreferencesfor
shoes.
HowtoAnswerthisQuestion
:Beveryhonestastowhereyoucanaddalotofvalue
(emphasizewhatpartsyouvehadexperiencein),butdontbeshyaboutwhereyouexpecttogeta
littlebitofhelp.Trytorelatehowyourtechnicalknowledgecanhelpwithbusinessoutcomes,and
alwaysenumeratethethoughtprocessbehindyourchoicesandtheassumptionsthatguidethem.
Donthesitatetoaskquestionsthatcanbettertailorfityouranswer.
Answer
:Breaktheanswertotwocomponents:DatascienceandDataengineering
Let'sdiscussthedatascienceelementfirst.Ifitisanewcompanythatdoesnothavemuch
historicaluserdata,gowithitemitemsimilarity.Ifthenumberofdifferentitems/shoesis
extremelylarge,considerusingmatrixfactorizationtechniquestoreducethedimensions.
Ifyouhavehistoricaldataarounduserpreferences(e.g.ratingsofshoes),youcanusea
collaborativefiltertypeapproach.Mentionspecificallytherowsandcolumnsofthematrixyou
generatewitheitherapproach.Thendiscusswhatkindofsimilaritymetricsyouwouldtry.E.g.
euclideandistance,Jaccardsimilarity,cosinedistance.
Afterexplainingthealgorithmicaspect,youwoulddiscussthedataengineeringside.Proposean
engineeringinfrastructurethatscalestomillionsofusers/shoeswhererecommendationsare
generatedinrealtime.Asanexample,youcanstreamtheuserdatatoaS3bucket.Youcan
performthematrixanalysisonanightlybasis,precomputetheentiresetofrecommendationson
aperuserbasis,andstorethisinainmemorydatabasesuchasRedis.Thenyoucouldbuilda
RESTAPIthatwouldquerythedatabaseandrespondwiththerecommendationsgivenauserid.
Question2.HowmuchisthemonetaryvalueofashareofaChange.orgpetitiononfacebook?
(Change.orginterviewquestion)
www.springboard.com
52
Rationale
:Theintentofthisquestionistoseehowmuchyouunderstandaboutthebusinessand
howwellyoucanbreakafairlycomplexproblemdowntobasicconceptsandthenconvertthese
conceptstoanalyzablechunksbasedontheavailabledata.Thisisagoodtesttoseehowwellyou
canabsorbacompanysframeworkforthedataandhowwellyoucancommunicatebusiness
insightsderivedfromyourdataanalysis.
HowtoAnswertheQuestion
:Makesureyouresearchthecompaniesyouinterviewfor
thoroughly,especiallytheirrevenuemodel.Getasenseforwhat
importantmetrics
thecompany
wouldusetotrackitsperformance,andgetusedtothinkingaboutwhatactionsacompanymust
drivetomakerevenue.Askquestionsandstateanyassumptionsyoumighthave,whichsketchout
howyourethinkingaboutthisproblem,thenanswerwithforceandconvictionasifyoure
presentingtoyoursupervisor.
TheAnswer
:ThisquestionrequiressomebasicunderstandingoftheChange.orgbusiness.A
shareofapetitioncanresultinrevenuegenerationintwodifferentways
1. Anotheruserclickingonanadvertisement(i.e.signingapaidpetition)
2. Anewusersigninguponthesystemwhothengoesontoclickonasetofadvertisements
duringthatuserslifetime
Thefirststepisfiguringoutamethodologythatwouldallowyoutoderiveavalueofbothofthese
ways.Thetrickistostartsimple.Youcansimplifythevalueequationtothefollowing:
Valueofashare=Expectedrevenuefromclickinganad+Averagenumberofnewsignupsper
shareevent*LifetimeValueofanewsignup
Expectedrevenuefromclickinganad=Likelihoodofanadvertisementclick*Averagecostper
clickchargedtopublishers
Likelihoodofanadvertisementclickcanbederivedbyjustlookingatthehistoricaldataand
findingtheaverageconversionrateoverthecourseofatimewindowsuchasamonthoryear.A
similarvaluecanbederivedforthecostperclick.
www.springboard.com
53
FortheLTV,it'salittletricky.Youneedtolookatusersoverthecourseofsimilarlifetimesand
derivetheirtotalrevenuegenerated.Onecommonmethodofdoingthisiscalledthecohort
analysisorretentionanalysis.Youcangroupusersthatsigneduponaspecificmonthandlookat
howmanyofthemclickedhowmanyadsoverthecourseofthenexttwelvemonths.Dothisover
twelvedifferentcohortmonths,andthentaketheaveragerevenueoverthelifetime.Now,the
lifetimetoanalyzecanbesettobehoweverlongittakesthatcauseandeffectrelationshiptobe
considerednegligible,i.e.theuserthatsignedupduetotheinitialsharewouldhavesignedup
anywaybeyondthattimewindow,hencetherevenuegeneratedcannotbesolelyattributedtothe
share.
OnceyouhavetheLTV,plugitintotheoriginalequation,andyouhavethevalueofapetition
share.Therearedeeperelementsyoucangointo,suchastherevenuegeneratedbythenewly
joininguserssharingthemselveswhichcausesotheruserstojoin.Makesurethatifyouaregoing
toincludeadditionalelementstoyouranswerthatitdoesntdiluteyourmainmessage.Stay
laserfocusedonansweringtheoriginalquestion.Ifyouhaveassortedthoughtsonthesituation,
leavethemtotheend.
3Givenasetofhistoricalnewsarticlesthathavebeenclassifiedasspecificcategoriessuchas
Sports,Politics,World,howwouldyouclassifyanewarticle?
Rationale
:Thisquestionlooksathowdeeplyyouunderstandthedatasciencemethodologyand
yourexperiencewithdealingwithunstructuredtextdata,animportanttestforhowcomfortable
youarewithdataformatsthatmightbedifficulttodealwith.
HowtoAnswertheQuestion
:Specifyhowyouwouldorganizethetextandhowyouthinkof
classificationsystems.
Samplesolution:
1. Explorethedataandunderstandkeyelementsofthedata.
a. Plotthedistributionofvariouscategoriesinyourtrainingsettodetermineifthereis
labelimbalance.
www.springboard.com
54
b. Lookatthetexttoidentifyanythingstrange,suchasnonenglishtext,heavy
abbreviations,ormisspellings.
c. Dotopicextractiontoidentifykeywordsforspecificlatenttopicsandfindcorrelation
tothelabelledcategories.Thismaygiveyouahintastowhethertherearelatent
topics(keywords)thatmaycorrelatebetterthanjustusingallthewords.
2. Derivethetrainingsetbycleaningupthetext.Removelesserinformativeelementssuchas
punctuation,abbreviations,andunicodecharacters.Dofurthercleaningbytakingthelower
caseofwordsandlemmatization/stemming.
3. UseaTFIDFvectorizertoconvertthedatatoabagofwordsmodelwithTFIDFmetric.et
lowerandupperboundstoTFIDFtoreducethevocabularysize.
4. Buildapipelinewhereyoucantrainvariousmodelsandcomparetheirperformance
relativetometricssuchasAUC,F1score,precision,andrecall.Youcandogridsearchto
automatethecrossvalidationaspectaswell.
5. Onceyougettheoptimalmodel,youcanpublishthismodeltoproductionusingapickled
model(inpython)orPOJO(injava).Thismodelcanthenbequeriedbyusingtheexact
sameprocessofcleaningasdonein#2and#3forthenewarticles.
4Designanexperimenttofigureoutwhichwebdesignalternativetouse.Assumetherehave
beennootherexperimentsdoneandthereisnoknowledgeoftheuserbehavior.Discusspotential
issuesthatcanoccurwiththeconclusionsandhowtoavoidthem.
Rationale
:Manywebcompaniesaskthisquestionbecauseitistheirbreadandbutterto
optimizetheirwebsiteforbetterbusinessresults.ThinkofFacebookconstantlychangingtheir
homepagetogetyoutopostmore.Thedatascientistsroleisofteninhelpingtheproductmanager
setuptheexperimentorinterprettheexperimentresults.Thegoalofthequestionistoseethe
depthoftheknowledgeoftheintervieweeinthistopic.
Solution
:Identifythenatureofthechangeandthemetrictoconsidertodecidewhichversionof
thesitetochoose.Forexample,clickthroughrateandaveragenumberofFacebookshares.
Next,decidethenumberofsamples/visitsnecessarytohitthenecessarystatisticalsignificance
(e.g.95%).Thiscanbedonebyusingachisquaredtest(ifweareusingabinomialrandom
www.springboard.com
55
variableofclickingvs.notclicking)oraztest(ifweareusinganormallydistributedrandom
variable).YoucanthenevaluatethepvaluetoidentifywhetherthemetricoftheBtestis
statisticallysignificantlydifferentthanthemetricofthebaselineAtest.Ifitisandthemetricis
betterthanthebaseline,thenthealternativesiteisthebetterwaytogo.
Someotherissuesyoushouldconsiderinthisanswer
:
1) Identifypotentialbiasesduetointeractionsacrosspages.Talktotheproductmanagerand
seeiftherearewaysthatarandomsamplingmaynotworktotestthenatureofthechange
youreproposingforawebpage.
2) PerformaA/Atestwhichimpliestestingtworandomsamplesofvisitors,andcheckifthe
distributionandmetricofchoicedoesnothaveastatisticallysignificantdifference.This
willensurethefairnessoftheA/Btest.AnA/Atestensuresthatyouraudiencedoesnthave
aparticularskeworbiasandarandomizedselectionforanA/Btestwillbestatistically
relevant.
3) Whatifthemetricthatweareevaluatinghassignificantoutliersthatmaycausetheaverage
tobeapoormetric?Thedistributionmaybehighlyskewed.Weassumetheaverageisa
goodmetricofcomparisonsincecentrallimittheoremholds.Thismaynotbetrue.Hence,
checkthedistributionofthemetrictoensurethattakinganaverage(e.g.conversionrateor
averagenumberofsharesperuser)isareasonablemetricwhencomparingbetween
alternatives.Ifoneuserhasthousandsofsharesattributedtotheiraccount,forexample,
usingsharerateperusermaynotbethebestperformancemetric.
Insummary,casequestionsaredesignedtotestforyourexperienceandyourknowledgein
differentfieldsofdatascience.Theyaredesignedtoseeifyouhaveanylimitstoyourability.
Demonstrateyourknowledgethoroughly,andyoullcomeoffwellinanycaseanalysis.
TacklingtheInterview
1) Dressedaccordingly.Ifitsaninterviewforastartup,adressshirtwillsuffice.Ifitsan
interviewwithabank,wearsuitandtie.Ifyoureunsureofwhattowear,ask.
www.springboard.com
56
2) Beforeyoucomeintotheinterview,researchyourinterviewerandthecompany.Comeup
withgoodquestionstoask.
3) Beatthetopofyourgamementally.Eatwell,behydrated,exercisewell,anddowhatever
youcantomakesureyourepreparedtohandleaninterview.
4) Answerquestionsindetail,andsketchoutyourthoughtprocess.
5) Smile,andbeconfident.Dontcomeinstressed.Meditate,stretch,orreaddowhateverit
takestogetyoutoyourpeak.
Conclusion
Thedatascienceinterviewprocessisamultifacetedbeast.Youllbechallengedtoprogramand
comeupwithtechnicalalgorithmsonthespot.Youllbechallengedaboutyourstatisticaland
mathematicalknowledge.Youllbechallengedonyourabilitytoleadteams,communicate,
persuade,andinfluence.
Itcanbehardtoseehowtopassthisbeastofaninterviewprocess.Thankfully,wecondensed
actionableinsightsfromsuccessfulapplicantsandthehiringmanagersontheothersideofthe
table.
www.springboard.com
57
InterviewwithWillKurt(QuickSprout)
Bio:WillKurtisaDataScientistwithQuickSprout.His
maininterestsareprobability,writing,andHaskell.He
blogsatCountBayesie.comandcanbefoundonTwitteras
@willkurt
Whatdoyoulookforwhenyourehiring
candidates?
Thebiggestthingformehasalwaysbeenacombinationof
creativityandgenuinecuriosity.Inastartupenvironment,
newproblemscomeupeverydayinawiderangeofareas.
Onemonthyoumaybehelpingtheproductteamaddnewfeatures.Thenextmonth,youllhelp
salesimprovetheirprocess,andthemonthafter,youllbehelpingmarketingrestructuretheir
testingsetup.Themostvaluablecandidatesaretheonesinterestedinallofthecompanysdata
relatedproblemsandalwaysthinkingofnewandinterestingwaystosolvethem.
Whatsthebestpieceofadviceyoucangivetopeoplegoingthroughthedatascience
interviewprocess?
Inmyexperience,allsmallcompaniesandstartupsworthworkingforareexcitedabouttheidea
ofaddinganewdatascientisttotheteam.Theyhopeyourskillsandexperiencewillhelpthem
solvearangeofproblemstheyvebeenstrugglingwith.Showuptotheinterviewreadytolistento
whattheyretryingtosolveandgetthemexcitedaboutsolvingproblemstogether.Everychance
yougetaskpeoplewhattheyreworkingonandgetthembrainstormingwithyouaboutwaysyou
couldmaketheirdaybetter.Therearethousandsofcandidatesouttherewithsuperbquantitative
www.springboard.com
58
skills,butcandidateswhocareandareexcitedareveryrare.Leavetheinterviewwitheveryone
wantingtoworkwithyouonaproject,andtheyllbetheoneshopingyousayyes.
Whatkindofinterviewquestionsdoyouliketoask?Whatareyoutryingtotest?
AllIcareaboutishowyourmindworksonceitsfixeditselfonaninterestingproblem.At
Kissmetrics,Igaveoutanopenendedhomeworkassignment.Therewasanobviousapproach
totheproblem(buildaclassifier),butImentionedthisandcautionedthatpartofthetestwasto
seeifyoucouldcomeupwithsomethinginteresting.Theresultsoftheassignmentdidnthaveto
belongorcomplicated.Whatmatteredisthattheystartedaconversationandshowedthatthe
candidatehadgenuinecuriosityinfindingsomethingworthtalkingabout.Giventhatacandidate
cancodeandiscomfortablewithlinearalgebra,calculus,andprobability,theyhavethebasicsto
learneverythingelse.Itisveryhardtoteachsomeonetothinkcreativelyorbecomepassionate
aboutproblems.
WhatisdifferentabouthowKissmetricsandQuickSprouthiredatascientists?
Rightnow,Quicksproutisaverysmallteamintheearlystagesofproductdevelopment,sowere
nothiringnewdatascientistsatthemoment.Onethingthataspiringdatascientistsshouldknow
isthatmanystartupsandsmallcompaniesarelookingforadatascientistbutmayhavegivenup
onfindingoneasthesearchprocesscanbeexhausting.OneofourbestcandidatesatKissmetrics
showedupatourdoorandsaid,Iwanttoworkhere!Peoplecomingfromacademiaorother
largeorganizationsmightnotbeawareofhowflexiblestartupsandsmallcompaniescanbewhen
itcomestohiring.Ifyouthinkacompanyisdoingcoolwork,connectwiththem.Itshardtomake
abetterimpressiononagroupofpeopleexcitedabouttheirworkthantellingthemyoulovewhat
theyredoingandwanttobeapartofit.Evenifthatcompanyisnthiring,youllbeatthetopof
thelistif/whentheydostartlooking.
www.springboard.com
59
InterviewwithMattFornito(OpsVisionSolutions)
Bio:MattFornitoisaDataScientistandLeaderwithover
tenyearsofexperienceintheresearch,analytics,and
managementdomains.Apassionforlearninganddevout
workethiccontinuestohelphimgrow.Thisinterviewis
transcribedfromnotestakenonaphonecallwithMatt.
Whatdoesyoulookforwhenyourehiring
candidates?
Ifeelmostcomfortablehiringpeoplewithastrong
quantitativebackgroundwhocanlearnprogramming
ratherthantheotherwayaround.AMastersoraPH.Disveryimportanttome,asIfeelthat
undergradisnotastrongsignalofsuccessitsarelativebreezeformostpeople.Ipreferhiring
peopleabletopickupprogrammingandeffectivecommunicationknowingandunderstanding
whatthetechnicalproblemsaretoimplementingasolutionandbeingabletocommunicatethose
conceptsiskey.Whatdifferentiatesdatascientistsanddataanalystsistheabilityofdatascientists
todeeplyunderstanddataproblemsandhowtosolveforthem.
IlikerecruitingMastersandPhDsfrommathandstatistics,chemistry,physics,and
bioinformaticsandengineering.ThereareasmallhandfulofpeopleinMBAsthathaveworked
outgreatforme.IamactuallyaPhDinorganizationalpsychology,sothoughItendtotrytohire
peoplewithSTEMbackgrounds,itisntastrictlimitation.
Whatsthebestpieceofadviceyoucangivetopeoplegoingthroughthedatascience
interviewprocess?
RecruiterslookateducationlevelandthelasttwojobsontheCVandtheirpedigree.HRsonly
takeaveryquickglanceatCVs,soyouhavetostandoutinamatterofseconds.Onepieceof
advice:getyourselfintoabigcompanythathasapedigreelikeFacebook,orgointoastartupand
takeahighpositionsothatyoucanstandouteasilyforadvanceddatascienceroles.
www.springboard.com
60
Walkmethroughaprojectquestionswhereahiringmanagerwillaskexactlyhowyoubuilt
somethinginthepastarehugeeverythingfromwhatdatawasused,whattoolswereused,what
theoutcomeswereareimportanttorecountclearly.successfulintervieweeshaveacomfortable
grasponwhattheyveworkedonandarereadytostorytellonthatelementandrelatehowtheir
workimpactedthebusinesstheywereworkingfor.
Whatareyoutestingfor?
QuestionsIaskinvolveworkingaroundaprojecttotestproblemsolvingandcommunication
skillsacrosstheinterview.Iamalsoassessingacandidatespassionforthecompanyanddata
science.Adriveforcontinuouslearningandloveofproblemsolvingarekeydifferentiators.Then
onthetechnicalside,Iaminterestedinseeingcandidatesworkonhowtooptimizedatawith
HadoopandSparkandworkingonthetradeoffsbetweendifferentdatasciencesolutions.Dothey
thinklikeadatascientist?Havetheydonedatasciencework?TheseareimportantquestionsIam
lookingtouncoverwithmyinterviewprocess.
Iwillthengointomathquestionssuchasaskinghowgradientdescent,statisticaltechniques,and
randomforestwork.Acoupleofsituationalquestionswherethecandidateisputthrougha
hypotheticalclientsituationaredeployedtoseehowthecandidatewouldhandleinterfacingwith
clients.IhaveastrictrequirementofabilitytoprograminPythonorR,butIamflexiblewithC++
andJava.IdontbelieveinHackerRankliketestingsituationswhereyouareexpectedtotraceout
asolutionIwouldrathertestforadaptiontonewprogramminglanguagesandanabilitytolearn
skillsrapidly.Anybodyhiredisgoingtohavetohavethelatentskillof
adaptability,andthatis
thekeythingIamtestingfor.
www.springboard.com
61
InterviewwithAndrewMaguire(PMC/Google/Accenture)
Bio:AndrewhasbeenworkinginAnalytics/Data
Sciencefor7yearsinvariousrolesacrossmany
differentindustries.HeisaDataScientistatPenske
MediaCorporationfocusingonbothdata
engineeringinfrastructureaswellasapplied
businessanalytics.Priortothisposition,heworked
atGoogle(marketinganalytics,thenlocaldata
quality),Accenture'sAnalyticsInnovationCentre
(consultancy),andAon'sCenterforInnovationand
Analytics(productdevelopmentteam).
Whatdoyoulookforwhenyourehiring
candidates?
Beyondmeetingthebasicrequirementsfromatechnicalandexperiencepointofview,I'dsay
enthusiasm,willingnessandabilitytocontinuallylearnnewthingsarekey..
Agoodattitudeissuperimportant,sosomeonewhoisabletoalsotellmeabouttheirweaknesses
aswellasstrengthsisagoodwaytodrawthisout(sometimessellingtoohardisabitoffputting
humilityismuchbetter).
Beingapproachable,openandhonestissomethingthat'skeyonthe'teamfit'side.Youdonthave
toknowtheanswerforeverythingbutbeingabletoworkwithotherstocomeupwithadecent
solutioniscrucial.
Whatsthebestpieceofadviceyoucangivetopeoplegoingthroughthedatascience
interviewprocess?
Onthetechnicalstuff,takeyourtime,writestuffdown,andaskclarifyingquestions.Alsodon'tbe
afraidtotellthemifitsanareayou'venotworkedonbeforeoranalgorithmyou'renotthat
www.springboard.com
62
familiarwith.Beingabletoadmitwhenyourknowledgeislimitedissuperimportantasadata
scientistcontinuallylearningisoneofthemostimportantskillsrequired.
Makesureyouhavetwoorthreedatascience'stories'youcanchataboutwithaninterviewerthat
touchonproblemformulation,datawrangling,analysisandinsights,visualizationand
stakeholdercommunication.Trytogetthebalancerightbetweencoolnerdytechnicalstuffand
showingbusinessunderstandingandinsights.These'stories'canbeprojectsfromyourprevious
roles,collegeassignments,orprojectsyoudidonyourowntime.Getgoodatspottingopenings
frominterviewerquestionstouseyourstoriestoshowconcreteexamplesandexperience.Ifind
thatchatting(indetail)aboutprojectsthecandidatehasdoneinthepastisthebestwaytogeta
properfeelforthem(andbestplacetoprobedeeperfrom),somakesureyoumakeiteasyforthe
interviewertobeinterestedandexcitedtoaskyouaboutsomeprojectsorexamplesfromyour
CV.
Whatkindofinterviewquestionsdoyouliketoask?Whatareyoutryingtotest?
What'sthebiggestormostcomplexdatasetyouhaveeverworkedwith?Whatproblemsdidit
create?(Tryingtobeginadiscussionherethatcanleadintojudgingdatawranglingskillsand
experience)
Givemeanexampleofatimewhenyouanalysedadataset,andcommunicateyourfindingsback
tothebusiness.Whatwastheproblemfaced?Whatdidyoufind?Howdidthisaffectthe
business?(Touchontheextractingbusinessinsightsandcommunicatingbacktostakeholders
aspects)
IaskquestionsveryrelatedtowhatisontheCV,soifit'saprojectfromapreviousrolefor
example,Iwantyoutoexplainwhattheproblemwas,whatsortofdatayouused,howyouusedit,
whattheinsightswere,andhowthisallfitsintothewiderbusiness.Choosewhatyouputonyour
CV
verydeliberately
.Ifyoufindithardtogetallontwopagesthenmaybehavedifferenttypes
ofcvsyoumightusefordifferenttypesofroles.
Finally,Iaskcandidatestogivemeanexampleofatimewhentheyfailed,thenaddwhatthey
thinkwentwrongandwhattheywoulddodifferentlyinfuture.Thisissomethingthatcomesout
ofHR101,butIliketohearwhattheyhavetosay:)
www.springboard.com
63
WhatisdifferentabouthowGooglehiresdatascientistsfromtherestofthe
industry?
I'mnotsurethereistoomuchofadifferenceanymore.Generallyitdependsonthespecificrole.
Forveryspecializedpositionsthatareoftenmorelikeresearchorfellowshippositions,youwould
getmuchmoredetailedtechnicalquestionsandproblemstodriveintotherelevantareaof
expertiseinveryfinedetail.Formoregeneralistorbusinessrelatedroles,thefocusismoreonthe
rightmixoftechnicalskills,businessunderstanding,workinginteams,andcommunicating
resultstostakeholders.
ThemaindifferenceinGoogleisthatyouhavealotmoreinterviewsandmeetmorepeople,so
behindthescenestherearearound6+peoplewhohaveallmetyouandprobedyoufromtheir
owndifferentangles.Thesepeopleallhaveadifferentviewofyouandyourstrengthsand
weaknessesandmustcometoadecisionandconclusiontogetherthattypicallyinvolvestradeoffs.
Beingabletoshowdecentlevelofcompetenceacrosstheboardasopposedtobeingarockstarin
oneareabutlettingyourselfdowninotherswillgenerallyserveyouwell.Thisiswhereattitude
andbeingeasytogetalongwithcanbemostimportantevenifyoufallalittleshortononeofthe
competencies,iftheylikeyouandfeelyoucouldeasilygetuptospeedinthatareainafew
months,thenit'slesslikelytobeadealbreaker.
www.springboard.com
64
InterviewwithHristoGyoshev(MasterClass)
Bio:HristoGyoshevistheHeadofBusinessOperations&
StrategyatMasterClass,afastgrowingstartupthatis
democratizingaccesstogeniusandreimaginingonline
education.Hepreviouslyworkedoncorporatestrategy,
businessoperations,andproductstrategyatboth
consumerweb(e.g.Yahoo!)andenterpriseSaaS
companies.MasterClassislookingtohireaDataScientist
&manyotherpositions.Checkoutthedetailsat
careers.masterclass.com/
Whatdoyoulookforwhenyourehiring
candidates?
Oneofthemainassetswelookforisadesiretoworkonprojectsacrossaverybroadrangeof
analyticdisciplinesfromquantitativemarketresearchand/ordesigning,conducting,and
analyzingusersurveys,tostatisticalanalysis,tobusinessintelligenceandanalytics.Wealsolook
forcandidateswhoarecomfortablelearningsomethingnewtoremovebottlenecksandkeepa
projectmoving,whennecessary.
Intermsofeducationalbackgroundandexperience,werelookingforananalyticalbackground
thatcombines1.sufficientknowledgeofstatisticstodeterminewhatisorisntavalidstatistical
inference,recognize&preventbiases,etc.and2.thedesireandabilitytoobtainandworkwith
realworlddata(whichisalwaysimperfect)andderiveactionableinsights.
Someonewhohasaverystrongquantitativebackgroundandabilitytoprocessandanalyzedata
usingExcel,SQL,andPythonorRwhoalsohasexperienceinsocialscienceresearchor
market/userresearch(througheitheracademicorindustrywork)andwhohasexperiencewith
businessreporting/analytics,couldbeanidealcandidateforus.
Whatsthebestpieceofadviceyoucangivetopeoplegoingthroughthedatascience
interviewprocess?
www.springboard.com
65
Strivetounderstandandkeepinmindthebroadercontextoftheproblemyouarebeingaskedto
solveortheproblembehindthequestionyouarebeingasked.Wheneveryouareaskedto
performacertainanalysis,orbuildamodel,someoneatthecompanybelievesthatthiswouldhelp
themsolveaparticularproblem.Sometimesyoucantellinadvancethatitwont,andsometimes
youcansuggestabetterapproach.Youranalysis/model/otherworkproductwillalwaysbebetter
ifyoustartfromagoodunderstandingofthecoreobjectivesoftheclientsofyouranalysis.(This
appliesasmuchtoquestionsyouareaskedduringtheinterviewasitdoestoprojectsyouare
askedtoworkononceyougetthejob.)
Whatkindofinterviewquestionsdoyouliketoask?Whatareyoutryingtotest?
Weliketounderstandacandidatespreviousexperiencewithvarioustypesofworkthatweexpect
willberelevanttotheirrole.Thus,wemayaskforexamplesofspecifictypesofprojectstheyhave
workedon,andthenaskthemtowalkusthroughtheirapproachandthinking,thetoolstheyused,
themajorchallengestheyencountered,andhowtheyresolvedthem.
Wemayalsoaskcandidatestocompleteashortprojecttoseehowtheyapproachsomespecific
problemandyes,tobeabletoseethequalityofadeliverabletheyproduce.
WhatisdifferentabouthowMasterClasshiresdatascientists?
ComparedtomostDataScienceroles,thejobwithusinvolvesverylittlemachinelearningor
algorithms,andonlyminimaldatawrangling,butaverywidevarietyofanalysesthatwould
informabroadrangeofdecisionsabouttheproducts,business,andoperationsofthecompany.
Theworkwould,ofcourse,involvesomeexporting,processing,andanalyzingdatafromvarious
systems,butwouldalsoinvolvebuildingvariouspredictivemodelsdesigning,conducting,and
analyzingsurveysorexperimentshelpingtodefineandsetupreporting&metricsand
conductingoneoffanalysesrelatedtovariousaspectsofourbusinessoperations.
Correspondingly,wedontneedcandidatestobeproficientinmachinelearningoralgorithms,but
wedoneedthemtobehighlyversatileandfamiliarwithanumberofotheraspectsofdata
analysis.Wealsoneedthemtobewillingandabletolearntoolsormethodstheymaynothave
previouslyused.
www.springboard.com
66
Conclusion
Hiringmanagersacrosstheboardloveitwhenyoudemonstrate:
1) Passionforthecompanyanddatascienceingeneral
2) Anabilitytogetalongwellwitheverybody,whichmayevenhelpyouwithweaknessesin
yourtechnicalability
3) Strongwillingnesstolearnanddemonstratedabilitytorapidlydoso
4) Astrongrecordofpreviousprojectsandtheabilitytorelatepreviousprojectswithimpact
driven
5) Stronganalyticalability
Nowletstalkabouttheothersideofthetable:successfulapplicantswhonowworkasdata
scientists.
www.springboard.com
67
DataScientistatBoeingCanadaAeroInfo,
Springboardgraduate
difficult
:
How long the process took. I knew to
expect several interviews, and in fact had
three. With nearly a week between each,
plus waiting for my background check to
clear, theprocessfromfirstcontactto firm
offer took a month. It was stressful to say
the least. Staying positive, confident, and
prepared
thedatascienceinterviewprocess?
rightbeforeinterviewstochannelcalmand
Interms ofpreparation,IwishIspentmoretime
confidence.
thinkingaboutanalyticsstrategy.I preppedhard
higherlevel
analytics
methods
&
strategies.
www.springboard.com
68
NirajSheth
DataAnalystatReddit,Springboardgraduate
You didn't ask this, but there were also
some things I did that I think worked out
well. One is to have a live project up
somewhere with a neat visualization (i.e.
morethan a githubrepowithareadme).It
doesn'thavetobefancyjustproveyoucan
buildsomethingthatworks(minewasa
fog
prediction map
, for example). It definitely
helpsgetyourfootinthedoor.
foundthatfor myselfandotherpeoplewho
don'thavea formaldatabackground,itcan
thedatascienceinterviewprocess?
beintimidatingtoworkonadatasetonthe
IwishIhadstudiedmorefundamentalstatistics
TypeIandTypeIIerrors.Dependingonthetime
andatleastbecomingfamiliarwiththetermsout
showwhatIcoulddothatway.
there.
www.springboard.com
69
SdrjanSantic
relatetothesesameaffairs,butwithhaving
DataScientistatFeedzai|DataScienceMentor
tobreakoutthemathonawhiteboard.
atSpringboard
Howdidyourinterviewprocessgo?
www.springboard.com
70
Conclusion
Thecommonpointsforsuccessthesedatascientistsbringtotheforefrontareasfollows:
1) Dontthinkquestionsaboutbasicmaterialwontbecovered.Readuponstatistical
fundamentalsbeforeyougothroughtheinterviewprocess.
2) Bepreparedtodowellonnontechnicaldimensions.Companiesaretestingyouonyour
communicationskillsandyourabilitytogetalongwithfuturecoworkersasmuchasthey
aretestingyouonyourstatisticalandprogrammingknowledge.
3) Bepreparedtostorytellaboutwhoyouareandwhyyourpassionsandskillsareuniquely
valuableforthecompanyathand.Havingrelevantprojectsandbeingveryclearaboutwhat
youcontributedtothoseworkswillmarkyouasacandidateworthyofpassingtothenext
round.
4) Bepatient.Aninterviewprocesscantakealongtime.Youllwanttobepreparedtowait.
Weveprovidedyouallthatwehaveontheactualdatascienceinterviewprocess.Nowwehaveto
lookatwhathappensafteryouvefinishedinterviewing.
www.springboard.com
71
Afteryouvefinishedyourdatascienceinterview,youmightthinkyourworkisfinished.Thatsnot
necessarilythecase.Herearealistofthingsyoucandoaftertheinterviewtoensure,asbestas
possible,thatyoumaximizeyourchancesofmakingthebestlastingimpressiononyourpotential
employers.
Itisnowcustomarytosendafollowupthankyounote.Mostrecruitersnowagreethatitis
mandatorytodoso.Witheachofficeworkerreceivinganaverageof110emailsaday,youwont
wanttojuststickwithaboilerplateThankyoufortheopportunityemail.Howyoufollowupon
aninterviewcanmakethedifferencebetweeninternaladvocatesfightingtogetyouin,and
apathy.
Makesureyoureremembered.Youllwanttosendanemailattheveryleast.Candidateswhotake
theextrastepofsendinghandwrittennotesoralistofthoughtsaftertheinterviewwillstandout
fromtherestoftheaverage109emails.
Oneeasywaytodifferentiateyourselfistogobeyondsayingthanks.Rememberwhathas
happenedintheinterviewandmakeaconsciousefforttoteaseoutexactlywhatpainpointsthe
employeristryingtosolve.Ifsampleproblemswithintheinterviewareorientedtowardsa
technicaldirection,oraquestionnotesadisconnectbetweendifferentteams,youllwanttomake
anoteofitandsendindepththoughtsonanycompanyproblemsthatmayhavesurfacedduring
yourdiscussion.
www.springboard.com
72
Afterall,aninterviewisntjustatestitsadiscussion.Ifyoulistencarefullytothequestions
presentedandasktherightquestionsyourself,youwillknowexactlywhatproblemsthecompany
isfacing.Makesureyousendthemthoughtsonwhatsolutionsyoudpursue.
Itcanbedifficultseeinghowyourdifferentskillsapplytotheoffice,especiallyforsomebodywho
hasjustmetyou.Thesharpesthiringorganizationswilloftengiveyouasampleproblemtosolve
thatissourcedfromsomerealissuetheyarefacingrightnow.Thisgivesyouthechanceto
demonstratehowyoureffortscanimpactthebusinessinapositivemanner.
Organizationsthatdontdothatwillhesitatetohiretherightcandidatebecausetheyhavent
sufficientlydemonstratedhowtheyddriveimpactforthecompanyinquestion.However,youcan
beproactiveandusewhatyoulearnedintheinterviewtofollowup.Youdonthavetostopat
sendingthemthoughtsthatshowyoulistenedcarefullyyoucangivethemactual,tangible
solutions
Theauthorof
thispostonForbes
wastoldthattheydidnthaveenoughofaportfoliotogetajob
asafreelancecopywriter.Aftertheinterview,thehiringmanagertoldthemthattheylikedthe
spiritthecandidatehad,butwerehesitantduetoalackofaportfolio.Havinglistenedcarefully
throughouttheinterview,thecandidateknewthatamajorproject(theredesignofawebsite)was
justoverthehorizon.
Insteadofacceptingdefeat,thecandidatesenttenproposedheadlinesforthewebsitebanner,free
ofcharge.Thisburstofinitiativegotherthejobofdoingtherestofthewritingforthe
websiteandtheattentionofaverybusyemployer.
Youneedtohaveaportfoliothatshowstheimpactyoucanmake,butsometimesthatisnt
enough.Ifyoureastuteandyouasktherightquestions,youcanfindamajordataproblemforthe
company.Therealwaysissomethingthatswhytheyrehiringforthefirstplace!Theresadata
projectouttherethateverybodywouldlovetoseedoneorathornyproblemthatnoonecanfigure
out.
www.springboard.com
73
Sendthemaplanforwhatyouddoorplaywithsomeofthedatatheyvedivulged,andgivesome
solidinsightsintohowyouwork.Proactiveinitiativewillgoalongwaytogettingyouanoffer.
Oneofthemostawkwardpartsofthepostinterviewprocessiswaitingforaresponse.Youdont
wanttocomeoffasdesperatebyfollowinguptoomanytimes,butcompaniestaketheirtimeif
youdontengagewiththemproactively.
Itispossibletoeffectthepostinterviewdecisionfromoutsideofthecompany,butyoushould
keepinmindtheappropriatechanneltoreachsomebody.Makesuretoaskbeforetheinterview
endshowbesttoreachyourinterviewer.Everybodyhasapreferredmodeofcommunicationif
theyspecifyshortemailsortocheckinonceinawhileinperson,followthatruleanddispelsome
ofthepostinterviewawkwardness.
5- Leverage connections
Youshouldhavecomeinwithstrongreferencesbothfromexternalandinternalsources.Ifyou
hadbeenbuildingyournetworkandprovidingvaluetothem,youshouldhavestrongadvocates
thatcansupportyourcandidacy.Checkinwithpeoplewhohavereferredyouinternallyeveryonce
inawhile,andifneeded,getthemtoadvancehowexcitedyouwouldbetoworkatthecompany
andhowluckythecompanywouldbetohireyou.
Hiringisoftennetworkdriven,andthestrongestsignalyoucansendtoapotentialemployerisa
strongnetworkofpeoplewhoarewillingtogotobatforyou.
Nomatterwhat,youreoftengoingtogetrejected.Sometimes,yourenotrightfortherole,orthey
mighthavefoundsomebodywhoisabetterfit.Itsimportantatthispointtomaintainyour
composure,thanktheemployerfortheirtime,andmoveon.
www.springboard.com
74
Peopleintheindustrytalkamongsteachother,andbeingunprofessionalatthispointwillonlybe
badkarmaandmightgetyouignoredatothercompanies.Beingprofessionalensuresthehealthof
yournetwork.Moreimportantly,anoisntalwaysano.Sometimes,companiesdokeepyour
profileonfileandtheywillreachoutforajobthatistheperfectfitforyou.
PerhapsWinstonChurchillputitbestwhenhesaidSuccessistheabilitytogofromonefailureto
anotherwithnolossofenthusiasm.
J.KRowling,theauthorofthepopularHarryPotterseries,sharedher
rejectionlettersfrom
publishers
.BrianChesky,thefounderofAirBnB(nowvaluedatmorethan10billiondollars)
published
sevenrejectionlettersfrompotentialinvestors
.Inordertoachievegreatness,youwill
havetoendurerejection.Everybodysuccessfulalreadyhas.
7- Keep up hope
Theinterviewprocesscanbeoneofgreatanxiety.Yourfuturecanbemappedoutbydeciding
whatcompanyyoucanworkfor.Aninterviewcanmeanthebeginningofacareerchange.Itcan
meanmovingcities.Itisaperiodinourliveswhereotherpeoplehaveadisproportionatecontrol
overourdestinies.
Nevertheless,asseenintheprevioussteps,youcontrolalotmorethanyouthink.Itsimportantto
keepyourheadupanddowhatyoucan.Themostimportantthingyoucandoduringthe
interviewprocess
istokeepuphope
.Interviewsarelengthy.Companiestaketimetogetbackto
you.Therearelengthyinternalchecksandprocessesbeforeacandidategetsaccepted.Youmaygo
throughmultipleroundsofinterviewswiththesamecompanyandnotseemanyclosertoafinal
offer.
Youhavetosetexpectations.DJPatil,theChiefDataScientistoftheUnitedStates(aposition
createdforhimbyPresidentObama)tooksixmonthstotransitionoutofacademiatoajobinthe
industry.
Youshouldneverbedisheartenedduringyourownjourney.
www.springboard.com
75
Yourgoalistogetasmanyinterestingoffersaspossiblethatyoucanevaluateandnegotiate.
Whiletheprocessitselfisdifficult,andmaytakelongerthanyoucouldexpect,onceyoustart
gettingoffers,youllhaveearnedthem.
Itskeytoemphasizehowimportantitistomanageyourexpectationsandkeepyourhopeup.
Severalofthedatascientistsweinterviewedtalkedaboutmonthstohalfayearofwaitingto
transferfromanadvanceddegreefromaprestigiousschooltoasecurejob.Alotofthemhadto
takeentrylevelpositionstogettheirfootinthedoor.
Youmighthaveheardalotofgreatthingsaboutdatascience,butyoullonlyexperiencethatwith
alotofhardworkandwaiting.
Makesureyouweighwhatispresentedtoyouandchoosethefutureyoudeserveonceyouve
spentallthehardworkearningit.
HandlingOffers
Ifyoufinishaprocesssuccessfully,youmighthaveoneofferormultipleoffers.Congratulations!
Acceptinganofferisacommitmentofsignificantamountsofyourtimetothecompanyin
question.Alwayskeepthatinconsideration.Thereareseveralfactorsyoucanusetoascertain
whetherornotanofferistherightoneforyou.
Company Culture
Thismightbeoneofthemostimportantfactorsindeterminingwhenanofferisoneyoushould
accept.Makesureyouaskaboutthekindofcompanycultureyouregoingtobeapartof.Lookfor
signsthatthecompanyhasindividualsthatgenuinelyenjoyspendingtimewithoneanother,and
www.springboard.com
76
runawayfromgenericdescriptorsandcompaniesthatstruggletodefinetheircultureorwavethat
questionaway.Greatcompaniesinvesttonsoftimeandeffortintomakingsuretheyhave
awesomepeoplewholovewhattheydo.Thatllcomeoffinyourquestioning.
Youshouldalsocheckexternalandobjectivesourcessuchascompanyreviewson
Glassdoor
.
ApproachcurrentemployeesaswellasformeronesthatyoucanfindonLinkedIntogettheirside
ofthestory.Youlloftenfindcandidtalesthatcangiveyouagoodpreviewofworkingatyournew
jobwouldbelike.
Team
Companycultureisanextensionoftheteamthatinhabitsit,butyoushouldbeexcitedabout
comingtoworkeverydayandworkingwitheverybodyelse.Makesurethatyoureworkingwitha
teamthatyoucanlearnfrom.Youarethe
combinationofthefivepeopleyouspendthemosttime
with
,andyouregoingtobespendingalotoftimewithyourofficeteam.
Location
Makesurethatyourecomfortablewherethecompanyislocated,especiallyifyouremoving
significantdistances.Youcantmovewithoutgreatdifficulty,anditsimportantthatyoufeelat
easewithwhereyoulive.Mattersliketheweatherandthetransitsystemmattertoacertain
degree,especiallyifyouregoingtolivewiththoseconditionsforyears.
An
astonishing18%ofpeoplenevernegotiatetheirsalary,despitethefactthatthosewhodo
typicallyseetheir
salaryraisedby7%
.
Whenyoufirstgetyouroffer,youreatanuniqueleveragepointthatyoumightnotseeagainfor
severalyears.Thisisthetimetotestwhatyoureworth.Reachoutwithanofferacompanywont
fireyouorcancelacontractofferbecauseyouwereassertingyourworth.Initialoffersaresent
withabufferforslightnegotiation.Takeadvantage.
www.springboard.com
77
Duringasalarynegotiation,
1) Comewithawellresearchednumberforwhatyouwant.Looktoindustryaverages,andget
asensefrompeopleworkinginthefieldwhatyoushouldexpect.Nevercomeintoa
negotiationwithoutk
2) Knowingwhatyouwantoutofit.
3) Staypositiveanddontpushhardforwhatyoudeserve.Instead,usethisasapositive
experiencetoassertyourworthandthevalueyoucancreate.
4) Negotiatealittlebithigherthanwhatyouthinkyoullactuallyget.Anybodyexperiencedat
negotiationwillcomebacktoyouwithacounteroffer,andyoudbestbepreparedforit.
5) Mostimportantly,dontfearrejection!Solongasyoukeeptheprocessmovingforward
civillyandprofessionally,acompanywillappreciateyoubeingfrankandpositiveatwhatis
oftenthemostdifficultpartoftherecruitmentprocessforthem.
Beforeyouaccepttheoffer,makesureyouknowhowcommittedyouaretothecompany,team,
andmoney.
www.springboard.com
78
Negotiationisalwayseasierifyouhavesomeaveragesalariestogroundyou.Ifyouhavespecific
offerstopropose,youllbestrongeratthenegotiationtable.
Herearesomefactsandfiguresthatcanstartyourresearch.
Indeed.com
cites
anaveragesalaryof$65,000fordataanalysts,anaveragesalaryof$100,000for
dataengineers,andanaveragesalaryof$115,000fordatascientists.Thisvariesfromregionto
region,withthehighestsalariestendingtoclusterinthetechheavyBayArea.Californiahasthe
highestrangeandmedianofallregionswhenitcomesdatascienceaccordingto
OReillyMedia
.
Globally,theUnitedStateshasthehighestmedianandrangeofdatasciencesalaries,whilethe
UK,NewZealand,Australia,andCanadaarentfarbehind.AsiaandAfricatendtohavethelowest
medians.
Thehighestpayingindustriestendtobetechnologyandsocialnetworkingcompanies,whilethe
lowestpayingonestendtobeeducationandnonforprofitsectors.
Thissalaryalsovariesbasedonskillsandtoolsused.
OReilly
hasadefinitivesurveyofhundreds
ofrespondentsintheindustry.Anopenstudy,theresultsindicateavarietyofdifferentfactors
thatleadtodifferentaveragesalaries.Justasanexample,peoplewhousetheScalalanguage
extensively,aspecializedtypeofprogramming,receiveabove$100,000inmediansalary,while
thosewhouseSPSS,aproprietarytool,earnsignificantlyless.
TakingtheOffertotheBestFirstDay
Ifyouveacceptedanoffer,congratulations!Youveaccomplishedthegoalofthiswholeprocess
andbrokenintothejobyouvesought,ajobthatpromisesgoodcompensationandtheabilityto
drivesignificantsocialimpact.
www.springboard.com
79
Youllhavetokeepthatmomentumgoingforwardifyouwanttolearnasmuchasyoucan.Be
awarethatcompanies
willwork
tomakeyouascomfortableaspossible.Youshouldreachoutto
futureteammatesandfigureoutwhotheyareandhowyoucanhelpwiththeirproblemsatwork.
Takethetimetosocializeandmeetasmanypeopleasyoucan.
Moreimportantly,ifyouhavetimebetweenwhenyouacceptedtheofferandwhenyoustart,relax
andenjoy!Makesureyoucatchupwithasmanypeopleasyoucaninyourlife,takethechanceto
rest,andbecompletelyrefreshedforyourfirstdayatwork.
Conclusion
Thedatascienceinterviewprocessisoneofthehardestrecruitmentprocessestocrack,andits
oneofthemostcompetitive.Yourfellowintervieweeswillbeadvanceddegreeholders,andsome
ofthemwillhaveextensiveexperienceindatascience.
Whilethefieldisattractingmanytalentedpeople,rememberthatithasaslewofdifferent
industries,challenges,andteamstoworkwith.Ifyouthinkoutsideoftheboxandapplyafew
battletestedtactics,youllbeabletogetaninterviewandtakeitallthewaytoanofferyoulove.
Splittheprocessintoitscompositesteps,andrememberwhatittakestosucceed.Dontsearchfor
jobslikeeverybodyelsebyapplyingtothestandardjobpostsandsendingoutforlorncoverletters.
Beinnovativeandsolvecompanyproblemsproactively.Reachouttopeoplewithinthe
organizationforinformationinterviews.Dosomethingdifferentfromthehundredsofother
candidates,andstandoutasagreattechnicalthinkerand,aboveallelse,aproficient
communicator.
Gothroughthetechnicalandnontechnicalpartsofthedatascienceinterview.Onceyouve
masteredthethinkingbehindthequestionsandwhathiringmanagersarelookingfor,youllhave
agoodsenseofhowtoexcelthroughouttheprocess.
www.springboard.com
80
Finally,whenyouhaveanoffer(orseveral)onthetable,takethetimetoevaluatethemwithgood
judgement.Takethetimeafteryouacceptanoffertorelax,skillup,andbringthemomentum
forwardto
yourfirstdayatadatasciencejob.
FinalThoughts
Most of the world will make decisions by either guessing or using their gut. They will be
either lucky or wrong.-
Suhail Doshi
, CEO,
Mixpanel
The whole enterprise of teaching managers is steeped in the ethic of data-driven analytical
support. The problem is, the data is only available about the past. So the way weve taught
managers to make decisions and consultants to analyze problems condemns them to taking
action when its too late.-
Clayton M. Christensen
, management professor at Harvard
Were entering a new world in which data may be more important than software.-
Tim
OReilly
, Founder,
OReilly Media
Web users ultimately want to get at data quickly and easily. They dont care as much about
attractive sites and pretty design.-
Tim Berners-Lee
Data scientists are involved with gathering data, massaging it into a tractable form, making it
tell its story, and presenting that story to others.
Mike Loukides
, VP, OReilly Media
Checklist
1) Mapouttheroleyourskillsfit
2) Mapouttheindustriesandtypesofcompaniesyouwanttoworkfor
3) PrepareyourLinkedIn,CV,andemailtemplates
4) Researcheachcompanyandroleyouwanttoaimforthoroughly
5) Reachoutproactivelytoindividualswithincompanieswithinformationalinterviews
6) Buildstrongnetworksandreferrals
7) Tacklethedatascienceinterview
8) Keepuphope
www.springboard.com
81
9) Negotiateyouroffer
Templates
Gettinganinformationalinterview
Hi[firstname],
IwassuperinterestedintheproblemsAirBnBisfacingindatascience.Ivebeenaspiringto
breakintothefield,andbeingapassionatefollowerofthe
AirBnBNerds
blog,Inoticedthat
buildingtrustwithdata
isanimportantpartofwhatdrivesAirBnB.Basedonmybackgroundin
psychologyandstatistics,Imightbeabletohelpcomeupwithsomecreativeideasonhowto
fostertrust.IdlovetotakeyououttocoffeeandgetagreatersenseofwhatproblemsAirBnB
hasperhapsIcanhelp!
Cheers,
[yourname]
[Greeting],
[Whyareyouinterestedinthecompany],[somethingthecompanyhasdonethatyoulove],[how
youcanhelp].
www.springboard.com
82
Reachingouttogetareferral
Hi[firstname],
Itwasgreatseeingyouatthepotluck!Ivebeenlookingaround,andIminterestedinthe
problemsUberisfacing,specificallytheonesfacedbydatascientistsonthegrowthteam.Would
youmindintroducingmetothehiringmanagerorsomebodyontheteamsoIcouldseeifIcould
help?
Cheers,
[yourname]
[Greeting],
[Talkaboutlastpointofcontact],[talkaboutinterestincompanyandproblemsfacedbya
specificrole],[asktobeintroducedtohiringmanagertohelpsolvethoseproblems]
Followingupafteraninterview
Hi[askhowyourinterviewerpreferstobeaddressed],
ItwasapleasuretalkingwithyouaboutGooglesdatascienceproblems.IthinkIcanhelpwith
someoftheproblemsyouveenumerated,andIlookforwardtothenextstepsintheprocess!
Hello[Askyourinterviewerhowtheyprefertobeaddressedduringtheinterview],
[Talkaboutproblemsyoucanhelpsolve],[Statethatyourelookingforwardtonextsteps]
www.springboard.com
83
Glossary
A/Bsplittest
AnA/Bsplittestisthegoldenstandardofexperimentdesignforwebcompanies,
wheretwogroupsofusersaresubjectedtodifferenttreatmentsandmeasuredtoseetheir
conversionratetoacertaingoal.Optimizely,awebcompanydedicatedtohelpingrunA/Bsplit
testshasa
goodguideontheconcept
.
Feature
Anuggetofinformationaboutanobject,usuallystoredasacolumnintabulardata.If
youmeasureandstoretheheight,weight,andgenderofanindividual,youarestoringthree
featuresaboutthem.
LifetimeValue
Theexpectedamountofrevenueacustomerisexpectedtogenerateoverthe
timetheyspendwithacompany.Asoftwareasaservicestartupthatsellssoftwarebythemonth
canexpecttocalculatethisbymultiplyingthemonthlypricewiththenumberofmonthsspent.
MapReduce
Asetofalgorithmsthatacttoabstractawaythedifficultyofstoringmassivedata
setsbytreatingdatasplitintomultipleserversinawayasintuitiveashandlingitfromone.
MapReduceusesparallel,distributedlogictodealwithmassivedatasets.
Overfitting
Thetendencyofamodeltofitontopastdata,overgeneralizingfromthoseinsights
tomakeinaccuratepredictionsinthefuture,draggeddownbytheweightofthepast.
TypeIError
Afalsepositiveistheincorrectacceptancethatsomethingishappeningakinto
tellingamanthatheispregnant.Intechnicalterms,itistheincorrectrejectionofthenull
hypothesis.
TypeIIError
Afalsenegativeistheincorrectacceptancethatsomethingisnthappening.Itis
akintotellingapregnantwomansheisntpregnant.Intechnicalterms,itistheincorrect
acceptanceofthenullhypothesis.
Formoreglossaryterms,consultthis
datascienceglossary
.
www.springboard.com
84
Resources
Aparodyoftheinterviewprocessthatexaminessomehardtruthsfrom
KDNuggets
.
Thisbook,called
DataScienceInterviewsExposed
,offersmoresamplequestionsthatyoucan
tacklewithyourinterviewpractice.
The
DataScienceHandbook
offersreallifeadvicefromdatascientists,includingsomesmart
analysisonwhatmakesforagreatdatascientistandwhathappensduringtheinterviewprocessto
findthoseindividuals.Itscompanion,the
DataScienceInterviewGuide
,offers120questionsyou
mightseeinadatascienceinterview.
CrackingtheCodingInterview
isadefinitiveresourceforgoingthroughsoftwareengineering
interviewsandwillhelpwiththeprogrammingpartsofthedatascienceinterview.
This
Quorathread
goesintohowAirBnBhiresfordatascientists,aninsightfullookatthedata
scienceinterviewprocessfromanestablisheddatascienceleader.
This
exposebyTreyCausey
explainshowtoacethedatascienceinterviewprocessandoffersa
criticalandunvarnishedlookonhowoneshouldapproachthedatascienceinterview.
Erin
Shellman
alsotalksaboutherexperiencegettingajobindatascience.
AsI'vegottenolderandmoreexperienced,Ipushbackininterviews.Iaskquestionsaboutwhat
thepurposeofaproblemisorstatethatIdon'tthinkthisisagoodevaluationofmyskillsor
abilities.SomepeopleprobablyseethisasmethinkingIm"toogood"toanswerthequestions
everyoneelsehastoanswer,butIseeitasdoingmyparttobeacriticalthinkerabout
evaluation,prediction,andhiring.Hopefullyyou'lldothistooandasmoreofusareinaposition
wherewearebuildingteamsandhiring,we'llthinkmorecarefullyaboutwhatwe'retryingto
accomplishandhowwecangetthereinsteadofcopyingthesamepatternsthathavebeen
aroundforyears.
Thisarticleisan
insightfulread
abouthowdatascienceatTwitterworksandofferstheinside
perspectiveofsomebodywhoisadatascientistintheindustry.
Ifyoufindyourselfthinkingaboutprobability,refertothis
cheatsheet
tomakesureyoureontop
ofanyproblem.This
Quorathread
willhelpaswell.
EllenChisawrites
aboutthingsshehasscreweduponwhenitcomestotechnicalinterviewsyou
shouldmakesuretoavoidthosemistakes!
www.springboard.com
85
Finally,FirstRoundReviewhasa
primer
onhowtohireexceptionaldatascientistsreaditto
knowhowthepeopleontheothersideofthetablethink.
AbouttheCoAuthors
Rogerhasalwaysbeeninspiredtolearnmore.Hebrokeintoacareerindatabyanalyzing$700m
worthofsalesforamajorpharmaceuticalcompany.HehaswrittenforEntrepreneur,
TechCrunch,TheNextWeb,VentureBeat,andTechvibes.
Forthisguide,hecompiledinsightfromSpringboard'snetworkofhundredsofdatascience
experts,includingSriKanajan,hiscoauthor.
SriKanajaniscurrentlyaseniordatascientistinNewYorkCityatamajorinvestmentbank.He
has14yearsofexperienceinvariousengineeringandmanagementcapacitiesandmadeacareer
transitiontobeadatascientistin2013.HecompletedafulltimedatasciencebootcampinSan
Franciscoandprogressedtobecomeadatascientistattwostartupsandeventuallyadatascience
manageratChange.orgbeforetakingonhiscurrentrole.Srialsoteachesparttimeasalead
instructorinGeneralAssembly'sDataSciencecourse.Heispassionateabouthelpingothersmake
thetransitionintodatascience.
www.springboard.com
86