Você está na página 1de 6

6/4/2016

UberAnalyticsExercise

Uber Analytics Exercise


Yao-Jen Kuo
Monday, September 07, 2015
Introduction
Load Data and Take a Quick Glimpse
Data Exploration
Distribution of 5 numeric variables
Correlation of 5 numeric variables
Exploration by Category
Most Completed Trips by Date and Time
The Gap between Demand and Supply
Zeroes in Popular Hours
Conclusion

Introduction
This document demonstrates my approach analyzing the dataset in the Uber Analytics Exercise.

Load Data and Take a Quick Glimpse


To begin the analysis, we load the .csv le into R workspace and check its structure and summary.
##'data.frame':336obs.of7variables:
##$Date:Date,format:"20120910""20120910"...
##$Time..Local.:int78910111213141516...
##$Eyeballs:int568911129121111...
##$Zeroes:int0032101122...
##$Completed.Trips:int2200420013...
##$Requests:int2201420024...
##$Unique.Drivers:int914141411119976...

Data Exploration
The data could be viewed as a funnel from Eyeballs(demand emerges) to Completed Trips, while Zeros
result in potential drop-outs(turn o the App). Visualziation helps us discover the pattern of these
metrics easier than contingency table.

Distribution of 5 numeric variables


The graph conrmed our previous funnel assumption that the distribution of Eyeballs, Requests and
Completed Trips are slightly higher than their next ones.

http://yaojenkuo.github.io/uber.html#loaddataandtakeaquickglimpse

1/6

6/4/2016

UberAnalyticsExercise

Correlation of 5 numeric variables


By drawing a scatter plot matrix, we can identify if there is any correlations between all the numeric
variables.

It looks like there is a positive correlation between demand(Eyeballs) and supply(Unique Drivers) based
on the plot. So we will further calculate the correlation coecient. A correlation coecient of 0.79
suggests strong positive relationship between supply and demand.
cor(uberNum$Eyeballs,uberNum$Unique.Drivers)
http://yaojenkuo.github.io/uber.html#loaddataandtakeaquickglimpse

2/6

6/4/2016

UberAnalyticsExercise

##[1]0.7895826

Exploration by Category
After examining numeric variables solely, we will analyze them by categorical groups, that, in this case,
are Date and Time.

Most Completed Trips by Date and Time


It is clearly that completed trips centralize in weekends(Saturday and Sunday) and in peak hours(5pm 3am), according to the graph.

http://yaojenkuo.github.io/uber.html#loaddataandtakeaquickglimpse

3/6

6/4/2016

UberAnalyticsExercise

The Gap between Demand and Supply


The core concept of the business is about the optimization between demand(Eyeballs) and
supply(Unique Drivers). So we are going to explore the pattern of gap
( uber$Eyeballsuber$Unique.Drivers ) by date and time. It is clearly that there are still lots of positive
bars, especially in Friday night and Saturday.

http://yaojenkuo.github.io/uber.html#loaddataandtakeaquickglimpse

4/6

6/4/2016

UberAnalyticsExercise

Zeroes in Popular Hours


Zeroes might directly result in drop-outs(turning o the App.) Besides watching patterns of Eyeballs and
Online Drivers, we also have to pay attention to zeroes, especially in Popular Hours. According to the
exercise, Popular Hours are dened as 5pm in Friday to 3am in Sunday. And there are still plenty of
Zeroes during Popular Hours according to the graph.

http://yaojenkuo.github.io/uber.html#loaddataandtakeaquickglimpse

5/6

6/4/2016

UberAnalyticsExercise

Conclusion
The most important target is to maximize Completed Trips.
Nice balance between Eyeballs and Unique Drivers results in more requests, thus results in
more Completed Trips.
Zeros result in less requests, hence result in fewer Completed Trips.
Most of the Completed Trips come from Popular Hours(from 5pm on Friday to 3am on Sunday).
Yet, there are still plenty of business opportunities lost in these hours.
Incentive plans or marketing campaigns shall focus on the how to reduce Zeros during Popular
Hours.
Dynamic pricing on riders might decrease Eyeballs during Popular Hours.
Therefore, if I were about to make a suggestion, I will go for a dynamic pricing on drivers
proportion of revenue share, rather than on riders.
The denition of Popular Hours should be considered more thoroughly besides considering Date
and Time only. Say, adding dimensions including national holidays, seasons, and climate etc. into
the denition of Popular Hours. The more precise Uber can predict popularity, the more
Completed Trips can be accomplished.

http://yaojenkuo.github.io/uber.html#loaddataandtakeaquickglimpse

6/6

Você também pode gostar