Você está na página 1de 10

Descriptive Analytics:

It involves gathering,
organizing, tabulating and
depicting data and then
describing the characteristics
about what is being studied.
This type of analytics was
historically called reporting. It
can be very useful but does
not tell you anything about
what might have happen in the future.

Descriptive Analytics, which use data aggregation and data mining to


provide insight into the past and answer: What has happened?

The vast majority of the analytics we use fall into this category. (Think basic
arithmetic like sums, averages, percent changes). Usually, the underlying
data is a count, or aggregate of a filtered column of data to which basic
math is applied. For all practical purposes, there are an infinite number of
these Analytics. Descriptive Analytics are useful to show things like, total
stock in inventory, average dollars spent per customer and Year over year
change in sales. Common examples of descriptive analytics are reports that
provide historical insights regarding the companys production, financials,
operations, sales, finance, inventory and customers.

Use Descriptive Analytics when you need to understand at an aggregate level


what is going on in your company, and when you want to summarize and
describe different aspects of your business.

Descriptive Analytics, the conventional form of


Business Intelligence and data analysis, seeks to
provide a depiction or summary view of facts
and figures in an understandable format, to
either inform or prepare data for further
analysis. It uses two primary techniques, namely
data aggregation and data mining to report past events. It presents past
data in an easily digestible format for the benefit of a wide business
audience.

A common example of Descriptive Analytics are company reports that simply


provide a historic review of an organizations operations, sales, financials,
customers, and stakeholders. It is relevant to note that in the Big Data
world, the simple nuggets of information provided by Descriptive Analytics
become prepared inputs for more advanced Predictive or Prescriptive
Analytics that deliver real-time insights for business decision making.
Descriptive Analytics helps to describe and present data in a format which
can be easily understood by a wide variety of business readers. Descriptive
Analytics rarely attempts to investigate or establish cause and effect
relationships. As this form of analytics doesnt usually probe beyond surface
analysis, the validity of results is more easily implemented. Some common
methods employed in Descriptive Analytics are observations, case studies,
and surveys. Thus, collection and interpretation of large amount of data
may be involved in this type of analytics.
In Descriptive, Predictive, and Prescriptive Analytics Explained, the author
argues that both in Predictive and Prescriptive Analytics, the data analyst
has to investigate beyond surface data. While the predictive data analyst
uses investigation to understand the future, the prescriptive data analyst
uses investigation to suggest probable actions. In contrast to both, the
descriptive analyst simply offers the existing data in a more understandable
format without any further investigation. Thus, Descriptive Analytics is more
suited for a historical account or a summary of past data. Most statistical
calculations are generally applied to Descriptive Analytics.
In Information Weeks Big Data Analytics: Descriptive vs. Predictive vs.
Prescriptive, Dr. Michael Wu, Chief Scientist of Lithium Technologies in San
Francisco, describes Descriptive Analytics as the simplest form of Data
Analytics, which captures Big Data in small nuggets of information. As Wu
observes, 80% of Business Analytics falls within the ambit of Descriptive
Analytics. Also, review the article 3 Types of Analytics: Descriptive,
Predictive, and Prescriptive.
In this article on the different types of Data Analytics, the author hints that
any good Data Scientist may try to use the results of Descriptive Analytics
and further tweak the data or trends or pattern analysis to forecast future
trends in business. The author notes that with the help of Big Data, all three
types of Data Analytics are now used to better understand the customer.
With huge amount of multi-channel customer data coming in, the data-
driven businesses are far better positioned to gauge the individual
preferences of customers and design appropriate personalized offerings. The
majority of industry literature echoes the sentiment that with Predictive and
Prescriptive Analytics, the business data that once simply described past
events can now view nuggets of useful information, thanks to Big Data-
powered Descriptive Analytics.
Examples of Descriptive Analytics
Here are some common applications of Descriptive Analytics:

Summarizing past events such as regional sales, customer attrition, or


success of marketing campaigns.
Tabulation of social metrics such as Facebook likes, Tweets, or followers.
Reporting of general trends like hot travel destinations or news trends.
According to Four Types of Big Data Analytics and Examples of Their Use, as
soon as the volume, velocity, and variety of Big Data invades the limited
business data silos, the game changes. Now, powered by the hidden
intelligence of massive amounts of market data, Descriptive Analytics takes
new meaning. Whenever Big Data intervenes, vanilla-form Descriptive
Analytics is combined with the extensive capabilities of Prescriptive and
Predictive Analytics to deliver highly-focused insights into business issues
and accurate future predictions based on past data patterns. Descriptive
Analytics mines and prepares the data for use by Predictive or Prescriptive
Analytics. Big Data lends a wide context to the nuggets of information for
telling the whole story. Also view this presentation from Information
Builders on four popular types of Business Analytics.
According to a recent Forbes study titled EY-Forbes-Insights: Data and
Analytics Impact Index people and culture can influence the intelligence
gathered from Business Analytics. This study conducted jointly by Forbes
Insights and EY interviewed global executives and concluded that:
Every modern business needs to build its Data Analytics framework,
where the latest data technologies like Big Data play a crucial role.
Data and technology should be made available at every corner of an
enterprise to develop and nurture a widespread data-driven culture.
If data and analytics are aligned with overall business goals, then day-to-
day business decisions will be more driven by data-driven insights.
As people drive businesses, the manpower engaged in Data Analytics must
be competent and adequately trained to support enterprise goals.
A centrally managed team must lead the analytics production and
consumption efforts in the enterprise to bring behavioral change towards a
data culture.
The concept of Data Analytics must be spread through both formal data
centers and informal social networks for an inclusive growth.
Descriptive Analytics: Industry Applications
In McKinseys 2016 Analytics Study Defines the future of Machine Learning,
you will find that US retail(40%) industry and GPS-based services (60%) are
showing rapid adoption of Descriptive Analytics to track teams, customers,
and assets across locations to capture enhanced insights for operational
efficiency. McKinsey also claimed that in todays business climate, the three
most critical barriers to Data Analytics are lack of organizational strategy,
lack of involved management, and lack of available talent.
Another Report suggests that Descriptive Analytics has made great strides
in supply chain mapping (SCM), manufacturing plant sensors, and GPS
vehicle tracking, to gather, organize, and view past events.
The Role of Descriptive Analytics in Future Data Analysis
As data-driven businesses continue to use the results from Descriptive
Analytics to optimize their supply chains and enhance their decision-making
powers, Data Analytics will move further away from Predictive Analytics
toward Prescriptive Analytics or rather towards a mash-up of predictions,
simulations, and optimization.
The future of Data Analytics lies in not only describing what has happened,
but in accurately predicting what might happen in the future. This claim is
explained in the article titled The Future of Analytics Is Prescriptive, Not
Predictive. This article cites a GPS navigation system, where Descriptive
Analytics is used to provide directional cues. However, such analysis is
reinforced by Predictive Analytics offering important details about the
journey like the time duration. Now, if the GPS system is further powered by
Prescriptive Analytics, then the navigation system will not only provide
directions and time, but also the quickest way to reach the destination. The
best part of such a super-charged navigation system is that it can even
compare several traveling routes and recommend the best solution.
As Data Mining and Machine Learning jointly offer solutions to predict
customer segments and marketing ROIs, the future Predictive Analytics
techniques will continue to evolve into Prescriptive Analytics, creating a
mash-up of predictions, simulations, and optimization.
Frequency tables:

The frequency (f) of a particular observation is the number of times the observation occurs
in the data. The distribution of a variable is the pattern of frequencies of the observation.
Frequency distributions are portrayed as frequency tables, histograms, or polygons.

Frequency distributions can show either the actual number of observations falling in each
range or the percentage of observations. In the latter instance, the distribution is called
a relative frequency distribution.

Frequency distribution tables can be used for both categorical and numeric variables.
Continuous variables should only be used with class intervals, which will be explained
shortly.

Generating frequency tables using R

R provides many methods for creating frequency and contingency tables. Three are
described below. In the following examples, assume that A, B, and C represent categorical
variables.

Table

You can generate frequency tables using the table( ) function, tables of proportions using
the prop.table( ) function, and marginal frequencies using margin.table( ).

Task 1:

Construct a frequency table for the variable cyl in mtcars and enhance the view. Write
your observations.
Task 2

Construct a grouped frequency table for the variable Salary in Crew.data, write your
observations
Crosstable
The CrossTable( ) function in the gmodels package produces cross tabulations
modelled after PROC FREQ in SAS or CROSSTABS in SPSS. It has a wealth of
options.
There are options to report percentages (row, column, cell), specify decimal places,
produce Chi-square, Fisher, and McNemar tests of independence, report expected
and residual values (pearson, standardized, adjusted standardized), include missing
values as valid, annotate with row and column titles, and format
as SAS or SPSS style output. See help(CrossTable) for details.

Descriptive measures: min,max,range,mean,sd,var,median,quantile and summary.

Skewness:
Intuitively, the skewness is a measure of symmetry. As a rule, negative skewness indicates that

the mean of the data values is less than the median, and the data distribution is left-skewed.

Positive skewness would indicate that the mean of the data values is larger than the median, and

the data distribution is right-skewed.

Kurtosis:
Intuitively, the kurtosis describes the tail shape of the data distribution. The normal

distribution has zero kurtosis and thus the standard tail shape. It is said to be mesokurtic.

Negative kurtosis would indicate a thin-tailed data distribution, and is said to be platykurtic.

Positive kurtosis would indicate a fat-tailed distribution, and is said to be leptokurtic.

t.test:
The t.test( ) function produces a variety of t-tests. Unlike most statistical packages,
the default assumes unequal variance and applies the Welsh df modification.
# independent 2-group t-test
t.test(y~x) # where y is numeric and x is a binary factor
# independent 2-group t-test
t.test(y1,y2) # where y1 and y2 are numeric
# paired t-test
t.test(y1,y2,paired=TRUE) # where y1 & y2 are numeric
# one sample t-test
t.test(y,mu=3) # Ho: mu=3

Chi-square test:

Correlation:

Correlation can be easily understood as co relation. To define. correlation is the average


relationship between two or more variables. When the change in one variable makes or
causes a change in other variable then there is a correlation between these two variables.

These correlated variables can move in the same direction or they can move in opposite
direction. Not always there is a cause and effect relationship between the variables when
there is a change; that might be due to uncertain change.

Simple Correlation is a correlation between two variables only; meaning the relationship
between two variables. Event correlation and simple event correlation are the types of
correlations mainly used in the industry point of view.

Types of correlation:
In Research Methodology of the Management, Correlation is broadly classified into six types
as follows :

(1) Positive Correlation


(2) Negative Correlation
(3) Perfectly Positive Correlation
(4) Perfectly Negative Correlation
(5) Zero Correlation
(6) Linear Correlation

Correlation using R

You can use the cor( ) function to produce correlations and the cov( ) function to
produces covariances.
A simplified format is cor(x, use=, method= ) where

Kendall rank correlation: Kendall rank correlation is a non-parametric test that measures
the strength of dependence between two variables. If we consider two samples, a and
b, where each sample size is n, we know that the total number of pairings with a b is n(n-
1)/2. The following formula is used to calculate the value of Kendall rank correlation:

Nc= number of concordant


Nd= Number of discordant