Você está na página 1de 6

# Business Analytics & Decision Making Assignment

Submitted to:

Submitted by:

Utsav Gahtori
80011314014
PGDM 2014-16

## 1) Descriptive statistics on different variables

Descriptive Statistics
N

Minimum

Maximum

Mean

Std. Deviation

Skewness

Statistic

Statistic

Statistic

Statistic

Statistic

Statistic

Std. Error

Statistic

Std. Error

40

-4.311

5.934

2.15145

1.929015

-1.176

.374

2.737

.733

40

12.154

21.872

17.58990

3.066949

-.270

.374

-1.154

.733

40

.125

16.849

3.51630

3.185368

2.584

.374

7.998

.733

Unemployment rate

40

4.750

11.750

7.46792

2.217023

.531

.374

-1.047

.733

Employment

36

24

31

27.26

2.190

.159

.393

-1.142

.768

40

-8.227

12.357

3.80470

3.705395

-.462

.374

1.942

.733

Gross

domestic

product,

constant prices
Gross national savings
Inflation, average consumer
prices

## Volume of exports of goods

and services
Valid N (listwise)

Kurtosis

36

In this case, I have taken six key economic variables to gain an insight on countrys performance. These variables are GDP growth rate,
Total Investment, Gross National Savings, Unemployment Rate, Inflation and Employment.

Skewness is the measurement of the asymmetry of the distribution of the real valued random variables. For a data to be normally
distributed, Skewness lies between -1 to +1. But, if we observe here, in case of Inflation and GDP growth rate, the skewness is way
above +1, i.e. 2.584 and -1.176 respectively. This depicts that the data that we have is not symmetrical. For GDP growth rate, a negative
skewness depicts asymmetrical distribution with a long tail to the left (lower values). Whereas, in case of Inflation, the positive skewness
depicts asymmetrical distribution with a long tail to the right (higher values). So, in this case, the skewness is substantial and the
distribution is far from symmetrical.

Kurtosis is a measure of whether the data are peaked or flat relative to a normal distribution. That is, data sets with high kurtosis tend to
have a distinct peak near the mean, decline rather rapidly, and have heavy tails. From the above give Kurtosis value, it can be seen that
the statistics are either more than 1 or less than 1.

So, in order to use the data set for further interpretation, we apply the transformative algorithm to normalize the data set. When that is
done, we again apply a descriptive statistic to see whether the skewness and kurtosis are within the range or not. If yes, then further
interpretations can be made from the data set. And, if not, then we identify the outliners by the way of creating a boxplot and then
identify the outliers, and subsequently removing them to make the data useful for further study and interpretations.

2) GDP growth rate between 1980 to 2000 is same as between 2001 to 2020 using Independent sample T-Test.

As it was seen in the last question that both skewness and kurtosis for GDP growth rate was more or less than 1, so in that case we first
normalize the data by transformation. After normalizing the data set, we again run a descriptive test to measure the skewness and
kurtosis.
Statistics
Ln_gdp
Valid

37

N
Missing
Mean

6
1.0636

Median

.9373

Mode

.94

Std. Deviation

1.22916

Variance

1.511

Skewness

4.205

Kurtosis

.388
23.476

## Std. Error of Kurtosis

.759

Minimum

-.81

Maximum

7.61

Sum

Percentiles

39.35
25

.7749

50

.9373

75

1.2032

Here, we find that still the values are either greater or lesser than 1. So, in this case, we develop a boxplot chart, and find out the
outliers. Outliers are basically observation points that are distant from other observations, and thus creating a conflict in the statistic
output that are used for making any decision. So, now we identify the outliers and eliminate them, and again run a descriptive statistics.

Descriptive Statistics

Minimum

Maximum

Mean

Std. Deviation

Skewness

Kurtosis

Statistic

Statistic

Statistic

Statistic

Statistic

Statistic

Std. Error

Statistic

Std. Error

Ln_gdp

32

.50

1.71

.9940

.27642

.573

.414

.362

.809

Valid N (listwise)

32

Here, we find out that now skewness and kurtosis both lie within the desired range, indicating that the data set is now
normalized. So, now we run the Independent T-test. The independent-samples t-test compares the means between two
unrelated groups on the same continuous, dependent variable.

## Creating a hypothesis for the above problem:

o H0: GDP growth rate is same between 1980-2000 and 2001-2020
o Ha: GDP Growth rate is not same between 1980-2000 and 2001-2020

From the T-test performed, we see that the significance level is .006, which is less than .05, so we accept that alternate hypothesis.

This implies that the GDP growth rate is not similar for two different group of time, which indicates that GDP growth rate has not been
constant, and it can either be a positive growth rate or negative when compared with the other group.

3) Change in Total Investment, Gross National Savings, Employment %, Unemployment rate, Inflation change by
dividing the data into three groups (1980 to 1991, 1992 to 2003 and 2004 to 2015) using ANOVA.

## The groups formed

# Period 1: 1980-1991
# Period 2: 1992-2003
# Period 3: 2004-2015

Creating
Hypothesis:
H0: There is no change in the variables across three given period of time (Groups)
Ha: There is a significant change in the variables across three given period of time (Groups)

Here again, we first run a descriptive statistic to see if the skewness or kurtosis value is less than or greater than 1.

here

will

compare

various

parameters

among

themselves

and

provide

significant

value.

Statistics
Inflation,

Gross

average

savings

Employment

rate

consumer prices
Valid

32

32

32

32

28

Missing

Mean

96.48012

17.54431

19.6458

7.29247

27.34

Median

94.79150

17.04200

19.5035

6.22500

27.33

5.400

24

16.29

Mode

48.701

Std. Deviation

27.620588

2.867211

2.00870

2.363304

2.230

Variance

762.897

8.221

4.035

5.585

4.974

Skewness

-.106

-.134

1.355

.733

.108

.414

.414

.414

.414

.441

Kurtosis

-.918

-.907

4.173

-.972

-1.021

.809

.809

.809

.809

.858

Minimum

48.701

12.154

16.29

4.750

24

Maximum

141.047

22.415

26.75

11.750

31

Sum

3087.364

561.418

628.67

233.359

766

25

82.53925

15.93300

18.4700

5.40000

25.35

50

94.79150

17.04200

19.5035

6.22500

27.33

75

124.49575

19.85600

20.3290

9.28125

29.21

Percentiles

12.154

## a. Multiple modes exist. The smallest value is shown

So, here we observe that in case of Total Investment, the skewness and kurtosis are greater than 1. So, now we normalize the data
set by applying transformation algorithm, and then again run a descriptive statistic to see whether the correction is done or not. If
not, then we identify the outliers and eliminate them.

Descriptive Statistics

Minimum

Maximum

Mean

Std. Deviation

Skewness

Kurtosis

Statistic

Statistic

Statistic

Statistic

Statistic

Statistic

Std. Error

Statistic

Std. Error

Ln_TI

40

2.71

3.24

2.9702

.10891

.185

.374

.886

.733

Valid N (listwise)

40

Now, since all the data set is normalized, we run the ANOVA. The test results are below:
ANOVA

Ln_TI

Sum of Squares

df

Mean Square

Sig.

Between Groups

.308

.154

32.962

.000

Within Groups

.149

32

.005

Total

.457

34

Employment

Unemployment rate

## Inflation, average consumer

prices

Between Groups

123.152

61.576

Within Groups

27.542

32

.861

Total

150.695

34

Between Groups

73.918

36.959

Within Groups

93.367

32

2.918

Total

167.285

34

Between Groups

279.936

139.968

Within Groups

56.307

32

1.760

Total

336.243

34

Between Groups

18568.206

9284.103

Within Groups

2900.076

32

90.627

Total

21468.282

34

71.543

.000

12.667

.000

79.545

.000

102.443

.000

Multiple Comparisons
Tukey HSD
Dependent Variable

(I) Year_Group_ANOVA

(J) Year_Group_ANOVA

Mean

## Difference Std. Error

Sig.

(I-J)
2.00

.09798

3.00

.23014

1.00

Ln_TI

1.00

-.09798

3.00

.13216

2.00

1.00

-.23014

2.00

-.13216

3.00

1.00

Employment

2.00

3.00

1.00

Unemployment rate

2.00

3.00

1.00

2.00

3.00

prices

consumer

Upper Bound

.02852

.005

.0279

.1681

.02852

.000

.1600

.3002

.02852

.005

-.1681

-.0279

.02790

.000

.0636

.2007

.02852

.000

-.3002

-.1600

.02790

.000

-.2007

-.0636

-1.654

.387

.000

-2.61

-.70

3.00

-4.554

.387

.000

-5.51

-3.60

1.00

.387

.000

.70

2.61

.379

.000

-3.83

-1.97

.387

.000

3.60

5.51

.379

.000

1.97

3.83

.713015

.002

.89461

4.39889

.713015

.000

1.69644

5.20073

.713015

.002

-4.39889

-.89461

.697343

.491

-.91180

2.51546

.713015

.000

-5.20073

-1.69644

.697343

.491

-2.51546

.91180

.553713

.000

1.31391

4.03527

.553713

.000

5.53866

8.26002

.553713

.000

-4.03527

-1.31391

.541542

.000

2.89398

5.55552

.553713

.000

-8.26002

-5.53866

2.00

1.654

3.00

-2.900

1.00

4.554

2.900

2.00
2.00
3.00

2.646750

3.448583

1.00

-2.646750

3.00

.801833

1.00

-3.448583

2.00

-.801833

2.00
3.00

2.674591

6.899341

1.00

-2.674591

3.00

1.00
2.00

4.224750

-6.899341

-4.224750

.541542

.000

-5.55552

-2.89398

-32.524182

3.973806

.000

-42.28930

-22.75906

3.00

-56.778598

3.973806

.000

-66.54372

-47.01348

1.00

32.524182

3.973806

.000

22.75906

42.28930

3.886459

.000

-33.80489

-14.70394

2.00

average

Lower Bound

1.00
Inflation,

## 95% Confidence Interval

2.00

3.00

-24.254417

1.00

56.778598

3.973806

.000

47.01348

66.54372

24.254417

3.886459

.000

14.70394

33.80489

3.00
2.00
*. The mean difference is significant at the 0.05 level.

Since, the significance level value is less than .05 in all the cases, we can infer that there is a significant change in the
variables across three given period of time (Groups)

We see the mean difference between Employment rates in the three periods. As is visible the difference between 1 and 2 is -1.654
and between 1 and 3 is -0.4554. This means that period 1 has the lowest Employment rate. Also it can be noted that the difference
between period 2 and 3 is -0.2900 implying period 3 has more employment rate than period 2.

Collating all of the above analysis we can say that the Employment rate follows the following in increasing order: Period
1<Period 2<Period 3.

Correlations
Gross

domestic Implied

PPP Gross

savings

## national Inflation, average Volume of imports Unemployment

consumer prices

prices
Pearson Correlation
Gross

domestic

product,

constant prices

Sig. (2-tailed)
N

Inflation,

average

consumer

prices

## Volume of imports of goods and

services

Unemployment rate

Employment

40

Pearson Correlation

-.017

Sig. (2-tailed)

.916

40

-.017

.241

-.458

.916

.133

.003

40

40

Pearson Correlation

.241

-.759

Sig. (2-tailed)

.133

.000

40

40

-.458

Sig. (2-tailed)

.003

40

Pearson Correlation

.796

Sig. (2-tailed)
N

**

**

**

-.741

-.741

.000

.000

40

40

.502

40
**

.502

.000

.001

40

40

**

and rate

40

.000

40

40

Pearson Correlation

-.126

.819

-.910

Sig. (2-tailed)

.464

.000

.000

36

36

36

.428

**

.006

40
**

40

.000

40

-.516

36

-.015

-.707

.925

.000

40

40

**

.428

-.910

36
**

-.516

.006

.001

40

40

36

.060

-.196

.712

.251

40

40

36

.060

-.758
.000

40

-.196

-.758

.001

.251

.000

36

36

36

36
**

36

## **. Correlation is significant at the 0.01 level (2-tailed).

*. Correlation is significant at the 0.05 level (2-tailed).

**

.000

.014

40
**

**

36
**

40
*

.819
.000

.712

40
**

40

-.385

40

.841

.464

40

.014

Sig. (2-tailed)

.841

40

.089

.587

.000

.000

.925

-.707

-.126

.089

.000

.033

.033

.001

-.385

**

**

.587

40

40

.796

.272

.272

Pearson Correlation

**

**

-.015

**

**

40

-.759

40

Pearson Correlation

goods

services

**

of

Employment

For the correlation part, we have taken GDP, Implied PPP, Gross National Savings, Inflation, Volume of Imports of goods and services,
unemployment rate and employment as the key economic variables.
From a descriptive statistic, we observe that all the variables are normally distributed as the skewness and kurtosis were not less than or
greater than 1. So, we dont need to apply transformative algorithm to normalize the data set.
Here, we will compare the given economic variables with each other and infer our results.
We can observe from the correlation that GDP is positively correlated with Gross National Savings, Volume of imports of goods and
services, and employment rate, whereas negatively related to the rest of the variables. It is seen that GDP is highly correlated with volume
of import of goods and services, which in fact, is a practical fact.
Similarly implied PPP is negatively correlated to all the variables except the employment rate.
Also, Gross National Savings is positively correlated to Inflation, Volume of goods and services and unemployment rate, whereas negative to
all the remaining factors. This implies that the National Savings is highly affected by the inflation rate and unemployment rate, which again
is a plausible factor to be considered.
Similarly, relation between the remaining variables can be found out and commented upon.

**

**