Você está na página 1de 5

BASIC STATISTICS

Assignment name- Concepts of statistics


Solutions

1.
Variance is the squared deviation of a random variable from its mean, and it informally
measures how far a set of (random) numbers are spread out from their mean.

Thus from the statistical point of view a zero variance means, squared deviation of a
random variable from its mean is null or zero, and the distance between set of random
number from mean is also zero, they all lie on the same path. i. e. a straight line is
connecting all points in a plane.

When the variance = 0, the values in our column/variable are all the same (it is a
constant). There won’t be any use in including it in our analysis. Variance is zero
means that there is no variation i.e. all the values are same. If all the values of a variable
are same, it would not be wise and fruitful to use it for the analysis.

2.
Using the data from the given table in column A

Mean:
• The arithmetic average of all the scores

7 +6+7+7 +8+5+8+ 7+7+5+5


(ΣX)/N= = 6.54
11

Median:
The median is the value separating the higher half from the lower half of a data sample.
For a data set, it may be thought of as the "middle" value.

In the given table for column A , set of data can be arranged in ascending order as

( 5 ,5 ,5 ,6 ,7 ,7 ,7 ,7 ,7 ,8 ,8)

Since the total number of terms is 11(odd). For the middle term, we will be taking out
6th term as our median.

Hence the median is 7.


Mode:

The mode is the score that occurs most frequently in a set of data

score frequency
8 2
7 5
6 1
5 3

Hence from the table made above for column A, it is clear that mode of scores is 7 since
it has maximum frequency of occurrence.

Variance:

Variance is given by the following formula

Where X= individual score

µ= mean value

N= number of terms
2
σ =
(7−6.54)2+(6−6.54)2 +(7−6.54)2 +(7−6.54)2+(8−6.54)2 +(5−6.54)2 +(8−6.54)2 +(7−6.54)2+(7−6.54)2
11

2 13.8076
σ = =1.25 (approx)
11

Standard deviation:

Standard deviation is the square root of the Variance.

Hence s= √σ2 = √ 1.25 =1.118 (approx)


3.
Let the mean of the 12 scores be µ.

x 1 +…+ x 12
So we can write =µ
12

1+¿
⇒ .....+ x 12 =12µ
x¿

Now, according to question, largest score is increased by 36. Let us assume x12 is the
largest score. So the value of x12 now becomes x12+36. The total number of scores will
still be 12.

Let’s say the new mean to be µ '

Then,

x 1 +…+ x 12+ 36
µ' =
12

x 1 +…+ x 12 36
⇒ µ' = +
12 12

⇒ µ ' = µ +3

So, we can clearly see that mean (µ) increases by a value of 3

4.
In data science, Data (singular) is a single value of any variable. Data (plural) is all the
values of a variable.

Data (singular) is the value of the variable associated with one element of a population
or sample. Data (plural) is the set of values collected for the variable from each of the
elements belonging to the sample.

For example:-

A 5, 6, 3, ....
B 4, 3, 9, ....

Here, Column A and Column B are the variables of data set. 5 is the single value of
variable A is data (singular) while all the values of variable A is data (plural).

5.
Inferential statistics allows us to make predictions (“inferences”) from that data.
With inferential statistics, we take data from samples and make generalizations about
a population. For example, we might stand in a mall and ask a sample of 100 people if
they like shopping at Sears. We can make a bar chart of yes or no answers (that would
be descriptive statistics) we can use our research (and inferential statistics) to reason
that around 75-80% of the population (all shoppers in all malls) like shopping at Sears.
There are two main areas of inferential statistics:

1. Estimating parameters. This means taking a statistic from our sample data (for
example the sample mean) and using it to say something about a population
parameter (i.e. the population mean).
2. Hypothesis tests. This is where we can use sample data to answer research
questions. For example, we might be interested in knowing if a new cancer drug is
effective. Or if breakfast helps children perform better in schools.
Let’s say we have some sample data about a potential new cancer drug. We could use
descriptive statistics to describe our sample, including:

 Sample mean
 Sample standard deviation
 Making a bar chart or box plot
 Describing the shape of the sample probability distribution

A bar graph is one way to summarize data in descriptive statistics..

With inferential statistics we take that sample data from a small number of people
and try to determine if the data can predict whether the drug will work for everyone (i.e.
the population).

Você também pode gostar