This action might not be possible to undo. Are you sure you want to continue?

Ø Statistics is the use of data to help decision maker reach better decision. Ø Stages in a statistical investigation: 3. Collection. 4. Organization. 5. Presentation. 6. Analysis 7. Interpretation.

Collection of data…….

Ø Data can be collected from three sources: 2. Secondary source. 3. Internal records 4. Primary source.

Collection of data…….

Ø Secondary source: Journals, reports etc. Ø Internal record: Routine business record keeping like accounting, sales etc. Ø Primary Source: 4. Questioning. 5. Observation.

Presentation of data

Presentation can take two basic forms:

Ø Statistical Table: Numbers in a logical arrangements, with some brief explanation to show what they are. Ø Statistical Chart: Pictorial device for presenting data.

Classification of data

Ø It is grouping of related facts into different classes with respect to some characteristic known as a basis of classification. Ø E.g. Sorting of letters in post office.

Types of classification

Ø Geographical: Area wise e.g. cities etc Ø Chronological: On the basis of time Ø Qualitative: According to some attributes. Ø Quantitative: In terms of

Geographical Classification

Ø Data is classified on the basis of geographical or locational differences between various items. Ø Two approaches of classification: üIn alphabetical order. üBy size. üE.g. Distribution of grain production all over the India.

Chronological Classification

Used when data is observed over a period of time. n Approach: Starting with earliest time period and further in chronological order. E.g. GDP analysis from 1991 onwards.

n

Qualitative Classification

Ø Data is classified on the basis of some attribute or quality such as sex, blindness etc. Ø Attributes under study cannot be measured. Ø Different types: d. Simple. e. Twofold or dichotomous f. Manifold

Quantitative Classification

Ø Refers to classification of data according to some characteristics that can be measured such as height, weight etc. Ø Two basic elements: üVariable üFrequency.

What is Variable????

Ø Variable refers to characteristic that varies in amount or magnitude in a frequency distribution. It can be continuous or discrete (Discontinuous). Ø Continuous variable is capable of of manifesting every conceivable fractional value within the range of possibilities. Ø Discrete is capable of taking only finite “JUMP” values. Ø Practically almost every observation will be discrete one.

Frequency distribution

Refers to data classified on the basis of some variables that can be measured such as price, wages, age etc.

**Classification acc. to class intervals
**

Ø Class Limits Ø Class Intervals Ø Class frequency Ø Class midpoint Midpoint = Upper limit + Lower Limit 2

Methods of classification

Exclusive method: Upper limit has not included in the class. n Inclusive method: Both upper as well as lower limits has been included.

n

Principles of classification

n

n

n

n

n

No. of classes should be more than 5. Otherwise it may not reveal the essential characteristics. Struge’s formula: K=1+3.322logN; N- No. of observation. Prefer class interval of five or multiple of five. Starting point should be either zero, five or multiple of five. To ensure continuity, prefer exclusive

Tabulation of data

Statistical table is a logical listing of related quantitative data in vertical column and horizontal rows of numbers with sufficient explanatory and qualifying words, phrases and statement to the form of titles, headings and notes to make clear the full meaning of data and their origin.

**Parts of the table
**

n n n n n n n

Table number Title of the table Caption Stub Body of the table Head notes Footnotes

Types of Table

Ø Simple and complex table ü Single or one way table ü Two way table ü Higher Order Table Ø General and specific purpose

Charting data

Ø Why Charts??? ü Easy to analyze ü Greater memorize Ø Types of charts: ü Diagram ü Graphs

**Rules for constructing diagram
**

Ø Title should be given to every diagram Ø Scale should be in even numbers. Ø Footnotes, Index should be added to clarify points about the graphs Ø Diagram should be neat and clean. Ø Simplicity.

Types of diagrams

Ø One dimensional diagram e.g.. Bar diagram Ø Two dimensional e.g.. Rectangles etc. Ø Pictogram and cartograms.

Bar Diagram

Ø A bar is a thick line whose width is shown merely for attention. Ø Merits of bar diagrams: ü Easily understandable to those who are unaccustomised to reading charts ü Simplest and easy to make ü Easy to compare no.of

**Rules for Bar Diagram
**

Ø Width of bars should be uniform. Ø Gap between two bars should be uniform throughout. Ø Bars can be vertical or horizontal. However vertical bars are easy to read. Ø While constructing bars, it is preferable to write respective figure so that reader can know the

**Types of Bar Diagram
**

Ø Simple Bar Diagram Ø Subdivided Bar Diagram Ø Multiple Bar Diagram Ø Percentage Bar diagram Ø Deviation Bars Ø Broken Bars

**Simple Bar Diagram
**

Ø Used to represent only one variable. Ø Only length matters in this case Ø Limitation is that it can present only one classification or one category of data

**Sub divided Bar Diagram
**

Ø Used to represent various parts of the total. Ø It cannot be used where no. of components is more than 12. Ø Can be used to represent % distribution ration in place of pie chart.

**Multiple Bar Diagram
**

Ø In this two or more sets of interrelated data are represented. Ø Technique is same as that of simple bar diagram.

Others

Ø 2. 3. Ø 5. 6. Deviation Bars Used for representing net quantities-Excess or Deficit. Can be positive, zero or negative. Broken Bars Used where variation in values is very high. To gain space for smaller bars, large bars may be broken.

**Two Dimensional Diagrams
**

Ø Both length as well as width is considered in this type. Ø Types of 2-D diagrams: ü Rectangle ü Squares ü Circles

Rectangle

n

Area is consider in this type as it is product of length and width.

Pie Chart

Ø Used to represent a total into its components. Ø Can be constructed on the basis of angle (360o) or percentage (100%). Ø Limitation 4. Less effective for reading and interpretation, particularly when series are divided into large no. of

Pictogram

Ø Data is represented through a pictorial symbol that is carefully selected Ø Merit: ü Facts portrayed in pistorial form are generally remembered longer than facts presented in the tables. ü Greater attraction thus used to draw attention of masses in exhibitions etc. Ø Limitation: ü It is difficult to construct.

Cartogram

n

n

Used to give quantitative information on a geographical basis. Quantity on the map can be shown in many ways such as through use of colors, dots, by placing pictogram in each geographical unit and by placing appropriate numerical figure in each geographical unit.

**How to select diagram.
**

Ø Depends upon two factors: 2. Nature of the data. 3. Type of people for whom diagram is meant.

**How to select diagram.
**

Ø Simple bar diagram when change in total is required. Ø Component Bar diagram when change in total as well as in the size of component is required, but component should not be more than 3 or 4. Ø Percentage sub-divided bar charts are better suited when change in relative size of component figure are to be exhibited. Ø Multiple bar chart when change in absolute value of component are to

Continue….

n

n

Pie chart is useful when it is desirable to show relative proportion of figure that go up to make overall total, but cannot be used where series of figure is involved as it is very difficult to compare. Pictogram and cartogram are more informative and more effective than other forms of presentation to

Graphs

Ø Graphs can be mainly divided under two heads: 2. Graphs of time series or line graph. 3. Graphs of frequency distribution.

Line graphs

Ø Time is taken on X axis and variable on Y axis. Ø If unit of measurement is same we can represent two or more variable on the same graph.

Others

Ø Range chart Used to show the range of variation i.e. minimum and maximum value of variable. Ø Band Graph It is type of line graph which shows the total for successive time period broken up into subtotals for each of component

**Graphs of frequency distribution
**

Ø Histogram Ø Frequency polygon Ø Smoothed frequency curve Ø Cumulative frequency curves or ogives

Histogram

Ø It is a graph that represent the class frequencies in a frequency distribution by vertical adjacent rectangles. Ø Observations are plotted on the horizontal axis and corresponding frequencies on the vertical axis. Ø Construction: ü For distribution having equal class interval. ü For distribution having unequal class

Histogram…

Ø Histogram with equal class interval Height of rectangles will be proportional to the frequencies. Ø Histogram with unequal class interval Heights will be proportional to the ratios of the frequencies to the width of the class e.g.. For the frequency having twice class interval than the smallest class limit, height of the rectangle will be half than the corresponding frequency.

Limitation of charts

Ø Easily misinterpreted. Ø 2D and 3D diagrams cannot be accurately appraised visually and therefore should be avoided. Ø Can present only limited amount of information. Ø Represent only approximation values.

- Classification and Tabulation
- Presentation of Data
- Incremental Discretization for Naive Bayes Learning using FIFFD
- Classification of Incomplete Patterns Based on the Fusion of Belief Functions
- 10800949_PHY620 Assignment 6
- Lecture 1
- Change Detection
- Handwritten Devnagri Character Recognition
- Paper 7-Towards Multi Label Text Classification Through Label Propagation
- How to Evaluate Credit Scorecards and Why Using the Gini Coefficient Has Cost You Money
- Personalization and User Verification in Wearable Systemsusing Biometric Walking Patterns
- Pattern CH2P1
- Completely Lazy Learning
- Report 1
- CS-572 Data Mining and Information Retrieval Week 06
- Bilal Ahmed ShaikDMDW Lab
- A Complete Processing Chain for Shadow Detection and Reconstruction in VHR Images
- slide
- Persian Character Recognition Using New
- Suitability of Naïve Bayesian Methods for
- MultiSpec Description
- Multivoxel Pattern Analysis Presentation
- Dm Lab Manual
- Analysis of German Credit Data
- Improving Semantic Knowledge Base for Transfer Learning in Sentiment Analysis
- CS190.1x_week1
- n 348295
- Sr 4832000
- Classification DMM
- Data Mining-Classification and Decision Tree Induction_1

Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

We've moved you to where you read on your other device.

Get the full title to continue

Get the full title to continue reading from where you left off, or restart the preview.

scribd