Você está na página 1de 3

Newsletter

Researchers Corner

Volume 4 Issue 7 July 2012

Four Steps to Tabular Presentation of Data


Twice in the past, I mentioned about four steps to tabular presentation and here they are for the consumption of novice researchers who are not much exposed to basic statistics. To recapitulate, March 2012 issue elaborated the preparatory work for tabulation like tally marking so that a frequency table that displays data in a concise and logical order with one-way, two-way, or three-way classification depending upon the number of characteristics involved can be made. Note that the raw data itself can be

classified broadly in four ways: qualitative, quantitative, temporal and spatial (see box for their definitions). Classification, by organizing similar things into groups or classes, brings order in the data and the classified subjected data to can further be easily
i) Qualitative classification is based on qualitative characteristics like status, nationality, religion, marital status and gender. ii) Quantitative classification is based on characteristics measured quantitatively like age, height, income, etc. Quantitative variables can also be continuous or discrete. Continuous can take any numerical value like that of weight, height, etc. Discrete can take only certain values by a finite jumps like number of books. It jumps from one value to another but does not take any intermediate value between them. For example, we can have 71.5 Kg as weight of a person, but we cannot have 2.5 persons. iii) Temporal (or chronological) classification involves using time like hours, days, weeks, months or years as classifying variable (when it is in terms of years it is called time series). iv) Spatial classification is based on place as a classifying variable like village, town, block, district, state or country.

statistical

analysis. Mutually exclusive but exhaustive classes (or groups) are created while tabulating based on common characteristics. We use attributes (statistics of attributes) for qualitative data and class

intervals, class limits, magnitude and frequencies for quantitative data (statistics of variables). Four steps presented here refer to

quantitative data only.

1. Decide the number of classes: First, try to know the range and variations in the values of variables. Range is the difference between the largest and the smallest value of the variable. (It is also the sum of all class intervals or the number of classes 1

multiplied by class interval). In the sample Table 2 of March issue (see table) we had the price of elementary textbooks ranging from, say 4 to 99 and hence had a range of 95. It was decided to have 10 classes of each with size or class interval of 10.

2. Decide the size of each class: This decision is inter-linked with the previous, i.e., with the number of classes. The thumb rule is to have 5 to 15 classes. The mathematical way to work out size of class is given by the formula i = R / 1+3.3 log N , where i is the size of class interval, R is Range, N is Number of items to be grouped. In the above referred table, it is already mentioned that, we have chosen a size of 10 for each class.

3. Determine the class limits: Choose a value less than the minimum value of the variable as the lower class limit of the first class and a value greater than the maximum value of the variable as the upper class limit for the last class. In the example, we have chosen 1 as the lower class limit of the first class and 100 as the upper class limit for the last class. It is important to choose class limit in such a way that mid-point or class mark of each class coincides, as far as possible, with any value around which the data tend to be concentrated. That is the class limits are chosen in such a way that midpoint is close to average. Once the class limits are chosen, we have the class interval. In other words, class intervals become the various intervals of the variable chosen for classifying data. In the example we have chosen equal 2

class interval for all the 10 classes. See diagram showing the way midpoints of even and odd class-intervals are

determined. Further, the class intervals could be either

exclusive or inclusive (see text box for further explanation).

(i) Exclusive method: When the upper class limit of one class equals the lower class limit of the next class, it is exclusive interval. This is suitable for data from a continuous variable and while recording frequencies the upper class limit is excluded but the lower class limit of a class is included in the interval. (ii) Inclusive method: If both lower and upper class limits are parts of the class interval it is inclusive interval. If a gap or discontinuity between the upper limit of a class and the lower limit of the next class is found, an adjustment in class interval is done. The procedure is to divide the difference between the upper limit of first class and lower limit of the second class by 2 and subtract it from all lower limits and then add it to all upper class limits. This adjustment restores continuity of data in the frequency distribution, i.e., Adjusted class mark = (Adjusted upper limit + Adjusted lower limit) / 2.

4. Find the frequency of each class: Find how many times that a certain observation occurs in the raw data to place in a suitable class as per tally marking (see March 2012 issue).

Lastly, one may wonder why all these mind boggling exercises when software provides ready-to-use table. True, much of statistical drudgery is simplified by software, but the concepts and terms in these steps are required even to use the software. As an exercise, try the pivot table tool of Excel to generate a frequency table with five classes. M S Sridhar sridhar@informindia.co.in