Você está na página 1de 13

CHAPTER

2
e d Collection of Data

T s h
R li
E u b
C
N re p
© e
chapter, you will study the sources of
Studying this chapter should enable
data and the mode of data collection.
you to:
• understand the meaning and The purpose of collection of data is to

b
purpose of data collection; collect evidence for reaching a sound
• distinguish between primary and and clear solution to a problem.
secondary sources; In economics, you often come

o
• know the mode of collection of data; across a statement like,

t
• distinguish between Census and “After many fluctuations the output
Sample Surveys;
of food grains rose to 176 million tonnes

t
• be familiar with the techniques of
sampling; in 1990–91 and 199 million tonnes in

o
• know about some important 1996–97, but fell to 194 million tonnes
sources of secondary data. in 1997–98. Production of food grains

n
then rose continuously and touched
212 million tonnes in 2001–02.”
1. I N T R O D U C T I O N
In this statement, you can observe
In the previous chapter, you have read that the food grains production in
about what is economics. You also different years does not remain the
studied about the role and importance same. It varies from year to year and
of statistics in economics. In this from crop to crop. As these values
1 0 STATISTICS FOR ECONOMICS

vary, they are called variable. The 2. WHAT ARE THE SOURCES OF DATA?
variables are generally represented by
Statistical data can be obtained from
the letters X, Y or Z. The values of
two sources. The enumerator (person
these variables are the observation.
who collects the data) may collect the
For example, suppose the food grain

d
data by conducting an enquiry or an
production in India varies between
investigation. Such data are called

e
100 million tonnes in 1970–71 to 220
Primary Data, as they are based on
million tonnes in 2001–02 as shown
first hand information. Suppose, you

h
in the following table. The years are
want to know about the popularity of
represented by variable X and the

T s
a film star among school students. For
production of food grain in India (in

i
this, you will have to enquire from a
million tonnes) is represented by

R l
large number of school students, by
variable Y:
asking questions from them to collect

b
TABLE 2.1

E
the desired information. The data you
Production of Food Grain in India get, is an example of primary data.

u
(Million Tonnes) If the data have been collected and

C
processed (scrutinised and tabulated)

p
X Y
by some other agency, they are called

N re
1970–71 108
1978–79 132 Secondary Data. Generally, the
1979–80 108 published data are secondary data.
1990–91 176 They can be obtained either from

© e
1996–97 199 published sources or from any other
1997–98 194 source, for example, a web site. Thus,
the data are primary to the source that

b
2001–02 212
collects and processes them for the
Here, these values of the variables first time and secondary for all sources
X and Y are the ‘data’, from which we

o
that later use such data. Use of
can obtain information about the

t
secondary data saves time and cost.
trend of the production of food grains
For example, after collecting the data
in India. To know the fluctuations in
on the popularity of the film star

t
the output of food grains, we need the
among students, you publish a report.
‘data’ on the production of food grains

o
If somebody uses the data collected
in India. ‘Data’ is a tool, which helps
by you for a similar study, it becomes
in understanding problems by

n
secondary data.
providing information.
You must be wondering where do
3. HOW DO WE COLLECT THE DATA?
‘data’ come from and how do we collect
these? In the following sections we will Do you know how a manufacturer
discuss the types of data, method and decides about a product or how a
instruments of data collection and political party decides about a
sources of obtaining data. candidate? They conduct a survey by
COLLECTION OF DATA 1 1

asking questions about a particular Good Q


product or candidate from a large (i) Is the electricity supply in your
group of people. The purpose of locality regular?
surveys is to describe some (ii) Is increase in electricity charges
characteristics like price, quality, justified?

d
usefulness (in case of the product) and • The questions should be precise

e
popularity, honesty, loyalty (in case and clear. For example,
of the candidate). The purpose of the Poor Q

h
survey is to collect data. Survey is a What percentage of your income do
method of gathering information from

T s
you spend on clothing in order to look
individuals.

i
presentable?

l
Good Q

R
Preparation of Instrument
What percentage of your income do

b
The most common type of instrument you spend on clothing?

E
used in surveys is questionnaire/
• The questions should not be

u
interview schedule. The questionnaire
ambiguous, to enable the respon-

C
is either self administered by the
dents to answer quickly, correctly

p
respondent or administered by the
and clearly. For example:

N re
researcher (enumerator) or trained
investigator. While preparing the Poor Q
questionnaire/interview schedule, you Do you spend a lot of money on books
should keep in mind the following in a month?

© e
points; Good Q
How much do you spend on books in
• The questionnaire should not be too a month?

b
long. The number of questions (i) Less than Rs 200
should be as minimum as possible. (ii) Between Rs 200–300
Long questionnaires discourage

o
(iii) Between Rs 300–400
people from completing them. (iv) More than Rs 400

t
• The series of questions should move • The question should not use double

t
from general to specific. The negatives. The questions starting
questionnaire should start from with “Wouldn’t you” or “Don’t you”

o
general questions and proceed to should be avoided, as they may
more specific ones. This helps the lead to biased responses. For

n
respondents feel comfortable. For example:
example: Poor Q
Poor Q Don’t you think smoking should be
(i) Is increase in electricity charges prohibited?
justified? Good Q
(ii) Is the electricity supply in your Do you think smoking should be
locality regular? prohibited?
1 2 STATISTICS FOR ECONOMICS

• The question should not be a because all the respondents respond


leading question, which gives a clue from the given options. But they are
about how the respondent should difficult to write as the alternatives
answer. For example: should be clearly written to represent

d
Poor Q both sides of the issue. There is also
How do you like the flavour of this a possibility that the individual’s true

e
high-quality tea? response is not present among the
Good Q options given. For this, the choice of

h
How do you like the flavour of this tea? ‘Any Other’ is provided, where the

T s
respondent can write a response,
• The question should not indicate

i
which was not anticipated by the

l
alternatives to the answer. For

R
researcher. Moreover, another
example:
limitation of multiple-choice questions

b
Poor Q

E
Would you like to do a job after college is that they tend to restrict the

u
or be a housewife? answers by providing alternatives,

C
Good Q without which the respondents may

p
Would you like to do a job, if possible? have answered differently.

N re
The questionnaire may consist of Open-ended questions allow for
closed ended (or structured) questions more individualised responses, but
or open ended (or unstructured) they are difficult to interpret and hard

© e
questions. to score, since there are a lot of
Closed ended or structured variations in the responses. Example,
questions can either be a two-way Q. What is your view about

b
question or a multiple choice question. globalisation?
When there are only two possible
answers, ‘yes’ or ‘no’, it is called a two-

o
Mode of Data Collection
way question.

t
Have you ever come across a television
When there is a possibility of more
than two options of answers, multiple show in which reporters ask questions

t
choice questions are more appropriate. from children, housewives or general
public regarding their examination

o
Example,
Q. Why did you sell your land? performance or a brand of soap or a

n
(i) To pay off the debts. political party? The purpose of asking
(ii) To finance children’s educa- questions is to do a survey for
tion. collection of data. There are three
(iii) To invest in another property. basic ways of collecting data: (i)
(iv) Any other (please specify). Personal Interviews, (ii) Mailing
Closed -ended questions are easy (questionnaire) Surveys, and (iii)
to use, score and code for analysis, Telephone Interviews.
COLLECTION OF DATA 1 3

Personal Interviews less expensive. It allows the researcher


to have access to people in remote
This method is used
areas too, who might be difficult to
when the researcher
reach in person or by telephone. It
has access to all the does not allow influencing of the

d
members. The resea- respondents by the interviewer. It also
rcher (or investigator)

e
permits the respondents to take
conducts face to face interviews with sufficient time to give thoughtful
the respondents.

h
answers to the questions. These days
Personal interviews are preferred online surveys or surveys through

T s
due to various reasons. Personal short messaging service i.e. SMS have

i
contact is made between the become popular. Do you know how an

R l
respondent and the interviewer. The online survey is conducted?
interviewer has the opportunity of The disadvantages of mail survey

E b
explaining the study and answering are that, there is less opportunity to
provide assistance in clarifying

u
any query of the respondents. The
instructions, so there is a possibility

C
interviewer can request the respon-
of misinterpretation of questions.

p
dent to expand on answers that are
Mailing is also likely to produce low

N re
particularly important. Misinterpre-
response rates due to certain factors
tation and misunderstanding can be
such as returning the questionnaire
avoided. Watching the reactions of the
without completing it, not returning

© e
respondents can provide supplemen-
the questionnaire at all, loss of
tary information. questionnaire in the mail itself, etc.
Personal interview has some

b
demerits too. It is expensive, as it Telephone Interviews
requires trained interviewers. It takes
In a telephone interview, the
longer time to complete the survey.

o
investigator asks questions over the
Presence of the researcher may inhibit

t
telephone. The advan-
respondents from saying what they
tages of telephone
really think.

t
interviews are that they
are cheaper than
Mailing Questionnaire

o
personal interviews and
When the data in a survey are can be conducted in a shorter time.

n
collected by mail, the questionnaire is They allow the researcher to assist the
sent to each individual respondent by clarifying the
by mail with a request questions. Telephone interview is
to complete and return better in the cases where the
it by a given date. The respondents are reluctant to answer
advantages of this certain questions in personal
method are that, it is interviews.
1 4 STATISTICS FOR ECONOMICS

Activities small group which is known as Pilot


Survey or Pre-Testing of the
• You have to collect information questionnaire. The pilot survey helps
from a person, who lives in a
in providing a preliminary idea about
remote village of India. Which
the survey. It helps in pre-testing of

d
mode of data collection will be
the most appropriate for the questionnaire, so as to know the

e
collecting information from him? shortcomings and drawbacks of the
• You have to interview the parents questions. Pilot survey also helps in

h
about the quality of teaching in assessing the suitability of questions,
a school. If the principal of the clarity of instructions, performance of

T s
school is present there, what enumerators and the cost and time

i
types of problems can arise? involved in the actual survey.

R l
The disadvantage of this method
4. CENSUS AND SAMPLE SURVEYS

b
is access to people, as many people

E
may not own telephones. Telephone Census or Complete Enumeration

u
Interviews also obstruct visual
A survey, which includes every

C
reactions of the respondents, which
element of the population, is known

p
becomes helpful in obtaining
as Census or the Method of Complete

N re
information on sensitive issues.
Enumeration. If certain agencies are
interested in studying the total
Pilot Survey
population in India, they have to

© e
Once the questionnaire is ready, it is obtain information from all the
advisable to conduct a try-out with a households in rural and urban India.

b
Advantages Disadvantages
• Highest Response Rate • Most expensive
• Allows use of all types of questions • Possibility of influencing

o
• Better for using open-ended respondents

t
questions • More time taking.
• Allows clarification of ambiguous

t
questions.

o
• Least expensive • Cannot be used by illiterates
• Only method to reach remote areas • Long response time

n
• No influence on respondents • Does not allow explanation of
• Maintains anonymity of respondents unambiguous questions
• Best for sensitive questions. • Reactions cannot be watched.

• Relatively low cost • Limited use


• Relatively less influence on • Reactions cannot be watched
respondents • Possibility of influencing respon-
• Relatively high response rate. dents.
COLLECTION OF DATA 1 5

The essential feature of this method


is that this covers every individual unit
in the entire population. You cannot
select some and leave out others. You
may be familiar with the Census of

d
India, which is carried out every ten

e
years. A house-to-house enquiry is
carried out, covering all households

h
in India. Demographic data on birth
and death rates, literacy, workforce,

T s
life expectancy, size and composition

i
of population, etc. are collected and

R l
Source: Census of India, 2001.
published by the Registrar General of

b
India. The last Census of India was 1981 indicated that the rate of

E
held in February 2001. population growth during 1960s and

u
1970s remained almost same. 1991

C
Census indicated that the annual

p
growth rate of population during

N re
1980s was 2.14 per cent, which came
down to 1.93 per cent during 1990s
according to Census 2001.

© e
“At 00.00 hours of first March,
2001 the population of India stood
at 1027,015,247 comprising of

b
531,277,078 males and
495,738,169 females. Thus, India
becomes the second country in the

o
world after China to cross the one

t
billion mark.”

Source: Census of India, 2001.

t
Sample Survey

o
Population or the Universe in statistics

n
means totality of the items under
According to the Census 2001, study. Thus, the Population or the
population of India is 102.70 crore. It Universe is a group to which the
was 23.83 crore according to Census results of the study are intended to
1901. In a period of hundred years, apply. A population is always all the
the population of our country individuals/items who possess certain
increased by 78.87 crore. Census characteristics (or a set of characteris-
1 6 STATISTICS FOR ECONOMICS

tics), according to the purpose of the • Sample: Ten per cent of the
survey. The first task in selecting a agricultural labourers in Chura-
sample is to identify the population. chandpur district.
Once the population is identified, the Most of the surveys are sample
researcher selects a Representative surveys. These are preferred in

d
Sample, as it is difficult to study the statistics because of a number of

e
entire population. A sample refers to reasons. A sample can provide
a group or section of the population reasonably reliable and accurate

h
from which information is to be information at a lower cost and
obtained. A good sample (represen- shorter time. As samples are smaller

T s
tative sample) is generally smaller than than population, more detailed

i
the population and is capable of information can be collected by

R l
providing reasonably accurate conducting intensive enquiries. As we

b
information about the population at need a smaller team of enumerators,

E
a much lower cost and shorter time. it is easier to train them and supervise

u
Suppose you want to study the their work more effectively.

C
average income of people in a certain Now the question is how do you

p
region. According to the Census do the sampling? There are two main
method, you would be required to find types of sampling, random and non-

N re
out the income of every individual in random. The following description will
the region, add them up and divide make their distinction clear.
by number of individuals to get the

© e
average income of people in the region. Activities
This method would require huge • In which years will the next
expenditure, as a large number of

b
Census be held in India and
enumerators have to be employed. China?
Alternatively, you select a represent- • If you have to study the opinion

o
ative sample, of a few individuals, from of students about the new
economics textbook of class XI,

t
the region and find out their income.
what will be your population and
The average income of the selected
sample?

t
group of individuals is used as an
• If a researcher wants to estimate
estimate of average income of the the average yield of wheat in

o
individuals of the entire region. Punjab, what will be her/his
population and sample?

n
Example
• Research problem: To study the Random Sampling
economic condition of agricultural As the name suggests, random
labourers in Churachandpur district sampling is one where the individual
of Manipur. units from the population (samples)
• Population: All agricultural are selected at random. The
labourers in Churachandpur district. government wants to determine the
COLLECTION OF DATA 1 7

tables have been generated to


guarantee equal probability of
selection of every individual unit (by
their listed serial number in the

d
sampling frame) in the population.
They are available either in a

e
A Population of 20
published form or can be generated
Kuchha and 20 by using appropriate software

h
Pucca Houses
packages (See Appendix B).You can

T s
start using the table from anywhere,

i
i.e., from any page, column, row or

R l
A Representative A non Representative point. In the above example, you need
Sample Sample
to select a sample of 30 households

b
impact of the rise in petrol price on
out of 300 total households. Here, the

E
the household budget of a particular
largest serial number is 300, a three

u
locality. For this, a representative
digit number and therefore we consult

C
(random) sample of 30 households has
three digit random numbers in

p
to be taken and studied. The names
sequence. We will skip the random

N re
of all the 300 households of that area
numbers greater than 300 since there
are written on pieces of paper and
is no household number greater than
mixed well, then 30 names to be
300. Thus, the 30 selected households

© e
interviewed are selected one by one.
are with serial numbers: 149, 219,
In the random sampling, every
111, 165, 230, 007, 089, 212, 051,
individual has an equal chance of being
244, 300, 051, 244, 155, 300, 051,

b
selected and the individuals who are
152, 156, 205, 070, 015, 157, 040,
selected are just like the ones who are
243, 479, 116, 122, 081, 160, 162.
not selected. In the above example, all

o
the 300 sampling units (also called

t
sampling frame) of the population got
Exit Polls
an equal chance of being included in

t
the sample of 30 units and hence the You must have seen that when an

o
sample, such drawn, is a random election takes place, the television
sample. This is also called lottery networks provide election coverage.

n
method. The same could be done using They also try to predict the results.
a Random Number Table also. This is done through exit polls,
wherein a random sample of voters
How to use the Random Number who exit the polling booths are asked
Tables? whom they voted for. From the data
of the sample of voters, the
Do you know what are the Random
prediction is made.
Number Tables? Random number
1 8 STATISTICS FOR ECONOMICS

Activity characteristic of the population (that


• You have to analyse the trend of may be the average income, etc.). It is
foodgrains production in India the error that occurs when you make
for the last fifty years. As it is an observation from the sample taken
difficult to include all the years, from the population. Thus, the

d
you have to select a sample of difference between the actual value of

e
production of ten years. Using a parameter of the population (which
the Random Number Tables, is not known) and its estimate (from
how will you select your sample?

h
the sample) is the sampling error. It is
possible to reduce the magnitude of

T s
Non-Random Sampling
sampling error by taking a larger

i
There may be a situation that you sample.

R l
have to select 10 out of 100
Example

b
households in a locality. You have to

E
decide which household to select and Consider a case of incomes of 5
farmers of Manipur. The variable x

u
which to reject. You may select the

C
households conveniently situated or (income of farmers) has measure-

p
the households known to you or your ments 500, 550, 600, 650, 700. We
note that the population average of

N re
friend. In this case, you are using your
judgement (bias) in selecting 10 (500+550+600+650+700)
households. This way of selecting 10 ÷ 5 = 3000 ÷ 5 = 600.
out of 100 households is not a random Now, suppose we select a sample

© e
selection. In a non-random sampling of two individuals where x has
method all the units of the population measurements of 500 and 600. The

b
do not have an equal chance of being sample average is (500 + 600) ÷ 2
selected and convenience or judgement = 1100 ÷ 2 = 550.
of the investigator plays an important Here, the sampling error of the

o
role in selection of the sample. They are estimate = 600 (true value) – 550

t
mainly selected on the basis of (estimate) = 50.
judgment, purpose, convenience or

t
quota and are non-random samples. Non-Sampling Errors

o
Non-sampling errors are more serious
5. SAMPLING AND NON-S AMPLING than sampling errors because a
ERRORS

n
sampling error can be minimised by
Sampling Errors taking a larger sample. It is difficult
The purpose of the sample is to take to minimise non-sampling error, even
an estimate of the population. by taking a large sample. Even a
Sampling error refers to the Census can contain non-sampling
differences between the sample errors. Some of the non-sampling
estimate and the actual value of a errors are:
COLLECTION OF DATA 1 9

Errors in Data Acquisition process and tabulate the statistical


This type of error arises from recording data. Some of the major agencies at
of incorrect responses. Suppose, the the national level are Census of India,
teacher asks the students to measure National Sample Survey Organisation
(NSSO), Central Statistical Organisa-

d
the length of the teacher’s table in the
classroom. The measurement by the tion (CSO), Registrar General of India

e
students may differ. The differences (RGI), Directorate General of
may occur due to differences in Commercial Intelligence and Statistics

h
measuring tape, carelessness of the (DGCIS), Labour Bureau etc.
The Census of India provides the

T s
students etc. Similarly, suppose we
most complete and continuous

i
want to collect data on prices of
demographic record of population. The

l
oranges. We know that prices vary

R
from shop to shop and from market Census is being regularly conducted

b
to market. Prices also vary according every ten years since 1881. The first

E
to the quality. Therefore, we can only Census after Independence was held

u
consider the average prices. Recording in 1951. The Census collects

C
mistakes can also take place as the information on various aspects of

p
enumerators or the respondents may population such as the size, density,
sex ratio, literacy, migration, rural-

N re
commit errors in recording or trans-
scripting the data, for example, he/ urban distribution etc. Census in
she may record 13 instead of 31. India is not merely a statistical
operation, the data is interpreted and

© e
Non-Response Errors analysed in an interesting manner.
The NSSO was established by the
Non-response occurs if an interviewer government of India to conduct

b
is unable to contact a person listed in nation-wide surveys on socio-
the sample or a person from the economic issues. The NSSO does
sample refuses to respond. In this

o
continuous surveys in successive
case, the sample observation may not

t
rounds. The data collected by NSSO
be representative. surveys, on different socio economic

t
subjects, are released through reports
Sampling Bias
and its quarterly journal

o
Sampling bias occurs when the Sarvekshana. NSSO provides periodic
sampling plan is such that some estimates of literacy, school

n
members of the target population enrolment, utilisation of educational
could not possibly be included in the services, employment, unemployment,
sample. manufacturing and service sector
enterprises, morbidity, maternity,
6. CENSUS OF INDIA AND NSSO child care, utilisation of the public
There are some agencies both at the distribution system etc. The NSS 59th
national and state level, which collect, round survey (January–December
2 0 STATISTICS FOR ECONOMICS

2003) was on land and livestock of data collection is to understand,


holdings, debt and investment. The explain and analyse a problem and
NSS 60th round survey (January– causes behind it. Primary data is
June 2004) was on morbidity and obtained by conducting a survey.
health care. The NSSO also

d
Survey includes various steps, which
undertakes the fieldwork of Annual
need to be planned carefully. There are

e
survey of industries, conducts crop
estimation surveys, collects rural and various agencies which collect,

h
urban retail prices for compilation of process, tabulate and publish
consumer price index numbers. statistical data. These can be used as

T s
secondary data. However, the choice

i
7. CONCLUSION

l
of source of data and mode of data

R
Economic facts, expressed in terms of collection depends on the objective of

b
numbers, are called data. The purpose the study.

C E u Recap

N re p
• Data is a tool which helps in reaching a sound conclusion on any
problem by providing information.
• Primary data is based on first hand information.
• Survey can be done by personal interviews, mailing questionnaires

© e
and telephone interviews.
• Census covers every individual/unit belonging to the population.
• Sample is a smaller group selected from the population from which

b
the relevant information would be sought.
• In a random sampling, every individual is given an equal chance of
being selected for providing information.

o
• Sampling error arises due to the difference between the actual

t
population and the estimate.
• Non-sampling errors can arise in data acquisition, by non-response
or by bias in selection.

t
• Census of India and National Sample Survey Organisation

o
are two important agencies at the national level, which collect,
process and tabulate data.

n EXERCISES

1. Frame at least four appropriate multiple-choice options for following


questions:
(i) Which of the following is the most important when you buy a new
dress?
COLLECTION OF DATA 2 1

(ii) How often do you use computers?


(iii) Which of the newspapers do you read regularly?
(iv) Rise in the price of petrol is justified.
(v) What is the monthly income of your family?
2. Frame five two-way questions (with ‘Yes’ or ‘No’).

d
3. (i) There are many sources of data (true/false).

e
(ii) Telephone survey is the most suitable method of collecting data, when
the population is literate and spread over a large area (true/false).

h
(iii) Data collected by investigator is called the secondary data (true/false).
(iv) There is a certain bias involved in the non-random selection of samples

T s
(true/false).

i
(v) Non-sampling errors can be minimised by taking large samples (true/

R l
false).
4. What do you think about the following questions. Do you find any problem

E b
with these questions? If yes, how?
(i) How far do you live from the closest market?

u
(ii) If plastic bags are only 5 percent of our garbage, should it be banned?

C
(iii) Wouldn’t you be opposed to increase in price of petrol?

p
(iv) (a) Do you agree with the use of chemical fertilizers?

N re
(b) Do you use fertilizers in your fields?
(c) What is the yield per hectare in your field?
5. You want to research on the popularity of Vegetable Atta Noodles among
children. Design a suitable questionnaire for collecting this information.

© e
6. In a village of 200 farms, a study was conducted to find the cropping
pattern. Out of the 50 farms surveyed, 50% grew only wheat. Identify the

b
population and the sample here.
7. Give two examples each of sample, population and variable.

o
8. Which of the following methods give better results and why?
(a) Census (b) Sample

t
9. Which of the following errors is more serious and why?

t
(a) Sampling error (b) Non-Sampling error
10. Suppose there are 10 students in your class. You want to select three out

o
of them. How many samples are possible?

n
11. Discuss how you would use the lottery method to select 3 students out of
10 in your class?
12. Does the lottery method always give you a random sample? Explain.
13. Explain the procedure of selecting a random sample of 3 students out of
10 in your class, by using random number tables.
14. Do samples provide better results than surveys? Give reasons for your
answer.

Você também pode gostar