Você está na página 1de 45

QUE No 1

Many people, of all ages and education levels, think that reading an article or

book, taking notes, and writing an essay on their notes is research. Research is not

information gathering. It is not the transportation of facts. It is not rummaging to

find out something new, and it certainly is not used as an advertising mechanism.

According to the educational writers Leedy and Ormrod, research is “a systematic

process of collecting, analyzing, and interpreting information (data) in order to

increase our understanding of the phenomenon about which we are interested or

concerned.” A hypothesis offers a possible answer to a question or problem.

This question or problem can then be separated into more manageable

“subproblems”. For instance, let’s say the question is “What should one wear to

school today?” The next question is “what is the weather supposed to be like

today?” The question is easily answered, and will lead to a few hypotheses, like

“shorts and a tee-shirt” or “jeans and a sweater”.


Research is the systematic process of collecting and analyzing information to

increase our understanding of the phenomenon under study. It is the function of

the researcher to contribute to the understanding of the phenomenon and to

communicate that understanding to others.

The Steps of the Research Process

The following seven steps outline a simple and effective strategy for finding

information for a research paper and documenting the sources you find.

Depending on your topic and your familiarity with the library, you may need to

rearrange or recycle these steps. Adapt this outline to your needs. We are ready to

help you at every step in your research.


SUMMARY: State your topic as a question. For example, if you are interested in

finding out about use of alcoholic beverages by college students, you might pose

the question, "What effect does use of alcoholic beverages have on the health of

college students?" Identify the main concepts or keywords in your question.


SUMMARY: Look up your keywords in the indexes to subject encyclopedias.

Read articles in these encyclopedias to set the context for your research. Note any

relevant items in the bibliographies at the end of the encyclopedia articles.

Additional background information may be found in your lecture notes, textbooks,

and reserve readings.


SUMMARY: Use guided keyword searching to find materials by topic or subject.

Print or write down the citation (author, title, etc.) and the location information

(call number and library). Note the circulation status. When you pull the book

from the shelf, scan the bibliography for additional sources. Watch for book-
length bibliographies and annual reviews on your subject; they list citations to

hundreds of books and articles in one subject area.


SUMMARY: Use periodical indexes and abstracts to find citations to articles. The

indexes and abstracts may be in print or computer-based formats or both. Choose

the indexes and format best suited to your particular topic; ask at the reference

desk if you need help figuring out which index and format will be best. You can

find periodical articles by the article author, title, or keyword by using the online

databases found on the NMSU-Carlsbad Library Gateway.

If the full text is not linked in the index you are using, write down the citation

from the index and search for the title of the periodical in the EZ Journal Portal

found on the NMSU-Carlsbad Library Gateway


SUMMARY: Use Search engines but evaluate your findings. Evaluating Web

pages requires two actions:

• be suspicious
• think critically about every page you find


SUMMARY: See How to Critically Analyze Information Sources and

Distinguishing Scholarly from Non-Scholarly Periodicals: A Checklist of Criteria

for suggestions on evaluating the authority and quality of the books and articles

you located. If you have found too many or too few sources, you may need to
narrow or broaden your topic. Check with a reference librarian or your instructor.


Give credit where credit is due; cite your sources.

Citing or documenting the sources used in your research serves two purposes: it

gives proper credit to the authors of the materials used, and it allows those who are

reading your work to duplicate your research and locate the sources that you have

listed as references.

Knowingly representing the work of others as your own is plagiarism. (See

NMSU’s Code of Academic Integrity). Use one of the styles listed below or

another style approved by your instructor. Handouts summarizing the APA and

MLA styles as well as sample pages of a research paper are available online and

they are useful for the beginning researcher to view and study.


Most students and beginning researchers do not fully understand what a research

proposal means, nor do they understand its importance. To put it bluntly, one's

research is only as a good as one's proposal. An ill-conceived proposal dooms the

project even if it somehow gets through the Thesis Supervisory Committee. A

high quality proposal, on the other hand, not only promises success for the project,

but also impresses your Thesis Committee about your potential as a researcher.

A research proposal is intended to convince others that you have a worthwhile

research project and that you have the competence and the work-plan to complete

it. Generally, a research proposal should contain all the key elements involved in

the research process and include sufficient information for the readers to evaluate

the proposed study.

Regardless of your research area and the methodology you choose, all research

proposals must address the following questions: What you plan to accomplish,

why you want to do it and how you are going to do it.

The proposal should have sufficient information to convince your readers that you

have an important research idea, that you have a good grasp of the relevant

literature and the major issues, and that your methodology is sound.

The quality of your research proposal depends not only on the quality of your

proposed project, but also on the quality of your proposal writing. A good research

project may run the risk of rejection simply because the proposal is poorly written.
Therefore, it pays if your writing is coherent, clear and compelling.

Research can be defined as the search for knowledge or any systematic

investigation to establish facts. The primary purpose for applied research (as

opposed to basic research) is discovering, interpreting, and the development of

methods and systems for the advancement of human knowledge on a wide variety

of scientific matters of our world and the universe. Research can use the scientific

method but need not do so

Types of research

Market research is the collection of information or data to better understand what

is happening in the market place. A firm's marketing department needs to know

about economic trends, as well as consumers' views. Based on this information,

they can put together a marketing plan, which will meet their own needs as well

as those of their consumers.

There are two general types of research:

• Exploratory research
• Conclusive research

Exploratory research

Many times a decision maker is grappling with broad and poorly defined

problems. Attempts to secure better definitions by analytic thinking may be the

wrong approach and may even be counter productive, counter productive in the

sense that this approach may lead to a definitive answer to the wrong question.

Exploratory research uses a less formal approach. It pursues several possibilities

simultaneously, an in a sense it is not quite sure of its objective. Exploratory

research is designed to provide a background, to familiarizes and as the word

implies, just explore the general subject. A part of exploratory research is the

investigation of relationships among variable without knowing why they are

studied. It borders on an idle curiosity approach, differing from it only in that the

investigator thinks there may be a payoff in application some where in the forest

of question. Three typical approaches in exploratory research are

1 The literature survey,

2 The experience survey,

3 The analysis of insight stimulating examples.

The literature search is fast , economical way for researchers to develop a

better understanding of a problem area in which they have limited experience. It

also familiarizes them with past research results, data sources, and the type of

data available.

The experience survey concentrates on persons who are particularly

knowledgeable in the particular area. Representative samples are not desired. A

covering of widely divergent views is better. Researchers are not looking for

conclusion, they are looking for ideas.

The analysis of specific examples is a sort of case study approach, but again

researchers are looking for fresh possible divergent views.

2-Conclusive Research

Exploratory research gives rise to several hypotheses which will have to be

tested for drawing definite conclusions. These conclusions when tested for

validity lay the structure for decision making. Conclusive research is used for this

purposed of testing the hypotheses generated by exploratory research.

Conclusive research can be classified as either descriptive or experimental.

(a) Descriptive Research

Descriptive research as the name suggests is designed to describe something,

for example, th characteristics of users of a given product, the degree to which

product use varies with income, age, sex or other characteristics, or the number

who saw a specific television commercial. To b of maximum benefit, a descriptive

study must collect data for define purpose. Descriptive studies vary in the degree

to which a specific hypothesis is the guide. It allows both implicit and explicit

hypotheses to be tested depending on the research problem. For example, a

cereal company may find its sales declining. On the basis of market feedback the

company may hypothesis that teenage children do not eat its for breakfast. A

descriptive study can then be designed to test this hypothesis.

(b) Experimental Research

Experimentation will refer to that process of research in which one or more

variables are manipulated under conditions which permit the collection of data

which show the effects. Experiments will create artificial situation so that the

researcher can obtain the particular data needed and can measure the data
accurately. Experiments are artificial in the sense that the situation are usually

created for testing purpose. This artificiality is the essence of the experimental

method, since it gives researchers more control over the factors they are

studying. if they can control the factors which are present in a given situation,

they can obtain more conclusive evidence of cause and effect relationships

between any two of them. Thus the ability to set up a situation for the express

purpose of observing and recording accurately the effect on one factor when

another is deliberately changed permits researchers to accept or reject

hypothesis beyond reasonable doubt. If the objective is to validate in a

resounding manner the cause and effect relationship among variables, then

undoubtedly experiments are much more effective than descriptive techniques.

QUE.NO 3 a


An hypothesis is a specific statement of prediction. It describes in concrete (rather

than theoretical) terms what you expect will happen in your study. Not all studies

have hypotheses. Sometimes a study is designed to be exploratory. There is no

formal hypothesis, and perhaps the purpose of the study is to explore some area

more thoroughly in order to develop some specific hypothesis or prediction that

can be tested in future research. A single study may have one or many hypotheses.

Actually, whenever I talk about an hypothesis, I am really thinking simultaneously

about two hypotheses. Let's say that you predict that there will be a relationship

between two variables in your study. The way we would formally set up the

hypothesis test is to formulate two hypothesis statements, one that describes your

prediction and one that describes all the other possible outcomes with respect to

the hypothesized relationship. Your prediction is that variable A and variable B

will be related (you don't care whether it's a positive or negative relationship).

Then the only other possible outcome would be that variable A and variable B are

not related. Usually, we call the hypothesis that you support (your prediction) the

alternative hypothesis, and we call the hypothesis that describes the remaining

possible outcomes the null hypothesis. Sometimes we use a notation like HA or H1

to represent the alternative hypothesis or your prediction, and HO or H0 to

represent the null case. You have to be careful here, though. In some studies, your

prediction might very well be that there will be no difference or change. In this
case, you are essentially trying to find support for the null hypothesis and you are

opposed to the alternative.

If your prediction specifies a direction, and the null therefore is the no difference

prediction and the prediction of the opposite direction, we call this a one-tailed

hypothesis. For instance, let's imagine that you are investigating the effects of a

new employee training program and that you believe one of the outcomes will be

that there will be less employee absenteeism. Your two hypotheses might be stated

something like this:

The null hypothesis for this study is:

HO: As a result of the XYZ company employee training program, there will either

be no significant difference in employee absenteeism or there will be a significant


which is tested against the alternative hypothesis:

HA: As a result of the XYZ company employee training program, there will be a

significant decrease in employee absenteeism.

Assumption Introduction

Research is built upon assumptions since not everything needed to move forward

is known. "One must assume something to learn something." "The more

assumptions or the stronger assumptions that one makes, the more one insures that

her analysis will yield clear-cut and interpretable results; at the same time, the

researcher, more than the empirical observations or records, is determining these


All research is built upon assumptions. We are limited in what we can test at

one time. Some variables may not be measurable until later.


An assumption is a realistic expectation. It is something that we believe to be true.

However, no adequate evidence exists to support this belief. An assumption is an

act of faith. It will not be tested in your research.If critics can dismiss your

assumptions, then your research is not likely to be taken seriously. Thus,

assumptions must be identified and considered with care.


Assumptions listed in research papers is often a good source of research topics.

Best Practice

Under the tradition/convention used by researchers, the researcher is responsible

for informing the reader of important assumptions made. If the reader cannot

accept these assumptions as reasonable, there is little point in reading the rest of

the research report.

Only important assumptions are identified and listed. For example, you would

not list as an assumption that your statistical software is accurate. Ordinarily,

assumptions are simply listed unless they are especially important,

controversial, or unusual. In that case, they may be discussed.


In a citation analysis study, you assume that the citation is evidence of use and that

use is reflected in the citation. In a user study, you might assume that respondents

are truthful and knowledgeable. Or is this so obvious and reasonable, that you

don't state this


One of the important strategies employed in this course is

the utilization of scientific propositions. A Proposition is a logical

statement of relationship between two or more variables which has,

generally, been confirmed by empirical research. (A proposition should be

distinguished from a hypothesis which is a logical statement of an assumed

relationship between two or more variables which must be empirically

tested, replicated and elaborated before being accepted as confirmed).

Throughout this course, particularly in the unit dealing with

theoretical framework, general principles regarding racial, ethnic and

minority group relations will be presented in the form of propositions.

These will serve to organize our thinking about complex relationships into

brief, cogent and manageable statements.

Proposition #1

Historically, when two or more racial (and or ethnic) groups establish

and maintain contact over time within the framework of an imperatively

coordinated society, one group ultimately comes to dominate the other(s).

Proposition #2

Historically, the greater the power differential between dominant and

subordinate groups within imperatively coordinated societies, the greater

the degree of racial discrimination, domination and subjugation.

Proposition #3
The type of racial domination emerging out of the migration/ contact

setting generally leads to a racial stratification system that is highly

resistant to change and persists long after the conditions which produced

it have ceased to exist.

Proposition #4

The greater the perceived racial and/or cultural differences between

the host society (generally the dominant group) and the immigrant groups

(generally subordinate), the greater the difficulties immigrants will

experience in their attempts to assimilate into the host society.

Proposition #5

When a society's "moral boundaries" and collective self-concept are

being threatened, whether from external or internal sources and whether the

threat is real or imaginary, and particularly when the nature and origins

of the threat are not clearly understood, members of society are likely to

react in an irrational manner (K. Erikson).

Proposition #6

In paternalistic racial systems, resistance usually takes the form of

coercive resource mobilization directed against dominant group members in

general; in competitive systems, resistance (either coercion, inducement or

persuasion) is more likely to be directed against dominant group authority

structures, i.e., administrative branches of government.

Proposition #7

As race relations approach a fluid competitive state, beliefs

regarding the biological and/or cultural inferiority of the subordinate

group are seriously undermined and their debilitating effects upon

subordinate group members are significantly reduced (William J. Wilson).

Proposition #8

Racism may serve to heighten or diminish overt conflict depending on

critical situational variables. And, the reverse is true: conflict may

heighten or diminish racism in society.

Proposition #9

Cultural racism generally precedes biological racism in the historical

sequence and arises out of the shift from paternalistic to competitive race


Proposition #10

During periods of general optimism regarding meaningful assimilation,

minority group leaders will tend to promote integration; in times of

disillusionment and despair, they will tend to promote separatism.

Proposition #11

If an extended period of increased expectations and gratifications is

followed by a brief period during which the gap between expectations and

gratifications suddenly and dramatically widens, the probability of violent

protest increases significantly.

Q.No 3 b


Research forms a cycle. It starts with a problem and ends with a solution to the

problem. The problem statement is therefore the axis which the whole research

revolves around, beacause it explains in short the aim of the research.


A research problem is the situation that causes the researcher to feel apprehensive,

confused and ill at ease. It is the demarcation of a problem area within a certain

context involving the WHO or WHAT, the WHERE, the WHEN and the WHY of

the problem situation.

There are many problem situations that may give rise to reseach. Three sources

usually contribute to problem identification. Own experience or the experience of

others may be a source of problem supply. A second source could be scientific

literature. You may read about certain findings and notice that a certain field was

not covered. This could lead to a research problem. Theories could be a third

source. Shortcomings in theories could be researched.

Research can thus be aimed at clarifying or substantiating an existing theory, at

clarifying contradictory findings, at correcting a faulty methodology, at correcting

the inadequate or unsuitable use of statistical techniques, at reconciling conflicting

opinions, or at solving existing practical problems.


The prospective researcher should think on what caused the need to do the

research (problem identification). The question that he/she should ask is: Are

there questions about this problem to which answers have not been found up to the


Research originates from a need that arises. A clear distinction between the

PROBLEM and the PURPOSE should be made. The problem is the aspect the

researcher worries about, think about, wants to find a solution for. The purpose is

to solve the problem, ie find answers to the question(s). If there is no clear

problem formulation, the purpose and methods are meaningless.

Keep the following in mind:

Outline the general context of the problem area.

Highlight key theories, concepts and ideas current in this area.

What appear to be some of the underlying assumptions of this area?

Why are these issues identified important?

What needs to be solved?

Read round the area (subject) to get to know the background and to identify

unanswered questions or controversies, and/or to identify the the most significant

issues for further exploration.

The research problem should be stated in such a way that it would lead to

analytical thinking on the part of the researcher with the aim of possible
concluding solutions to the stated problem. Research problems can be stated in

the form of either questions or statements.

The research problem should always be formulated grammatically correct and as

completely as possible. You should bear in mind the wording (expressions) you

use. Avoid meaningless words. There should be no doubt in the mind of the

reader what your intentions are.

Demarcating the research field into manageable parts by dividing the main

problem into sub problems is of the utmost importance.


Sub problems are problems related to the main problem identified. Sub problems

flow from the main problem and make up the main problem. It is the means to

reach the set goal in a manageable way and contribute to solving the problem.


The statement of the problem involves the demarcation and formulation of the

problem, ie the WHO/WHAT, WHERE, WHEN, WHY. It usually includes the

statement of the hypothesis.



Is the problem of current interest? Will the research results have
social, educational or scientific value?
2 Will it be possible to apply the results in practice?
3 Does the research contribute to the science of education?
4 Will the research opt new problems and lead to further research?
5 Is the research problem important? Will you be proud of the result?
Is there enough scope left within the area of reseach (field of
Can you find an answer to the problem through research? Will you
be able to handle the research problem?
8 Will it be pratically possible to undertake the research?
9 Will it be possible for another researcher to repeat the research?
10 Is the research free of any ethical problems and limitations?
11 Will it have any value?
Do you have the necessary knowledge and skills to do the research?
Are you qualified to undertake the research?
Is the problem important to you and are you motivated to undertake
the research?
Is the research viable in your situation? Do you have enough time
and energy to complete the project?
15 Do you have the necessary funds for the research?
16 Will you be able to complete the project within the time available?
Do you have access to the administrative, statistic and computer
facilities the research necessitates?
Que.No 4 a

What is qualitative research?

Qualitative research seeks out the ‘why’, not the ‘how’ of its topic through the

analysis of unstructured information – things like interview transcripts, emails,

notes, feedback forms, photos and videos. It doesn’t just rely on statistics or

numbers, which are the domain of quantitative researchers.

Qualitative research is used to gain insight into people's attitudes, behaviours,

value systems, concerns, motivations, aspirations, culture or lifestyles. It’s used to

inform business decisions, policy formation, communication and research. Focus

groups, in-depth interviews, content analysis, ethnography, evaluation and

semiotics are among the many formal approaches that are used, but qualitative

research also involves the analysis of any unstructured material, including

customer feedback forms, reports or media clips.

Collecting and analyzing this unstructured information can be messy and time

consuming using manual methods. When faced with volumes of materials, finding

themes and extracting meaning can be a daunting task.

Qualitative research is a method of inquiry appropriated in many different

academic disciplines, traditionally in the social sciences, but also in market

research and further contexts.[1] Qualitative researchers aim to gather an in-depth

understanding of human behavior and the reasons that govern such behavior. The

qualitative method investigates the why and how of decision making, not just
what, where, when. Hence, smaller but focused samples are more often needed,

rather than large samples.

Qualitative methods produce information only on the particular cases studied, and

any more general conclusions are only hypotheses (informative guesses).

Quantitative methods can be used to verify, which of such hypotheses are true.


In the social sciences, quantitative research refers to the systematic empirical

investigation of quantitative properties and phenomena and their relationships. The

objective of quantitative research is to develop and employ mathematical models,

theories and/or hypotheses pertaining to phenomena. The process of measurement

is central to quantitative research because it provides the fundamental connection

between empirical observation and mathematical expression of quantitative


Quantitative research is used widely in social sciences such as sociology,

anthropology, and political science. Research in mathematical sciences such as

physics is also 'quantitative' by definition, though this use of the term differs in

context. In the social sciences, the term relates to empirical methods, originating in

both philosophical positivism and the history of statistics, which contrast

qualitative research methods.

Qualitative methods produce information only on the particular cases studied, and

any more general conclusions are only hypotheses. Quantitative methods can be

used to verify, which of such hypotheses are true.

Examples of quantitative research

Research that consists of the percentage amounts of all the elements that make up

Earth's atmosphere.

Survey that concludes that the average patient has to wait two hours in the waiting

room of a certain doctor before being selected.

An experiment in which group x was given two tablets of Aspirin a day and Group

was given two tablets of a placebo a day where each participant is randomly

assigned to one or other of the groups.

The numerical factors such as two tablets, percent of elements and the time of

waiting make the situations and results quantitative

Difference b/w Qualitative & Quantitative research

Quantitative research is research which collects measurements that can be

analyzed mathematically. Generally speaking, this means data which has (or can

be interpreted as having) an interval or ratio scale of measurement. Quantitative

research can produce statistical results that are powerful - meaning that they can

make very fine distinctions between tested groups - but sometimes suffer from

over-specificity, producing results which (however accurate) are trivial, or

difficult to interpret in meaningful terms.

Qualitative research takes measurements that are difficult to analyze

mathematically. Statistical techniques for qualitative research are far less powerful

than quantitative techniques, but qualitative data usually contains much more

information, and is often easier to interpret and more meaningful to real-world

situations than quantitative results.

example: if a researcher has a question about a nation's satisfaction with its

government, she can create a survey to produce aquantitative measure of citizen

satisfaction, or she can conduct a series of interviews (e.g., a small set of questions

which individuals can answer at length, usually with the researcher trying to dig

deeper into issues that arise) to produce a qualitative measure. the first will give an

overview of citizen opinions without much detail, but which can be run through

statistical tests; the second will give a rich and precise understanding of what

citizens like and dislike about their nation, but will be far more difficult to analyze

Qualitative Quantitative
Methods include focus groups, in-depth Surveys, structured interviews &
interviews, and reviews of documents observations, and reviews of records or
for types of themes documents for numeric information

Primarily inductive process used to Primarily deductive process used to test

formulate theory or hypotheses pre-specified concepts, constructs, and
hypotheses that make up a theory

More subjective: describes a problem or More objective: provides observed

condition from the point of view of effects (interpreted by researchers) of a
those experiencing it program on a problem or condition

Text-based Number-based

More in-depth information on a few Less in-depth but more breadth of

cases information across a large number of

Unstructured or semi-structured Fixed response options

response options

No statistical tests Statistical tests are used for analysis

Can be valid and reliable: largely Can be valid and reliable: largely
depends on skill and rigor of the depends on the measurement device or
researcher instrument used

Time expenditure lighter on the Time expenditure heavier on the

planning end and heavier during the planning phase and lighter on the
analysis phase analysis phase

Less generalizable More generalizable

Que.NO 5

What Are Secondary Data?

In the fields of epidemiology and public health, the distinction between

primary and secondary data depends on the relationship between the

person or research team who collected a data set and the person who

is analyzing it. This is an important concept because the same data set

could be primary data in one analysis and secondary data in another.

If the data set in question was collected by the researcher (or a team of

which the researcher is a part) for the specific purpose or analysis under

consideration, it is primary data. If it was collected by someone else for

some other purpose, it is secondary data. Of course, there will always

be cases in which this distinction is less clear, but it may be useful to

conceptualize primary and secondary data by considering two extreme

cases. In the first, which is an example of primary data, a research team

conceives of and develops a research project, collects data designed to

address specific questions posed by the project, and performs and publishes

their own analyses of the data they have collected. In this case, the

people involved in analyzing the data have some involvement in, or at

least familiarity with, the research design and data collection process, and

the data were collected to answer the questions examined in the analysis.

In the second case, which is an example of secondary data, a researcher

poses questions that are addressed through analysis of data from the

Behavioral Risk Factor Surveillance System (BRFSS), a data set collected

annually in the United States through cooperation of the Centers

for Disease Control and Prevention and state health departments. In this

case, the person performing the analysis did not participate in either the

2 An Introduction to Secondary Data Analysis

research design or data collection process, and the datawere not collected

to answer specific research questions.

As an example of the same data set serving as both primary and

secondary data, consider the increasingly common practice of one

researcher performing an analysis of data collected by a research team

with whom he or she has no connection. This type of analysis is facilitated

by the ease of sharing data stored electronically and the concomitant

creation of electronic data archives that allow access to secondary

users; some of these archives are discussed in Chapter 7. Such analyses

may serve a variety of purposes, such as addressing questions not considered

in the original analysis or examining how a different analytic

approach might change the conclusions reached from the first analysis.

In either case, the same data set serves as primary data for the original

research team and secondary data for the researcher performing the later


This book deals primarily with secondary data in the sense of data sets

that can be obtained and analyzed in detail by the individual researcher.

There is another type of secondary data, again not mutually exclusive

with the first, meaning statistical information about some geographic

region or other entity. This type of information is often useful to

researchers: when you place your research project in context by describing

the racial makeup or median house value in the metropolitan area

where you conduct your research, the data used to compute those statistics

were probably secondary data. Often these statistics are computed

on data collected by the federal government, and Chapter 7 discusses

several websites that were created specifically to permit easy access to

these types of statistics. In addition, many of the data sets described in

this book are accessible through an online interface that allows the quick

computation of basic statistics, without requiring the user to download

data and use a statistical program to analyze it. The availability of such

interfaces has been noted in the sections pertaining to each data set.

Most of the data sets discussed in this volume contain either data collected

through surveys or censuses, such as the National Health Interview

Survey and the U.S. Census, or administrative records such as the medical

claims records submitted to the Medicare system. There are other types

of secondary data, including diaries, video recordings, and transcripts of

3 Advantages and Disadvantages of Secondary Data Analysis

interviews and focus groups: some of these are included in sources discussed

inChapter 7. Data such as interview transcripts are often analyzed

using qualitative data methods rather than the quantitative techniques

appropriate for most of the data sets discussed in this volume. Secondary

analysis of qualitative data is a topic unto itself and is not discussed in

this volume. The interested reader is referred to references such as James

and Sorenson (2000) and Heaton (2004).

Advantages and Disadvantages of Secondary Data Analysis

The choice of primary or secondary data need not be an either/or question.

Most researchers in epidemiology and public health will work with

both types of data in the course of their careers, and many research

projects incorporate both types of data. A more useful approach to this

question is to focus on selecting data that are appropriate to the research

question being studied and the resources available to the researcher; the

latter include time, money, and personal expertise. In this spirit, we offer

a summary of the major advantages and disadvantages of working with

secondary, as opposed to primary, data.

The first major advantage ofworkingwith secondary data is economy:

because someone else has already collected the data, the researcher does

not have to devote resources to this phase of research. Even if the secondary

data set must be purchased, the cost is almost certainly lower

than the expense of salaries, transportation, and so forth that would be

required to collect and process a similar data set from scratch. There is also

a savings of time. Because the data are already collected, and frequently

also cleaned and stored in electronic format, the researcher can spend

the bulk of his or her time analyzing the data. There is also the influence of

preference: secondary data analysis is an ideal focus for researchers who

prefer to spend their working hours thinking of and testing hypotheses

using existing data sets, rather than writing grants to finance the data

collection process and supervising student interviewers and data entry

The second major advantage of using secondary data is the breadth

of data available. Few individual researchers would have the resources to

collect data from a representative sample of adults in every state in the

4 An Introduction to Secondary Data Analysis

United States, let alone repeat this data collection process every year, but

the federal government conducts numerous surveys on that scale. Data

collected on a national basis are particularly important in epidemiology

and public health, fields that focus primarily on the health of populations

rather than of individuals. In addition, some of the data sets collect data using a

longitudinal design, and othersare designed so certain questions are included

annually or at regularintervals, allowing researchers to examine the changes in

health status and health behaviors in the population over time.

The third advantage in using secondary data is that often the data collection

process is informed by expertise and professionalism that may not

available to smaller research projects. For instance, many of the federal

health surveys discussed in this volume use a complex sample design and

system of weighting that allows the researcher to compute population based

estimates of health conditions and behaviors. Although a local

data collection project could conceivably use similar techniques, more

often a convenience sample, whose generaliz ability is questionable, is

used instead. To take another example, data collection for many federal

data sets is often performed by staff members who specialize in that task
and who may have years of experience working on a particular survey.

This is in contrast to many smaller research projects, in which data are

collected by students working at a part-time, temporary job.

One major disadvantage to using secondary data is inherent in its

nature: because the data were not collected to answer your specific

research questions, particular information that you would like to have

may not have been collected. Or it may not have been collected in the

geographic region you want to study, in the years youwould have chosen,

or on the specific population that is the focus of your interest. In any case,

you can only work with the data that exist, not what you wish had been

collected. A related problem is that variables may have been defined or

categorized differently than you would have chosen: for instance, a data

set may have collected age information in categories rather than as a continuous

variable, or race may have been defined as only White/Other. A

third difficulty is that data may have been collected but are not available to

the secondary researcher: for instance, address and phone number information

for survey respondents may have been recorded by the original

5 Locating Appropriate Secondary Data

research team but will not be released to secondary researchers for confidentiality

reasons. If an analysis incorporating geographic information

was planned, such a restriction might make the data set unusable. For
these reasons, a secondary data set should be examined carefully to confirm

that it includes the necessary data, that the data are defined and

coded in a manner that allows for the desired analysis, and that the

researcher will be allowed to access the data required.

A second major disadvantage of using secondary data is that because

the analyst did not participate in the planning and execution of the

data collection process, he or she does not know exactly how it was

done. More to the point, the analyst does not know how well it was

done and therefore how seriously the data are affected by problems such

as low response rate or respondent misunderstanding of specific survey

questions .Every data collection effort has its “dirty little secrets” that may

not invalidate the data but should be taken into account by the analyst. If

the analyst was not present during the data collection process, he or she

has to try to find this information through other means. Sometimes it is

readily available; for instance, many of the federal data sets have extensive

documentation of their data collection procedures, refusal rates, and

other technical information available on their websites or in published

reports. However, many other secondary data sets are not accompanied

by this type of information, and the analyst must learn to “read between

the lines” and consider what problems might have been encountered in

the data collection process.

Locating Appropriate Secondary Data

There is a vast quantity of secondary data in epidemiology and public

health that is available to the individual researcher. However, the sheer

quantity of data available, and the fact that the data are collected and

archived by many different governmental and private entities, means

that the process of locating appropriate secondary data is not always

straightforward. In fact, this book was written to ameliorate some of the

difficulties involved. There is no single process to be followed in every

case, but we offer two examples of the process of locating and analyzing

secondary data to address a specific research question or problem.

6 An Introduction to Secondary Data Analysis

This section might have been better titled “achieving a fit between

your research question and the data you choose to analyze” because it is

often an iterative process in which a research question is posed, potential

data sets are considered, the question is refined in terms of the data

available, other sources of data are considered, the question is refined

again, and so on. The most typical way to use secondary data for research

is to begin with a research question and seek a data set that will allow

analysis of that question. An alternative method is to begin by selecting

from among the available secondary data sets, and then formulating a

research question that may be answered using the data chosen. Although
the first method conforms more to standard beliefs about how research is

done, the second approach is particularly useful in class room instruction,

and both methods can produce quality research. If the researcher begins

with a question and then seeks out an appropriate data set, the following

generalized sequence of procedures may be useful:

1. Define the question you want to study; for instance, “How does the

experience of racism affect an individual’s health?”

2. Specify the population you want to study. Are you interested in children,

adults, or people of all ages? What races or ethnicities do you

want to study? Do you want to analyze a national sample or one confined

to a smaller area? What is the range of years you would consider

(e.g., you may only be interested in data collected over the last 5 years)?

3. Specify what other variables you want to include in your analysis. In

this example, you might believe that it was important to have information

about the respondents’ race, Hispanic ethnicity, age, gender,

income, and educational level in order to include those factors in

your analysis. If so, you must confirm that the data you desire are

contained in the data set that you choose and that they are recorded

in a manner that is useful to you. If you are interested in comparing

the experiences of Hispanic Blacks and non-Hispanic Blacks, information

about Hispanic ethnicity would need to be recorded in the

data set independently of information about race.

4. Specify what kind of data is most appropriate for your research

question: for instance, can it best be addressed through a national

survey, examination of hospital claims records, or transcriptions of

7 Locating Appropriate Secondary Data

interviews? Also, specify if there are any specific data collection techniques

you believe are particularly appropriate or inappropriate for

your question. For instance, if you do not believe people would answer

questions about racism honestly in a personal interview, you would

not consider any data sets collected using that technique. However, if

you believe that a telephone survey would be the best way to collect

this information, you might begin your search by looking at surveys

that used this data collection method.

5. Create a list of data sets that include information related to your

research question and examine them to see if they meet your other

requirements (age range included, year of collection, etc.). This is

where the interactive process begins because you may have to revise

either your question or your data requirements, depending on the

data that are available to you.

6. Once you have chosen your data set, examine the variables you intend

to use for the analysis of problems such as missing data or out-of range

values. Also, read whatever information you can find about the
data collection process, data cleaning procedures, and so on in order

to evaluate whether the data quality is sufficient to meet your needs.

If so, continue with the analysis; if not, either devise a way to work

around it (e.g., by imputing values for the missing data) or choose

another data set.

How do you generate the list mentioned in step five? By any means

necessary, as the saying goes. Consider the data sets described in this

book, search Medline to see what data sets other researchers have used to

address your topic, search the web portals listed in Chapter 7, ask other

researchers for suggestions, query relevant email lists, and so forth.

If you take the approach of beginning with a data set and crafting

a research question that can be addressed using it, the process is similar,

but the order of events is different. In this case, you would begin

by looking at the variables contained in the data set and considering

how you might combine them to create an interesting question. The

process can begin with a germ of an idea, which may reflect your personal

interests or a question that has arisen in your work. For instance,

you might be interested in how disability affects the amount of physical

8 An Introduction to Secondary Data Analysis

activity in which a person engages. You then need to operationalize this

question so it may be tested using the variables available in the data set:
how will you define disability, and how will you define physical activity?

At this point, a Medline search for related articles would be in order,

to see how others have addressed similar questions and whether they

have done so with the data set you will be using. This step will help

keep you from reinventing the wheel and will place your research in

Alternatively, you can begin by simply looking at the variables included

in the data set to see which of them interest you. For instance, if you were

planning to work with the BRFSS data from 2005, you might notice

that eleven states included questions on weight control procedures. You

would then look at the actual questions asked and confirm that the data

were actually available. This process

should help you refine your focus so you can craft a research question

that can be answered using BRFSS 2005 data and that would add to our

understanding of public health. Because the BRFSS includes racial and

ethnic data, you might decide to look at racial and ethnic differences

in weight control practices. Or, taking advantage of the fact that BRFSS

data are identified by state of residence, you could plan to conduct a

comparison of weight control practices in different states. You could

also plan a multilevel analysis that combined information about state level

characteristics from the U.S. Census (e.g., racial makeup or poverty

level) with the individual-level data available in the BRFSS. When you
have selected the variables you will include in your analysis, confirm that

they are coded (or can be recoded by you) in a manner that will support

your intended analysis and that there are no major data quality issues

such as large quantities of missing data.

Questions to Ask About Any Secondary Data Set

Once you have located a secondary data set that you think is appropriate

for your analysis, you need to learn as much as you can about why and

how it was collected. In particular, you will want to answer the following

three questions:

9 Questions to Ask About Any Secondary Data Set

1. What was the original purpose for which the data were collected?

2. What kind of data is it, and when and how were the data collected?

3. What cleaning and/or recoding procedures have been applied to the


Sources for this information include the website of the agency or other

entity responsible for collecting and/or making the data available, published

reports, research articles based on the data, and personal communications

with relevant individuals. For instance, many of the federal

agency websites include one or more contact people who are available to

answer questions about the data collected by that agency, and a Medline
search will often produce citations to reports and articles discussing the

procedures used to collect particular data sets.

The question of determining the original purpose of the project that

produced the data is important because its influence may be present in

other characteristics of the data, from the population targeted to the

specific wording of questions included in a survey. Because you were

not involved in planning phases for the project whose data you will analyze,

you need this information in order to place the data in context.

To take an extreme case, you would certainly want to know if a research

project on the health effects of smoking was sponsored by a tobacco company

or by a nonprofit dedicated to smoking prevention. You would also

like to know if there was any particular philosophy or model of health

behavior that shaped the project: for instance, was a smoking cessation

program structured using the Tran theoretical Model? Knowledge

of the core philosophical beliefs behind a research project can illuminate

the reasons for many choices made in the planning and execution

of the research and will be reflected in the end product, the data you are

proposing to analyze.

It is almost impossible to know too much about the data collection

process because it can influence the quality of the data in many ways,

some of them not obvious. To start with, you need to know when the data
were collected. A data set released in 2004 may have been collected in the

first 3months of 2004 or over a 4-year period from2000 to 2003. Second,

you want to know the process by which the data were collected: was it

via telephone interviews, in-person interviews, abstraction of hospital

10 An Introduction to Secondary Data Analysis

records, or some other technique? Third, you want to know the details of

the data collection process. Questions in this regard include who actually

did the data collection, how extensive was their training, and how

carefully were they supervised. If the data were collected through chart

review, what specific instructions were given to the reviewers? If the data

were collected through a survey, what was the response rate? How many

efforts were made to collect data from no responders? If data were collected

through a telephone survey, how were numbers selected? Was there

any attempt to correct for the bias introduced because households without

a telephone are not a random sample of all households? The issues

of survey data quality are the same whether the data set is primary or

secondary. For a thorough discussion of these issues, consult a reference

such as de Vaus (2002) or Bulmer, Sturgis, and Allum (2006).

The third major question in working with any secondary data set is

what was done to the data after they were collected. For instance, almost

all data sets include some missing data. Were these data left as missing,

or were values imputed, and if so, how was the imputation done? Was
any data cleaning done to remove out-of-range values, and were those

cases assigned missing values or was some other procedure followed?

Were certain combinations of answers considered invalid, and if so, how

were they treated? A famous example of this last type of procedure was

the decision in the 1990 U.S. Census to recode to the opposite gender

one member of a same-gender couple who declared themselves to be

married. If any recoding has been done, is it possible to restore the

original values? You also need to find out if data can be weighted, and if

so, for what aggregations the weighting allows the production of accurate

estimates (e.g., at the national level alone or at both the national and the
state levels).

Você também pode gostar