Você está na página 1de 13

CHAPTER 16

Observing Exposures and Outcomes Concurrently


Reporting Surveys or Cross-Sectional Studies
The principle objectives [of a survey] should always be to collect reliable, valid, and unbiased data from a representative sample, in a timely manner and within given resource constraints. E. MCCOLL, A. JACOBY, L.THOMAS, ET AL. (1)

Cross-sectional research includes several ways of gathering data at a single point in time from individualsmail surveys, standardized tests, telephone interviews, structured face-to-face interviewsand about populationsperiodic surveys of health status and regular chart reviews of databases for disease surveillance in populations.We use the general term survey to refer to any cross-sectional study, as well as to the administration of clinically oriented questionnaires, such as those measuring quality of life, socioeconomic status, satisfaction with hospital stay, or the likelihood of mental illness or behavioral problems. We use the general term questionnaire to refer to a set of questions that are to be answered by respondents, either directly through self-administered printed or computerized questionnaires, or indirectly through an interviewer who contacts the respondent by telephone or in person or through a data collection form on which data from medical records is compiled. We use the term psychometric instrument to refer to sets of questions intended to assess specifc traits or conditions associated with specific diagnoses. Psychometric instruments are measurement tools on which inferences are made for individuals, as opposed to more general questionnaires, which we see more as forms for collecting general descriptive data on groups of people. Finally, we use the term data collection form to refer to the form that guides data abstraction from patient records. Many of the guidelines for reporting research designs and activities apply to all study designs. Here, we explain only those guidelines unique to cross-sec-

Sub-Guideline

c Method of Checking p Potential Problem i


239

Related Information

240

Guidelines for Reporting Research Designs and Activities

tional studies. Annotations for other guidelines are given in Chapter 13: Reporting Randomized Controlled Trials.

GUIDELINES TO BE ADDRESSED IN THE INTRODUCTION

16.1 16.2 16.3 16.4

Document the background, nature, scope, and importance of the problem that led to the survey (2). State the general purpose of the survey. Identify any theoretical or scientific approach taken to address the problem (2,3). State who funded the survey and describe the role of the funding agency in the conduct of the survey and the publication of the results (4). State how the protocol and original data may be obtained.

GUIDELINES TO BE ADDRESSED IN THE METHODS

16.5

Identify the institutional review board that approved the study.


Surveys involve far less risk than clinical research and thus have less complex ethical considerations. Nevertheless, the questionnaire or survey instrument may raise controversial or unpleasant topics among respondents and so should probably be approved by the appropriate IRB.

16.6

State the specific objectives of the research, including any formally stated research questions or hypotheses (2,3).
Cross-sectional studies are descriptive.They can be used to collect information about the prevalence of attributes, behaviors, beliefs, knowledge, or opinions among members of a population (1), as well as to track changes over time, including routine epidemiologic surveillance, and to collect information useful in health care planning. By identifying associations between exposures and outcomes, they can also be used to generate hypotheses about causation that can be tested with other study designs (5). Because exposure and outcome are assessed at the same time, a temporal relationship between them can be uncertain (6).

Reporting Surveys or Cross-Sectional Studies

241

16.7

Identify the research as a cross-sectional study and explain why this design was chosen (1-4).
Three forms of cross-sectional studies are common in health research: Self-administered questionnaires, such as mail surveys or computerassisted surveys that are taken on-line (computer-assisted self-administration, or CASA) Interviewer-assisted surveys, such as face-to-face intercept surveys done at, say, shopping centers, or telephone interviews, especially computer-assisted telephone interviewing (CATI) Surveillance reports or chart reviews of databases or clinical registries Surveys can generally be done faster and less expensively than other forms of research; however, they need to be designed and conducted as rigorously as any other research design. Also, unlike groups created with random assignment, groups created from survey responses (as opposed to those sampled for the survey) may differ on any number of unmeasured or unknown confounding factors, which can affect the conclusions that can be drawn from surveys.

16.8 16.9 16.10

Specify the observational unit(s) of interest. Describe the target population of interest (2). Define the source population from which the recipients or records were drawn (2).
For retrospective studies of registries or databases, it may be appropriate to include a brief description of the following: The original purpose of the registry and the dates of any major revisions to its structure or purpose (7) The scope of the registry, including the number of records, the extent of information in each record, and the inclusive dates of the data contained in the registry How the registry is managed: the personnel and procedures for collecting, screening, entering, and extracting the data contained in the registry The methods for ensuring the accuracy and completeness of the data If possible, the results of the most recent verification of the data, including the error rate

242

Guidelines for Reporting Research Designs and Activities

16.11

Describe how survey recipients or records were identified (2,3).


Survey recipients are often identified from membership lists or are sampled at random from telephone directories.These sources may or may not be representative of the population of interest, however. In a study of managed care outcomes, all patients of interest will be included in the registry. In a telephone survey, only households with telephones will be included, and the sample will be biased toward higher-income, urban, and more socially connected households. In some cases, potential respondents with unlisted telephone numbers can be sampled with random-digit dialing to improve the representativeness of a sample of households with telephones.

16.12

Report how survey recipients were approached for participation (2).


Recipients are commonly approached for participation by mail, telephone, or by face-to-face contact in intercept surveys. In any case, the details of the approach should be reported because how respondents are approached can determine their likelihood of participating in the survey. For example, many face-to-face interviews are conducted by middle-aged women because they are thought to be less threatening than other people.

16.13 16.14 16.15 16.16 16.17 16.18

Report the eligibility criteria for being included in the survey (3,4). Indicate whether the sample was stratified and, if so, on which characteristics. Report the target sample size and how it was determined (2-4). Identify the location(s) and setting(s) of survey recipients at the time of the survey (2). Specify any demographic, clinical, and other baseline covariate data collected. Describe the characteristics of the questionnaire or psychometric instrument (1,2).
A brief description of the physical appearance and characteristics of the questionnaire is helpful in understanding how data were collected.The graph-

Reporting Surveys or Cross-Sectional Studies

243

ic design of the questionnaire can affect response rates at several points in the process (1). In particular, it may be helpful to know: The number of questions to be answered The amount of time required by the typical respondent to complete the questionnaire or the interview The types of responses solicited, such as Likert scales (ranked responses), ordinal categories, and yes-no or open-ended responses; and whether the responses are forced-choice or allow no opinion or neutral alternatives The number of pages involved and what they look like (page size, type size, color, and so on) Questionnaires can also be designed to prevent bias caused by what are called response sets, which occur when respondents answer questions in predictable ways, irrespective of the content of the questions. Acquiescence or yea-saying is the tendency to agree with or to answer yes to all or most questions; the nay saying response set is also possible. To counter this response set, questionnaires are often designed with questions framed both positively and negatively. Social desirability is the tendency to select the responses that will portray the respondent in the most positive light. Again, questions can often be worded to make all response options more acceptable. Faking bad occurs when respondents chose negative responses to call increased attention to themselves as individuals or to a problem that they want to emphasize.

16.19

Identify the variables assessed and, if necessary, explain how each was quantified (2-4).
The variables assessed in surveys may consist of: Demographic and clinical characteristics (age, sex, race, education, measures of social economic status, health problems, contact with the health care delivery system, and so on) Preconditions, exposures, or risk factors for disease or disability (health behaviors, family occurrences of health problems, occupational exposures, and so on) Past and present health states (past and present occurrences of immunizations, diagnoses, hospitalizations, surgeries, and so on) Knowledge and opinions on various topics (warning signs of cancer, end-of-life decisions, willingness to stop smoking, and so on) Dimensions or constructs of health states or personality characteristics (self-esteem, depression, tolerance of ambiguity, risk-taking, and so on)

244

Guidelines for Reporting Research Designs and Activities

Scales or scores created from two or more questions (such as risk scores for domestic violence, stress scales, or mood indexes) Especially when measuring dimensions or constructs, it is important to state how the concept is operationalized or defined for measurement purposes. For example, risk-taking might be defined as the willingness to 1) engage in risky activities (sky-diving, bungee jumping, or motorcycle racing) or to 2) forgo protective activities (not wearing a seatbelt, or not wearing sunblock when outside for prolonged periods).Thus, a respondent who engages in skydiving is, by definition, a risk-taker for purposes of the study. Some dimensions are more easily or convincingly operationalized than others; thus, risktaking or aggression may be more easily assessed than, say, pain or love. Scales and scores should be described completely so that their results can be interpreted. For example: The American Urology Associations Symptom Index consists of 7 symptoms (straining, incomplete voiding, frequency, intermittency, stream force, urgency, and nocturia), each of which is graded on a severity scale of from 0 to 5.Thus, total scores range from 0 to 35, where scores of 0 to 7 indicate mild symptoms; 8 to 19 indicate moderate symptoms; and 20 to 35 indicate severe symptoms. It may also be necessary to indicate the range of scores associated with normal function or that is typical of healthy people. For example, the 21-item Beck Depression Inventory is scored from 0 to 63. Scores below 4 are unusually low among normal people and may indicate possible denial of depression or attempts to fake good emotional health. Scores of 5 to 9 are normal, 10 to 18 indicate mild-to-moderate depression, 19 to 29, moderate-to-severe depression, and 30 to 63, severe depression. However, scores above 40 are unusually high, even for depressed persons, and suggest possible exaggeration of depression or histrionic or borderline personality disorders. A special type of psychometric instrument is the standardized test. Standardized tests are usually 1) administered under uniform conditions, 2) scored objectively so that different evaluators will award the same scores for the same responses, and 3) report results that are relative to those of a normative population. Examples are the Scholastic Aptitude Test (SAT) and many intelligence tests.The normative population should be identified, as well as the scores of this population on the test, such as the median and typical ranges.

Ordinal response categories should be analyzed with tests for ordinal data; the categories should not be described or analyzed as though they were continuous data. That is, responses to a question about satisfaction with hospital stay, measured as 1 (low) to 5 (high), should not be given as mean and standard deviation, and even median and interquartile range may be uninformative with so few categories.The mode is always appropriate, however, and reporting the number or percentage of responses for each response category is often desirable.

Reporting Surveys or Cross-Sectional Studies

245

16.20 Report whether and on whom the questionnaire was pretest-

ed before being administered in the study; document the reliability and validity of psychometric instruments.

Questionnaires should almost always be pretested to make sure that they will provide the expected results. Pretesting may reveal problems that respondents have with understanding words or questions, completing the questionnaire, understanding why the survey is important, and so on. Pretesting also allows investigators to learn how long respondents take to complete and return the questionnaire and perhaps how long it will take to score or enter the data into the database. A good questionnaire is one that is reliable, valid, unbiased, and capable of discriminating between groups (1). Reliability is the extent to which the questionnaire yields repeatable and consistent results when administered to similar populations in similar circumstances. The purpose of reliability testing is to determine how much of the variability in results should be attributed to measurement error and how much should be attributed to the expected variability in true scores. Reliability can be assessed in several ways. In test-retest reliability, the questionnaire is administered to the same group of people at least twice.The first set of scores is compared with the second set; the questionnaire is reliable if the scores are highly correlated. In split-half reliability, the results from half the respondents are compared with the results from the other half; again, the questionnaire is reliable if the scores are highly correlated. Internal consistency is a measure of how respondents answer related questions the same way. It is often assessed with Cronbachs alpha, a correlation coefficient (scored between a low of zero and a high of 1). Generally, an alpha of 0.8 or higher reflects a reasonable degree of internal consistency, and an alpha of less than 0.6 is unacceptably low. Results from alternate forms of the questionnaire can also be compared to assess consistency. Here, each questionnaire asks the same questions but in different ways. Validity (sometimes called internal validity) is the extent to which a questionnaire measures what it is supposed to measure. A valid questionnaire produces consistent results (that is, it is reliable) that are relatively free from bias and error.There are several types of validity: Face validity is the extent to which a questionnaire superficially appears to measure what it is supposed to measure. On a depression inventory, a question about sadness has high face validity because sadness is a well known characteristic of depression. Face validity is often important in convincing respondents to take the questionnaire seriously. It is the least important type of validity because validity still needs to be established through other methods. Content validity is concerned with whether the questions measure the full domain that is to be assessed.A questionnaire on muscle function that asks about strength and endurance but not about flexibility has omitted an impor-

246

Guidelines for Reporting Research Designs and Activities

tant domain. Experts in the domain generally must judge the degree of content validity. Construct validity refers to the degree to which questions assess the underlying theoretical dimensions (constructs) that they are supposed to measure.A good construct is based on theory and is operationally defined by measurable indicators. Construct validity is the most important kind of validity, and establishing it is a long and complex process. For example, a question that asks respondents to estimate the number of submarines used in World War II actually discriminates reasonably well between normal and schizophrenic people; respondents with symptoms of paranoia consistently overestimate the actual number. Convergent, criterion, concurrent, or predictive validity is one aspect of construct validity and refers to the degree of agreement (convergence) between a questionnaire and other measures (criteria) of the same construct at the same time (concurrent validity) or at some future time (predictive validity). For example, if the scores from two questionnaires rank patients in the same order of disease severity, the questionnaires have high concurrent validity. If disease severity is highly associated with longer hospitalization, the questionnaire has high predictive validity. Divergent or discriminant validity is another aspect of construct validity and refers to the appropriate lack of agreement (divergence) of scores between two questionnaires measuring (discriminating between) different concepts. For example, the results from a questionnaire measuring quantitative reasoning should not be highly correlated with those of a questionnaire measuring reading comprehension, which is a different ability. External validity or generalizability refers to the wisdom of projecting the questionnaire results obtained from a study to other populations, to other settings, or to other time periods. For example, can the results of a survey of New Yorkers be generalized to Texans? In contrast, internal validity refers to the validity of the questionnaire itself; that is, its ability to measure what it is designed to measure.

A test must be reliable to be valid, but reliability does not guarantee validity.

16.21 16.22

Report how the survey was conducted: how the questionnaire was administered or how the records were accessed and the data abstracted (2). Indicate whether and how responses were kept anonymous.
Anonymity can be ensured by not collecting identifying information. In other cases, all responses are combined so that only aggregated data are made available. In still other cases, each respondent or patient record may be assigned

Reporting Surveys or Cross-Sectional Studies

247

a code number.The key to the code is kept in a secure location and respondents or patients are then referred to only by their code number.

16.23

Indicate any measures taken to ensure adequate response rates (1,2).


Response rates to surveys can be improved by: Telling prospective respondents to expect the questionnaire in the mail or the telephone contact Reminding respondents how important the survey is to them as individuals or as members of a group Using endorsements of opinion leaders to establish the importance of the survey Following up with respondents who have not yet returned completed questionnaires Providing payment or other incentives for returning a completed questionnaire Training interviewers in persuasive tactics

16.24

Report the criteria for accepting questionnaires or records as evaluable.


An evaluable questionnaire is typically one that is returned within a specified time and that has complete information for all questions or for discrete parts within the questionnaire. Thus, unanswered questions, illegible handwriting, multiple erasures that prevent identifying an interpretable response, and multiple responses to a question for which a single response is appropriate may all render all or part of the questionnaire unevaluable.

16.25

Identify possible sources of bias, confounding, and error, and the measures taken to control for them (1,4).
Surveys are subject to many unique biases. Some of the more common are listed below.Typical actions taken to minimize them are given in parentheses. Bias introduced by the order in which questions are posed (pretesting; in interviews, changing the order of the questions) Bias introduced by the wording or framing of questions (pretesting; multiple questions measuring the same characteristic) Bias introduced by response options to questions (pretesting; multiple questions measuring the same characteristic) Bias introduced by nonresponses to specific questions (pretesting; multiple questions measuring the same characteristic)

248

Guidelines for Reporting Research Designs and Activities

Bias introduced by nonresponses to the entire survey (multiple contacts with prospective respondents; large samples) Recall bias, or bias introduced by memory lapses or errors among respondents (memory prompts included in the question) Differential recall bias, or bias introduced by differential recall rates between affected and unaffected respondents (memory prompts included in the question)

See Appendix 5.

16.26

Describe any quality control methods used to ensure completeness and accuracy of data entry and management.
The completeness and accuracy of data entry and management can be improved by: Optical scanning to enter data automatically Double data entry and follow-up comparisons Computer software that prevents entering incompatible data Random checks of the database record against source documents (such as medical records) Contacting patients to obtain information missing from records

Statistical Methods

16.27 16.28 16.29 16.30 16.31 16.32

Indicate the minimum size, change, or difference in the outcome(s) considered to be clinically important. Identify the relationships analyzed and the statistical techniques used to analyze them (3,4). Confirm that the assumptions of the statistical analysis were met by the data. Identify any planned subgroup or covariate analyses (4). Identify any statistical adjustments made to control for confounding (4). In database surveys, indicate how data extraction was assessed for consistency or agreement.

Reporting Surveys or Cross-Sectional Studies

249

16.33 16.34 16.35 16.36 16.37

Describe any planned sensitivity analyses (4). Specify any procedures used to control for multiple testing (3). Specify the alpha level. Report whether statistical tests were one-tailed or two-tailed. Justify the use of one-tailed tests. Identify the statistical software package(s) used to analyze the data (3).

GUIDELINES TO BE ADDRESSED IN THE RESULTS

16.38 16.39

Identify the time frame of the survey: report the dates during which the survey was conducted or the records compiled. Explain any deviations from the protocol in the conduct of the survey. number and disposition of surveys at each stage of the research (3,4).

16.40 Provide a schematic summary of the study, identifying the


The numbers reported may include: Number of eligible respondents or records in the source population Number of potential respondents who were approached or the number of records that were available for review Number of respondents who declined to participate in, or who were unavailable for, telephone surveys Number of completed interviews or questionnaires (the response rate) (2,3) Number of questionnaires or records assessed for eligibility Number of ineligible questionnaires or records Number of evaluable questionnaires or records Number of questionnaires or records analyzed

250

Guidelines for Reporting Research Designs and Activities

16.41 16.42 16.43 16.44 16.45

Characterize each group with the appropriate descriptive statistics. Indicate the degree to which the respondents or records were representative of the target population. Comment on the nature of nonresponders or on unevaluable records (3). Report the results of the study, preferably in figures or tables (3).
It is customary to report the number of responses for each question, when practical (2,4).

At a minimum, report absolute values for all endpoints, including between-group differences (4).

16.46 Provide confidence intervals for all endpoints (3). 16.47 In database surveys, provide a measure of consistency or
agreement among data abstractors or assessors.

16.48 Report any potential confounding or interactive effects. 16.49 Account for all observations and explain any missing data (2-4). 16.50 Describe the treatment of outlying values. 16.51 Report any anecdotal evidence or observations that may
contribute to a more accurate or complete understanding of the study or its results.

GUIDELINES TO BE ADDRESSED IN THE DISCUSSION

16.52 16.53

Summarize the results (2,4). Interpret the results and suggest an explanation for them (2).

Reporting Surveys or Cross-Sectional Studies

251

Especially in surveys, which are descriptive and collect data usually at a single time point, three common fallacies should be kept in mind when interpreting the results: Association is not causation: Smoking and coffee drinking are highly associated, but one does not cause the other. The direction of possible causality may not be clear (5): It may not be possible to determine whether behaviors cause certain symptoms or whether people with these symptoms behave in certain ways. For example, amphetamine abuse can lead to depression, but depressed people may self-medicate with amphetamine to relieve their depression. Individuals will not necessarily exhibit the typical characteristics of the group at large (the ecological fallacy): Not all physicians are rich and not all nurses are women, for example, so generalizing on the basis of group means can be misleading.

16.54 16.55 16.56 16.57 16.58

Describe how the results compare with what else is known about the problem; review the literature and put the results in context (2). Suggest how the results might be generalized (4). Discuss the implications of the results (4). Discuss the limitations of the study (2-4). List the conclusions.

REFERENCES
1. McColl E, Jacoby A, Thomas L, et al. Design and use of questionnaires: a review of best practice applicable to surveys of health service staff and patients. Health Technol Assess. 2001;5:1-256. 2. Huston P. Reporting on surveys: information for authors and peer reviewers. Can Med Assoc J. 1996;154:1695-8. 3. Rushton L. Reporting of occupation and environmental research: use and misuse of statistical and epidemiological methods. Occup Environ Med. 2000;57:1-9. 4. STROBE statement. http://www.strobe-statement.org. 5. Grimes DA, Schulz KF. Descriptive studies: what they can and cannot do. Lancet. 2002;359:145-9. 6. Grimes DA, Schulz KF. An overview of clinical research: the lay of the land. Lancet. 2002;359:57-61. 7. Byar DP. The use of data bases and historical controls in treatment comparisons. Recent Results Cancer Res. 1988;111:95-8.

Você também pode gostar