An Introduction To Epi

An introduction to Epi-Info
Page 1 of 19
Minsk, February 2000
Table of Contents
Table of Contents..........................................................................................................................2
A brief introduction......................................................................................................................3
Starting Epi-Info...........................................................................................................................3
Introduction to the Analysis program.........................................................................................3
Browsing the data.......................................................................................................................4
First look at the data...................................................................................................................4
One-way frequency tables..........................................................................................................5
Entering Commands and Variable names using Menus..............................................................6
Displaying Categorical Variables...............................................................................................6
Statistics on Continuous Variables..............................................................................................6
Displaying Continuous Variables...............................................................................................7
Saving variables and leaving Epi-Info........................................................................................9
Analysis of Quantitative variables...............................................................................................9
Introduction................................................................................................................................9
Using the Means Command.......................................................................................................9
Linear Regression.....................................................................................................................11
Analysis of Categorical Data......................................................................................................12
Introduction..............................................................................................................................12
Using the Tables command.......................................................................................................12
Using the Statcalc program for stratified analysis....................................................................13
Introduction..............................................................................................................................13
Linear trend in proportions.......................................................................................................15
Reading a data file into Epi-info................................................................................................17
Creating a questionnaire (qes) File.........................................................................................17
Variable types...........................................................................................................................18
Creating a .rec file.................................................................................................................18
Importing the data....................................................................................................................18
Page 2 of 19
A brief introduction
Epi-info is a multi-purpose computer package designed for use by epidemiological researchers. It
contains smaller programs for use with Survey Design (Epiaid), Questionnaire Design and Report
Writing (Eped), Data Entry (Enter), Data Checking (Check), Data Analysis (Analysis), Simple
Statistics (Statcalc), Importing and Exporting files (Import, Export). There is also a separate
package for mapping (EpiMap).
The package is made available by WHO and CDC as public domain software and can be
downloaded (free of charge) from http://www.cdc.gov/epo/epi/epiinfo.htm .
Starting Epi-Info
Epi-info is a DOS based program, using pull down menus, although a mouse can be used. The
cursor can be used to move up and down the menus (using the up arrow and the down arrow) to
see the descriptions of the programs. Note the on a colour display an alternative way of moving
up and down is to press the highlighted letter for the program you require.
Here we concentrate on the Analysis program
Introduction to the Analysis program
Position the cursor bar on Analysis and read the description on the right hand side. Press ENTER
to select Analysis. The screen goes blank for a few seconds and then the Analysis screen appears.
The screen is split into two – the upper window is headed Output and the lower, smaller, window
is headed Commands. The cursor is on the Command window against the EPI> prompt. At the
top of the screen are two lines giving the status information:
Dataset: <None> Free memory: 262K

Use READ to choose a dataset
This indicates that we have not yet specified the name of the dataset to be analysed, and hints
how to do it. It also states the amount of free memory.
In order to load a dataset for use, we use the read command, for example if the file is called
itpexamp we type
read itpexamp
The full name of an Epi-info file will end with .rec , so the actual name of the file will be
itpexamp.rec, but Epi-info allows it to be omitted.
Note that you should enter the whole path as well as the file name, for example
a:\itpexamp.rec
The name of the file and the number of records appears at the top of the screen, indicating that the
file has been found and read. We also see the all records have been selected (as so far we have not
specified any criteria for selecting or rejecting records).
Page 3 of 19
Browsing the data
You can browse the data by pressing F4. As you pass through the different columns, you can see
what type of variables they contain at the top of the screen.
If you press F4, Full screen mode is selected, this shows a single record in its entirety.
Pressing F5 will start Split mode, this is a combination of both modes, browse in the top window
and Full screen in the bottom
Note that although we entered browse by pressing F4 in the analysis mode, we could have also
typed browse at the prompt.
First look at the data
When starting to look at any new set of data, one of the first steps is to check that the values of
the variables are sensible and that they correspond to the codes defined in the coding schedule or
other documentation about the data. For the categorical variables, we might do one-way tables to
check that only the specified codes occur and to check for missing values, for example in the sex
field there should only be the values 1 and 2. For continuous variables, we need to obtain
summary statistics (mean, standard deviation, minimum, maximum) and to check that these are
what we expect.
Page 4 of 19
One-way frequency tables
We start by producing one-way tables for the categorical variables. At the prompt type
tables sex
The resulting table appears in the Output window
This shows that there are 45 males (1’s) and 35 females (2’s) together with percentages, the total
number of records (80) and summary statistics (sum, mean and standard deviation). Ignore for
now the Student’s t-distribution.
Exercise:
Repeat the tables command for the observed ages (observeage). (Note that in many cases
age would have a large number of possible values and so a frequency table might be large and
unwieldy and so other commands would be used - however here we have a small range of ages
and the table can be quite useful)
What is the youngest age ?
What is the average (mean, mode and median ) age ?
How many 13 year olds are there ?
How many children are 13 years or younger ?
Page 5 of 19
What percentage of the children are 13 years or younger ?
Entering Commands and Variable names using Menus
We have used the tables command by typing it at the command prompt. It is also possible to enter
commands by selecting them from a list of commands, similarly it is also possible to select the
variable names from a list of variables.
If, for example, we wanted to construct a table of sex, F2 is the Commands key which brings up a
list of possible commands . The tables command is in the General section, by highlighting it and
pressed ENTER, the command is ‘pasted’ into the command line. Now press the Variables
function key, F3, and a list of the variables will be shown. Highlighting sex and pressing
ENTER will paste into onto the command line, which can now be entered giving the same results
as when we typed in the commands by hand.
If you want to pick more than one variable in this way, as will be the case when we do two way
tables, you can tag groups of variables using the plus (+) and minus (-) sign, i.e. select sex and
press + and then select observeage and press +. You will see that these two variables will have
been tagged (marked) by a small sign, pressing ENTER and both of them will appear in the
command line. This also works for more than 2 variables.
Displaying Categorical Variables

The distribution of each of the categorical variables can be displayed using either a bar chart or a
pie chart. At the command prompt type (or select)
pie sex
A pie chart should appear on the screen, showing the percentage of males and females. A bar
chart can be produced using the command bar
bar sex
Exercise:
Produce pie and bar charts for thyroid medication (THYRMEDICA).
Statistics on Continuous Variables

The command for obtaining summary statistics for continuous variables is means, for example
means height
The output is the same as for the tables command – a frequency table followed by summary
statistics. Because there are so many different values for height, the table is much longer. For
continuous variables a frequency table is not much use – except for checking for suspicious
values. The means command (unlike the tables command) allows us to suppress the
frequency table and print only the summary statistics. This is achieved as follows
Page 6 of 19
means height /n
The full specification of the means command includes a grouping variable, but for now we are
dealing with all the data together. At this stage we do not want to subdivide the data into groups,
for example males and females separately. We need a way of forming one group of all the records
in it. This is done as follows:
let groupall = 1
This creates a new variable groupall which has the value 1 for every record. Thus to group the
data by groupall will cause all the records to be included in one group. If you browse the data
you can see the new variable. We can now use the means command
means height groupall /n
This produces an entirely different output – no frequency table and a total of 11 statistics.
Exercise:
What are the mean and standard deviation of the WEIGHTs of the 80 children ?
What are the median and interquartile range (75 th percentile – 25th percentile) of the weights of
the 80 children ?
What is the range of the weights of the weights ? (minimum – maximum)
Displaying Continuous Variables
Neither bar charts or pie charts are sensible ways of displaying continuous variables with a lot of
different values. Try one of the commands on height and you will see that the result is not very
useful.
Bar charts have individual separated bars and are used to display categorical variables for which
the order of the categories is irrelevant. Histograms are used for continuous variables.
Usually a continuous variable is grouped before the histogram is drawn. However, if the variable
has a relatively small number of distinct values a histogram can sometimes give a good
representation of the distribution.
histogram observeage
Exercise:
Are the observed ages of the children approximately Normally distributed ?
Try doing a histogram of the heights of the children
histogram heights
Page 7 of 19
If there are a lot of different values, then the resulting histogram can be less useful. We might
want to group the variable. To group the height variable we need to create a new variable,
which we will call htgp , which will have grouping interval of 10cm. To form the groups we use
the let statement to divide height by 10 and assign the result to htgp. Because we want the
new variable to have integer values rather than exact values with decimal places, we use the div
operator (this is the way that Epi-info does integer division – the traditional / will give the exact
answer)
let htgp = height div 10
Before looking at the histogram, see the effect of the let statement by getting a frequency
distribution for htgp
tables htgp
You will see that 8 height groups have been created. Now type
histogram htgp
Exercise:
Are the heights Normally distributed ?
Page 8 of 19
Repeat this process for weights. Create a new variable called wtgp again using the let
command and the div operator, choosing a sensible grouping interval.
Are the weights approximately Normally distributed ?
Saving variables and leaving Epi-Info
If you have created new variables, you might want to save them for use the next time you use
Epi-Info. You could re-write the original data file, but it is recommended that you save to a new
file. To do this you first need to route the output and then to designate a file to which the new
dataset (including both the old and new variables) will be saved. If we wanted to save out new
dataset to a file called itpnew.rec we would type
route itpnew.rec
Again, it is important that you put in the full path for the file, e.g. a:\itpnew.rec
And then to write the data to that file
write recfile
To leave Epi-Info, press F10 to leave Analysis and return to the main Epi-Info menu, and then
press F10 (or select Quit) to leave Epi-Info
Analysis of Quantitative variables
Introduction
Here, we are going to use Epi-info to analyse data in the form of continuous variables, i.e.
quantitative variables measured on a continuous scale. We shall use the means command to
compare continuous variables classified by categorical variables, we shall also see how two
continuous variables can be compared using scatter and regress.
Using the Means Command

In this section we use the ungrouped HEIGHT variable, and ask the hypothetical question of
whether a child’s weight varies according to its sex and height.
One of the advantages of using statistical packages is that it is easy to examine the data visually
before proceeding to formal statistical analysis. This is one way of checking whether the
assumptions made in the analysis are reasonable. We can examine a scatter plot of the data
scatter sex height
Exercise
Execute the above command. Note that the first variable is put on the x-axis and the second on
the y-axis.
Guess the mean height for each sex
Page 9 of 19
Mean height for sex = 1
Mean height for sex = 2
Does this suggest an association between height and sex ?
Are there any outlying observations ?
These graphs are not really suitable for presenting the data, since it is difficult to discern the
distribution of height where the points are crowded together. An alternative graphical
presentation to illustrate the variation in height according to sex is to use histograms.
Exercise:
Type the following:
let hgtrp = height div 10

histogram hgtrp
This produces a histogram of all the values of height. Epi-info allows us to use subgroups of
the data with the select command.
Type:
select sex=1
histogram htgrp
To see a histogram of height for males only. Note the result of the select command is shown
at the top left of the screen as Criteria: sex=1
Exercise:
Plot a histogram for the heights of females, is there any difference ?
What happens if we forget to type select before select sex=2 ?

(remember to type select again before the next section!)
Recall that we used the means command to derive summary statistics for a variable. Remind
yourself of the reason for having to create a new variable with a single value to get the means
output for a single variable
let groupall = 1
means weight groupall /n
Make sure that you understand the output, now to calculate the statistics for each sex separately
means height sex /n
The first part is the same summary as you have already seen, but subdivided by each level of sex
Exercise
Do these results compare with what you guessed when you looked at the scatter plot ?
Page 10 of 19
Linear Regression
If we are examining the relationship between two continuous variables, such as height and
weight we might start by drawing a scatter diagram before proceeding to formal statistical tests.
scatter height weight
Exercise :
Does there appear to be a straight line relationship between the two variables ? If so, guess the
best straight line, now estimate the slope of the best straight line as follows:
Pick two points, A and B, towards the ends of your line (A at the bottom, B at the top). Write
down the values of height and weight for each point.
At point A height
weight
At point B height
weight
The slope of the line is (heightB – heightA)/(weightA-weightB)
What do you calculate the slope to be ?
We can use Epi-info to perform linear regression using the regress command. To use this to
perform a linear regression of height on weight type:
regress height weight
Note that the regress command requires the dependent (response) variable, to go on the y-axis
and then the independent (explanatory) variable to go on the x-axis. We are given the correlation,
together with 95% confidence limits.
Exercise:
What does this correlation tell you about the relationship between height and weight ?
The program then gives us output which tests the null hypothesis that the slope of the line is equal
to zero (i.e. no relationship between the two variables). The next part of the output is the
estimated regression coefficients. These are estimates of the parameters  and  in the formula:
height =  +  x weight
The estimate of the parameter  is labeled as the -coefficient for variable weight and is the
slope of the line. The estimate of parameter  is labeled as the Y-intercept, i.e. the value of
height when weight=0.
Exercise:
What is the equation of the fitted line ? How does it compare to the estimate that you previously
calculated ?
Using the equation of the Epi-info fitted line, calculate the following
Page 11 of 19
the predicted value of height for weight = 120
the predicted value of height for weight = 170
How do these values compare with what you would have got using your original equation ?
Note the value of the y-intercept in the absurd case that weight=0, this apparently ridiculous
result arises because the relationship between height and weight is not linear over the entire range
of the data, although it does look to be a reasonable approximation over the range we are
examining. One of the reasons for checking the data graphically is to check whether the
relationship might be linear, or whether a curve might be a better description.
Epi-info will plot the regression line on the graph for you.
scatter height weight /r
Analysis of Categorical Data
Introduction
In this session, we aim to use Epi-info for the analysis of categorical data. In particular we will
construct and interpret two-way tables of categorical data, test the association between 2
categorical variables using the chi-squared test, analyse the association between two binary
variables in the presence of one or more confounding variables and test for a linear test in
proportions.
Using the Tables command

We start by asking whether THYROIDILL (whether the child has ever had thyroid disease) is
associated with a child’s SEX. The variable THYROIDILL has the value 1 for yes, 2 for no,
and 3 for don’t know, and sex takes values 1 for male and 2 for female. For the purposes of this
teaching exercise, we will ignore the cases where thyroidill takes the value 3, as we require
a binary variable, i.e. one that can only take 2 values. In order to use just that subset of the data
we use the select statement
select thyroidill < 3
Since these are now two categorical variables, we can look at their associations using a two way
table, using the tables command.
tables thyroidill sex
Exercise:
The program prints out the required 2x2 table. Examine the table and decide which part of the
output indicates whether there is an association between the two variables ?
It would be easier to see the measure of association if the table had percentages on it. We can
request these using the set command
set percents=on
Page 12 of 19
Now repeat the tables command
tables thyroidill sex
This time the tables appear with row and column percentages in the cells of the tables. The row
percentages are printed first, with an arrow beside them pointing to the denominator on the right.
The column percentages are printed underneath the row percentages. However, the tables has a
rather muddled appearance and you have to look at it quite carefully to see what percentage and
cell counts are. It would be better if there were more space between the four cells - or if there
were lines between them. This can be achieved using the lines command.
set lines=on
If we now concentrate on the statistics provided within the table. We are given, with confidence
limits, the odds ratio and relative risk, together with Chi-squared statistics with and without
continuity correction (“Yates corrected”). There is also a “Mantel-Haenszel” chi-squared
statistics, which is really for stratified analyses and can be ignored for the time being.
Exercise:
Check that you can calculate
(i) the odds ratio
(ii) the relative risk
(iii) the chi-square statistics, with and without continuity correction
What is the response variable ?
Which is the explanatory variable ?
What do you conclude about the association between the two variables ?
Note that Epi-info assumes that the response variables is the column variable in the table, the one
listed second in the tables command.
Using the Statcalc program for stratified analysis
Introduction
Often when we are investigating the relationship between two variables, we want to take into
account the effect of other variables that have associations with both the response and explanatory
variables we are interested in. Epi-info can be used to allow for such confounding variables, using
the tables command and also a separate module called statcalc.
To illustrate these methods, we use another dataset, on the use of bed nets and the presence of
enlarged spleens in two villages in Africa.
The data is as follows:
Village A Village B
Spleen enlarged Spleen enlarged
yes no Total yes no Total
With nets 12 (50%) 12 24 15 (22%) 52 67
Without 42 (59%) 29 71 4 (25%) 12 16
nets
Total 54 (57%) 41 95 19 (23%) 64 83
Page 13 of 19
Both
villages
combined
Spleen enlarged
yes no Total
With nets 27 (30%) 64 91
Without 46 (53%) 41 87
nets
Total 73 (41%) 105 178
A stratified analysis is necessary here, because village is a confounding factor – being related
both to the response variable (enlarged spleen) and the explanatory variable (bed-net use)
We can conduct this analysis using the statcalc module of Epi-info. We start this module from
the Epi-info menu (after exiting Analysis by pressing the F10 key)
You are given the choice of three options – choose the first option, Tables (2x2, 2xn). You are then
faced with the traditional, “Exposure by Disease” table:
Disease
+ -
Exposure +
-
Note that, once again the disease (response variable) must be the column variable and exposure
the row variable.
Exercise:
We now have to enter the data for the two villages combined, entering cells counts only and not
totals, as follows
Type 27 and press ENTER. Notice that the cursor automatically goes to the next cell
Type 64 and press ENTER
If you have entered the four cell counts correctly, press F4 to request the analysis of the table. A
set of statistics, similar to those produced from the tables command used earlier is given.
Page 14 of 19
Exercise:
What are the values of the
(i) Relative risk ?
(ii) Yates corrected chi-squared test?
Now we have to enter the data separately for the two villages, press the ENTER key twice to
return to the blank table.
Exercise:
Enter the data for the first table (village A), press F4 to get the analysis. For village A,
(iii) Relative risk ?
(iv) Yates corrected chi-squared test?
To enter the data for village B, press the F2 key and proceed as before. For village B,
(v) Relative risk ?
(vi) Yates corrected chi-squared test?
To get the summary analysis, press the ENTER key.

What are the values of
(vii) The crude RR?
(viii) The summary RR ?
Page 15 of 19
(ix) The Mantel Haenzel summary chi-square ?
What are your conclusions about the relationship between bed-net use and presence of an
enlarged spleen ?
What was the effect of controlling for the confounding variable, village ?
Linear trend in proportions

Another technique in the analysis of tables is the statistical test for a linear trend in proportions,
and this can also be performed within Statcalc . This test may be used in the analysis of a 2xc
table (2 rows and c columns), where the column variable is an ordered categorical variable such
as age-group. It provides a more sensitive test for the association between the two variables. As
an example, we use the following data showing the proportions of women with early age at
menarche by their triceps skinfold thickness.
Triceps Skinfold Group

Small Intermediate Large Total
Age at < 12 years 15 (9%) 29 (13%) 36 (19%) 80
menarche 12+ years 156 197 150 503
Total 171 226 186 583
Note that there is an increasing trend in the proportion of women with early age at menarche as
skinfold thickness increases.
Exercise:
The usual 2x3 chi-square test can be carried out within Statcalc by selecting the (2x2,2xn) option,
as before. The table has 2 rows and 3 columns, but Epi-info requires the data to be entered as 2
columns and 3 rows. Starting with the blank table, enter the data (cell counts only, no totals), for
the ‘Small’ group (from column 1 of the table), remembering to press ENTER after each number
has been entered. Now enter the cell counts for the ‘Intermediate’ group (column 2 from the table
above). Now just continue typing to enter a third row of numbers for the ‘Large’ group. After the
third row, press F4 to get the analysis. Statistics for the table are displayed
(i) What is the value of the chi-square?

(ii) How many degrees of freedom are there?
(iii) What is the p-value ?
Now we will carry out a test in proportions, to do this we have to assign a numerical score to each
column (the skinfold group from the table), namely 1, 2 and 3.
In order to perform a chi-squared test for trend, return to the Statcalc menu by pressing F10.
Notice that the third option is chi-squared test for trend .
You will be asked to enter the following:
Exposure score Cases Controls
For each column of the table above:

the exposure score is the value of the numerical score;
cases refers to the number of women with age at menarche <12 years;
Page 16 of 19
controls refers to the number of women with age at menarche 12+ years;
Press ENTER after each entry. After entering the last number, press the F4 key to calculate the
statistics.
Exercise:
(i) What is the value of the chi-squared test for trend ?
(ii) How many degrees of freedom are there ?
(iii) What is the P-value ?
Note that the odds ratios are given, using the ‘Small’ category as the baseline. This confirms the
initial observation that there is an increasing proportion of women with early age at menarche.
Exercise:
Perform a similar analysis on the following data, which describes 128 children aged under 12
years who were followed up during the malaria season to record which of them experienced
clinical attacks of malaria. The results by age group were:
Age-group Number getting Total Percentage getting

malaria malaria
1-2 19 30
3-4 16 24
5-6 12 26
7-8 11 27
9-11 7 21
Page 17 of 19
(i) Calculate the percentage of children in each age group who contracted malaria
(ii) Conduct a significance test to assess if there is any evidence of age-related variation
in malarial morbidity
(iii) Carry out a test for trend in proportions
(iv) What are your conclusions ?
Reading a data file into Epi-info

There are three steps to entering your own data into Epi-Info. If you have a comma delimited file
of data they are:
(i) Create a questionnaire file detailing the variables within the dataset
(ii) Convert the questionnaire file to an empty .rec file ready to take the data
(iii) Read the data into the .rec file
Creating a questionnaire (qes) File

From the main menu, select the Eped program, which is a very simple word processor. We are
going to create a file layout for our data and save it in a text file with a .qes file extension. This
step can also be done, probably with more ease in a ‘proper’ word processor such as MS Word,
just make sure the file is saved as text with a .qes file extension and not as a word document or
with a .txt file extension).
The layout of the file will consist of names for all the variables in your data file and some
information to define them. An example of a file layout is given in itpexamp.qes
PERSONCOD ##
SEX #
OBSERVEAGE ##
THYROIDILLIS #
THYRMEDICAMIS #
THYRFAMILYIS #
IODSALTIS #
SEAPRODUCTIS #
OBSERVEGORMON #
OBSERVEIOD #
OBSERVEVITAMINE #
HEIGHT ###
WEIGHT ##
RIGHTSHIRINA #.##
RIGHTTOLSHINA #.##
RIGHTDLINA #.##
LEFTSHIRINA #.##
LEFTTOLSHINA #.##
LEFTDLIN #.##
IODURINE #.##
Next to the variable names are special characters that define the type and length of the variable.
The first variable in this file is the person code (PERSONCOD) which is a number can take up
two digits, i.e. it can be from 0 to 99. The ## characters imply that the variable is of numeric type
and is made up of 2 digits.
Page 18 of 19
Variable types
There are four basic variable types allowed within Epi-Info. Once the variable type is defined,
Epi-Info will only allow data of that type to be entered for that variable. The variable types are as
follows:
(i) Numeric variables are defined using the # character. The number of characters defined
the length, so that #### defines a variable with 3 digits. A decimal point can also be
included so that ###.## can hold numbers between –99.99 and 999.99
(ii) Text variables are defined using the _ (underline/underscore) character, with the number
of characters defining the length. Alternatively , a text variable can be defined as upper
case, using the ‘A’ character between less-than and greater than signs, e.g. <A>.
(iii) Date variables are defined as a date format between less-than and greater than signs, e.g.
<DD/MM/YY> defines a variable in the form day. month, year
(iv) Logical or yes/no variables are defined as <Y>.
Creating a .rec file

When we have created a .qes file, for example itpexamp.qes we need to convert it to a .rec
file, for example itpexamp.rec , which uses the definitions we have put in out questionnaire
file
Open the Enter program from the menu. You are asked for the name of a .rec file, but since it
doesn’t exist you must put in the name you want to call it. You now want to create new data from
a .qes file, which is choice 2 which will ask you to enter the filename of your .qes file. You will
now see the questionnaire (file layout) that you have created, with spaces highlighted for each of
the variables. We need go no further within Enter as we have now created a .rec file into which
we can load out pre-prepared data.
Importing the data

From the menu select the Import program, you are presented with a screen asking for two file
names and information on the layout of the type of data to be read.
In the first space enter the name of the .rec file that contains the file layout. The second filename
required is the name of the file that contains the data, for example itpexamp.csv which is
comma separated. In the case of comma separated data, choose the ‘delim’ option for delimited
(or separated by something). The other option ‘fixed’ is for when the data is in a very rigid
structure, which is unlikely to be the case if the data has been extracted from a common
spreadsheet or statistical package, such as Excel or S-plus.
The data will now have been loaded into the .rec file and is ready for use in Analysis.
Page 19 of 19

An Introduction To Epi

Enviado por

Dados do documento

Título original

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

An Introduction To Epi

Enviado por

Direitos autorais:

Formatos disponíveis

An introduction to Epi-Info

Here we concentrate on the Analysis program

Introduction to the Analysis program

Dataset: <None> Free memory: 262K

First look at the data

The resulting table appears in the Output window

What is the youngest age ?

What is the average (mean, mode and median ) age ?

How many 13 year olds are there ?

How many children are 13 years or younger ?

Entering Commands and Variable names using Menus

Displaying Categorical Variables

Produce pie and bar charts for thyroid medication (THYRMEDICA).

Statistics on Continuous Variables

means height groupall /n

What is the range of the weights of the weights ? (minimum – maximum)

Displaying Continuous Variables

Are the observed ages of the children approximately Normally distributed ?

Try doing a histogram of the heights of the children

let htgp = height div 10

Are the weights approximately Normally distributed ?

Saving variables and leaving Epi-Info

Analysis of Quantitative variables

Using the Means Command

scatter sex height

Guess the mean height for each sex

Does this suggest an association between height and sex ?

Are there any outlying observations ?

let hgtrp = height div 10

What happens if we forget to type select before select sex=2 ?

means height sex /n

scatter height weight

The slope of the line is (heightB – heightA)/(weightA-weightB)

What do you calculate the slope to be ?

regress height weight

scatter height weight /r

Analysis of Categorical Data

Using the Tables command

select thyroidill < 3

tables thyroidill sex

tables thyroidill sex

Using the Statcalc program for stratified analysis

The data is as follows:

To get the summary analysis, press the ENTER key.

Linear trend in proportions

Triceps Skinfold Group

(i) What is the value of the chi-square?

You will be asked to enter the following:

Exposure score Cases Controls

For each column of the table above:

Age-group Number getting Total Percentage getting

Reading a data file into Epi-info

Creating a questionnaire (qes) File

Creating a .rec file

Importing the data

Você também pode gostar