Você está na página 1de 13

DESCRIPTIVE STATISTICS

Running Head: Descriptive Statistics

Descriptive Statistics with SPSS Angela Chiasson University of Calgary

DESCRIPTIVE STATISTICS Descriptive statistics summarize data obtained about a sample of individuals. The present data was taken from 504 participants in total, including 253 males and 251

females, with year of birth ranging from 1917 to 1987. Participants were equally selected from small and large communities, with 252 individuals in each group. 33 participants were employed as health care professionals, while 471 were not. When deciding which kinds of descriptive statistics are to be used, it is important to know which data represents continuous and categorical variables. A continuous variable is capable of taking on an ordered set of values within a certain range (Kerlinger & Lee, 2000). Whereas a categorical variable belongs to a nominal measurement, which suggests that there are two or more subsets of the set of objects being measured. The data given by Nordstokke (2011) represents four continuous variables, including Calculated Age (CAGE), Visit 1, Visit 2 and Visit 3. The purpose of representing data in graphical form is to view sets of ordered pairs as a two-dimensional representation of relations. According to Kerlinger & Lee (2000), one of the most powerful tools of analysis is the graph. A graph will show if a relation exists in a set of data as well as if there is a positive, negative, linear, and quadratic nature (Kerlinger & Lee, 2000). Figures 1, 2, 3 and 4 illustrate different ways to graph the variables. The histogram is an effective graphical technique for illustrating the variability of data because it shows the center of the data, the spread of the data, the skewness of the data, presence of outliers and presence of multiple modes in the data. It is the histogram that is the most appropriate technique that accurately represents the data (see Figures 5, 6, 7 and 8).

DESCRIPTIVE STATISTICS Figure 1. Mean Comparison of Visit 1, Visit 2, and Visit 3.

Figure 1 clearly illustrates that the mean visits to a doctor are higher than the average visits to a walk-in clinic and emergency room in the past twelve months. It is also possible to choose a categorical variable, such as gender to compare the mean of Visit 1, Visit 2, and Visit 3 (see Figure 2 and 3). Figure 4 shows an alternative way to graph the data by choosing to compare the average of male and female visits to the doctor within twelve months in the same graph.

DESCRIPTIVE STATISTICS Figure 2. Mean Comparison of Males with Visit 1, Visit 2 and Visit 3.

Figure 3. Mean Comparison of Females with Visit 1, Visit 2, and Visit 3.

DESCRIPTIVE STATISTICS

Figure 4. Mean Comparisons of Male and Female Visits to the Doctor in the Past 12 Months.

Central tendency includes the mean, median and mode. There is little doubt that measures of central tendency and variability are the most important tools of behavioral data analysis (Kerlinger & Lee, 2000). Sets of measures can be too large to understand separately, which leads to the importance of central tendency where we understand what sets of measures are like on average. It is also useful to compare an individuals score to the average. Through the use of descriptive statistics we gain knowledge about the variables being studied. Table 1. Measures of Central Tendency Statistics

DESCRIPTIVE STATISTICS

6 VISIT3. How many visits to an emergency room have you made in the past 12 months? 504 0 1.36 .00 0

N Mean Median Mode

Valid Missing

Calculated age 504 0 53.89 54.50 57

VISIT1. How VISIT2. How many visits to many visits to a your doctor walk-in clinic have you made have you made in the past 12 in the past 12 months? months? 504 504 0 0 6.66 2.03 3.00 .00 1 0

The equality of the median and mean suggest a normal distribution, forming a bellshaped normal curve. According to Kerlinger & Lee (2000), events in large numbers tend to distribute themselves in the form of the curve. The data set for Visit 1, Visit 2 and Visit 3 do not have similar median and mean which may suggest that some of the data is skewed. The data for CAGE represents a similar mean and median, which may suggest a normal curve. Furthermore, there is an importance of graphing data rather than assuming that the data will distribute themselves normally by using empirical data to discover if the distribution is normal (Kerlinger & Lee, 2000; Nordstrom, 2011). The mean is used to calculate the average in research. According to the data set, the average age of participants is 53.89 years. Table 1 illustrates that, on average, individuals in the data set have visited their doctor 6.66 times in the past twelve months. If random sampling was used to represent a larger population, we assume that on average the larger population has visited their doctor approximately six to seven times in the past year. It is important to note that the method of drawing samples from a population can affect the representation of the larger population. The most reliable method of sampling

DESCRIPTIVE STATISTICS is random sampling, in which every possible sample of a particular size has an equal chance of being selected. The median can be used in tests of statistical significance where the mean is inappropriate. The mode is used mostly for descriptive purposes, but can be useful in research for studying characteristics of populations and relations (Kerlinger & Lee, 2000). Table 2. Variability of Continuous Variables.

Statistics VISIT1. How many visits to your doctor have you made in the Calculated past 12 age months? 504 504 0 15.036 226.088 69 44.00 54.50 64.00 0 13.632 185.826 97 1.00 3.00 6.00 VISIT2. How many visits to a walk-in clinic have you made in the past 12 months? 504 0 9.134 83.425 97 .00 .00 1.00 VISIT3. How many visits to an emergency room have you made in the past 12 months? 504 0 9.476 89.790 98 .00 .00 .00

Valid

Missing Std. Deviation Variance Range Percentile 25 s 50 75

There is an importance to understanding and studying variability because it is valuable to know how spread out the data values are near the center of the data. When the variability of a sample is zero it means that all of the information in that set in the same (standard deviation, mean, median, mode, etc.). One example of a categorical variable having a variability of zero is if there were all females in the gender category. Another unlikely example would be the exact same scores on a test for all test takers.

DESCRIPTIVE STATISTICS

Figure 5. Histogram of Calculated Age.

DESCRIPTIVE STATISTICS

6. Histogram Representing the Amount of Visits to the Doctor in the Past 12 Months.

DESCRIPTIVE STATISTICS

10

Figure 7. Histogram Representing the Amount of Visits to the Walk-In Clinic in the Past 12 Months.

DESCRIPTIVE STATISTICS

11

Figure 8. Histogram Representing the Amount of Visits to the Emergency Room in the Past 12 Months.

It is important to study the distribution of variables in order to determine if there are outliers that affect the central tendency. Figure 5 represents a normal distribution. Figures 6, 7, and 8 illustrate a positively skewed distribution. It is important to review the raw data as well as the collection of data. One option would be to remove the outliers from the data set so that central tendency is not affected. Skewness is the

DESCRIPTIVE STATISTICS measure of the lack of symmetry in a distribution. The lack of symmetry lets the researcher know if there is data that needs to be reviewed or removed from the data

12

set. Kurtosis is a measure of whether the data are peaked or flat relative to the normal distribution. Figures 6, 7 and 8 represent a high kurtosis. . The box plots below illustrate that there are outliers that lie within more than 3 standard deviations from the mean, which affects the central tendency.

The data given by Nordstokke (2011) represents that in the past twelve months, females and males, coming from small and large communities made visits to the doctor more often than visiting the walk-in clinic and emergency room. When looking at the

DESCRIPTIVE STATISTICS boxplots, it is evident that some data must be removed from the data set in order for accurate representation.

13

References Kerlinger, F. N. and Lee, H. B. (2000). Foundations of Behavioral Research, 4 th Ed. Harcourt

Você também pode gostar