Você está na página 1de 8

STATS MIDTERM REVIEW LECTURE 1

Research problem states problem researcher want to solve or contribute to its solution Research question states actual question that researchers want to answer

Types of Questions: - Quantitative - Qualitative Quantitative Research Design Randomization Control Experimental Group Intervention Randomization to groups (each participant has equal chance of being assigned to any study group reduces bias) No randomization to groups Sedentary activity Exercise

Design Experimental Design (RCT)

Question Questions about effectiveness of an intervention for treatment or prevention

Time Direction Measure outcome forward in time (mood) - Before & after intervention - After intervention only Measure outcome forward in time (staff morale) - Before & after intervention - After intervention only

Quasiexperimental design (Cohort Analytic Study)

Questions about effectiveness of intervention for treatment or prevention

Usual nursing care

Primary nursing

Case-control design

Questions about the cause of a problem

No randomization

Cohort Design

Questions about the course/cause of a problem Questions about what is associated with a problem

No randomization

2 groups (exposed = cases, not exposed= controls) One group

No intervention Measure outcome back in time (retrospective)

Cross-sectional design

No randomization

May compare more than one group

No intervention Measures outcome forward in time (prospective) No intervention Measures outcome at one point in time

Hypotheses - Research develops, refines, expands knowledge to answer questions or solve problems; turn research question/problem into statement about diff erence or relationship thats expected - Need to know what caused difference/relationship is truly the cause of the difference/ relationship - Dependent variable outcome of interest - Independent variable intervention or factor/thing being manipulated in experimental studies (stays the same) - Explanatory variables variables that explain why two variables are related to each other Types of Hypotheses Relationship/Direction Example There is no There is no relationship between exercise (IV) relationship/difference and systolic BP (DV) in adult women. There is There is a relationship between systolic BP (IV) relationship/difference and hypertension (DV) in adult women. Does not express direction There is a relationship between exercise (IV) and between IV & DV systolic BP (DV) in women. Expresses direction Women who exercise will have lower systolic BP between IV & DV (DV) than women who dont exercise (IV)

Type Null Alternative Non Directional Directional

Population - All of the people you want to be able to generalize findings to; target population Sampling Type Probability Sampling Selection Random selection of people from the population (Where as random allocation occurs after sample is selected) Selection chance Each person has equal, independent chance of being selected for sample (all members of population must be known) Each person in population doesnt have equal, independent chance of being selected for sample Method - Simple random sampling - Stratified random sampling - Cluster sampling - Convenience sampling - Purposive sampling - Quota sampling

Non Probability Sampling

Selection of people from the population by non random methods

LECTURE 2 Levels of Measurement Category Difference Between Ranking Categories Usually use #s (or coded date) to distinguish between categories

Level

Observation Categories Observations fall into unordered categories (Mutually exclusive can only belong to one category) Observations fall into categories

Value of 0

Example

Nominal

Value of 1 does not mean more than 0

Ordinal

Categories can be ranked according to some criterion

Difference between categories not equal (dont know how much more/less one category is to another)

Interval

Can be ranked according to some criterion

Difference between categories is equal

Ratio

Can be ranked according to some criterion

Difference between categories is equal

Gender (male/ female) Mortality (alive/dead) Blood type (A, B, O, AB) Aboriginal Yes/No Value of 1 does Cancer mean more than 0 staging (Stage 1-4) Pain Scale (010) Anxiety Level (mild, moderate, severe) Value of 1 does Temperature mean more than 0 Sea Level (0 not necessarily Above true 0; doesnt mountain have to indicate levels total absence of variable being measured) Value of 1 does Income ($) mean more than 0; Length has a true 0 (0 not Time arbitrary) Weight

Graph Type Bar Graph

Type of measured variable - Nominal - Ordinal

Category Representation and Description - Category Indicates represented by a the bar categories - Bar represents counts of individuals in that category of their relative frequency

Graphs X axis

Y axis

Order

Diagram

Plots the frequencies

-Since nominal, it is arbitrary

Pie Chart

Histograms

- Interval - Ratio

Stem and Leaf Plot

- Interval - Ratio

- Percentage of each category converted into proportion of entire circle pie - Size of slice depends on % of the whole this category represents - Summary graph for single variable - Useful to understand pattern of variability in data, especially for large data sets - Combination of frequency table and histogram - Visually describes data without losing #s from frequency table

Statistic descriptive measure computed from a sample of data Parameter descriptive measure computed from a population Central tendency typical value for the data; data tends to centralize around common point Description - # or category that occurs most frequently - May have no mode, one mode, more than one mode - Value that divides data into 2 equal parts - Sum of the set of values divided by total # of participants - The average Measured Variables - Nominal - Ordinal - Mode - Ratio - Ordinal - Interval - Ratio - Interval - Ratio Equation

Central Tendency Mode

Median

(n+1)/2 X= (Xi)/N

Mean

Notation x = one piece of data x = mean of all the data N = total # of participants in sample (size of the whole sample) N = subgroup of total sample (sub-group) Xi = data of specific subject (x1, x2, x3) = sum of a series of #s

Measures of Dispersion how close the data are around the measure of the central tendency Measure of Dispersion Range Quartiles Standard Deviation Description Difference between highest score, and lowest score in data Difference between 1st & 3rd quartiles comprises middle 50% of data (not influenced by extreme scores) (On next lecture)

Quartile Because of issues with range, data can be described as falling into these Median divides data into 2 equal halves Interquartile range divides data into 1st quartile (25th percentile); 2nd quartile (50th percentile/median); 3rd quartile (75th quartile) Boxplots - Picture of data by interquartile range in relation to median - Box reps middle 50% or interquartile range - Whiskers represents dispersion (variability) of outside 25% - Outliers represented Skewness - Distribution of data can be classified as symmetric/asymmetric Symmetric left half of graph mirrors the right half Asymmetric left half of graph doesnt mirror right half skewed - Positively skewed if graph has a long tail to the right (mean > median) - Negatively skewed if graph has long tail to the left (mean < median)

LECTURE 3 Standard Deviation - Most widely used and reported measure of variability (dispersion) - Average variance each individual in the sample is diff from sample mean (how far each observation of data for individuals are from sample mean) - What impacts SD: # of individuals in sample; how different individuals are from one another - = (sum of squared deviation of individual observations from sample mean) / (n-1)

- SD =
S or SD represents variation based on the original observations of data for each individual SD based on # of individuals; if sample has more individuals makes sample more representative of the population

SD if theres more individuals in sample The larger the standard deviation, the more varied (spread out) the distribution of observations for sample Data with small SD = scores closer to the mean Data with large SD = scores scattered over a wider range around mean

Normal Distribution - Variation of many variables tend to follow bell shape, symmetrical with data - Determined by mean & standard deviation; symmetrical - Mean = median = mode (all the same) - Area under the normal curve = 1 or 100% (always the same) - About the graph Area under curve always the same (despite height & width) Height = # of observations Width = standard deviation Mean = 0 Each standard deviation is 1 unit Standardizing Scores - Need to convert everything into one common unit; allows us to determine how many observations would be higher/lower - Converting observations to those associated with normal curve = normalizing/standardizing the scores - z score = number of standard deviations above/below the mean a particular score is

Use the z score table to look up value The goal of research (include data analysis) is to produce results that we can generalize the population Need to decide whether results are statistically significant (Did results occur by chance?) Because researchers are never 100% sure, probability helps us to understand how often theyre sure/not sure

Probability - Mathematical way of predicting results (likelihood/chance than an event will/wont occur) - Rules: Ranges from 0 (0%) 1.0 (100%) Can never be less than 0 (negative) Can never be greater than 1.0 - Probability of an event Mutually exclusive is similar to nominal measurement (one event cant happen if the other event does; has to be one or the other) Ex. either Type A, B, O, or AB - Conditional probability Probability of one event occurring given that another event has already happened (Ex. Probability that someone will be diagnosed with lung cancer given that they did/do currently smoke)

LECTURE 4 Inferential Statistics - Differences found in every statistical comparison we will make depends on probability - Used to draw conclusions about population based on results in sample; but any sample drawn from population will have some associated error from true results - Inferential statistics used to determine probability that the results based on analysis of data sample is true The lower this probability, the better than answer Hypothesis Testing - Compute test statistics to determine whether hypotheses should be accepted as true (statistically significant) or rejected as false (not statistically significant) - Need to determine if results are true Want to know: whats the probability that the difference in the DV between the groups (IV) occurred by chance (if probability small enough, can assume difference not due to chance) Significance Level - Use significance level to identify how sure we want to be - Significance Level = - Researchers will select and identify significance level before calculating the test statistic - Most commonly, researchers decide they want to be 95% sure; = 0.05) The probability that the difference occurred by chance or randomly is only 5% We are 95% sure the difference is a true difference acceptable level of error; amount of error youre willing to accept; probability of making a mistake Statistical Significance - P value if < identified significance level (0.10 / 0.05 / 0.01), then we can be 90% / 95% / 99% sure that results did not occur by chance; When the results dont occur by chance = statistically significant - If P value identified significance (0.10 / 0.05 / 0.01), then cant be 90% / 95% / 99% sure that results did not occur by chance = not statistically significant - Treatment effect difference between groups - Reject or accept null hypothesis; reject if P < significance level () it is clinically significant Treatment effect and Significance into words: We can be at least 95% sure that theres a true difference in amount of oral rehydration fluid consumed in population of children 6 months to 10 years being treated for gastroenteritis and dehydration who receive oral ondansetron compared to those who received placebo. Ex. If P value is < = 0.05: we are 95% sure that the results did not occur by chance; results are statistically significant; we reject null hypothesis (no difference) and accept the alternative hypothesis (theres a difference) = theres a true difference Central Limit Theorem - If we produced histograms for results from different samples that are large enough the distribution of the sample means would be approximately normal (this is true even if samples were from a population that was normally distributed/non-normally distributed [skewed]) - We know that 95% of all possible estimations of the population mean lies within 1.96 SD of the sample mean

Standard Error of the Mean - SD = how close observations are from a sample are to the mean - SE (standard error) = if we took a # of different samples from the population & record the mean for each sample, we could calculate the SD of all of their means Error = signifies that every time data is collected from a sample, the results may have some error of the population mean Refers to how close mean scores from repeated samples will be to the true population mean Larger the sample size = the smaller the SE the smaller the SE = the more accurate the estimation of the population mean will be because theres less variability in sample means - Use the SE to determine how confident we are that our estimation of the sample means reflects the population meanotherwise known as confidence interval Confidence Interval - Confidence interval tells us range in which the true mean of the population lies - If we are confident that the population mean lies within the confidence interval, then we can be confident that the sample is representative of the population (therefore, we are confident that the results we get from sample is true for everyone in population) CI and Statistical Significance - CI can also tell us whether the difference is statistically significant - Because the 95% CI represents the range of values in which the true population value falls, if the value which represents no difference (between 2 groups IV & DV) falls within the CI then the treatment effect is not statistically significant CI cannot include 0 - When comparing means, if the difference wasnt statistically significant, 0 would be included in the corresponding 95% since no difference is a possible value in the population, indicates that the difference is not statistically significant, it occurred by chance CI and Clinical Significance - Clinical significance tells us what we can expect in the population - If you would implement at both ends of the CI, it is usually a clinically significant result EIDM Model

Você também pode gostar