ATTACHED FILE(S)
Week Two Journal
Course Reflection
Provide reflection of this week that contains 1-2 paragraphs which address at least one of the following topics:
· Learning (i.e., information learned from course and/or content)
· Likes (i.e., liked most about the course and/or content)
· Dislikes (i.e., liked least about the course and/or content)
· Suggestions (i.e., suggestions for improvement about the course, content and/or assignments)
Chapter 5
The Role of Probability
Learning Objectives (1 of 3)
Define the terms “equally likely” and “at random”
Compute and interpret unconditional and conditional probabilities
Evaluate and interpret independence of events
Explain the key features of the binomial distribution model
Learning Objectives (2 of 3)
Calculate probabilities using the binomial formula
Explain the key features of the normal distribution model
Calculate probabilities using the standard normal distribution table
Compute and interpret percentiles of the normal distribution
Learning Objectives (3 of 3)
Define and interpret the standard error
Explain sampling variability
Apply and interpret the results of the Central Limit Theorem
Two Areas of Biostatistics
Goal: Statistical Inference
Descriptive Statistics
Sampling from a Population
Sampling:
Population Size = N, Sample Size = n(1 of 2)
Simple random sample
Enumerate all members of population N (sampling frame), select n individuals at random (each has same probability of being selected).
Systematic sample
Start with sampling frame; determine sampling interval (N/n); select first person at random from first (N/n) and every (N/n) thereafter.
Stratified sample
Organize population into mutually exclusive strata; select individuals at random within each stratum.
Convenience sample
Non-probability sample (not for inference)
Quota sample
Select a predetermined number of individuals into sample from groups of interest.
Sampling:
Population Size = N, Sample Size = n(2 of 2)
Basics
Probability reflects the likelihood that outcome will occur.
0 ≤ Probability ≤ 1
P(Select any child) = 1/5290 = 0.0002
Example 5.1.
Basic Probability (1 of 2)
Example 5.1.
Basic Probability (2 of 2)
P(Select a boy) = 2560/5290 = 0.484
P(Select boy age 10) = 418/5290 = 0.079
P(Select child at least 8 years of age)
= (846 + 881 + 918)/5290
= 2645/5290 = 0.500
Conditional Probability
Probability of outcome in a specific subpopulation
Example 5.1.
P(Select 9-year-old from among girls)
= P(Select 9-year-old | girl)
= 461/2730 = 0.169
P(Select boy | 6 years of age)
= 379/892=0.425
Example 5.2.
Conditional Probability (1 of 2)
Example 5.2.
Conditional Probability (2 of 2)
P(Prostate cancer | Low PSA)
= 3/64 = 0.047
P(Prostate cancer | Moderate PSA)
= 13/41 = 0.317
P(Prostate cancer | High PSA)
= 12/15 = 0.80
Sensitivity and Specificity
Sensitivity = True positive fraction
= P(test+ | disease)
Specificity = True negative fraction
= P(test– | disease free)
False negative fraction = P(test– | disease)
False positive fraction = P(test+ | disease free)
Example 5.4.
Sensitivity and Specificity
Sensitivity and Specificity
Sensitivity = P(test+ | disease) = 9/10 = 0.90
Specificity = P(test– | disease free)
= 4449/4800 = 0.927
False negative fraction = P(test– | disease)
= 1/10 = 0.10
False positive fraction = P(test+ | disease free)
= 351/4800 = 0.073
Independence
Two events, A and B, are independent if P(A | B) = P(A) or if P(B | A) = P(B)
Example 5.2.
Is screening test independent of prostate cancer diagnosis?
P(Prostate cancer) = 28/120 = 0.023
P(Prostate cancer | Low PSA) = 0.047
P(Prostate cancer | Moderate PSA) = 0.317
P(Prostate cancer | High PSA) = 0.80
Bayes’ Theorem (1 of 2)
Using Bayes’ Theorem we revise or update a probability based on additional information.
Prior probability is an initial probability.
Posterior probability is a probability that is revised or updated based on additional information.
Bayes’ Theorem (2 of 2)
Example (1 of 2)
In Boston, 51% of adults are male.
One adult is randomly selected to participate in a study.
Prior probability of selecting a male = 0.51
Example (2 of 2)
Selected participant is a smoker.
9.5% of males in Boston smoke as compared to 1.7% of females.
Find the probability that we selected a male given he is a smoker.
Example: Find P(M | S)
P(M) = 0.51 P(M’) = 0.49
P(S | M) = 0.095 P(S | M’) = 0.017
Bayes’ Theorem
Knowing the participant smokes—increases P(M)
P(disease) = 0.002
Sensitivity = 0.85 = P(test+ | disease)
P(test+) = 0.08 and P(test–) = 0.92
What is P(disease | test+)?
Example 5.8.
Bayes’ Theorem (1 of 3)
What is P(disease | test+)?
P(disease) = 0.002
Sensitivity = 0.85 = P(test+ | disease)
P(test+) = 0.08 and P(test–) = 0.92
Example 5.8.
Bayes’ Theorem (2 of 3)
Example 5.8.
Bayes’ Theorem (3 of 3)
P(disease) = 0.002
Sensitivity = 0.85 = P(test+ | disease)
P(test+) = 0.08 and P(test–) = 0.92
Model for discrete outcome
Process or experiment has two possible outcomes: success and failure.
Replications of process are independent.
P(success) is constant for each replication.
Binomial Distribution (1 of 2)
Binomial Distribution (2 of 2)
Notation
n = number of times process is replicated
p = P(success)
x = number of successes of interest
0 ≤ x ≤ n
Example 5.9.
Binomial Distribution
Medication for allergies is effective in reducing symptoms in 80% of patients. If medication is given to 10 patients, what is the probability it is effective in 7?
= 120(0.2097)(0.008) = 0.2013
Antibiotic is claimed to be effective in 70% of the patients. If antibiotic is given to five patients, what is the probability it is effective on exactly three?
Success = Antibiotic is effective: n = 5, p = 0.7, x = 3
= 10(0.343)(0.09) = 0.3087
Binomial Distribution (1 of 4)
What is the probability that the antibiotic is effective on all five?
Binomial Distribution (2 of 4)
What is the probability that the antibiotic is effective on at least three?
P(X ≥ 3) = P(3) + P(4) + P(5)
= 0.3087 + 0.3601 + 0.1681 = 0.8369
Binomial Distribution (3 of 4)
Binomial Distribution (4 of 4)
Mean and variance of the binomial distribution
m = np
s2= np (1 – p)
For example, the mean (or expected) number of patients in whom the antibiotic is effective is 5*0.7 = 3.5
Model for continuous outcome
Mean = median = mode
Normal Distribution (1 of 3)
Notation: m = mean and s = standard deviation
m-3s m-2s m-smm+sm+2s m+3s
Normal Distribution (2 of 3)
Normal Distribution (3 of 3)
Properties of normal distribution
I) The normal distribution is symmetric about the mean (i.e., P(X > m) = P(X < m) = 0.5).
ii) The mean and variance, m and s2, completely characterize the normal distribution.
iii) The mean = the median = the mode.
P(m - s < X < m + s) = 0.68
P(m - 2s < X < m + 2s) = 0.95
P(m - 3s < X < m + 3s) = 0.99
iv) P(a < X < b) = the area under the normal curve from a to b.
Body mass index (BMI) for men age 60 is normally distributed with a mean of 29 and standard deviation of 6.
What is the probability that a male has BMI less than 29?
Example 5.11.
Normal Distribution (1 of 10)
Example 5.11.
Normal Distribution (2 of 10)
11172329354147
P(X<29)=0.5
0.5 0.5
Example 5.11.
Normal Distribution (3 of 10)
Body mass index (BMI) for men age 60 is normally distributed with a mean of 29 and standard deviation of 6.
What is the probability that a male has BMI less than 35?
Example 5.11.
Normal Distribution (4 of 10)
11172329354147
P(X<35)=?
Example 5.11.
Normal Distribution (5 of 10)
Example 5.11.
Normal Distribution (6 of 10)
11172329354147
P(X < 35) = 0.5 + 0.34 = 0.84
0.50.34
Standard Normal Distribution Z
Normal distribution with m = 0 and s = 1
-3 -2 -10 1 2 3
11172329354147
P(X < 35) = P(Z < 1) = ?
Example 5.11.
Normal Distribution (7 of 10)
P(X < 35) = P(Z < 1).
Using Table 1, P(Z < 1.00) = 0.8413
Table 1.Probabilities of Z
Table entries represent P(Z < Zi)
Zi .00 .01.02 .03 .04…
0.00.50000.50400.50800.51200.5160…
0.10.53980.54380.54780.55170.5557…
.
.
1.00.84130.84380.84610.84850.8508 …
Example 5.11.
Normal Distribution (8 of 10)
11172329354147
P(X<30)=?
What is the probability that a male has BMI less than 30?
Example 5.11.
Normal Distribution (9 of 10)
Example 5.11.
Normal Distribution (10 of 10)
P(X < 30)= P(Z < 0.17) = 0.5675
Percentiles of the Normal Distribution
The kth percentile is defined as the score that holds k percent of the scores below it (e.g., 90th percentile is the score that holds 90% of the scores below it.
Q1 = 25th percentile
Median = 50th percentile
Q3 = 75th percentile
Percentiles (1 of 2)
For the normal distribution, the following is used to compute percentiles:
X = m + Z s
where
m = mean of the random variable X,
s = standard deviation, and
Z = value from the standard normal distribution for the desired percentile (See Table 1A, next slide).
Percentiles of the standard normal distribution
(Table 1A)
Percentile Z
1st –2.326
2.5th –1.960
5th –1.645
10th –1.282
50th 0
90th 1.282
95th 1.645
97.5th 1.960
99th 2.326
Percentiles (2 of 2)
Example 5.12.
Percentiles of the Normal Distribution
BMI in men follows a normal distribution with m = 29, s = 6; BMI in women follows a normal distribution with m = 28, s = 7.
The 90th percentile of BMI for men
X = 29 + 1.282(6) = 36.69.
The 90th percentile of BMI for women
X = 28 + 1.282(7) = 36.97.
Central Limit Theorem
Suppose we have a population with known mean m and standard deviation s. If we take simple random samples of size n with replacement, then for large n, the sampling distribution of the sample means is approximately normal with mean and standard deviation.
Application
Non-normal population
Take samples of size n, as long as n is sufficiently large (usually n ≥ 30 suffices).
The distribution of the sample mean is approximately normal, therefore can use Z to compute probabilities.
HDL cholesterol has a mean of 54 and standard deviation of 17 in patients over 50. A physician has 40 patients over age 50 and wants to know the probability that their mean cholesterol is above 60.
Example 5.18.
Central Limit Theorem (1 of 2)
Example 5.18.
Central Limit Theorem (2 of 2)
Example
Suppose we wish to estimate the mean of a population (m) whose standard deviation is known and equal to 12, and a simple random sample of 100 individuals is selected from the population.
Find the probability that the sample mean is no more than 2 units from the population mean.
Sampling Distribution of Sample Mean
Central Limit Theorem
P( m – 2 < < m + 2) = ??
Z = { (m – 2) – m }/12/100 = –2/1.2 = –1.67
Z = { (m + 2) – m }/12/100 = 2/1.2 = 1.67
Then: P(–1.67 < Z < 1.67) = 0.9525 – 0.0475 = 0.905
The probability that the sample mean is no more than 2 units from the population mean is 0.905, or 90.5%.
n, X
= ?
SAMPLE
POPULATION
n
SAMPLES
n
n
n
n
n
n
n
n
n
Population
N
0.05
1.645
0.95
4
3
2
1
0
-1
-2
-3
-4
0
- 2
+ 2
Chapter 7
Hypothesis Testing Procedures
Learning Objectives (1 of 3)
Define null and research hypothesis, test statistic, level of significance, and decision rule
Distinguish between Type I and Type II errors and discuss the implications of each
Explain the difference between one- and two-sided tests of hypothesis
Estimate and interpret p-values
Learning Objectives (2 of 3)
Explain the relationship between confidence interval estimates and p-values in drawing inferences
Perform analysis of variance by hand
Appropriately interpret the results of analysis of variance tests
Distinguish between one- and two-factor analysis of variance tests
Learning Objectives (3 of 3)
Perform chi-square tests by hand
Appropriately interpret the results of chi-square tests
Identify the appropriate hypothesis testing procedures based on type of outcome variable and number of samples
Hypothesis Testing
Research hypothesis is generated about unknown population parameter.
Sample data are analyzed and determined to support or refute the research hypothesis.
Hypothesis Testing Procedures
Step 1
Null hypothesis (H0):
No difference, no change
Research hypothesis (H1):
What investigator believes to be true
Hypothesis Testing Procedures
Step 2
Collect sample data and determine whether sample data support research hypothesis or not.
For example, in test for m, evaluate.
Hypothesis Testing Procedures
Step 3
Set up decision rule to decide when to believe null versus research hypothesis.
Depends on level of significance, a = P(Reject H0|H0 is true)
Hypothesis Testing Procedures
Steps 4 and 5
Summarize sample information in test statistic (e.g., Z value).
Draw conclusion by comparing test statistic to decision rule.
Provide final assessment as to whether H1 is likely true given the observed data.
p-values
p-values represent the exact significance of the data.
Estimate p-values when rejecting H0 to summarize significance of the data (can approximate with statistical tables, can get exact value with statistical computing package).
p-value is the smallest a where we still reject H0.
Hypothesis Testing Procedures
Set up null and research hypotheses, select a.
Select test statistic.
Set up decision rule.
Compute test statistic.
Draw conclusion and summarize significance.
Errors in Hypothesis Tests
Hypothesis Testing for m
Continuous outcome
One sample
H0: m = m0
H1: m > m0, m < m0, m ≠ m0
Test statistic:
n ≥ 30 (Find critical
value in Table 1C,
n < 30 Table 2, df = n – 1)
The National Center for Health Statistics (NCHS) reports the mean total cholesterol for adults is 203. Is the mean total cholesterol in Framingham Heart Study participants significantly different?
In 3310 participants the mean is 200.3 with a standard deviation of 36.8.
Example 7.2.
Hypothesis Testing for m (1 of 4)
1. H0: m = 203
H1: m ≠ 203 a = 0.05
2.Test statistic:
3.Decision rule:
Reject H0 if z ≥ 1.96 or if z ≤ –1.96
Example 7.2.
Hypothesis Testing for m (2 of 4)
4.Compute test statistic:
Conclusion. Reject H0 because –4.22 < –1.96. We have statistically significant evidence at a = 0.05 to show that the mean total cholesterol is different in Framingham Heart Study participants.
Example 7.2.
Hypothesis Testing for m (3 of 4)
Example 7.2.
Hypothesis Testing for m (4 of 4)
Significance of the findings:Z = –4.22
Table 1C.Critical Values for Two-Sided Tests
a Z
0.20 1.282
0.10 1.645
0.05 1.960
0.010 2.576
0.001 3.291
0.0001 3.819 p < 0.0001
New Scenario
Outcome is dichotomous (p = population proportion).
Result of surgery (success, failure)
Cancer remission (yes/no)
One study sample
Data
On each participant, measure outcome (yes/no)
n, x = number of positive responses,
Hypothesis Testing for p
Dichotomous outcome
One sample
H0: p = p0
H1: p > p0, p < p0, p ≠ p0
Test statistic:
(Find critical value in Table 1C)
The NCHS reports that the prevalence of cigarette smoking among adults in 2002 is 21.1%. Is the prevalence of smoking lower among participants in the Framingham Heart Study?
In 3536 participants, 482 reported smoking.
Example 7.4.
Hypothesis Testing for p (1 of 3)
1. H0: p = 0.211
H1: p < 0.211 a = 0.05
Test statistic:
3.Decision rule:
Reject H0 if z ≤ –1.645
Example 7.4.
Hypothesis Testing for p (2 of 3)
Example 7.4.
Hypothesis Testing for p (3 of 3)
4. Compute test statistic:
5.Conclusion. Reject H0 because –10.93 < –1.645. We have statistically significant evidence at a = 0.05 to show that the prevalence of smoking is lower among the Framingham Heart Study participants. (p < 0.0001)
Hypothesis Testing for Categorical and Ordinal Outcomes*
Categorical or ordinal outcome
One sample
H0: p1 = p10, p2 = p20,…,pk = pk0
H1: H0 is false
Test statistic:
(Find critical value in Table 3, df = k – 1)
* c2 goodness-of-fit test
Chi-Square Tests
2 tests are based on the agreement between expected (under H0) and observed (sample) frequencies.
Test statistic:
Chi-Square Distribution
If H0 is true c2 will be close to 0; if H0 is false, c2 will be large.
Reject H0 if c2 > Critical value from Table 3
A university survey reveals that 60% of students get no regular exercise, 25% exercise sporadically and 15% exercise regularly. The university institutes a health promotion campaign and re-evaluates exercise 1 year later.
NoneSporadic Regular
Number of students 255 125 90
Example 7.6.
c2 Goodness-of-Fit Test (1 of 4)
1. H0: p1 = 0.60, p2 = 0.25, p3 = 0.15
H1: H0 is false a = 0.05
2.Test statistic:
3.Decision rule: df = k – 1 = 3 – 1 = 2
Reject H0 if c2 ≥ 5.99
Example 7.6.
c2 Goodness-of-Fit Test (2 of 4)
4. Compute test statistic:
NoneSporadicRegular Total
No. students (O) 255 125 90 470
Expected(E) 282 117.570.5 470
(O – E)2/E 2.59 0.48 5.39
c2 = 8.46
Example 7.6.
c2 Goodness-of-Fit Test (3 of 4)
Example 7.6.
c2 Goodness-of-Fit Test (4 of 4)
5.Conclusion. Reject H0 because 8.46 > 5.99. We have statistically significant evidence at a = 0.05 to show that the distribution of exercise is not 60%, 25%, 15%.
Using Table 3, the p-value is p < 0.005.
New Scenario
Outcome is continuous.
SBP, weight, cholesterol
Two independent study samples
Data
On each participant, identify group and measure outcome.
Two Independent Samples (1 of 2)
RCT: Set of Subjects Who Meet
Study Eligibility Criteria
Randomize
Treatment 1 Treatment 2
Mean Treatment 1 Mean Treatment 2
Two Independent Samples (2 of 2)
Cohort Study: Set of Subjects Who Meet Study Inclusion Criteria
Group 1 Group 2
Mean Group 1 Mean Group 2
Hypothesis Testing for (m1 - m2) (1 of 2)
Continuous outcome
Two independent samples
H0: m1 = m2 (m1 - m2 = 0)
H1: m1 > m2, m1< m2, m1 ≠ m2
Hypothesis Testing for (m1 - m2) (2 of 2)
Continuous outcome
Two independent samples
H0: m1 = m2
H1: m1 > m2, m1 < m2, m1 ≠ m2
Test statistic:
n1 ≥ 30 and (Find critical value
n2 ≥ 30 in Table 1C,
n1 < 30 or Table 2, df = n1 + n2 – 2)
n2 < 30
Pooled Estimate of Common Standard Deviation, Sp
Previous formulas assume equal variances (s12 = s22).
If 0.5 ≤ s12/s22 ≤ 2, assumption is reasonable.
Example 7.9.
Hypothesis Testing for (m1 - m2) (1 of 3)
A clinical trial is run to assess the effectiveness of a new drug in lowering cholesterol. Patients are randomized to receive the new drug or placebo and total cholesterol is measured after 6 weeks on the assigned treatment.
Is there evidence of a statistically significant reduction in cholesterol for patients on the new drug?
Example 7.9.
Hypothesis Testing for (m1 - m2) (2 of 3)
Sample Size Mean Std Dev
New drug 15 195.9 28.7
Placebo 15 227.4 30.3
1. H0: m1 = m2
H1: m1 < m2 a = 0.05
2.Test statistic:
3.Decision rule: df = n1 + n2 – 2 = 28
Reject H0 if t ≤ –1.701
Example 7.9.
Hypothesis Testing for (m1 - m2) (3 of 3)
Assess Equality of Variances
Ratio of sample variances: 28.72/30.32 = 0.90
Example 7.9.
Hypothesis Testing for (m1 - m2)
4. Compute test statistic:
5.Conclusion. Reject H0 because –2.92 < –1.701. We have statistically significant evidence at a = 0.05 to show that the mean cholesterol level is lower in patients on treatment as compared to placebo. (p < 0.005)
New Scenario
Outcome is continuous.
SBP, weight, cholesterol
Two matched study samples
Data
On each participant, measure outcome under each experimental condition.
Compute differences (D = X1 – X2).
Two Dependent/Matched Samples
Subject ID Measure 1 Measure 2
1 55 70
2 42 60
.
.
Measures taken serially in time or under different experimental conditions.
Crossover Trial
Treatment Treatment
Eligible R
Participants
Placebo Placebo
Each participant is measured on treatment and placebo.
Hypothesis Testing for md
Continuous outcome
Two matched/paired sample
H0: md = 0
H1: md > 0, md < 0, md ≠ 0
Test statistic:
n ≥ 30 (Find critical value
in Table 1C,
n < 30 Table 2, df = n – 1)
Example 7.10.
Hypothesis Testing for md (1 of 3)
Is there a statistically significant difference in mean systolic blood pressures (SBPs) measured at exams 6 and 7 (approximately 4 years apart) in the Framingham Offspring Study?
Among n = 15 randomly selected participants, the mean difference was –5.3 units and the standard deviation was 12.8 units. Differences were computed by subtracting the exam 6 value from the exam 7 value.
1. H0: md = 0
H1: md ≠ 0 a = 0.05
2.Test statistic:
3.Decision rule: df = n – 1 = 14
Reject H0 if t ≥ 2.145 or if z ≤ –2.145
Example 7.10.
Hypothesis Testing for md (2 of 3)
4. Compute test statistic:
5.Conclusion. Do not reject H0 because –2.145 < –1.60 < 2.145. We do not have statistically significant evidence at a = 0.05 to show that there is a difference in systolic blood pressures over time.
Example 7.10.
Hypothesis Testing for md (3 of 3)
New Scenario
Outcome is dichotomous
Result of surgery (success, failure)
Cancer remission (yes/no)
Two independent study samples
Data
On each participant, identify group and measure outcome (yes/no)
Hypothesis Testing for (p1 – p2)
Dichotomous outcome
Two independent samples
H0: p1 = p2
H1: p1 >p2, p1< p2, p1 ≠ p2
Test statistic:
(Find critical value
in Table 1C)
Example 7.12.
Hypothesis Testing for (p1 – p2) (1 of 4)
Is the prevalence of CVD different in smokers as compared to nonsmokers in the Framingham Offspring Study?
Free of CVD History of CVD Total
Nonsmoker 2757 298 3055
Current smoker 663 81 744
Total 3420 379 3799
1. H0: p1 = p2
H1: p1 ≠ p2 a = 0.05
2.Test statistic:
3.Decision rule:
Reject H0 if Z ≤ –1.96 or if Z ≥ 1.96
Example 7.12.
Hypothesis Testing for (p1 – p2) (2 of 4)
4. Compute test statistic:
Example 7.12.
Hypothesis Testing for (p1 – p2) (3 of 4)
5.Conclusion. Do not reject H0 because –1.96 < 0.927 < 1.96. We do not have statistically significant evidence at a = 0.05 to show that there is a difference in prevalence of CVD between smokers and nonsmokers.
Example 7.12.
Hypothesis Testing for (p1 – p2) (4 of 4)
Hypothesis Testing for More
than Two Means*
Continuous outcome
k independent Samples, k > 2
H0: m1 = m2 = m3 … = mk
H1: Means are not all equal
Test statistic:
(Find critical value in Table 4)
*Analysis of variance
Test Statistic: F Statistic
Comparison of two estimates of variability in data
Between treatment variation, is based on the assumption that H0 is true (i.e., population means are equal).
Within treatment, residual or error variation, is independent of H0 (i.e., we do not assume that the population means are equal and we treat each sample separately).
F Statistic (1 of 2)
Difference between each group mean and overall mean
Difference between each observation and its group mean (within group variation—error)
F = MSB/MSE
MS = Mean Square
What values of F indicate H0 is likely true?
F Statistic (2 of 2)
Decision Rule
Reject H0 if F ≥ critical value of F with df1 = k – 1 and df2 = N – k from Table 4
k = Number of comparison groups
N = Total sample size
ANOVA Table
Source of Sums of Mean
Variation Squares df Squares F
Between
treatments k – 1 SSB/k – 1MSB/MSE
Error N – k SSE/N – k
Total N – 1
Is there a significant difference in mean weight loss among four different diet programs?
(Data are pounds lost over 8 weeks)
Low-Cal Low-Fat Low-Carb Control
8 2 3 2
9 4 5 2
6 3 4 -1
7 5 2 0
3 1 3 3
Example 7.14.
ANOVA (1 of 12)
1. H0: m1 = m2 = m3 = m4
H1: Means are not all equal a = 0.05
2.Test statistic:
Example 7.14.
ANOVA (2 of 12)
3.Decision rule:
df1 = k – 1 = 4 – 1 = 3
df2 = N – k = 20 – 4 =16
Reject H0 if F ≥ 3.24
Example 7.14.
ANOVA (3 of 12)
Summary Statistics on Weight Loss by Treatment
Low-Cal Low-Fat Low-Carb Control
N 5 5 5 5
Mean 6.6 3.0 3.4 1.2
Overall Mean = 3.6
Example 7.14.
ANOVA (4 of 12)
Example 7.14.
ANOVA (5 of 12)
= 5(6.6 – 3.6)2 + 5(3.0 – 3.6)2 + 5(3.4 – 3.6)2 + 5(1.2 – 3.6)2
= 75.8
Example 7.14.
ANOVA (6 of 12)
Example 7.14.
ANOVA (7 of 12)
Example 7.14.
ANOVA (8 of 12)
Example 7.14.
ANOVA (9 of 12)
= 21.4 + 10.0 + 5.4 + 10.6 = 47.4
Example 7.14.
ANOVA (10 of 12)
Source of Sums of Mean
Variation Squares df Squares F
Between 75.8 3 25.3 8.43
Treatments
Error 47.4 16 3.0
Total 123.2 19
Example 7.14.
ANOVA (11 of 12)
4.Compute test statistic:
F = 8.43
5.Conclusion. Reject H0 because 8.43 > 3.24. We have statistically significant evidence at a = 0.05 to show that there is a difference in mean weight loss among four different diet programs.
Example 7.14.
ANOVA (12 of 12)
Two-Factor ANOVA
Compare means of a continuous outcome across two grouping variables or factors
Overall test—is there a difference in cell means?
Factor A—marginal means
Factor B—marginal means
Interaction—difference in means across levels of Factor B for each level of Factor A?
Interaction
Cell Means Factor B
1 2 3
Factor A 1 45 58 70
2 65 55 38
A1 1 2 3 45 58 70 A2 1 2 3 65 55 38
No Interaction
Cell Means Factor B
1 2 3
Factor A 1 45 58 70
2 38 55 65
A1 1 2 3 45 58 70 A2 1 2 3 38 55 65
Example 7.16.
Two-Factor ANOVA (1 of 3)
Clinical trial to compare time to pain relief of three competing drugs for joint pain. Investigators hypothesize that there may be a differential effect in men versus women.
Design: N = 30 participants (15 men and 15 women) are assigned to three treatments (A, B, C)
Example 7.16.
Two-Factor ANOVA (2 of 3)
Mean times to pain relief by treatment and sex
Is there a difference in mean times to pain relief? Are differences due to treatment? Sex?Or both?
Source Sums of Mean
of Variation Squares df Square F p-value
Model 967.0 5 193.4 20.7 0.0001
Treatment 651.5 2 325.7 34.8 0.0001
Sex 313.6 1 313.6 33.5 0.0001
Treatment*Sex 1.9 2 0.9 0.1 0.9054
Error 224.4249.4
Total 1191.429
Example 7.16.
Two-Factor ANOVA (3 of 3)
Hypothesis Testing for Categorical
or Ordinal Outcomes*
Categorical or ordinal outcome
Two or more samples
H0: The distribution of the outcome is independent of the groups
H1: H0 is false
Test statistic:
(Find critical value in Table 3: df = (r – 1)(c – 1))
* c2 test of independence
Chi-Square Test of Independence
Outcome is categorical or ordinal (2+ levels) and there are two or more independent comparison groups (e.g., treatments).
H0: Treatment and outcome are independent distributions of outcome are the same across treatments)
Is there a relationship between students’ living arrangement and exercise status?
Exercise Status
None Sporadic Regular Total
Dormitory 32 30 28 90
On-campus apt. 74 64 42 180
Off-campus apt. 110 25 15 150
At home 39 6 5 50
Total 255 125 90 470
Example 7.17.
c2 Test of Independence (1 of 6)
1.H0: Living arrangement and exercise status are
independent
H1: H0 is false a = 0.05
2.Test statistic:
3.Decision rule: df = (r – 1)(c – 1) = 3(2) = 6
Reject H0 ifc2 ≥ 12.59
Example 7.17.
c2 Test of Independence (2 of 6)
4. Compute test statistic:
O = Observed frequency
E = Expected frequency
E = (row total)*(column total)/N
Example 7.17.
c2 Test of Independence (3 of 6)
4. Compute test statistic:
Table entries are Observed (Expected) frequencies
Exercise Status
NoneSporadic Regular Total
Dormitory 32 30 28 90
(90*255/470 = 48.8) (23.9)(17.2)
On-campus apt. 74 64 42 180
(97.7) (47.9)(34.5)
Off-campus apt. 110 25 15 150
(81.4) (39.9) (28.7)
At home 39 6 5 50
(27.1) (13.3)(9.6)
Total 255 125 90 470
Example 7.17.
c2 Test of Independence (4 of 6)
4. Compute test statistic:
Example 7.17.
c2 Test of Independence (5 of 6)
Example 7.17.
c2 Test of Independence (6 of 6)
5.Conclusion. Reject H0 because 60.5 > 12.59. We have statistically significant evidence at a = 0.05 to show that living arrangement and exercise status are not independent. (P < 0.005)
Delivering a high-quality product at a reasonable price is not enough anymore.
That’s why we have developed 5 beneficial guarantees that will make your experience with our service enjoyable, easy, and safe.
You have to be 100% sure of the quality of your product to give a money-back guarantee. This describes us perfectly. Make sure that this guarantee is totally transparent.
Read moreEach paper is composed from scratch, according to your instructions. It is then checked by our plagiarism-detection software. There is no gap where plagiarism could squeeze in.
Read moreThanks to our free revisions, there is no way for you to be unsatisfied. We will work on your paper until you are completely happy with the result.
Read moreYour email is safe, as we store it according to international data protection rules. Your bank details are secure, as we use only reliable payment systems.
Read moreBy sending us your money, you buy the service we provide. Check out our terms and conditions if you prefer business talks to be laid out in official language.
Read more