library(tidyverse)
library(tidymodels)
library(openintro)
Hypothesis tests and confidence intervals
STA 199
Bulletin
- this
ae
is due for grade. Push your completed ae to GitHub within 48 hours to receive credit - project draft report due to GitHub today
- homework 5 due next Thursday
- Exam 2 next Friday
Getting started
Clone your ae23-username
repo from the GitHub organization.
Today
By the end of today you will…
- define type I and type II error
- compare hypothesis tests with confidence intervals
- practice conducting hypothesis tests
Load packages
Practice
Load data
The stent30
data set comes from the openintro
package and is from a study conducted in 2011 on the effects of arterial stents as a therapy for stroke patients. See the original publication:
Chimowitz MI, Lynn MJ, Derdeyn CP, et al. 2011. Stenting versus Aggressive Med- ical Therapy for Intracranial Arterial Stenosis. New England Journal of Medicine 365:993- 1003. doi: 10.1056/NEJMoa1105335.
or check ?stent30
for more information.
data(stent30)
glimpse(stent30)
Rows: 451
Columns: 2
$ group <fct> treatment, treatment, treatment, treatment, treatment, treatme…
$ outcome <fct> stroke, stroke, stroke, stroke, stroke, stroke, stroke, stroke…
Exercise 1
Do stents affect stroke outcome in patients?
Write the null and alternative hypothesis. Report the observed statistic.
Simulate under the null.
Compute and report the p-value, compare to \(\alpha = 0.05\) and make a conclusion with appropriate context
Confidence intervals and hypothesis tests
Here we revisit the data from the first three seasons of NC Courage games (2017-2019).
= read_csv("https://sta101-fa22.netlify.app/static/labs/data/courage.csv") courage
glimpse(courage)
Rows: 78
Columns: 10
$ game_id <chr> "washington-spirit-vs-north-carolina-courage-2017-04-15", …
$ game_date <chr> "4/15/2017", "4/22/2017", "4/29/2017", "5/7/2017", "5/14/2…
$ game_number <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,…
$ home_team <chr> "WAS", "NC", "NC", "BOS", "ORL", "NC", "NC", "CHI", "NC", …
$ away_team <chr> "NC", "POR", "ORL", "NC", "NC", "CHI", "NJ", "NC", "KC", "…
$ opponent <chr> "WAS", "POR", "ORL", "BOS", "ORL", "CHI", "NJ", "CHI", "KC…
$ home_pts <dbl> 0, 1, 3, 0, 3, 1, 2, 3, 2, 3, 0, 0, 2, 1, 1, 0, 1, 2, 2, 2…
$ away_pts <dbl> 1, 0, 1, 1, 1, 3, 0, 2, 0, 1, 1, 1, 0, 0, 0, 1, 2, 0, 3, 1…
$ result <chr> "win", "win", "win", "win", "loss", "loss", "win", "loss",…
$ season <dbl> 2017, 2017, 2017, 2017, 2017, 2017, 2017, 2017, 2017, 2017…
Do National Women’s Soccer League (NWSL) teams have a home-field advantage? We’ll answer this question in a few separate ways.
Hypothesis testing framework: does NC Courage score a significantly different number of points (on average) away than at home?
Exercise 2
- Create a new column
location
that tells you whether the courage are “home” or “away” - Create a new column
pts
that always reports the Courage points scored in a game. - Save your result as a new data frame titled
courage2
.
# code here
Exercise 3
To answer the question does NC Courage score a significantly different number of points (on average) away than at home?
Write the null and alternative hypothesis. Report the observed statistic.
Simulate under the null.
Compute and report the p-value, compare to \(\alpha = 0.05\) and make a conclusion with appropriate context
# code here
Exercise 4
- Report the mean difference between away and home games and report a 95% bootstrap confidence interval. Use
set.seed(3)
andreps=5000
Interpret your interval in context.
# code here
Exercise 5
Is there a better way we could investigate whether or not the Courage have a home-field advantage? Why?
Notes
Type 1 and Type 2 Errors
Truth | Reject the null | Fail to reject the null |
---|---|---|
\(H_0\) is true | Type 1 error | ✔️ |
\(H_A\) is true | ✔️ | Type 2 error |
The significance level, \(\alpha\), is the probability of a type 1 error. In some contexts, a type 1 error may be referred to as a “false positive” and a type 2 error as a “false negative”.
Intuitively, by considering extremes, one can see a trade-off exists between type 1 and type 2 error.
If \(\alpha = 0\), then the p-value stands no chance of being smaller than \(\alpha\) and we always fail to reject the null. This makes type 1 errors impossible.
Similarly, if \(\alpha = 1\), then all p-values will be smaller than \(\alpha\) and type 2 errors will become impossible, because we will always reject the null.
\(\beta\) is used to denote the probability of a type 2 error.
The power of a test is \(1 - \beta\), which is the probability that your test rejects the null hypothesis when the null hypothesis is false.
Why it’s important to be careful with interpretation
(And why hypothesis tests don’t tell the whole story)
The data for this example comes from Confounding and Simpson’s paradox1 by Julious and Mullee.
The data examines 901 individuals with diabetes and includes the following variables
insulin_dep
: whether or not the patient has insulin dependent or non-insulin dependent diabetesage
: whether or not the individual is less than 40 years oldsurvival
: whether or not the individual survived the length of the study
= read_csv("https://sta101.github.io/static/appex/data/diabetes.csv") diabetes
Flex Aisher thinks people with insulin dependent diabetes actually survive longer than those without insulin dependence. Flex wants to formally test his hypothesis.
Let \(p_{d}\) be the probability of insulin dependent survival and \(p_{i}\) be the probability of insulin independent survival.
\[ H_0: p_{d} - p_{i} = 0 \]
\[ H_A: p_{d} - p_{i} > 0 \]
At first glance the data seem to back up his claim…
Exercise 6
Compute the probability of survival and death for diabetic individuals with and without insulin dependence.
# code here
Exercise 7
Is Flex’s claim significant at the \(\alpha = 0.05\) level? Perform a hypothesis test and report your results.
# code here
Exercise 8
Is the aggregate data misleading? Use the code chunk below to investigate further.
# code here
Footnotes
Julious, S A, and M A Mullee. “Confounding and Simpson’s paradox.” BMJ (Clinical research ed.) vol. 309,6967 (1994): 1480-1. doi:10.1136/bmj.309.6967.1480↩︎