Inference for numerical data Solution

```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE) library(tidyverse) library(openintro) library(infer) ``` ## Exercise 1 (5 Points) **1 Point** H0: p = 0.66, HA: p not = 0.66 OR $H_0: p = 0.66, H_A: p \ne 0.66$ ```{r, warning = FALSE, message = FALSE} yrbss <- yrbss %>% mutate(physical_3plus = if_else(physically_active_7d > 2, "yes", "no")) #1 Point yrbss%>% filter(!is.na(physical_3plus))%>% prop_test(response = physical_3plus, success = "yes", p = 0.66, alternative = "two-sided", z = T) #1 Point for hypothesis test yrbss%>% filter(!is.na(physical_3plus))%>% prop_test(response = physical_3plus, success = "yes", alternative = "two-sided", z = T, conf_int = T, conf_level = 0.98) #1 Point for confidence interval ``` **0.5 Points** Since the p-value is greater than 0.02, we fail to reject the null hypothesis. **0.5 Points** CI: (0.6596, 0.6785) ## Exercise 2 (3 Points) **1 Point** H0: pmale - pfemale = 0, HA: pmale - pfemale not = 0 OR $H_0: p_{male} - p_{female} = 0, H_A: p_{male} - p_{female} \ne 0$ ```{r, warning = FALSE, message = FALSE} yrbss%>% filter(!is.na(physical_3plus), !is.na(gender))%>% prop_test(response = physical_3plus, success = "yes", explanatory = gender, order = c("male","female"), z = TRUE, conf_int = TRUE, conf_level = 0.95)#1 Point ``` **0.5 Points** Since p is less than 0.05, we reject the null hypothesis. **0.5 Points** CI: (0.1962, 0.2313) ## Exercise 3 (3 Points) **1 Point** H0: mu = 66.82, HA: mu not = 66.82 OR $H_0: \mu = 66.82, H_A: \mu \ne 66.82$ ```{r, warning = FALSE, message = FALSE} yrbss%>% filter(!is.na(weight))%>% t_test(response = weight, mu = 66.82, conf_int = TRUE, conf_level = 0.95) #1 Point ``` **0.5 Points** Since p-value is less than 0.05, we reject the null hypothesis. **0.5 Points** CI: (67.61, 68.20) ## Exercise 4 (3 Points) **2 Points** Explanation: There's seems to be a slight difference in the violin plots. Physically inactive students have a higher standard deviation compared to the physically active students. Physically inactive students have a slightly lower mean weight. ```{r, warning = FALSE, message = FALSE} yrbss %>% filter(!is.na(physical_3plus), !is.na(weight))%>% ggplot(aes(x = physical_3plus, y = weight, fill = physical_3plus))+ geom_violin() #1 Point ``` ## Exercise 5 (3 Points) **2 Points** There is an observable difference. But the difference isn't big enough for us to deem it statistically significant without an hypothesis test. ```{r, warning = FALSE, message = FALSE} yrbss %>% filter(!is.na(physical_3plus), !is.na(weight))%>% group_by(physical_3plus) %>% summarise(mean_weight = mean(weight)) #1 Point ``` ## Exercise 6 (3 Points) **1 Point** H0: mu_yes - mu_no = 0, HA: mu_yes - mu_no not = 0 OR $H_0: \mu_{yes} - \mu_{no} = 0, H_A: \mu_{yes} - \mu_{no} \ne 0$ ```{r, warning = FALSE, message = FALSE} yrbss%>% filter(!is.na(weight))%>% t_test(response = weight, explanatory = physical_3plus, order = c("yes","no"), mu = 0, conf_int = TRUE, conf_level = 0.95) #1 Point ``` **0.5 Points** Since p-value is less than 0.05, we reject the null hypothesis. **0.5 Points** CI: (1.12, 2.42)