---
title: "Lab 6: Central Limit Theorem"
author:  "ADD YOUR NAME HERE"
date: now
date-format: "DD/MM/YYYY HH:MM"
format:
    pdf
#    html:
#      theme: a11y
#      highlight-style: a11y
#      self-contained: true
---
 
 
## Load R libraries

```{r}
#| warning: false
#| message: false
library(mosaic)
library(knitr)
```

## Setting the seed of the random number generator

Use the **set.seed()** function in R to initialize the random number generator.   


```{r}
set.seed(02041971)
```
 
## Load the bear data

```{r}
bdat<-read.csv("bears.csv")
```

## Sampling Distribution of the Sample Mean

### Exercises 1 and 2

Explore the distribution of the number of bears/quadrat (for the 164 surveyed quadrats) by calculating some summary statistics using favstats and also by constructing a histogram of the sample data using the histogram function.

```{r}


```

QUESTION:  How would you describe this population?

ANSWER:  

## Creating the Sampling Distribution

## Exercise 1

Create a sampling distribution by taking 5000 different samples of size n = 5. Plot the distribution using `gf_dhistogram` with `gf_fitdistr` to overlay a normal distribution.

```{r}
sampdist5<-do(5000)*{
  # Sample to create data set
  sampdat<-sample(bdat, size=5, replace=FALSE)
  
  # Calculate the mean number of bears in a plot (for the sample)
  mean.bears<-mean(~Num.Bears, data=sampdat)
  
  # Estimate the total number of bears by taking the mean #/plot and multiplying by the number of plots
  N.hat<-164*mean.bears
} 
gf_dhistogram(~result, data=sampdist5) %>% gf_fitdistr(dist="dnorm")
```


### Exercise 2

Create a sampling distribution by taking 5000 different samples of size n = 30. Plot the distribution using `gf_dhistogram` with `gf_fitdistr` to overlay a normal distribution.

```{r}
 

```


### Exercise 3

Create a sampling distribution by taking 5000 different samples of size n =75. Plot the distribution using `gf_dhistogram` with `gf_fitdistr` to overlay a normal distribution.

```{r}


```

#### Part 4

QUESTION: For each sample size, describe the sampling distribution. Consider its shape and center. How does the sampling distribution change as you increase the sample size?

ANSWER:  


## Central Limit Theorem

### Exercise 5


QUESTION: Does n = 5 appear to be large enough for the CLT to apply? What about n = 30? n = 75?

ANSWER:   



## Inference from a single sample

### Exercises 1 and 2
 

Take a single random sample of size n = 75 quadrats and estimate the mean number of bears per quadrat. Also, estimate the population size (using N^ = 164*(mean bears/quadrat).  
 
```{r}
 

```
 
 
### Exercise 3

Create a bootstrap distribution for N^ by resampling these 75 quadrats (from step 1) 5000 times, calculating the sample mean each time, and then N^= 164*sample mean. Calculate the standard error of the bootstrap distribution for N^.


```{r}



```

### Exercise 4

Create a 95% confidence for N using the bootstrap distribution above. 

```{r}


```

QUESTION:  Does your interval contain N? Are you surprised by your result - why or why not?

ANSWER:  
 

## CLT and Proportions

### Exercise 1

```{r}
nbikers<-round(400700*0.041) # Number of bikers in Minneapolis
Bike.Y.N<-data.frame(bike=c(rep("Yes", nbikers), rep("No", 400700-nbikers))) # bikers & non-bikers
tally(~bike, data=Bike.Y.N, format = "proportion")
prop(~bike, data=Bike.Y.N, success="Yes") # population proportion
```


Generate a sampling distribution of p^ using a sample size of 50. Use `gf_dhistogram` with `gf_fitdistr` to overlay a normal distribution. 

```{r}
 

```

### Exercise 2

Repeat with a sample size of 100.

```{r}


```

### Exercise 3

Repeat with a sample size of 300.
```{r}
 
```

### Exercise 4

Question:  For each sample size, describe the sampling distribution. Consider its shape and center. How does the sampling distribution change as you increase the sample size?

Answer:  
