library(mosaic)
library(dplyr)
library(googledrive)
library(googlesheets4)Load libraries
Let’s have a quick look at the data:
✔ Reading from "2025Scrabble (Responses)".
✔ Range 'Form Responses 1'.
head(scrabble)# A tibble: 6 × 2
Letters Score
<dbl> <dbl>
1 11 27
2 7 12
3 5 12
4 7 24
5 6 8
6 8 17
Let’s visualize the data with a scatterplot, overlaying the reggression line:
gf_point(Score~Letters, data=scrabble) %>% gf_lm()And, now fit the linear regression model:
lm.scrab<-lm(Score~Letters, data=scrabble)
summary(lm.scrab)
Call:
lm(formula = Score ~ Letters, data = scrabble)
Residuals:
Min 1Q Median 3Q Max
-6.7395 -3.7950 -0.4614 1.8722 10.2605
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.2925 2.0678 0.625 0.537
Letters 1.7782 0.2674 6.649 2.31e-07 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 4.653 on 30 degrees of freedom
Multiple R-squared: 0.5957, Adjusted R-squared: 0.5822
F-statistic: 44.21 on 1 and 30 DF, p-value: 2.311e-07
Exercises
- Review interpretation of intercept and slope
- Calculate Y^ for your name
- Calculate a CI for the slope using the summary output
- Know how to calculate the info in the 3rd and 4th columns of the summary output
- Know how to interpret R^2 and residual standard error (i.e., sigma^) Predictions and confidence intervals
scrab.pred<-makeFun(lm.scrab)
scrab.pred(Letters=10, interval="confidence") fit lwr upr
1 19.07402 16.76276 21.38528
scrab.pred(Letters=10, interval="prediction") fit lwr upr
1 19.07402 9.293873 28.85416
How do we interpret these intervals?
- confidence interval: I am 95% sure that the mean scrabble score among individuals that have 10 letters in their name is between 17.47 and 20.16.
- prediction interval: I am 95% sure that the scrabble score for an individual with 10 letters in their name is between 17.47 and 20.16.