Chapter 3 Model building

3.1 Model specification

  1. Pick a model type: choose from this list
  2. Set the engine: choose from this list
  3. Set the mode: regression or classification
library(tidymodels)

lm_spec <- # your model specification
  linear_reg() %>%  # model type
  set_engine(engine = "lm") %>%  # model engine
  set_mode("regression") # model mode

# Show your model specification
lm_spec
## Linear Regression Model Specification (regression)
## 
## Computational engine: lm

3.2 Model training

In the training process, you run an algorithm on data and thereby produce a model. This process is also called model fitting.

lm_fit <- # your fitted model
  lm_spec %>%  # your model specification  
  fit( 
  median_house_value ~ median_income, # a Linear Regression formula 
  data = new_train # your data
  )

# Show your fitted model
lm_fit
## parsnip model object
## 
## Fit time:  5ms 
## 
## Call:
## stats::lm(formula = median_house_value ~ median_income, data = data)
## 
## Coefficients:
##   (Intercept)  median_income  
##         46669          41309

3.3 Model predictions

We use our fitted model to make predictions.

price_pred <- 
  lm_fit %>% 
  predict(new_data = new_train) %>%
  mutate(price_truth = new_train$median_house_value)

head(price_pred)
## # A tibble: 6 x 2
##     .pred price_truth
##     <dbl>       <dbl>
## 1 390577.      452600
## 2 389594.      358500
## 3 346467.      352100
## 4 279782.      341300
## 5 213427.      269700
## 6 175554.      241400

3.4 Model evaluation

We use the Root Mean Squared Error (RMSE) to evaluate our regression model. Therefore, we use the function \(rmse(data, truth, estimate)\).

rmse(data = price_pred, 
     truth = price_truth, 
     estimate = .pred)
## # A tibble: 1 x 3
##   .metric .estimator .estimate
##   <chr>   <chr>          <dbl>
## 1 rmse    standard      83734.