Case study#

The goal of this case study is to build a model for predicting the sale price of a house based on a particular home’s characteristics.

We will start with data exploration and afterwards demonstrate how to build regression models with different Python modules:

  1. statsmodels

  2. scikit-learn

Resources

Data exploration#

In November of 2020, information on 98 houses in the Duke Forest neighborhood of Durham, NC were scraped from the real estate marketplace Zillow. The homes were all recently sold at the time of data collection.

datascience

Let’s start with our data exploration:

Jupyter notebook

Statsmodels#

Linear regression with statsmodels:

Jupyter notebook

Scikit-learn#

Linear regression with scikit-learn:

Jupyter notebook