{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# Statsmodels"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Setup"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [],
"source": [
"%matplotlib inline\n",
"\n",
"import pandas as pd\n",
"import seaborn as sns \n",
"import matplotlib.pyplot as plt\n",
"import statsmodels.formula.api as smf\n",
"from statsmodels.tools.eval_measures import mse, rmse\n",
"\n",
"sns.set_theme(style=\"ticks\", color_codes=True)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Data preparation"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"scrolled": true,
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [],
"source": [
"# See notebook \"Data Exploration\" for details about data preprocessing\n",
"from case_duke_data_prep import *"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Data splitting"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [],
"source": [
"train_dataset = df.sample(frac=0.8, random_state=0)\n",
"test_dataset = df.drop(train_dataset.index)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Modeling"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Train the model"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [],
"source": [
"# Fit Model\n",
"lm = smf.ols(formula='price ~ area', data=train_dataset).fit()"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
" | coef | std err | t | P>|t| | [0.025 | 0.975] | \n",
"
\n",
"\n",
" Intercept | 8.593e+04 | 6.21e+04 | 1.383 | 0.171 | -3.78e+04 | 2.1e+05 | \n",
"
\n",
"\n",
" area | 167.7007 | 20.741 | 8.085 | 0.000 | 126.391 | 209.010 | \n",
"
\n",
"
"
],
"text/plain": [
""
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Short summary\n",
"lm.summary().tables[1]"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"OLS Regression Results\n",
"\n",
" Dep. Variable: | price | R-squared: | 0.462 | \n",
"
\n",
"\n",
" Model: | OLS | Adj. R-squared: | 0.455 | \n",
"
\n",
"\n",
" Method: | Least Squares | F-statistic: | 65.37 | \n",
"
\n",
"\n",
" Date: | Wed, 05 Jan 2022 | Prob (F-statistic): | 7.56e-12 | \n",
"
\n",
"\n",
" Time: | 22:37:38 | Log-Likelihood: | -1053.3 | \n",
"
\n",
"\n",
" No. Observations: | 78 | AIC: | 2111. | \n",
"
\n",
"\n",
" Df Residuals: | 76 | BIC: | 2115. | \n",
"
\n",
"\n",
" Df Model: | 1 | | | \n",
"
\n",
"\n",
" Covariance Type: | nonrobust | | | \n",
"
\n",
"
\n",
"\n",
"\n",
" | coef | std err | t | P>|t| | [0.025 | 0.975] | \n",
"
\n",
"\n",
" Intercept | 8.593e+04 | 6.21e+04 | 1.383 | 0.171 | -3.78e+04 | 2.1e+05 | \n",
"
\n",
"\n",
" area | 167.7007 | 20.741 | 8.085 | 0.000 | 126.391 | 209.010 | \n",
"
\n",
"
\n",
"\n",
"\n",
" Omnibus: | 26.589 | Durbin-Watson: | 2.159 | \n",
"
\n",
"\n",
" Prob(Omnibus): | 0.000 | Jarque-Bera (JB): | 107.927 | \n",
"
\n",
"\n",
" Skew: | -0.862 | Prob(JB): | 3.66e-24 | \n",
"
\n",
"\n",
" Kurtosis: | 8.499 | Cond. No. | 9.16e+03 | \n",
"
\n",
"
Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
[2] The condition number is large, 9.16e+03. This might indicate that there are
strong multicollinearity or other numerical problems."
],
"text/plain": [
"\n",
"\"\"\"\n",
" OLS Regression Results \n",
"==============================================================================\n",
"Dep. Variable: price R-squared: 0.462\n",
"Model: OLS Adj. R-squared: 0.455\n",
"Method: Least Squares F-statistic: 65.37\n",
"Date: Wed, 05 Jan 2022 Prob (F-statistic): 7.56e-12\n",
"Time: 22:37:38 Log-Likelihood: -1053.3\n",
"No. Observations: 78 AIC: 2111.\n",
"Df Residuals: 76 BIC: 2115.\n",
"Df Model: 1 \n",
"Covariance Type: nonrobust \n",
"==============================================================================\n",
" coef std err t P>|t| [0.025 0.975]\n",
"------------------------------------------------------------------------------\n",
"Intercept 8.593e+04 6.21e+04 1.383 0.171 -3.78e+04 2.1e+05\n",
"area 167.7007 20.741 8.085 0.000 126.391 209.010\n",
"==============================================================================\n",
"Omnibus: 26.589 Durbin-Watson: 2.159\n",
"Prob(Omnibus): 0.000 Jarque-Bera (JB): 107.927\n",
"Skew: -0.862 Prob(JB): 3.66e-24\n",
"Kurtosis: 8.499 Cond. No. 9.16e+03\n",
"==============================================================================\n",
"\n",
"Notes:\n",
"[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.\n",
"[2] The condition number is large, 9.16e+03. This might indicate that there are\n",
"strong multicollinearity or other numerical problems.\n",
"\"\"\""
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Full summary\n",
"lm.summary()"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"To obtain single statistics:"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"data": {
"text/plain": [
"0.4553434818683253"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Adjusted R squared \n",
"lm.rsquared_adj"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"data": {
"text/plain": [
"0.4624169431427626"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# R squared\n",
"lm.rsquared"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"data": {
"text/plain": [
"2110.625966301898"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# AIC\n",
"lm.aic"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"Int64Index: 78 entries, 26 to 73\n",
"Data columns (total 7 columns):\n",
" # Column Non-Null Count Dtype \n",
"--- ------ -------------- ----- \n",
" 0 price 78 non-null int64 \n",
" 1 bed 78 non-null int64 \n",
" 2 bath 78 non-null float64 \n",
" 3 area 78 non-null int64 \n",
" 4 year_built 78 non-null int64 \n",
" 5 cooling 78 non-null category\n",
" 6 lot 78 non-null float64 \n",
"dtypes: category(1), float64(2), int64(4)\n",
"memory usage: 4.5 KB\n"
]
}
],
"source": [
"train_dataset.info()"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [],
"source": [
"# Add the regression predictions (as \"pred\") to our DataFrame\n",
"train_dataset['y_pred'] = lm.predict()"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"data": {
"text/plain": [
"31402336646.61913"
]
},
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# MSE\n",
"mse(train_dataset['price'], train_dataset['y_pred'])"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"data": {
"text/plain": [
"177207.04457390832"
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# RMSE\n",
"rmse(train_dataset['price'], train_dataset['y_pred'])"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAXYAAAEGCAYAAABxfL6kAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjUuMCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8/fFQqAAAACXBIWXMAAAsTAAALEwEAmpwYAAAovUlEQVR4nO3de3xTVaIv8F+oLZCpFSktZYCmR4/yfp3xjOh4yyDKq6UV8CqQ2h5xWgE7ncEnQ6vVcarCMDJixbHOFYoNo+gMLy9TURg+4wxcUe8IPvCIYlqCTRtaJPRl02SdPzCxjyR7p91JdnZ+388nH0mymqxl219W11p7LZ0QQoCIiDRjQLgrQEREymKwExFpDIOdiEhjGOxERBrDYCci0hgGOxGRxqgm2Jubm5GZmQmLxeK33KlTp3DHHXcgKysLd911F86fPx+iGhIRRQZVBPuxY8ewdOlSmM1mv+WEEFi5ciXy8/OxZ88ejBs3DhUVFaGpJBFRhLgk3BUAgB07dqC0tBQPPvig57Fdu3ahsrISLpcLEyZMQGlpKU6ePAm9Xo/09HQAwIoVK2C328NVbSIiVdKp6crTG2+8Edu2bUNbWxtKS0uxZcsWDBw4EL/73e8wePBgpKWlYefOnUhKSsKJEydwxRVX4OGHH8aQIUPCXXUiItVQxVBMT++++y5qampw2223ITs7GwcOHMCpU6fQ2dmJo0ePYunSpdi5cydGjx6Np556KtzVJSJSFVUMxfTkdDoxb948lJSUAABaWlrgdDrxySefwGAwYNKkSQCAzMxMFBUVhbOqRESqo8oe+7XXXou33noLjY2NEELg0UcfRWVlJaZNm4ampiZ89tlnAICDBw9iwoQJYa4tEZG6qLLHPnbsWBQWFiIvLw8ulwvjxo1DQUEBBg4ciOeeew4lJSVoa2tDSkoK1q9fH+7qEhGpiqomT4mIqP9UORRDRER9F9ahmPb2dnz88cdISkpCTExMOKtCRBQxnE4nbDYbJk6ciEGDBvV6PqzB/vHHH8NoNIazCkREEctkMuGaa67p9XhYgz0pKQnAxcqlpKSEsypERBHDarXCaDR6MrSnsAa7e/glJSUFo0aNCmdViIgijq8hbE6eEhFpDIOdiEhjGOxERBrDYCci0hgGOxGRgkwmE9LS0jBgwACkpaXBZDKFvA6q3CuGiCgSmUwmFBQUoLW1FQBQU1ODgoICAAjpNTvssRMRKaS4uNgT6m6tra0oLi4OaT0Y7ERECqmtrQ3o8WBhsBMRKSQ1NTWgx4OFwU5EpJCysjLo9fpuj+n1epSVlYW0Hgx2IiKFGI1GVFRUwGAwQKfTwWAwoKKiIuSbHXJVDBGRgoxGY9h3rWWPnYhIYxjsREQaw2AnItIYBjsRkcYw2ImINIbBTkSkMQx2IiKNYbATEWmMrGBvbm5GZmYmLBaLzzKHDh3CjTfeqFjFiIiobySD/dixY1i6dCnMZrPPMmfPnsW6deuUrBcREfWRZLDv2LEDpaWlSE5O9lmmpKQEhYWFfl/HbrfDYrF0u1mt1sBrTEREfknuFSO1K9m2bdswfvx4TJkyxW+5yspKlJeXB1Y7IiIKWL82Afv888+xf/9+bN26VbL3nZeXh4ULF3Z7zGq1hn2zHCIirelXsFdXV8Nms2Hx4sVwOBxoaGjAsmXLsH379l5lExISkJCQ0J+3IyIiGfoV7EVFRSgqKgIAWCwW5Obmeg11IiIKnT6tY8/Pz8dHH32kdF2IiEgBsnvsBw8e9Pz7xRdf7PX8qFGjupUhIqLw4JWnREQaw2AnItIYBjsRkcYw2ImINIbBTkSkMQx2IiKNYbATEWkMg52ISGMY7EREGsNgJwoCk8mEtLQ0DBgwAGlpaTCZTOGuEkWRfm0CRkS9mUwmFBQUoLW1FQBQU1ODgoICAOA21RQS7LETKay4uNgT6m6tra0oLi4OU40o2jDYiRRWW1sb0OMUfYI9VMdgJ1JYampqQI9TdHEP1dXU1EAI4RmqUzLcGexECisrK4Ner+/2mF6vlzw/mKJDKIbqGOxECjMajaioqIDBYIBOp4PBYEBFRQUnTglAaIbquCqGKAiMRiODnLxKTU1FTU2N18eVwh47EVEIhWKoTlawNzc3IzMzExaLpddzb7/9NrKzs5GVlYVVq1bh/PnzilWOiEhrQjFUJxnsx44dw9KlS2E2m3s919zcjEcffRQVFRXYs2cPxowZg2effVaxyhERaZHRaITZbIbL5YLZbFZ82E4y2Hfs2IHS0lIkJyf3es7hcKC0tBTDhw8HAIwZMwZ1dXWKVpCIiAIjOXnqb9zn8ssvx8033wwAaG9vR0VFBe644w6vZe12O+x2e7fHrFZrIHUlIiIZFFkVc+HCBdxzzz0YO3YsFi5c6LVMZWUlysvLlXg7IiLyo9/B3tDQgLvuugvTp0/H2rVrfZbLy8vrFfpWq5VLwoiIFNavYHc6nVixYgXmzZuHVatW+S2bkJCAhISE/rwdERHJ0Kdgz8/PR1FREaxWKz799FM4nU68+eabAICJEyfy0mkiojCSHewHDx70/PvFF18EAEyaNAmfffaZ8rUiIqI+45WnREQaw2AnItIYBjsRkcYw2ImINIbb9hIRhZAQAh9u3ozmM2dw9a23Yvh//Ifi78FgJyIKgabPP8ef587F+a++8jz2r+eew6r6elwyaJCi78VgJyIKEldnJ/5RUoKj69Z5fT4mLg66AcqPiDPYiYgU9uHzz+NtiavxAWBxdTVi4uIUf38GOxGRAloaGvD8d1uY+zP8Rz/CLbt24dJRo4JWFwY7EVE/7LvjDnxaVSVZbn5VFcaHaNNDBjsRUYBsx4+jcsoUyXJJkydjyd//joGXXRaCWn2PwU5EJIMQAv/nqqvwzZdfSpad/vDDuOHXvw5BrbxjsBMR+fHFnj3YlZ0tq2xRczPifvCDINdIGoOdiKiHzvZ2/H7wYFllM195BWNvvz3INQoMg52I6DvvrluHd9askSynHz4cK7/+Oihr0JXAYCeiqNZiteL5ESNklTUePYoR//mfQa5R/6nz44aIFGEymZCWloYBAwYgLS0NJpMp3FVSjb1LlmCDTicZ6v+enY37hcD9QkREqAMMdqKgCXeomkwmFBQUoKamBkII1NTUoKCgIKrDveHDD7FBp8MGnQ7//eqrfsvebbHgfiFwy65doamcgjgUQxQE7lBtbW0FAE+oAoAxRBepFBcXe97frbW1FcXFxSGrgxoIIVBhMODC6dOSZX/y61/juocfDkGtgktWj725uRmZmZmwWCy9njtx4gQWLVqEOXPmoLi4GJ2dnYpXkijS+AvVUKmtrQ3oca35/C9/wQadDr8bMEAy1H/R0oL7hdBEqAMygv3YsWNYunQpzGaz1+cfeOABPPLII3jzzTchhMCOHTuUriNRxFFDqKampgb0uBY42to8Qy17Fi/2Wzbr9dc9Y+exen2IahgaksG+Y8cOlJaWIjk5uddzZ86cQXt7O6ZOnQoAWLRoEaqrqxWvJFGkUUOolpWVQd8jsPR6PcrKykJWh1D5f088gQ06HZ6RCOhLR4/GfU4n7hcCV0sEfySTHGP390PQ0NCApKQkz/2kpCTU19d7LWu322G327s9ZrVa5daTKKKUlZV1G2MHQh+q7nH04uJi1NbWIjU1FWVlZZoZX2/++mv8YeRIWWXv+OCDoJxUpFb9mjx1uVzQ6XSe+0KIbve7qqysRHl5eX/ejihiqCVUjUajZoLcbffixTj5l79Ilrtq8WJkv/56CGqkPv0K9pSUFNhsNs/9s2fPeh2yAYC8vDwsXLiw22NWq1VzP3REbloM1XCxvv8+qmSuIV/x9deIl3nBkVb1K9hHjhyJgQMH4oMPPsCPfvQj7N69G+np6V7LJiQkICEhoT9vR0RRRLhc+MPIkWiRMWT7v554Atf+6lchqFVk6FOw5+fno6ioCJMmTcKGDRtQUlKC5uZmTJgwAbm5uUrXkYiiyH+/9hr23nabrLK/aG1FrMzNuqKJ7GA/ePCg598vvvii599jx47F61E6jkVEyuhoacGm+HhZZbN37sRVt9wS3ApFOF55SkRh889HH8WRxx6TLHfZv/0bfvbllz4XZ1B3DHYiCqkLFgteGD1aVtncDz9Esowj6Kg7BjsRhcTOrCx8uXevZLlxy5YhI4o3KlMCg52Igqbu6FGYrr1WVtmVVit+MHx4kGsUHRjsRKQo4XJhc3Iy2hobJcumr1+PHz/wQAhqFV0Y7ESkiBN/+hP+77Jlssr+sq0NlwwaFOQaRS8etEEhE+6DJ0h57efOeXZTlAr1hXv3enZTZKgHF3vsFBJqOHiClLP9Jz/B14cPS5YbOmYM7jxxgssUQ4zBTiHB03win+2jj1A5ebKssnnHjyNp0qQg14h8YbBTSKjh4Anqmw0ye9uJ48fjzk8+CXJtSA4GO4VEamoqampqvD5O6vP+xo04dO+9ssrmf/UVLktLC26FKCCcPKWQiKbTfHqKlEljp8PhmQiVCvUxt93mmQhlqKsPe+wUEmo5eCLUImHS+PV582CWeaTlL1paNHc+qBbphBAiXG9usVgwa9YsHDhwAKNGjQpXNYiCJi0tzesQlMFg8HlAfCg019XhDz/8oayy037+c8zatCnINaJASGUne+xEQaS2SWO5E6EAcF+Poy8pcnCMnSiIfE0Oh3LS+Ivduz1j51KyXn/dM3bOUI9c7LETBVFZWVm3MXYgdJPGgfTO7w/fiCwFAYOdKIhCPWksd2tcAFj+2WcYOmZMUOpB4cWhGAq6SFnup6SubS4uLkZZWRlcLhfMZrPioe5oa/MMtcgJdfdQC0Ndu2QF+969ezF//nzMnj3b6y/lJ598gsWLFyMrKwt333037Ha74hWl4AtGALuX+9XU1EAI4Vnup+VwD1Wb3WH+jIzlh4XnznkCnaKAkGC1WsXMmTPFuXPnREtLi1iwYIE4efJktzJLly4Vhw4dEkII8eSTT4qnn35a6mWFEEKcPn1aXH311eL06dOyylPwVFVVCb1eLwB4bnq9XlRVVfXrdQ0GQ7fXdN8MBoMyFVehYLbZ9vHH4reArNvWKVP6/X6kTlLZKTnGfvjwYUyfPh1DhgwBAMyZMwfV1dUoLCz0lHG5XGhpaQEAtLW14bLLLlPsg4dCI1ibdKltuV8oBKPNXKZIgZAcimloaEBSUpLnfnJyMurr67uVWbNmDUpKSnDDDTfg8OHDWLJkSa/XsdvtsFgs3W5Wq1WBJpASghXAaljuF2w9h7CGDh3qtVygbT72wguylymmr1vHZYrkIdljd/X49Bc9fnDa29tRXFyMrVu3YvLkydiyZQseeughVFRUdHudyspKlJeXK1h1UlKwNukK53K/UPC2ZUBsbCzi4uLQ0dHhKRdIm7lMkfpLsseekpICm83muW+z2ZCcnOy5//nnn2PgwIGY/N0+zbfffjuOHj3a63Xy8vJw4MCBbjctT6BFmmBt0mU0GlFRUQGDwQCdTgeDwYCKigrV7JPSX96GsBwOBy699NKA2vzazTfL7p3nvPceJ0LJL8ke+/XXX49nn30WTU1NGDx4MPbv34/HH3/c87zBYIDVasWpU6dwxRVX4MCBA5jkZYP9hIQEJCQkKFt7Ukww11sbjUbNBHlPvoaqmpqacPbsWb9f29HSgk3x8bLfi0FOckkG+/Dhw7F69Wrk5ubC4XDg1ltvxeTJk5Gfn4+ioiJMmjQJTz75JH75y19CCIHExEQ88cQToag7KUzLARwsfRnCCmSo5efnz2MgO0QUIFlXni5YsAALFizo9tiLL77o+feMGTMwY8YMZWtG5IfJZFLFFsBy5xAajh3DtqlTZb3mD6+7DstknCdK5Au3FKCIo6Y9zqWGsDgRSuHALQUo4vhbcx8ORqMRZrPZs2VA+2OPyZ4Infn733MilBTHHjtFHLVd9OQeFir0MtbuC4OcgonBThFHTQdju3vlhRLlAGDZkSP44fTpwa0QERjsFIHCfdGTvbYWFQaD7PLsnVOocYw9SkXyVrrhuujJPW4uJ9R/DXDsnMKGwR6FtLCVbs8Jy2CF+rtPPil7IhQAHvjuNtRgiOgPT4psDPYoFOxVJVoINHeYv7N2rWTZUr3eE+jAxWGh+fPnR/yHJ0UuBnsEUDoog7mqJJL/GnCHuZzeeVxCgmeoxduw0L59+1S1JJOiC4Nd5YIRlMHcSldta8ylCCECGmq5XwiMqKrC05df7vmgBdBrWEhtSzIpujDYVS4YQRmsnRwB9a0x98Ud5r8bIP0rcPMLL3h653I/aKNhH3pSLwa7ygUjKIO5qkTNgdZ44kTAvfP7hcCU77YrAOR/0Abzw5NICoNd5YIVlMFYVWIymdDc3NzrcTmBFswJV3eYbxk/XrLsSqvV7zJFuR+0Wt+HntSNFyipXLgvxpGr58ZcbomJiXjmmWf8BlowNvV6a8UKHHvhBdnl5a43D+SqV26DTOHCHrvKRUrPz9sQBQDEx8f7rKu7l56Tk6PYPIK7dy4n1B8AUG4wBHQREYdYKBIw2CNAqC7G6Y9A5wK6TkLKeU1/QzWBLFME0G3NeaBzFZHyQUvRjUMxpIhAN+by1cP39rXehmruzs9HXU6O7PqVGwyKbRzGIRZSO/bYSRGBDlFI9ZS7fm3XD4Hffnd7tK1Nsk43bd7smQjlEApFE/bYqV+6HlE3dOhQDB48GE1NTZLH1fnq4QMXD0jv9rU1NfhtAHXyNmYezMO6idRGVrDv3bsXzz//PDo7O5GXl9frl+HUqVMoLS3F+fPnkZSUhKeffhqXXXZZUCpM6tFziKSxsRF6vR4vv/yyZGD6Wu3Tdbw6kL3OSwG04uIqnBEmk9f35xAKRQvJoZj6+nps3LgR27dvx65du/Dqq6/iiy++8DwvhMDKlSuRn5+PPXv2YNy4caioqAhqpSlwwVgn3p+rYn1NQg7ctq1PE6HuWjQ2NkbM3jREwSLZYz98+DCmT5+OIUOGAADmzJmD6upqFBZe7Ed98skn0Ov1SE9PBwCsWLECdru91+vY7fZej1ut1v7Wn2QI1uHP/b0qtmsPeoNOJ3sy1NdEqFtrayvy8vI870EUbSR77A0NDUhKSvLcT05ORn19ved+bW0thg0bhrVr12LhwoUoLS3tNUkFAJWVlZg1a1a3G3/pQiNYG3P5WlEydOhQWX8dBLpM0T0RajabYZA47MLpdIat566FbYspskkGu8vlgq7LL54Qotv9zs5OHD16FEuXLsXOnTsxevRoPPXUU71eJy8vDwcOHOh24w98aCi934w7uGpqarr9LABAXFwc7Ha7z02yXJ2dfQrznhOi3la59BSOXSUjedti0g7JYE9JSYHNZvPct9lsSE5O9txPSkqCwWDApEmTAACZmZk4fvx4r9dJSEjAqFGjut1SUlKUaANJUHK/mZ4XFnX9oDcYDLj00kvhcDi6fU1rayvqcnKwQafD07Gxku8x67nnJI+Vc4/RJyYm+n2tUO8qGWnbFpM2SQb79ddfjyNHjqCpqQltbW3Yv3+/ZzwdAKZNm4ampiZ89tlnAICDBw9iwoQJwasxyeavZ93XNdzegksIAYPBALPZjKamJs/jV+H7dedyuMN82qpVssobjUacPXsWVVVViImJ8Vom1LtKRsq2xaRtksE+fPhwrF69Grm5ubjllluQmZmJyZMnIz8/Hx999BEGDRqE5557DiUlJcjIyMC7776LNWvWhKLuUSXQcVupnnVfL4OXCq7U1FRPmBd4Ldld4blz/T702Wg0orKyUtELkPo6Tq7mbYspiogwOn36tLj66qvF6dOnw1kN1auqqhJ6vV4A8Nz0er2oqqry+TUGg6FbeffNYDB4fX2DwSB0Op0wGAx9et0H4uLEbwHZt2AIpB1SrxPo/28lvpZILqnsZLCrXFVVlRgwYIDskHbT6XRev0an0/V6/UCCqGf5QMJcp9MpFr7BFMiHojdqbKMa60R9J5WdOiH68TdwP1ksFsyaNQsHDhzAqFGjwlUN1TKZTLjzzjt7TUa66XQ6uFwur8+5x9Z7co+FB1quK7krWtzcOykmJiaira3N79WmajBgwAB4+7Xw9/9bzbztla/G/+8kn1R2chOwMJIaxy0uLvYZ6oD/cVtfywGbm5u7vY/cyb6qysqAlimW6vXdtseNjY3FuXPnImLFiNbGyblSJ/ow2MNEznpnqZUU/iYGfS0H7HnJvVSIucPc+l//JdmmP+FikD+o03XbLiAxMdFvb1dtK0a0thMkV+pEHwZ7mMjpRfnrISYmJkr+GW00GhEfH9/r8a7v4y3EJg0ciMKamoD3a/n/Xerd9XCQ+Ph4dHR0+Px6tfWEtXaYhtb+AiFp3LY3TOT0osrKyryOscfFxeGZZ55R5H26bmdb6B5r//ZbydddC8CBi+PO6DIe7a1n669nqNaesJZ2goyUc3NJOeyxh4mcXpTRaMSWLVu6DackJibipZdekh06Uu+zbdo01OXkfB/qEty9c/dHjfju4iR/PVtfdYiJiYnonnCk0NpfICSNq2LCJFQrFXy9z2MSx9J19VRiIhobG70+52/1jL86xMbGIiEhQdahHETUHVfFqFSoelFd38d9RaicUB9y5ZUYUVWFUr3eZ6jL/XO+Z1vdk6mNjY3cKIsoCNhj17jO9nb8fvBg2eW7Xtrva4074OX4ugD0Ze08EX2PPXYNCWT/EvcyRTmh/hKAUytX9tqvxdekp06ng9ls7vNfF1x+RxRcDHYVkBPYcta9m996q0/Hyp0AvB5n2J9lcv7apNTyOx5oQeRDKPY18IV7xcjfq8XX/iWJiYkB7dcS4+U13Le+1i3Qr+vvRllVVVUiMTGxV/252RZFC24CpnJyN5zquanXnQFuwOUWExPj9f1iYmK81q8vm0fJaVNfN6Xy9qHg7/8bkRZxEzCVk7vhlHvCUe6hFQC87nG+atUqPP/8870eX7lyJTZv3hzAq/sWzE20/E3oKvUeRGrHydMwkTv+K2e8eYNOh0KZoX4cF/dq8XVwxebNm7Fy5UrPiUMxMTGKhjoQ3EvYpSZYeZk8EYNdcSaTCcOGDUNOTo6sA419bTj1m9LSPk2EvgzpcNu8eTM6OzshhEBnZ6eioQ4EdxMtf23jZfJEFzHYFeReueLtgh5f26T2vHjHfQGRdflyyfdLfOwxz/a4bl3DLVyrRvxdfNXfOvnajjgxMZGXyRO5yRmo37Nnj5g3b564+eab/U5y/e1vfxMzZ85UbAIg0viaNHTfep5e5Fb/4YcBTYSiyyShr0lINR7RplSdeBoQRbt+r4qxWq1i5syZ4ty5c6KlpUUsWLBAnDx5slc5m80m5s6dG9XB7us4OsD7io2AjpWT+SHh1t/j3QIlJ2xDXSdSBj9I1UcqOyWHYg4fPozp06djyJAh0Ov1mDNnDqqrq3uVKykpQWFhYR/+ZtAOOeO/n7z8suyxc90PfoA1MTF4ABcTUO57AaG9ulPOxVOhrhMpQ+73ltRFMtgbGhqQlJTkuZ+cnIz6+vpuZbZt24bx48djypQpPl/HbrfDYrF0u1mt1n5UXX18jf/Gx8fjsdZW1OXk4K+5uZKvc78QGFFVhUeEgNPp7PW8nEnCUB6uIPfoNR74EHl4rF5kkjxow+VyXTxM4TtCiG73P//8c+zfvx9bt271G9SVlZUoLy/vZ3XVreuhFbW1tbhr8GCMaW0FmpslvzbzlVcw9vbbPfe9/UIB8vcwD+XhCnJ74jzwIfLwr6zIJBnsKSkpeP/99z33bTYbkpOTPferq6ths9mwePFiOBwONDQ0YNmyZdi+fXu318nLy8PChQu7PWa1WjW3imHJbbehLifn4h0Z2+P6Wm/u6xfH5XLJ+n/W80MmmHuep6amer1oqGdPPJR1ImXI/d6SykgN0rsnTxsbG0Vra6vIysoSx44d8zmgHw2Tp94mk/6Qmip7IvQHKpz87A81rsAhZfB7q079njwdPnw4Vq9ejdzcXNxyyy3IzMzE5MmTkZ+fj48++ihoHzhq1XUy6TIhUFhTg7qcHFyQ+NO0Ed9fRNQC7z2ermu8m5ubERcX1+15tQ5b8Og17eL3NjJxr5gApaWlyT4fFABSXn4Zd999t+QReDw+jojkkspOyTF2usjyzjt4JT0dchZ0znruOUxbtcpzX6fT+RxXNplMKC4u9jqO6XA4EB8fj7NnzyrVDCKKAgx2CXL3agF8T4QajUavPW1vvfSe/O1kSETkDfeK8eK9DRtkX0T0ewClej1GVFUF/D6+ljR2pdPpeDEIEQWEPfbvOB0ObOwxWelLTEoKnhk40DO0UtHHsW85a4GFECguLubYOhHJFvXB/rf77sMHTz8tq2zRhQuIi48HAKxW4L19rRHuiReDEFEgojLYW202bO5ykZU/19x3H366YUNQ6uHtSkxveDEIEQUiqoK9cupU2I4dk1X2vh5bKQRDzysx9Xo9WlpaepWbP39+UOtBRNqi+cnTb06d8kyESoX6Lbt24X4hcH+P/XCCyWg0wmw2w+VyYdiwYV7L7Nu3T7H3C9fhG0QUOpoN9ldmzMAGnQ5/vPJKv+XiR470hPm/Z2eHqHbe+dtwSYlA5hasRNFBU0Mxp//+d7w6Y4assndbLLh05Mgg1ygwviZThw4d2m0s3h3IAAJaLeNvC1auuiHSjojvsbs6O/F0bCw26HSSoX7T5s2e3nnXUFfL8ISvQ6ABKLInNrdgJYoOERvs581mbNDp8HRsLFydnX7Lru7owIiqKtyybl2v8FbT8ISvDZeampq8lg80kHnQBVF0iMhgd7S14U833OC3zP9++21P7/yVHTt8hrfaTojpOplqNpthNBoVC2RffxGoccdIIuq7yAz25mY0nznT6/ER117rCXPDrFmex/2FdyQMTygVyNyClSg6ROTkqT4pCfNffhn/Ki9HXEICMkwm6Lucy9qTv/COhBNilDx5yNeGZESkHREZ7AAwPicH491H0EnwF96Rcg4nA5mI5IrIoZhA+RvK4PAEEWlNxPbYAyE1lMHeMBFpSVQEO8DwJqLoIWsoZu/evZg/fz5mz57tdX3322+/jezsbGRlZWHVqlU4f/684hUlIiJ5JIO9vr4eGzduxPbt27Fr1y68+uqr+OKLLzzPNzc349FHH0VFRQX27NmDMWPG4Nlnnw1qpd3UcsUoEZGaSAb74cOHMX36dAwZMgR6vR5z5sxBdXW153mHw4HS0lIMHz4cADBmzBjU1dUFr8bfUdMVo0REaiIZ7A0NDUjqskY8OTkZ9fX1nvuXX345br75ZgBAe3s7KioqcNNNN/V6HbvdDovF0u1mtVr7XHG1XTFKRKQWkpOnrh4HTggfe5VfuHAB99xzD8aOHYuFCxf2er6yshLl5eX9rO73IuGKUSKicJAM9pSUFLz//vue+zabDck9jpVraGjAXXfdhenTp2Pt2rVeXycvL69X4Fut1j6vVImEK0aJiMJBcijm+uuvx5EjR9DU1IS2tjbs378f6enpnuedTidWrFiBefPmobi42OfJQwkJCRg1alS3W0pKSp8rzg2tiIi8k+yxDx8+HKtXr0Zubi4cDgduvfVWTJ48Gfn5+SgqKoLVasWnn34Kp9OJN998EwAwceLEoAeskvunEBFpiU4IIcL15haLBbNmzcKBAwcwatSocFVDUe6tgPlhQ0TBIpWdUXPlaSi4l2D29wg7IqL+iIpNwEKFSzCJSA0Y7AriEkwiUgMGu4J4pigRqQGDXUFcgklEahCRwa7Wzb94aAcRqUHEBbvaN/8yGo0wm81wuVwwm80MddI8tXa0olnEBTtXnhCph9o7WtEq4oKdK0+I1IMdLXWKuGDnyhMi9WBHS50iLti58oRIPdjRUqeIC3auPCFSD3a01Cki94oxGo0MciIV4C6r6hSRwU5E6sGOlvpE3FAMERH5x2AnItIYBjsRkcYw2ImINCask6dOpxMAYLVaw1kNIqKI4s5Md4b2FNZgt9lsAHhsHBFRX9hsNhgMhl6Ph/Uw6/b2dnz88cdISkpCTEyMz3JWqxVGoxEmkwkpKSkhrGF4RFN7o6mtANurdaFqr9PphM1mw8SJEzFo0KBez4e1xz5o0CBcc801ssunpKR4PZFbq6KpvdHUVoDt1bpQtNdbT92Nk6dERBrDYCci0hgGOxGRxkREsCckJKCwsBAJCQnhrkpIRFN7o6mtANurdWppb1hXxRARkfIiosdORETyMdiJiDQm7MHe3NyMzMxMWCwWAMDhw4exYMECzJ49Gxs3bvSUO3HiBBYtWoQ5c+aguLgYnZ2dAICvv/4aRqMRc+fOxcqVK9HS0hKWdshRXl6OjIwMZGRkYP369QC03d5nnnkG8+fPR0ZGBrZs2QJA2+0FgHXr1mHNmjUAtN3WO+64AxkZGcjOzkZ2djaOHTum6fYePHgQixYtwrx58/Cb3/wGgMq/vyKMPvzwQ5GZmSkmTJggTp8+Ldra2sSMGTNEbW2tcDgcYvny5eLQoUNCCCEyMjLEv/71LyGEEL/61a+EyWQSQghRUFAg3njjDSGEEOXl5WL9+vVhaYuUf/7zn+L2228X3377rejo6BC5ubli7969mm3vu+++K5YsWSIcDodoa2sTM2fOFCdOnNBse4UQ4vDhw+Laa68VDz30kKZ/ll0ul7jhhhuEw+HwPKbl9tbW1oobbrhB1NXViY6ODrF06VJx6NAhVbc3rD32HTt2oLS0FMnJyQCA48ePw2AwYPTo0bjkkkuwYMECVFdX48yZM2hvb8fUqVMBAIsWLUJ1dTUcDgfee+89zJkzp9vjapSUlIQ1a9YgLi4OsbGxuPLKK2E2mzXb3h//+MfYtm0bLrnkEjQ2NsLpdMJut2u2vd988w02btyIFStWAND2z/KpU6cAAMuXL0dWVhaqqqo03d633noL8+fPR0pKCmJjY7Fx40YMHjxY1e0N65YCPQ+8bWhoQFJSkud+cnIy6uvrez2elJSE+vp6nDt3DvHx8bjkkku6Pa5GV111leffZrMZf/3rX5GTk6PZ9gJAbGwsNm3ahJdeeglz587V9Pf3kUcewerVq1FXVwdA2z/Ldrsd1113HR5++GE4HA7k5ubiZz/7mWbbW1NTg9jYWKxYsQJ1dXX46U9/iquuukrV7Q37GHtXLpcLOp3Oc18IAZ1O5/Nx93+76nlfbU6ePInly5fjwQcfxOjRozXf3qKiIhw5cgR1dXUwm82abO9rr72GESNG4LrrrvM8puWf5WnTpmH9+vW49NJLMXToUNx6663YtGmTZtvrdDpx5MgRPPHEE3j11Vdx/PhxnD59WtXtVdVh1ikpKZ6tfIGLW1ImJyf3evzs2bNITk7G0KFDceHCBTidTsTExHjKq9UHH3yAoqIirF27FhkZGTh69Khm2/vll1+io6MD48aNw+DBgzF79mxUV1d328VTK+3dt28fbDYbsrOzcf78ebS2tuLMmTOabCsAvP/++3A4HJ4PMiEERo4cqdmf5WHDhuG6667D0KFDAQA33XST6n+WVdVjnzJlCr766ivU1NTA6XTijTfeQHp6OkaOHImBAwfigw8+AADs3r0b6enpiI2NxTXXXIN9+/YBAHbt2oX09PRwNsGnuro63HPPPdiwYQMyMjIAaLu9FosFJSUl6OjoQEdHBw4cOIAlS5Zosr1btmzBG2+8gd27d6OoqAg33ngj/vjHP2qyrQBw4cIFrF+/Ht9++y2am5uxc+dO3HvvvZpt78yZM/GPf/wDdrsdTqcT77zzDubOnavu9gZtWjYAM2fOFKdPnxZCXFxZsGDBAjF79mxRVlYmXC6XEEKIEydOiMWLF4s5c+aIe++9V3z77bdCCCEsFovIyckR8+bNE8uXLxfffPNN2Nrhz+OPPy6mTp0qsrKyPLft27drtr1CCLFp0yYxb948kZmZKTZt2iSE0O731+3Pf/6zeOihh4QQ2m7rxo0bxdy5c8Xs2bPF1q1bhRDabu9rr70mMjIyxOzZs8Vjjz0mnE6nqtvLLQWIiDRGVUMxRETUfwx2IiKNYbATEWkMg52ISGMY7EREGsNgJyLSGAY7EZHGMNiJiDTmfwC/TX72W2d/AwAAAABJRU5ErkJggg==",
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# Plot regression line \n",
"plt.scatter(train_dataset['area'], train_dataset['price'], color='black')\n",
"plt.plot(train_dataset['area'], train_dataset['y_pred'], color='darkred', linewidth=3);"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"data": {
"image/png": "",
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# Plot with Seaborn\n",
"\n",
"import seaborn as sns \n",
"sns.set_theme(style=\"ticks\")\n",
"\n",
"sns.lmplot(x='area', y='price', data=train_dataset, line_kws={'color': 'darkred'}, ci=False);"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"data": {
"image/png": "",
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"sns.residplot(x=\"y_pred\", y=\"price\", data=train_dataset, scatter_kws={\"s\": 80});"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Validation with test data"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [],
"source": [
"# Add regression predictions for the test set (as \"pred_test\") to our DataFrame\n",
"test_dataset['y_pred'] = lm.predict(test_dataset['area'])"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" price | \n",
" bed | \n",
" bath | \n",
" area | \n",
" year_built | \n",
" cooling | \n",
" lot | \n",
" y_pred | \n",
"
\n",
" \n",
" \n",
" \n",
" 9 | \n",
" 650000 | \n",
" 3 | \n",
" 2.0 | \n",
" 2173 | \n",
" 1964 | \n",
" other | \n",
" 0.73 | \n",
" 450348.456333 | \n",
"
\n",
" \n",
" 12 | \n",
" 671500 | \n",
" 3 | \n",
" 3.0 | \n",
" 2200 | \n",
" 1964 | \n",
" central | \n",
" 0.51 | \n",
" 454876.375736 | \n",
"
\n",
" \n",
" 21 | \n",
" 645000 | \n",
" 4 | \n",
" 4.0 | \n",
" 2300 | \n",
" 1969 | \n",
" central | \n",
" 0.47 | \n",
" 471646.447601 | \n",
"
\n",
" \n",
" 25 | \n",
" 603000 | \n",
" 4 | \n",
" 4.0 | \n",
" 3487 | \n",
" 1965 | \n",
" central | \n",
" 0.61 | \n",
" 670707.200633 | \n",
"
\n",
" \n",
" 36 | \n",
" 615000 | \n",
" 3 | \n",
" 3.0 | \n",
" 2203 | \n",
" 1954 | \n",
" other | \n",
" 0.63 | \n",
" 455379.477892 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" price bed bath area year_built cooling lot y_pred\n",
"9 650000 3 2.0 2173 1964 other 0.73 450348.456333\n",
"12 671500 3 3.0 2200 1964 central 0.51 454876.375736\n",
"21 645000 4 4.0 2300 1969 central 0.47 471646.447601\n",
"25 603000 4 4.0 3487 1965 central 0.61 670707.200633\n",
"36 615000 3 3.0 2203 1954 other 0.63 455379.477892"
]
},
"execution_count": 23,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"test_dataset.head()"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"data": {
"image/png": "",
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# Plot regression line \n",
"plt.scatter(test_dataset['area'], test_dataset['price'], color='black')\n",
"plt.plot(test_dataset['area'], test_dataset['y_pred'], color='darkred', linewidth=3);"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"data": {
"image/png": "",
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"sns.residplot(x=\"y_pred\", y=\"price\", data=test_dataset, scatter_kws={\"s\": 80});"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"data": {
"text/plain": [
"119345.88525637302"
]
},
"execution_count": 26,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# RMSE\n",
"rmse(test_dataset['price'], test_dataset['y_pred'])"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Multiple regression"
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [],
"source": [
"lm_m = smf.ols(formula='price ~ area + bed + bath + year_built + cooling + lot', data=train_dataset).fit()"
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {
"scrolled": true,
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"OLS Regression Results\n",
"\n",
" Dep. Variable: | price | R-squared: | 0.626 | \n",
"
\n",
"\n",
" Model: | OLS | Adj. R-squared: | 0.595 | \n",
"
\n",
"\n",
" Method: | Least Squares | F-statistic: | 19.83 | \n",
"
\n",
"\n",
" Date: | Wed, 05 Jan 2022 | Prob (F-statistic): | 1.86e-13 | \n",
"
\n",
"\n",
" Time: | 22:37:42 | Log-Likelihood: | -1039.1 | \n",
"
\n",
"\n",
" No. Observations: | 78 | AIC: | 2092. | \n",
"
\n",
"\n",
" Df Residuals: | 71 | BIC: | 2109. | \n",
"
\n",
"\n",
" Df Model: | 6 | | | \n",
"
\n",
"\n",
" Covariance Type: | nonrobust | | | \n",
"
\n",
"
\n",
"\n",
"\n",
" | coef | std err | t | P>|t| | [0.025 | 0.975] | \n",
"
\n",
"\n",
" Intercept | -2.944e+06 | 2.26e+06 | -1.302 | 0.197 | -7.45e+06 | 1.56e+06 | \n",
"
\n",
"\n",
" cooling[T.other] | -1.021e+05 | 3.67e+04 | -2.778 | 0.007 | -1.75e+05 | -2.88e+04 | \n",
"
\n",
"\n",
" area | 111.8295 | 25.915 | 4.315 | 0.000 | 60.156 | 163.503 | \n",
"
\n",
"\n",
" bed | 5121.5208 | 3.1e+04 | 0.165 | 0.869 | -5.68e+04 | 6.7e+04 | \n",
"
\n",
"\n",
" bath | 2.678e+04 | 2.94e+04 | 0.910 | 0.366 | -3.19e+04 | 8.55e+04 | \n",
"
\n",
"\n",
" year_built | 1491.1176 | 1157.430 | 1.288 | 0.202 | -816.732 | 3798.968 | \n",
"
\n",
"\n",
" lot | 3.491e+05 | 8.53e+04 | 4.094 | 0.000 | 1.79e+05 | 5.19e+05 | \n",
"
\n",
"
\n",
"\n",
"\n",
" Omnibus: | 27.108 | Durbin-Watson: | 1.919 | \n",
"
\n",
"\n",
" Prob(Omnibus): | 0.000 | Jarque-Bera (JB): | 112.999 | \n",
"
\n",
"\n",
" Skew: | -0.874 | Prob(JB): | 2.90e-25 | \n",
"
\n",
"\n",
" Kurtosis: | 8.632 | Cond. No. | 4.57e+05 | \n",
"
\n",
"
Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
[2] The condition number is large, 4.57e+05. This might indicate that there are
strong multicollinearity or other numerical problems."
],
"text/plain": [
"\n",
"\"\"\"\n",
" OLS Regression Results \n",
"==============================================================================\n",
"Dep. Variable: price R-squared: 0.626\n",
"Model: OLS Adj. R-squared: 0.595\n",
"Method: Least Squares F-statistic: 19.83\n",
"Date: Wed, 05 Jan 2022 Prob (F-statistic): 1.86e-13\n",
"Time: 22:37:42 Log-Likelihood: -1039.1\n",
"No. Observations: 78 AIC: 2092.\n",
"Df Residuals: 71 BIC: 2109.\n",
"Df Model: 6 \n",
"Covariance Type: nonrobust \n",
"====================================================================================\n",
" coef std err t P>|t| [0.025 0.975]\n",
"------------------------------------------------------------------------------------\n",
"Intercept -2.944e+06 2.26e+06 -1.302 0.197 -7.45e+06 1.56e+06\n",
"cooling[T.other] -1.021e+05 3.67e+04 -2.778 0.007 -1.75e+05 -2.88e+04\n",
"area 111.8295 25.915 4.315 0.000 60.156 163.503\n",
"bed 5121.5208 3.1e+04 0.165 0.869 -5.68e+04 6.7e+04\n",
"bath 2.678e+04 2.94e+04 0.910 0.366 -3.19e+04 8.55e+04\n",
"year_built 1491.1176 1157.430 1.288 0.202 -816.732 3798.968\n",
"lot 3.491e+05 8.53e+04 4.094 0.000 1.79e+05 5.19e+05\n",
"==============================================================================\n",
"Omnibus: 27.108 Durbin-Watson: 1.919\n",
"Prob(Omnibus): 0.000 Jarque-Bera (JB): 112.999\n",
"Skew: -0.874 Prob(JB): 2.90e-25\n",
"Kurtosis: 8.632 Cond. No. 4.57e+05\n",
"==============================================================================\n",
"\n",
"Notes:\n",
"[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.\n",
"[2] The condition number is large, 4.57e+05. This might indicate that there are\n",
"strong multicollinearity or other numerical problems.\n",
"\"\"\""
]
},
"execution_count": 28,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"lm_m.summary()"
]
}
],
"metadata": {
"celltoolbar": "Slideshow",
"interpreter": {
"hash": "463226f144cc21b006ce6927bfc93dd00694e52c8bc6857abb6e555b983749e9"
},
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.11"
},
"varInspector": {
"cols": {
"lenName": 16,
"lenType": 16,
"lenVar": 40
},
"kernels_config": {
"python": {
"delete_cmd_postfix": "",
"delete_cmd_prefix": "del ",
"library": "var_list.py",
"varRefreshCmd": "print(var_dic_list())"
},
"r": {
"delete_cmd_postfix": ") ",
"delete_cmd_prefix": "rm(",
"library": "var_list.r",
"varRefreshCmd": "cat(var_dic_list()) "
}
},
"types_to_exclude": [
"module",
"function",
"builtin_function_or_method",
"instance",
"_Feature"
],
"window_display": false
}
},
"nbformat": 4,
"nbformat_minor": 2
}