Course overview
Course overview¶
Topic |
Content |
Python |
|
---|---|---|---|
1 |
Introduction |
Data driven decision making; Programming toolkit; Programming process |
Anaconda, Jupyter Notebook, Visual Studio Code |
2 |
Data basics & study design |
Types of data; data transformations; Sampling; Experiment; Observational study |
pandas |
3 |
Exploratory data analysis |
Visualizing categorical data; visualizing numerical data; measures of central tendency; measures of distribution |
pandas; Seaborn; Plotly |
4 |
Statistical inference |
Hypothesis testing; decision errors; p-value and statistical significance; Confidence intervals; Crosstables (Pearson’s chi-squared test); Student’s t-test; A/B-testing |
pandas; statsmodels |
5 |
Introduction to modeling |
Statistical learning vs machine learning; Supervised learning vs unsupervised learning; Regression; Classification; Quality of fit; Bias-variance trade-off |
Statsmodels; Scikit-Learn |
6 |
Resampling methods |
Training, evaluation and test set; Validation set approach; k-Fold Cross-Validation; Bootstrap |
Statsmodels; Scikit-Learn |
7 |
Linear regression |
Fundamentals; Qualitative predictors, Interaction terms; Non-linear transformations |
Statsmodels; Scikit-Learn |
8 |
Regression diagnostics |
Linearity; Normality of the residuals; Influence tests; Multicollinearity; Heteroskedasticity tests |
Statsmodels; Scikit-Learn |
9 |
Advanced methods I |
Subset selection methods; Shrinkage methods (Lasso, Ridge regression, Elastic Net); Dimension reduction methods (Principal Components regression) |
Statsmodels; Scikit-Learn |
10 |
Advanced methods II |
Regression Splines; Smoothing Splines; Generalized Additive Models; Stacking |
Statsmodels; Scikit-Learn |
11 |
Introduction to classification |
Confusion matrix; Recall; Precision; F1-score; ROC-Curve; Unbalanced data |
Statsmodels; Scikit-Learn |
12 |
Classification models |
Logistic regression; Generative models (discriminant analysis) |
Statsmodels; Scikit-Learn |
13 |
Alternative models |
Time series analysis |
Statsmodels; Prophet |
14 |
Probability |
Introduction to probability; Expected frequency tree; Bayes Theorem |
pandas; PyMC3; bambi |