Marketing Campaign Response Prediction

Using Gradient Boosted Trees

Jan Kirenz

Introduction

Predict the response to a marketing campaign

Example Payback

Marketing use case

The goal is to predict if
- a customer
- will respond positively (e.g. buys a product)
- to a future campaign
- based on their features
We use data from previous campaigns to train a model

Boosting: An Intuitive Introduction

Overview

Boosting is an ensemble learning method
Combines multiple weak learners to build a strong classifier
Learners are trained sequentially
Each learner focuses on correcting the mistakes of its predecessor

Intuition

Begin with a weak learner that performs slightly better than random guessing
Train a new weak learner to correct the mistakes of the previous one
Repeat the process, focusing on different error patterns each time
Combine all weak learners into a strong classifier

Difference to Bagging

Bagging:
- Learners are trained independently
- Training samples are drawn with replacement (bootstrapping)
- Combines learners by averaging (regression) or voting (classification)
Boosting:
- Learners are trained sequentially
- Emphasis is placed on misclassified instances
- Combines learners by weighted averaging

Advantages of Boosting

Can achieve high accuracy with simple weak learners
Less prone to overfitting than single models
Can be applied to various learning algorithms

Disadvantages of Boosting

Sensitive to noise and outliers
Computationally expensive due to sequential training
Can overfit if weak learners are too complex

Code example

Data overview

age: Customer’s age (integer)
city: Customer’s place of residence (string: ‘Berlin’, ‘Stuttgart’)
income: Customer’s annual income (integer)
membership_days: Number of days the customer has been a member (integer)
campaign_engagement: Number of times the customer engaged with previous campaigns (integer)
target: Whether the customer responded positively to the campaign (0 or 1)

Import data

df = pd.read_csv(
    'https://raw.githubusercontent.com/kirenz/datasets/master/campaign.csv')

Data overview

df

	age	city	income	membership_days	campaign_engagement	target
0	56	Berlin	136748	837	3	1
1	46	Stuttgart	25287	615	8	0
2	32	Berlin	146593	2100	3	0
3	60	Berlin	54387	2544	0	0
4	25	Berlin	28512	138	6	0
...	...	...	...	...	...	...
995	22	Berlin	49241	2123	4	0
996	40	Stuttgart	116214	970	5	1
997	27	Stuttgart	64569	2552	6	0
998	61	Stuttgart	31745	2349	8	1
999	19	Berlin	46029	2185	2	0

1000 rows × 6 columns

Data info

df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1000 entries, 0 to 999
Data columns (total 6 columns):
 #   Column               Non-Null Count  Dtype 
---  ------               --------------  ----- 
 0   age                  1000 non-null   int64 
 1   city                 1000 non-null   object
 2   income               1000 non-null   int64 
 3   membership_days      1000 non-null   int64 
 4   campaign_engagement  1000 non-null   int64 
 5   target               1000 non-null   int64 
dtypes: int64(5), object(1)
memory usage: 47.0+ KB

Data corrections

Encode categorical variables

df = pd.get_dummies(df, columns=['city'])

df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1000 entries, 0 to 999
Data columns (total 7 columns):
 #   Column               Non-Null Count  Dtype
---  ------               --------------  -----
 0   age                  1000 non-null   int64
 1   income               1000 non-null   int64
 2   membership_days      1000 non-null   int64
 3   campaign_engagement  1000 non-null   int64
 4   target               1000 non-null   int64
 5   city_Berlin          1000 non-null   uint8
 6   city_Stuttgart       1000 non-null   uint8
dtypes: int64(5), uint8(2)
memory usage: 41.1 KB

Data splitting

Split the df into features (X) and target (y)

X = df.drop('target', axis=1)
y = df['target']

Save feature names for later evaluation steps

feature_names = X.columns

Make train and test split

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42)

Select model

Define hyperparameters as dictionary

params = {
    "n_estimators": 50,
    "max_depth": 3,
    "min_samples_split": 5,
}

n_estimators: Number of gradient boosted trees
max_depth: Maximum tree depth
min_samples_split: The minimum number of samples required to split an internal node

clf = GradientBoostingClassifier(**params)

Train model

Train the model on the training data

clf.fit(X_train, y_train)

GradientBoostingClassifier(min_samples_split=5, n_estimators=50)

In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.

Evaluate model

Predict on the testing data

# Predict on the testing data
y_pred = clf.predict(X_test)

Calculate accuracy

accuracy_score(y_test, y_pred)

0.92

Confusion matrix

Print confusion matrix

print(confusion_matrix(y_test, y_pred))

[[94  7]
 [ 9 90]]

Classification report

Print classification report

print(classification_report(y_test, y_pred))

              precision    recall  f1-score   support

           0       0.91      0.93      0.92       101
           1       0.93      0.91      0.92        99

    accuracy                           0.92       200
   macro avg       0.92      0.92      0.92       200
weighted avg       0.92      0.92      0.92       200

Obtain feature importance

Obtain feature importance

feature_importance = clf.feature_importances_

Save as dataframe

df_features = pd.DataFrame(
    {"score": feature_importance,
     "name": feature_names})

df_features

	score	name
0	0.135131	age
1	0.354875	income
2	0.005587	membership_days
3	0.503843	campaign_engagement
4	0.000564	city_Berlin
5	0.000000	city_Stuttgart

Plot feature importance

alt.Chart(df_features).mark_bar().encode(
    x=alt.X('score'),
    y=alt.Y('name', sort='-x')
).properties(
    width=800,
    height=300
)

Save model

model_filename = 'gradientboosted_model.joblib'
dump(clf, model_filename)

Summary

We trained a model
Our model makes a prediction if a customer will respond positively or not
The model does a good job and we want to use it
We saved the model
We want to use the model to target customers

How to use the model?

Dashboard & API

Integrate (“deploy”) the model in a dashboard (e.g. Streamlit)
Use an API (e.g. FastAPI) to allow other software applications to use the model

Streamlit dashboard

FastAPI

Use an FastAPI app with a single /predict endpoint
Accepts POST requests with JSON data containing age, city, income, membership days, and campaign engagement.
The app will return a JSON response with the prediction.

Test API with data

You can test the API using Python’s requests library:

url = "http://127.0.0.1:8000/predict"
data = {
    "data": [
        {
            "age": 25,
            "city": "city_Berlin",
            "income": 25000,
            "membership_days": 4,
            "campaign_engagement": 1
        },
        {
            "age": 35,
            "city": "city_Stuttgart",
            "income": 120000,
            "membership_days": 250,
            "campaign_engagement": 8
        }
    ]
}

Get response

response = requests.post(url, json=data)

if response.status_code == 200:
    results = response.json()['results']
    df = pd.DataFrame(results)
    df.to_csv('predictions.csv', index=False)
    print("Predictions saved to 'predictions.csv'")
else:
    print(f"Error: {response.status_code}")
    print(response.text)

Output

age,city,income,membership_days,campaign_engagement,prediction
25,city_Berlin,25000,4,1,0.01
35,city_Stuttgart,120000,250,8,0.97

Marketing campaign

Next, we would filter all customers at a certain threshold
What would be a good threshold?
Only target those customers with the marketing campaign

Marketing Campaign Response Prediction

Introduction

Example Payback

Marketing use case

Boosting: An Intuitive Introduction

Overview

Intuition

Difference to Bagging

Advantages of Boosting

Disadvantages of Boosting

Code example

Data overview

Import data

Data overview

Data info

Data corrections

Data splitting

Select model

Train model

Evaluate model

Confusion matrix

Classification report

Obtain feature importance

Plot feature importance

Save model

Summary

How to use the model?

Dashboard & API

Streamlit dashboard

FastAPI

Test API with data

Get response

Output

Marketing campaign

Questions?