Marketing Campaign Response Prediction

Using Gradient Boosted Trees

Jan Kirenz


  • Predict the response to a marketing campaign

Example Payback

Marketing use case

  • The goal is to predict if
    • a customer
    • will respond positively (e.g. buys a product)
    • to a future campaign
    • based on their features
  • We use data from previous campaigns to train a model

Boosting: An Intuitive Introduction


  • Boosting is an ensemble learning method

  • Combines multiple weak learners to build a strong classifier

  • Learners are trained sequentially

  • Each learner focuses on correcting the mistakes of its predecessor


  1. Begin with a weak learner that performs slightly better than random guessing

  2. Train a new weak learner to correct the mistakes of the previous one

  3. Repeat the process, focusing on different error patterns each time

  4. Combine all weak learners into a strong classifier

Difference to Bagging

  • Bagging:
    • Learners are trained independently
    • Training samples are drawn with replacement (bootstrapping)
    • Combines learners by averaging (regression) or voting (classification)
  • Boosting:
    • Learners are trained sequentially
    • Emphasis is placed on misclassified instances
    • Combines learners by weighted averaging

Advantages of Boosting

  • Can achieve high accuracy with simple weak learners

  • Less prone to overfitting than single models

  • Can be applied to various learning algorithms

Disadvantages of Boosting

  • Sensitive to noise and outliers

  • Computationally expensive due to sequential training

  • Can overfit if weak learners are too complex

Code example

Data overview

  • age: Customer’s age (integer)
  • city: Customer’s place of residence (string: ‘Berlin’, ‘Stuttgart’)
  • income: Customer’s annual income (integer)
  • membership_days: Number of days the customer has been a member (integer)
  • campaign_engagement: Number of times the customer engaged with previous campaigns (integer)
  • target: Whether the customer responded positively to the campaign (0 or 1)

Import data

df = pd.read_csv(

Data overview

age city income membership_days campaign_engagement target
0 56 Berlin 136748 837 3 1
1 46 Stuttgart 25287 615 8 0
2 32 Berlin 146593 2100 3 0
3 60 Berlin 54387 2544 0 0
4 25 Berlin 28512 138 6 0
... ... ... ... ... ... ...
995 22 Berlin 49241 2123 4 0
996 40 Stuttgart 116214 970 5 1
997 27 Stuttgart 64569 2552 6 0
998 61 Stuttgart 31745 2349 8 1
999 19 Berlin 46029 2185 2 0

1000 rows × 6 columns

Data info
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1000 entries, 0 to 999
Data columns (total 6 columns):
 #   Column               Non-Null Count  Dtype 
---  ------               --------------  ----- 
 0   age                  1000 non-null   int64 
 1   city                 1000 non-null   object
 2   income               1000 non-null   int64 
 3   membership_days      1000 non-null   int64 
 4   campaign_engagement  1000 non-null   int64 
 5   target               1000 non-null   int64 
dtypes: int64(5), object(1)
memory usage: 47.0+ KB

Data corrections

  • Encode categorical variables
df = pd.get_dummies(df, columns=['city'])
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1000 entries, 0 to 999
Data columns (total 7 columns):
 #   Column               Non-Null Count  Dtype
---  ------               --------------  -----
 0   age                  1000 non-null   int64
 1   income               1000 non-null   int64
 2   membership_days      1000 non-null   int64
 3   campaign_engagement  1000 non-null   int64
 4   target               1000 non-null   int64
 5   city_Berlin          1000 non-null   uint8
 6   city_Stuttgart       1000 non-null   uint8
dtypes: int64(5), uint8(2)
memory usage: 41.1 KB

Data splitting

  • Split the df into features (X) and target (y)
X = df.drop('target', axis=1)
y = df['target']
  • Save feature names for later evaluation steps
feature_names = X.columns
  • Make train and test split
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42)

Select model

  • Define hyperparameters as dictionary
params = {
    "n_estimators": 50,
    "max_depth": 3,
    "min_samples_split": 5,
  • n_estimators: Number of gradient boosted trees
  • max_depth: Maximum tree depth
  • min_samples_split: The minimum number of samples required to split an internal node
clf = GradientBoostingClassifier(**params)

Train model

  • Train the model on the training data, y_train)
GradientBoostingClassifier(min_samples_split=5, n_estimators=50)
Evaluate model

  • Predict on the testing data
# Predict on the testing data
y_pred = clf.predict(X_test)
  • Calculate accuracy
accuracy_score(y_test, y_pred)

Confusion matrix

  • Print confusion matrix
print(confusion_matrix(y_test, y_pred))
[[94  7]
 [ 9 90]]

Classification report

  • Print classification report
print(classification_report(y_test, y_pred))
              precision    recall  f1-score   support

           0       0.91      0.93      0.92       101
           1       0.93      0.91      0.92        99

    accuracy                           0.92       200
   macro avg       0.92      0.92      0.92       200
weighted avg       0.92      0.92      0.92       200

Obtain feature importance

  • Obtain feature importance
feature_importance = clf.feature_importances_
  • Save as dataframe
df_features = pd.DataFrame(
    {"score": feature_importance,
     "name": feature_names})

score name
0 0.135131 age
1 0.354875 income
2 0.005587 membership_days
3 0.503843 campaign_engagement
4 0.000564 city_Berlin
5 0.000000 city_Stuttgart

Plot feature importance

    y=alt.Y('name', sort='-x')

Save model

model_filename = 'gradientboosted_model.joblib'
dump(clf, model_filename)


  1. We trained a model

  2. Our model makes a prediction if a customer will respond positively or not

  3. The model does a good job and we want to use it

  4. We saved the model

  5. We want to use the model to target customers

How to use the model?

Dashboard & API

  • Integrate (“deploy”) the model in a dashboard (e.g. Streamlit)

  • Use an API (e.g. FastAPI) to allow other software applications to use the model

Streamlit dashboard


  • Use an FastAPI app with a single /predict endpoint

  • Accepts POST requests with JSON data containing age, city, income, membership days, and campaign engagement.

  • The app will return a JSON response with the prediction.

Test API with data

  • You can test the API using Python’s requests library:
url = ""
data = {
    "data": [
            "age": 25,
            "city": "city_Berlin",
            "income": 25000,
            "membership_days": 4,
            "campaign_engagement": 1
            "age": 35,
            "city": "city_Stuttgart",
            "income": 120000,
            "membership_days": 250,
            "campaign_engagement": 8

Get response

response =, json=data)

if response.status_code == 200:
    results = response.json()['results']
    df = pd.DataFrame(results)
    df.to_csv('predictions.csv', index=False)
    print("Predictions saved to 'predictions.csv'")
    print(f"Error: {response.status_code}")



Marketing campaign

  • Next, we would filter all customers at a certain threshold

  • What would be a good threshold?

  • Only target those customers with the marketing campaign
