What we cover

In this tutorial we are building Kubeflow Pipelines on a local machine.

Kubeflow Pipelines is a platform for building and deploying portable, scalable machine learning (ML) workflows based on Docker containers. We mainly follow the instructions from this Kubeflow tutorial

To start this tutorial, you need the following environment setup:

If you don't have MiniKF installaed on your system, review this Kubeflow installation tutorial.

According to Kubeflow's documentation the Kubeflow Pipelines platform consists of:

The following are the goals of Kubeflow Pipelines:

Kubeflow Pipelines offers a few samples that you can use to try out Kubeflow Pipelines quickly. The steps below show you how to run a basic sample that includes some Python operations, but doesn't include a machine learning (ML) workload:

  1. Open the Kubeflow menue at the top left and select Pipelines.

Menue

  1. Click the name of the sample, [Tutorial] Data passing in python components, on the pipelines UI.
  2. On the top right, click Create experiment.
  3. Provide experiment details:
  1. Click Next.
  2. Change the name of the run:
  1. Click Start.
  2. Click the name of the run on the experiments dashboard to trigger the pipeline.
  3. Wait until the pipeline is finished.
  4. Explore the graph and other aspects of your run by clicking on the components of the graph and the other UI elements.

You can find the source code for the Data passing in python components tutorial in the Kubeflow Pipelines repo.

This section shows you how to run the XGBoost sample available from the pipelines UI. Unlike the basic sample described above, the XGBoost sample does include ML components.

  1. Open the Kubeflow menue at the top left and select Pipelines.

Menue

  1. Click the name of the sample, [Demo] XGBoost - Iterative model training, on the pipelines UI.
  2. On the top right, click Create experiment.
  3. Provide experiment details:
  1. Click Next.
  2. Change the name of the run:
  1. Click Start.
  2. Click the name of the run on the experiments dashboard to trigger the pipeline.
  3. Wait until the pipeline is finsihed (the last component is Xgboost predict).
  4. Explore the graph and other aspects of your run by clicking on the components of the graph and the other UI elements.

You can find the source code in the Kubeflow Pipelines repo.

When you are done, you can exit Kubeflow and stop MiniKF:

  1. Log out from Kubeflow (click the ⍈ symbol at the top right of the Kubeflow UI)
  2. Navigate to your MiniKF browser window http://10.10.10.10.
  3. Click on the terminal (in the middle of the screen).
  4. Use Ctrl-Cto exit.

Congratulations! You have completed the tutorial and learned how to:

✅ Run a basic Kubeflow pipeline
✅ Run a ML pipeline with XGBoost

This tutorial showed you how to run some of the examples supplied in the Kubeflow Pipelines UI. Next, you may want to run a pipeline from a notebook, or compile and run a sample from the code. See the guide to experimenting with the Kubeflow Pipelines samples.

Build your own machine-learning pipelines with the Kubeflow Pipelines SDK.

Jan Kirenz

Thank you for participating in this tutorial. If you found any issues along the way I'd appreciate it if you'd raise them by clicking the "Report a mistake" button at the bottom left of this site.

Copyright: Jan Kirenz (2021) | kirenz.com | CC BY-NC 2.0 License