What we cover


In our example, we mainly follow the instructions provided by this TensorFlow tutorial: Building a TFX Pipeline Locally to build a pipeline from a prebuilt template.

To complete this tutorial, you need the following environment:

Furthermore, you should be aware of pathing differences between macOS, Windows and Linux:

We first need to set up our environment and create new folders:

  1. On Windows open the Start menu and open an Anaconda Command Prompt. On macOS or Linux open a terminal window.
  2. Activate the virtual Anaconda environment (in my case "tf"):
conda activate tf
  1. Create a project folder tfx-files with mkdir (make directory):
mkdir tfx-files
  1. Create a sub-folder inside tfx-files called output. Therefore, we first need to change directory (cd) into the tfx-files directory:
cd your-path-to-txf-files
  1. Create the new sub-folder output:
mkdir output

Create a copy of the pipeline template

In the next steps, we use the TFX command-line interface (CLI) which is a part of the TFX package. All commands start with tfx.

First, we use tfx template which are commands for listing and copying TFX pipeline templates.

  1. List the currently available TFX pipeline templates:
tfx template list
  1. We copy the penguin template to our local machine (you have to change the following code):
tfx template copy --model=penguin --pipeline_name=pipeline-tutorial \

Only change the entry for destination_path:

  1. A copy of the pipeline template has been created at the path you specified.

Explore the directories and files that were copied to your pipeline's project directory tfx-files:

Before we can create our pipeline, we first need to change some code in the file local_runner.py. This script creates a pipeline run and specifies the run's parameters, such as the DATA_PATH and OUTPUT_DIR.

Note that you don't necessarily have to change the variable definition of


since the given expression returns the full path name in a multiplatform-safe way.

  1. Open the file with your code editor and define the variables OUTPUT_DIR (in line 32) and DATA_PATH:
OUTPUT_DIR = 'your-path-to-tfx-files/output'
DATA_PATH = 'your-path-to-tfx-files/data/'
  1. We can save all changes and close the file.
  2. In your terminal, change directory (cd) into the project directory of tfx-files:
cd your-path-to-txf-files
  1. Run the following commands in your pipeline directory:
tfx pipeline create --pipeline_path=your-path-to-txf-files/local_runner.py

In my case this would be tfx pipeline create --pipeline_path=/Users/jankirenz/tfx-files/local_runner.py

If you run the code, the last output line should display Pipeline "pipeline-tutorial" created successfully.

  1. Finally, run this command:
tfx run create --pipeline_name=pipeline-tutorial

Open your pipeline's pipeline/configs.py file and review the contents:

Open your pipeline's pipeline/pipeline.py file and review the contents:

Gongratulations! You have completed the tutorial and learned how to:

✅ Install a TFX pipeline template
✅ Created a local TFX pipeline run
✅ Reviewed the pipeline components

Jan Kirenz

Thank you for participating in this tutorial. If you found any issues along the way I'd appreciate it if you'd raise them by clicking the "Report a mistake" button at the bottom left of this site.

Copyright: Jan Kirenz (2021) | kirenz.com | CC BY-NC 2.0 License