Deployment

Model deployment is often challenging due to various reasons. Some example challenges are (Tang, 2022):

  • It involves deploying models on heterogenous environments, e.g. edge devices, mobile devices, GPUs, etc.

  • It is hard to compress the model to very small size that could fit on devices with limited storage while keeping the same precision and minimizing the overhead to load the model for inference.

  • Deployed models sometimes need to process new data records within limited memory on small devices.

  • Many deployment environments have bad network connectivity so sometimes cloud solutions may not meet the requirements. . There’s interest in stronger user data privacy paradigms where user data does not need to leave the mobile device.

  • There’s growing demand to perform on-device model-based data filtering before collecting the data.

Review Yuan Tang’s excellent overview about different options of how to deploy models in R. Here, we only take a look at some of the options.

TensorFlow

TensorFlow Extended (TFX) is an end-to-end platform for deploying production ML pipelines:



R

For example, TensorFlow ‘s SavedModel as well as its optimized version TensorFlow Lite, enables on-device machine learning inference with low latency and small binary size.

The packages listed below can produce models in this format. Note that these packages are R wrappers of their corresponding Python API based on the reticulate package. Though Python binary is required for creating the models, it’s not required during inference time for deployment.

  • The tensorflow package provides full access to TensorFlow API for numerical computation using data flow graphs.

  • The tfestimators package provides high-level API to machine learning models as well as highly customized neural network architectures.

  • The keras package high-level API to construct different types of neural networks.

Python

If you want to learn more about Google’s TensorFlow Extended (TFX), which is an end-to-end Python based platform for deploying production ML pipelines, here are some tutorials to get you started:

RStudio’s Model Management

If you want to learn more about the complete data science lifecycle - including deployment - take a look at RStudio`s “Model management”, which is a workflow within the overall model lifecycle that can be used to manage multiple versions of deployed models in production:

Plumber Web API

To deploy your model as Web API, you may use the plumber R package. This package allows users to expose existing R code as a service available to others on the Web: