MLOps
Equal ExpertsContact UsPlaybooks
  • Overview
    • Key terms
  • What is MLOps
  • Principles
    • Solid data foundations
    • Provide an environment that allows data scientists to create and test models
    • A machine learning service is a product
    • Apply continuous delivery
    • Evaluate and monitor algorithms throughout their lifecycle
    • MLOps is a team effort
  • Practices
    • Collect performance data
    • Ways of deploying your model
    • How often do you deploy a model?
    • Keep a versioned model repository
    • Measure and proactively evaluate quality of training data
    • Testing through the ML pipeline
    • Business impact is more than just accuracy - understand your baseline
    • Regularly monitor your model in production
    • Monitor data quality
    • Automate the model lifecycle
    • Create a walking skeleton/steel thread
    • Appropriately optimise models for inference
  • Explore
  • Pitfalls (Avoid)
    • User Trust and Engagement
    • Explainability
    • Avoid notebooks in production
    • Poor security practices
    • Don’t treat accuracy as the only or even the best way to evaluate your algorithm
    • Use machine learning judiciously
    • Don’t forget to understand the at-inference usage profile
    • Don’t make it difficult for a data scientists to access data or use the tools they need
    • Not taking into consideration the downstream application of the model
  • Contributors
Powered by GitBook
On this page
  • Our playbooks are collections of observations that we have made many times in different sectors and clients. However, there are some emerging technologies and approaches which we have only applied in one or two places to date, but which we think are really promising. We think they will become recommended practices in the future - or are at least worth experimenting with. For now we are recommending you explore them at least.
  • Feature stores
  • Experience report
Export as PDF

Explore

PreviousAppropriately optimise models for inferenceNextPitfalls (Avoid)

Last updated 2 years ago

Our playbooks are collections of observations that we have made many times in different sectors and clients. However, there are some emerging technologies and approaches which we have only applied in one or two places to date, but which we think are really promising. We think they will become recommended practices in the future - or are at least worth experimenting with. For now we are recommending you explore them at least.

Feature stores

Data is central to any ML system - it’s needed both online and offline, for exploration and realtime prediction. One of the challenges in operationalising any ML algorithm is ensuring that any data used to train the model is also available in production. It is not simply the raw data that is used by the model - in most cases the raw data needs to be transformed in some way to create a data feature. ( for a description of Features and Feature Engineering.)

Creating a feature can be a time-consuming activity and you need it to be available for both offline and online activities. Furthermore, a feature you have created for one purpose may well be relevant for another task. A feature store is a component that manages the ingestion of raw data (from databases, event streams etc.) and turns it into features which can be used both to train models and as an input to the operational model. It takes the place of the data warehouse and the operational data pipelines - providing a batch API or query mechanism for retrieval of feature data-sets for model training, as well as a low latency API to provide data for real-time predictions.

The benefits are that:

  • You do not need to create a separate data pipeline for the online inference

  • Exactly the same transforms are used for training as for online inference

Experience report

In my experience MLOps is not just about tooling. It’s a culture - a mindset for bringing data scientists, engineers and content experts together - but let’s just focus on some tooling for now!

One of the marks of successfully getting machine learning into operations is that tedious and difficult tasks are automated, and it is easy for developers and data scientists to work together. I’ve recently been using Google Vertex AI as the framework for managing machine learning models at an online retailer. Prior to using Vertex AI, there were several teams doing ML operations in different ways. Some were using Airflow and Kubernetes, others were using hand-rolled in-house builds and data stores.

We have used Vertex AI to create a shared toolset for managing the model lifecycle, with standardised components to do the typical things you need to do:

  • Workflow management/orchestration

  • Model serving

  • Model repository

  • Feature store

I have found the feature store to be really useful. Our models need to use aggregated features like average lead times for products, and the Vertex AI feature store is a good place to calculate and store them. Using the feature store means that I know that the data is processed the same way for training as when applied to the model in production. It saves us time because we don’t have to create separate data pipelines for the deployed model in operation. It also has other advantages - keeping data in the feature store makes it easier to query how these aggregated features have changed over time. I think they will become a standard part of most ML environments.

ML specialist

Equal Experts, EU

See Provide an Environment which Allows Data Scientists to create and test models
Bas Geerdink