MLOps
Equal ExpertsContact UsPlaybooks
  • Overview
    • Key terms
  • What is MLOps
  • Principles
    • Solid data foundations
    • Provide an environment that allows data scientists to create and test models
    • A machine learning service is a product
    • Apply continuous delivery
    • Evaluate and monitor algorithms throughout their lifecycle
    • MLOps is a team effort
  • Practices
    • Collect performance data
    • Ways of deploying your model
    • How often do you deploy a model?
    • Keep a versioned model repository
    • Measure and proactively evaluate quality of training data
    • Testing through the ML pipeline
    • Business impact is more than just accuracy - understand your baseline
    • Regularly monitor your model in production
    • Monitor data quality
    • Automate the model lifecycle
    • Create a walking skeleton/steel thread
    • Appropriately optimise models for inference
  • Explore
  • Pitfalls (Avoid)
    • User Trust and Engagement
    • Explainability
    • Avoid notebooks in production
    • Poor security practices
    • Don’t treat accuracy as the only or even the best way to evaluate your algorithm
    • Use machine learning judiciously
    • Don’t forget to understand the at-inference usage profile
    • Don’t make it difficult for a data scientists to access data or use the tools they need
    • Not taking into consideration the downstream application of the model
  • Contributors
Powered by GitBook
On this page
  • To avoid this disillusionment, it is important at the beginning of the initiative to start detailing the business metrics that would be affected.
  • Experience report
Export as PDF
  1. Practices

Business impact is more than just accuracy - understand your baseline

PreviousTesting through the ML pipelineNextRegularly monitor your model in production

Last updated 3 years ago

When working on an initiative that involves cutting edge technology like AI/ML, it is very easy to get blind sided by the technological aspects of the initiative. Discussion around algorithms to be used, the computational power, speciality hardware and software, bending data to the will and opportunities to reveal deep insights will lead to the business stakeholders having high expectations bordering on magical outputs.

The engineers in the room will want to get cracking as soon as possible. Most of the initiatives will run into data definition challenges, data availability challenges and data quality issues. The cool tech, while showing near miraculous output as a “proof of concept” will start falling short of the production level expectations set by the POC stage, thereby creating disillusionment.

To avoid this disillusionment, it is important at the beginning of the initiative to start detailing the business metrics that would be affected.

Then the team and the stakeholders have to translate them into the desired level of accuracy or performance output from the ML based on an established base line. The desired level of accuracy can be staggered in relation to the quantum of business outcome (impact on the business metrics) to define a hurdle rate beyond which it would be acceptable.

Rather than choosing an obsolete or worse, a random accuracy level that may not be possible because of various factors that the team cannot control, this step ensures that they will be able to define an acceptable level of performance, which translates to valuable business outcome.

The minimum acceptable accuracy or performance level (hurdle rate) would vary depending on the use case that is being addressed. An ML model that blocks transactions based on fraud potential would need very high accuracy when compared to a model built to predict repeat buy propensity of a customer that helps marketers in retargeting.

Without this understanding, the team working on the initiative won’t know if they are moving in the right direction. The team may go into extended cycles of performance /accuracy improvement assuming anything less is not acceptable, while in reality they could have generated immense business value just by deploying what they have.

Experience report

At the start of a project to use machine learning for product recommendation, business stakeholders were using vague terminologies to define the outcomes of the initiative. They were planning downstream activities that would use the model output with the assumption that the model would accurately predict the repurchase behaviour and product recommendations, as if it can magically get it right all the time. They did not account for the probabilistic nature of the model predictions and what needs to be done to handle the ambiguities.

During , the team took time to explain the challenges in trying to build a model to match their expectations, especially when we could show them that they had limited available data and even where the data was available, the quality was questionable.

We then explored and understood their “as-is” process. We worked with them to establish the metrics from that process as the current baseline and then arrived at a good enough (hurdle rate) improvement for the initiative that can create significant business outcomes. During these discussions we identified the areas where the predictions were going to create ambiguous downstream data (e.g. although the model can predict with high enough accuracy who will buy again, the model can only suggest a basket of products that the customer would buy instead of the one specific product that the business users were initially expecting).

As the business understood the constraints (mostly arising out of the data availability or quality), they were able to design the downstream processes that could still use the best available predictions to drive the business outcome.

The iterative process, where we started with a base-line and agreed on acceptable improvement, ensured that the data team was not stuck with unattainable accuracy in building the models. It also allowed the business to design downstream processes to handle ambiguities without any surprises. This allowed the initiative to actually get the models live into production and improve them based on real world scenarios rather than getting stuck in long hypothetical goals.

Principle consultant

Equal Experts, India

inception
Rajesh Thiagarajan