User Trust and Engagement

A common pitfall when surfacing a machine learning score or algorithmic insight is that end users don’t understand or trust the new data points. This can lead to them ignoring the insight, no matter how accurate or useful it is.

This usually happens when ML is conducted primarily by data scientists in isolation from users and stakeholders, and can be avoided by:

  • Engaging with users from the start - understand what problem they expect the model to solve for them and use that to frame initial investigation and analysis

  • Demo and explain your model results to users as part of your iterative model development - take them on the journey with you.

  • Focus on explainability - this may be of the model itself. our users may want feedback on how it's arrived at its decision (e.g. surfacing the values of the most important features used to provide a recommendation), or it may be guiding your users on how to take action on the end result (e.g. talking through how to threshold against a credit risk score)

  • Users will prefer concrete domain based values over abstract scores or data points, so feed this consideration into your algorithmic selection.

  • Give access to model monitoring and metrics (link here) once you are in production - this will help maintain user trust if they wish to check in on model health if they have any concerns.

  • Provide a feedback mechanism - ideally available directly alongside the model result. This allows the user to confirm good results and raise suspicious ones, and can be a great source of labelling data. Knowing their actions can have a direct impact on the model provides trust and empowerment.

Experience report

We had a project tasked with using machine learning to find fraudulent repayment claims, which were being investigated manually inside an application used by case workers. The data science team initially understood the problem to be one of helping the case workers know which claims were fraud, and in isolation developed a model that surfaced an score of 0 - 100 overall likelihood of fraud.

The users didn’t engage with this score as they weren’t clear about how it was being derived, and they still had to carry out the investigation to confirm the fraud. It was seldom used.

A second iteration was developed that provided a score on the bank account involved in the repayment instead of an overall indicator. This had much higher user engagement because it indicated a jumping off point for investigation and action to be taken.

Equal Experts, UK

Last updated