Return on Investment for Machine Learning

栏目: IT技术 · 发布时间: 4年前

内容简介:Machine learning deals with probabilities, which means there’s always a chance for mistakes. This inherent uncertainty makes many decision makers feel uncomfortable with implementing machine learning and traps them in an endless chase for the magical 100%

Instead of asking “How do we get 100% accuracy?”, the right question is “How do we maximize ROI?”

Jul 7 ·7min read

Return on Investment for Machine Learning

Photo by Michał Parzuchowski on Unsplash

Machine learning deals with probabilities, which means there’s always a chance for mistakes. This inherent uncertainty makes many decision makers feel uncomfortable with implementing machine learning and traps them in an endless chase for the magical 100% accuracy. The fear of mistakes nearly always pops up when I’m working with companies taking their first steps towards intelligent automation, and I get asked “What happens if the prediction is wrong?”

If this issue is not addressed, the company will very likely spend a hefty amount of resources and years of development time on machine learning without ever getting returns for their investment. In this article, I’ll show you the simple equation I use to relieve these concerns and get decision makers more comfortable with the uncertainty.

When is machine learning worth it

Just like with any investment, the feasibility of machine learning comes down to whether it generates more value than it costs. It’s a normal Return on Investment (ROI) calculation which, in the context of machine learning, weighs the generated value against the cost of mistakes and accuracy. So instead of asking “How do we get 100% accuracy?”, the right question is “How do we maximize ROI?”

Determining the expected returns is quite straightforward. I usually begin opening up the business case for machine learning implementation by weighing the benefits against the potential costs in mathematical terms. This can be formalized in an equation which basically says “What’s left of the generated value after the cost of mistakes is accounted for?” Solving this simple equation allows us to estimate the profits for different scenarios.

Let’s look at the variables:

  • returns : Generated net value or profit per prediction
  • value : The new value generated by every prediction (e.g. assigning a document to the right category now takes 0.01 seconds instead of 5 minutes, so the value is 5 minutes saved)
  • accuracy : The accuracy of predictions made by the algorithm
  • cost of a mistake : The additional costs incurred by a wrong prediction (e.g. it takes 20 minutes for someone to change the value which was predicted

By flipping around the equation and setting returns to zero, we get the minimum accuracy required to generate net value. This is called break-even accuracy:

Return on Investment for Machine Learning

The equation gets more intuitive when plotted in a graph:

Return on Investment for Machine Learning

So let’s say each prediction the algorithm makes saves you 5 minutes of work but it takes 20 minutes of extra work to fix a wrong prediction. We can now calculate the break-even accuracy to be 1–5/20 = 75% . Any improvement after this point brings concrete profits.

The above equation assumes us to blindly accept any prediction the algorithm makes and fix the errors afterwards. Sounds risky? We can do much better by extending the equation with confidence scores to lower the risks.

Optimizing ROI

A machine learning algorithm (done right) does not only spew out predictions, it also tells us how confident it is in every prediction. The majority of mistakes happen when the algorithm is unsure of its answer, allowing us to focus automation on the highest certainty predictions while manually reviewing the lowest few. While manual review does cost a bit of labor, it’s normally much cheaper than fixing a mistake later on.

Let’s choose a threshold which picks out 10% of the least confident predictions for manual review. The rest 90% will be handled automatically. This ratio is called confidence split . The accuracy in the high confidence bracket will now be considerably better since many of the mistakes are caught in the small unconfident bracket. This leads us to the extended equation. It says “What’s left of the generated value after the cost of mistakes and manual review are accounted for?”


以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

编程算法新手自学手册

编程算法新手自学手册

管西京 / 机械工业 / 2012-1 / 69.80元

《编程算法新手自学手册》主要内容简介:算法是指在有限步骤内求解某一问题所使用的一组定义明确的规则。程序员都会看重数据结构和算法的作用,水平越高,就越能理解算法的重要性。算法不仅是运算工具,更是程序的灵魂。《编程算法新手自学手册》循序渐进、由浅入深地详细讲解了基于C语言算法的核心技术,并通过具体实例的实现过程演练了各个知识点的具体使用流程。全书共11章,分为4篇。1~2章是基础篇,介绍算法开发所必需......一起来看看 《编程算法新手自学手册》 这本书的介绍吧!

CSS 压缩/解压工具
CSS 压缩/解压工具

在线压缩/解压 CSS 代码

JSON 在线解析
JSON 在线解析

在线 JSON 格式化工具

HEX HSV 转换工具
HEX HSV 转换工具

HEX HSV 互换工具