Three ways to categorize machine learning platforms

栏目: IT技术 · 发布时间: 4年前

内容简介:Machine learning (ML) platforms take many forms and usually solve only one or a few parts of the ML problem space. So how do you make sense of the different platforms that all call themselves ML platforms?Machine learning platforms take many forms, from la

Three ways to categorize machine learning platforms

Photo courtesy of https://unsplash.com/@image_conscious

Machine learning (ML) platforms take many forms and usually solve only one or a few parts of the ML problem space. So how do you make sense of the different platforms that all call themselves ML platforms?

Machine learning platforms take many forms, from labeling of data to visualizing data and from training models to monitoring deployed models. All of these different factors, and more, constitute what people call a machine learning platform, so it is your responsibility as a technology evaluator to understand the core concepts around machine learning so that you know what kind of platform best suits your needs. This blogpost will introduce three ways of looking at ML platforms.

What are machine learning platforms?

Machine learning (ML) platforms are services or tools that allow you to automate or outsource parts of your data science work. The way they do that can, however, be very different. A library, such as MLFlow ( https://mlflow.org/), is a platform, in the same way, that an analytics platform H2O ( https://www.h2o.ai/) is a platform. With a few minutes of reading about the two products, you will realize that they solve completely different problems with completely different approaches. How can then a layman compare these two platforms?

First, let’s define the standard problems in machine learning. On a higher level, machine learning can be divided into three parts:

Let’s look at each one of these and see what they entail.

Data management

Data on its part can be divided into tabular data (e.g. databases with customer information) and unstructured data (e.g. images of our product in different scenarios). Data management in machine learning has to do with issues such as collection, preprocessing (ETL), labeling, annotating, and exploring data. Each of these challenges requires different tools.

Model development

Model training on its part can be divided into feature extraction and training. Training is also different depending on if you’re building AutoML solutions, using pre-existing models, building traditional machine learning models (like decision trees) or deep learning models on large unstructured data. For each case, there are different needs for infrastructure, frameworks and collaboration tools.

Prediction serving

The main two categories for prediction serving are the way it is deployed, either as a part of a software or as an external access point that can be accessed from serving. Models can also be served for batch inference (when the need for predictions is sporadic) or live inference (when the need is constant). Other issues to consider in prediction serving are AB testing of models, canary serving of new models, rollback of models, model staleness and more.

Summary

As you can see from the main three machine learning pipeline categories, the needs and tools supporting these needs vary greatly. Further dimensions in the whole toolchain come from the machine learning team’s background. Data scientists with a background in software engineering tend to value tools that allow them to develop models in an IDE whereas recent students of analytics and data science are more accustomed to interactive web interfaces, such as Jupyter notebooks. Also, your company’s internal toolchain from the cloud provider you use or on-premises GPU clusters you might have has an impact on what tools can be used.

It’s thus fair to assume that the best tool for you is different from that of your competitors.


以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

蚂蚁金服

蚂蚁金服

廉薇、边慧、苏向辉、曹鹏程 / 中国人民大学出版社 / 2017-7 / 59.00

打开支付宝,我们不但可以用手机即时付款,给好友转账,为信用卡还款,购买水、电、天然气,还可以办理出国购物退税;因为余额宝,我们可以开始打理手中的零用钱,随时随地进行理财;因为芝麻信用,我们感受到信用为我们带来的信任、尊严与方便——免押金租车、租房、骑行。从支付宝到蚂蚁金服,可以毫不夸张地说,一家企业改变了我们的生活。 蚂蚁金服无疑是目前中国最具代表性的金融科技巨头,同时也是全球估值最高的金融......一起来看看 《蚂蚁金服》 这本书的介绍吧!

JS 压缩/解压工具
JS 压缩/解压工具

在线压缩/解压 JS 代码

XML、JSON 在线转换
XML、JSON 在线转换

在线XML、JSON转换工具

HEX CMYK 转换工具
HEX CMYK 转换工具

HEX CMYK 互转工具