内容简介:At this point, it should be a surprise to no one that Python is theSo, naturally, the codebase forDeploying models at scale is different than writing Python scripts that call PyTorch and TensorFlow functions. To actually run a production machine learning A
Why we deploy machine learning models with Go — not Python
There’s more to production machine learning than Python scripts
Apr 14 ·5min read
At this point, it should be a surprise to no one that Python is the most popular language for machine learning. While ML frameworks use languages like CUDA C/C++ for actual computation, they all offer a Python interface. As a result, most ML practitioners work in Python.
So, naturally, the codebase for Cortex — our ML infrastructure — is 88.3% Go.
Deploying models at scale is different than writing Python scripts that call PyTorch and TensorFlow functions. To actually run a production machine learning API at scale, we need our infrastructure to do things like:
- Autoscaling, so that traffic fluctuations don’t break our APIs (and our AWS stays manageable).
- API management, to handle multiple deployments.
- Rolling updates, so that we can update models while still serving requests.
We built Cortex to provide this functionality, and we decided to write it in Go for a few reasons:
1. The infrastructure community has already embraced Go
We are software engineers, not data scientists, by background. We got into ML because we wanted to build features like Gmail’s Smart Compose, not because we were fascinated by backpropagation (although it is admittedly cool). We wanted a simple tool that would take a trained model and automatically implement all the infra features needed—like reproducible deployments, scalable request handling, automated monitoring, etc—to deploy it as an API.
And while that all-in-one model-to-microservice platform didn’t exist yet, we’d implemented each of those features in normal software before. We knew what tools were right for the job—and what language they were written in.
There’s a reason the teams that build tools like Kubernetes, Docker, and Terraform use Go. It’s fast. It handles concurrency well. It compiles down to a single binary. In this way, choosing Go was relatively low-risk for us. Other teams had already used it to solve similar challenges.
Additionally, being written in Go makes contributing easier for infrastructure engineers, who are likely already familiar with the language.
2. Go solves our problems related to concurrency and scheduling
Managing a deployment requires many services to be running both concurrently and on a precise schedule. Thankfully, Goroutines, channels, and Go’s built-in timers and tickers provide an elegant solution for concurrency and scheduling.
At a high level, a Goroutine is an otherwise normal function that Go runs concurrently by executing on a virtual, independent thread. Multiple Goroutines can fit on a single OS thread. Channels allow Goroutines to share data, while timers and tickers allow us to schedule Goroutines.
We use Goroutines to implement concurrency when needed — like when Cortex needs to upload multiple files to S3 and running them in parallel will save time — or to keep a potentially long-running function, like streaming logs from CloudWatch, from blocking the main thread.
Additionally, we use timers and tickers within Goroutines to run Cortex’s autoscaler. I’ve written up a full report on how we implement replica-level autoscaling in Cortex, but the short version is that Cortex counts the number of queued and inflight requests, calculates how many concurrent requests each replica should be handling, and scales appropriately.
To do this, Cortex’s monitoring functions need to execute at consistent intervals. Go’s scheduler makes sure monitoring happens when it is supposed to, and Goroutines allow each monitoring function to execute concurrently and independently for each API.
Implementing all of this functionality in Python may be doable with tools like asyncio, but the fact that Go makes it so easy is a boon for us.
3. Building a cross-platform CLI is easier in Go
Our CLI deploys models and manages APIs:
We want the CLI to work on both Linux and Mac. Originally, we tried writing it in Python, but users consistently had trouble getting it to work in different environments. When we rebuilt the CLI in Go, we were able to compile it down to a single binary, which could be distributed across platforms without much extra engineering effort on our part.
The performance benefits of a compiled Go binary versus an interpreted language are also significant. According to the computer benchmarks game, Go is dramatically faster than Python .
It’s not coincidental that many other infrastructure CLIs — eksctl, kops, and the Helm client, to name a few — are written in Go.
4. Go lends itself to reliable infrastructure
As a final point, Go helps with Cortex’s most important feature: reliability.
Reliability is obviously important in all software, but it is absolutely critical with inference infrastructure. A bug in Cortex could seriously run up the inference bill.
While we apply a thorough testing process to every release, Go’s static-typing and compilation step provide an initial defense against errors. If there’s a serious bug, there’s a good chance it will get caught during compilation. With a small team, this is very helpful.
Go’s unforgiving nature may make it a bit more painful to get started with than Python, but these internal guardrails act as a sort of first line of defense for us, helping us avoid silly type errors.
Python for scripting, Go for infrastructure
We still love Python, and it has its place within Cortex, specifically around model inference.
Cortex supports Python for model serving scripts. We write Python that loads models into memory, conducts pre/post inference processing, and serves requests. However, even that Python code is packaged up into Docker containers, which are orchestrated by code that is written in Go.
Python will (and should) remain the most popular language for data science and machine learning engineering. However, when it comes to machine learning infrastructure, we’re happy with Go.
Are you an engineer interested in Go and machine learning? If so, consider contributing to Cortex !
以上所述就是小编给大家介绍的《Why we deploy machine learning models with Go — not Python》,希望对大家有所帮助,如果大家有任何疑问请给我留言,小编会及时回复大家的。在此也非常感谢大家对 码农网 的支持!
猜你喜欢:本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们。