内容简介:Here’s some transparency into the technical decisions that have driven our technology. Also some insight into the things that our stack has made easy, and also some things our tech choices have made hard…Hopefully, this gives you an idea of some of the rel
Some insight into how we’re building our technology.
Mar 23 ·4min read
Here’s some transparency into the technical decisions that have driven our technology. Also some insight into the things that our stack has made easy, and also some things our tech choices have made hard…
Hopefully, this gives you an idea of some of the relevant technologies in our field. From talking to fellow startup founders, this stack is pretty similar across a lot of other Machine Learning focused data teams, with some variations from industry and personal circumstance.
Overview of our stack:
Spawner API :
- Languages: Python, C++, SQL
- AI/ML: TensorFlow (our toolkit of choice for DL), Scikit-Learn (our go-to for most non-DL tasks)
- Other Libraries: Pandas, Numpy, fbprophet, NLTK, scipy, ffn, pyodbc, APScheduler
- Database: SQL Server, migration to PostgreSQL
- Warehouse: N/A
- ETL: Python, Airflow
- Visualizations: Streamlit, Plotly (visualizing app performance), Altair (viz and dashboarding for new ideas), Tableau (internal business intelligence)
- Hosting: Azure (core), Heroku (side projects & demos)
- Tracking & SC: GitHub, Notion (keeping engineering, PMs and marketing synced up)
Spawner Portal:
- Languages & Frameworks: (FE) React + Next.js, (BE) Python
- Database: SQL Server
- Hosting: Azure
Languages
We use Python for basically everything. When something that isn’t serving efficiently or wasn’t built very well isn’t keeping up, we think about converting to C++ with Python serving merely as a reference implementation. We use Python most heavily for our modeling and ETL. We’re very much a data company so of course there’s SQL everywhere.
AI/ML
We like TensorFlow for its great documentation and high number of devs with TensorFlow familiarity. Though PyTorch is starting to make some real headways, especially with all the great work Facebook Research has done recently. For now, TensorFlow is the majority of our stack, but I see nothing wrong with TF and PyTorch mingling in the future.
Scikit-Learn pops up all over the place. Its ease of use is undeniable. It’s seen in production at companies all over industry. It’s really the bread and butter of much of what we do non-deep learning that we do on the ML side.
Frontends & Frameworks
Quite frankly I’m not a frontend dev and so I won’t waste any of your time on this section. Our first hire liked Vue.js and so we went that direction originally. He thought React/Next.js made more since for another part of the codebase so that’s how that happened. We’re incredibly pleased with the work our devs have done. We love Next.js for its SEO friendliness.
Database
Our stack lives on Azure, so SQL Server seemed to make the most sense up front. From a cost and ease of use perspective, the two are obviously tightly integrated. Other than the social pressure of “you’re not using MySQL or PostgreSQL???” it’s doing everything we need, for now. We’re eyeing a potential move to PostgreSQL.
Data Warehouse
We’re slowly moving into Databricks as the volume of data grows. I’m a huge Databricks fan, also a big Snowflake fan. The two partnering up is awesome. Databricks is spectacular and I expect to bring Databricks fully online very soon.
Extract, Transform, Load (ETL)
Until recently, I was almost exclusively using Python + SQL & SQL Alchemy to do most ETL tasks. Someone I know forced me to check out Airflow and all of a sudden it’s becoming part of our stack. Scheduling workflows feels a little more natural using Airflow than stringing together cron jobs.
Tracking & Source Control
We use GitHub. Shocker.
We use Notion and I find myself using Notion for more than just project management and tracking. I use it for personal accountability and really just a technical diary. I’m able to keep track of what I do on a daily basis, make sure I’m allocating time efficiently, and track what everyone on the team is up to and where I can help out or make someone’s life easier.
Visualizations
We love Streamlit; it helps us demo models and API endpoints in very little time. I dig Plotly and Altair at the moment for their ease of use on non-public projects. Plotly gives us a bunch of flexibility and features without much effort. Altair gives us more in-depth features and customization for extra effort.
For business metrics and keeping track of revenue, churn, GA, etc we love throwing stuff into Tableau. It’s an easy way for our non-technical folks to dig straight into the analytics.
You can check out our product here.
Let’s continue the conversation on Twitter!
以上就是本文的全部内容,希望本文的内容对大家的学习或者工作能带来一定的帮助,也希望大家多多支持 码农网
猜你喜欢:本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们。
AJAX HACKS中文版
帕里 / 2007-3 / 55.00元
《AJAX HACKS中文版:创建快速响应Web站点的工具和技巧》完全挖掘出了Ajax技术的优点,以手把手的方式教您如何揭开Ajax的神秘面纱。每个hack代表了完成某个特定任务的精巧方法,从而为您节省了大量的时间。 《AJAX HACKS中文版:创建快速响应Web站点的工具和技巧》搜集了80个有关Ajax技术的技巧,覆盖了该技术的所有亮点。你现在就想构建下一代Web应用吗?《AJAX HA......一起来看看 《AJAX HACKS中文版》 这本书的介绍吧!
MD5 加密
MD5 加密工具
RGB HSV 转换
RGB HSV 互转工具