Medium Articles that Made Me a Better Data Scientist in July

栏目: IT技术 · 发布时间: 4年前

内容简介：I am a data scientist and avid reader (andWhen browsing, I look for articles that will teach me something new and applicable. Where is the field of machine learning heading? What is the latest research? How can I do my job as a data scientist better? How c

Improving my skillset and knowledge about contemporary machine learning

Alexandra Amidon

Aug 2 ·4min read

Medium Articles that Made Me a Better Data Scientist in July — Photo by Alisa Anton on Unsplash

I am a data scientist and avid reader (and writer ) of articles about data science and machine learning. It takes time to read journal articles, listen to podcasts, analyze interesting data, and play with new machine learning packages. Blog posts offer much of this knowledge in a condensed form.

When browsing, I look for articles that will teach me something new and applicable. Where is the field of machine learning heading? What is the latest research? How can I do my job as a data scientist better? How can I understand machine learning algorithms better?

My favorite articles are those that help me understand machine learning algorithms — especially algorithms that weren’t widely used (or didn’t exist) when I completed my MS in data science in 2017. The most useful articles are those that include code examples because recoding algorithms is time-consuming.

All this said, there are a lot of data science articles out there! Which to read?

In this post, I share some articles that I read and found useful in July. These helped grow my knowledge about data science and machine learning. I hope you find them just as valuable.

The Severe Limitations of Supervised Learning Are Piling Up

Is research turning in a different direction?

towardsdatascience.com

What is the future of machine learning research?

Supervised learning algorithms have brought a lot of value to businesses. However, the marginal value of small improvements in a supervised learning algorithm is decreasing. Why? The algorithms themselves are already pretty good and require many labels to provide good results. Labels are expensive to acquire and there are many datasets with no explicit “labels”!

Supervised learning is in the process of realizing another limitation: at its best, it only does exactly what we want it to do.
Supervised learning can only interpolate. Reinforcement learning and similar evolutionary algorithms have the potential to extrapolate.

The future of research is likely in unsupervised, semi-supervised, and reinforcement learning!

Why I Liked This Article

This article speaks to the future of machine learning research and what sort of break-throughs I might expect to see.

NGBoost algorithm: solving probabilistic prediction problems

Predict a distribution of the target variable, not just point estimate

towardsdatascience.com

NGBoost is a “natural gradient” boosting algorithm that can predict the distribution of a target variable, not just a point estimate. This is important because often the uncertainty of a model, or range of probable values, is just as important as the exact predicted value.

How is this done? Per the paper, “NGBoost generalizes gradient boosting to probabilistic regression by treating the parameters of the conditional distribution as targets for a multiparameter boosting algorithm.”

Why I Liked This Article

This article nicely explains a technical paper about the new NGBoost algorithm. As icing on the cake, the article also shared a python package that makes it easy to apply ngboost in practice. (Paper + open source code = gold)

GPT-3, a Giant Step for Deep Learning And NLP

A few days ago, OpenAI announced a new successor to their Language Model (LM) - GPT-3. This is the largest model trained…

link.medium.com

Many articles (and tweets) have been written demonstrating the impressive capabilities of GPT-3. But how does it work? Fortunately, this article breaks down and explains the key points of the 72-page GPT-3 paper.

Why I Liked This Article

GPT-3 is a powerful model with many applications. I personally hope to incorporate it into a project sometime! As a practitioner, there is a lot of value in understanding how language models work, especially in choosing how to proceed in an NLP project. But the paper itself is bit long to read for general understanding; this blog post highlights the key points that I need to know.

SHAP explained the way I wish someone explained it to me

Making sense of the formula used for computing SHAP values

towardsdatascience.com

Explainable ML is an important new area in machine learning. Shap is a popular method that highlights how a black-box model uses data to make predictions.

The explainer of a black-box model should not itself be a black-box.

This article provides a visual and intuitive explanation of the SHAP algorithm.

Why I Liked This Article

I have worked with model explainability methods like SHAP before, but admittedly my knowledge of the algorithm was rudimentary. After this blog post, I have a much stronger understanding and can explain how it works.

Deep Learning for Anomaly Detection: A Comprehensive Survey

This post summarizes a comprehensive survey paper on deep learning for anomaly detection — “ Deep Learning for Anomaly…

link.medium.com

This article summarizes a survey paper on deep learning for anomaly detection.

The paper/article describes the key challenges of the anomaly detection task:

The difficulty to achieve high anomaly detection recall rate
Anomaly detection in high-dimensional and/or not-independent data
Data-efficient learning of normality/abnormality
Noise-resilient anomaly detection
Detection of complex or multidimensional anomalies
Anomaly explanation

There are three general ways to use deep learning for anomaly detection:

Deep learning for feature extraction
Learning feature representations of normality
End-to-end anomaly score learning

Why I Liked This Article

My reasons for sharing this article are simple: it is well-written and I am quite interested in this topic.

以上所述就是小编给大家介绍的《Medium Articles that Made Me a Better Data Scientist in July》，希望对大家有所帮助，如果大家有任何疑问请给我留言，小编会及时回复大家的。在此也非常感谢大家对码农网的支持！

查看所有标签

猜你喜欢:

Medium Articles that Made Me a Better Data Scientist in July

本站部分资源来源于网络，本站转载出于传递更多信息之目的，版权归原作者或者来源机构所有，如转载稿涉及版权问题，请联系我们。

码农书籍

超级IP

吴声 / 中信出版集团 / 2016-7 / 49.00元

一切商业皆内容，一切内容皆IP！从迪士尼、airbnb、YouTube、Instagram到微信、Papi酱、芈月传、鹿晗，IP浪潮席卷全球，这不仅仅是互联网领域的革命，更是未来商业的游戏新规则。 IP从泛娱乐形态快速渗透新商业生态全维度，正深化为不同行业共同的战略方法，甚至是一种全新的商业生存方式，即IP化生存。超级IP的内核，是辨识度极高的可认同的商业符号，它意味着一种对......一起来看看《超级IP》这本书的介绍吧!

码农工具

Medium Articles that Made Me a Better Data Scientist in July

Improving my skillset and knowledge about contemporary machine learning

The Severe Limitations of Supervised Learning Are Piling Up

The Severe Limitations of Supervised Learning Are Piling Up

Is research turning in a different direction?

towardsdatascience.com

Why I Liked This Article

NGBoost algorithm: solving probabilistic prediction problems

NGBoost algorithm: solving probabilistic prediction problems

Predict a distribution of the target variable, not just point estimate

towardsdatascience.com

Why I Liked This Article

GPT-3, a Giant Step for Deep Learning And NLP

GPT-3, a Giant Step for Deep Learning And NLP

link.medium.com

Why I Liked This Article

SHAP explained the way I wish someone explained it to me

SHAP explained the way I wish someone explained it to me

Making sense of the formula used for computing SHAP values

towardsdatascience.com

Why I Liked This Article

Deep Learning for Anomaly Detection: A Comprehensive Survey

Deep Learning for Anomaly Detection: A Comprehensive Survey

link.medium.com

Why I Liked This Article

超级IP

RGB转16进制工具

HTML 编码/解码

XML 在线格式化