Medium Articles that Made Me a Better Data Scientist in July

栏目: IT技术 · 发布时间: 4年前

内容简介:I am a data scientist and avid reader (andWhen browsing, I look for articles that will teach me something new and applicable. Where is the field of machine learning heading? What is the latest research? How can I do my job as a data scientist better? How c

Improving my skillset and knowledge about contemporary machine learning

Medium Articles that Made Me a Better Data Scientist in July

Photo by Alisa Anton on Unsplash

I am a data scientist and avid reader (and writer ) of articles about data science and machine learning. It takes time to read journal articles, listen to podcasts, analyze interesting data, and play with new machine learning packages. Blog posts offer much of this knowledge in a condensed form.

When browsing, I look for articles that will teach me something new and applicable. Where is the field of machine learning heading? What is the latest research? How can I do my job as a data scientist better? How can I understand machine learning algorithms better?

My favorite articles are those that help me understand machine learning algorithms — especially algorithms that weren’t widely used (or didn’t exist) when I completed my MS in data science in 2017. The most useful articles are those that include code examples because recoding algorithms is time-consuming.

All this said, there are a lot of data science articles out there! Which to read?

In this post, I share some articles that I read and found useful in July. These helped grow my knowledge about data science and machine learning. I hope you find them just as valuable.

The Severe Limitations of Supervised Learning Are Piling Up

What is the future of machine learning research?

Supervised learning algorithms have brought a lot of value to businesses. However, the marginal value of small improvements in a supervised learning algorithm is decreasing. Why? The algorithms themselves are already pretty good and require many labels to provide good results. Labels are expensive to acquire and there are many datasets with no explicit “labels”!

Supervised learning is in the process of realizing another limitation: at its best, it only does exactly what we want it to do.
Supervised learning can only interpolate. Reinforcement learning and similar evolutionary algorithms have the potential to extrapolate.

The future of research is likely in unsupervised, semi-supervised, and reinforcement learning!

Why I Liked This Article

This article speaks to the future of machine learning research and what sort of break-throughs I might expect to see.

NGBoost algorithm: solving probabilistic prediction problems

NGBoost is a “natural gradient” boosting algorithm that can predict the distribution of a target variable, not just a point estimate. This is important because often the uncertainty of a model, or range of probable values, is just as important as the exact predicted value.

How is this done? Per the paper, “NGBoost generalizes gradient boosting to probabilistic regression by treating the parameters of the conditional distribution as targets for a multiparameter boosting algorithm.”

Why I Liked This Article

This article nicely explains a technical paper about the new NGBoost algorithm. As icing on the cake, the article also shared a python package that makes it easy to apply ngboost in practice. (Paper + open source code = gold)

GPT-3, a Giant Step for Deep Learning And NLP

Many articles (and tweets) have been written demonstrating the impressive capabilities of GPT-3. But how does it work? Fortunately, this article breaks down and explains the key points of the 72-page GPT-3 paper.

Why I Liked This Article

GPT-3 is a powerful model with many applications. I personally hope to incorporate it into a project sometime! As a practitioner, there is a lot of value in understanding how language models work, especially in choosing how to proceed in an NLP project. But the paper itself is bit long to read for general understanding; this blog post highlights the key points that I need to know.

SHAP explained the way I wish someone explained it to me

Explainable ML is an important new area in machine learning. Shap is a popular method that highlights how a black-box model uses data to make predictions.

The explainer of a black-box model should not itself be a black-box.

This article provides a visual and intuitive explanation of the SHAP algorithm.

Why I Liked This Article

I have worked with model explainability methods like SHAP before, but admittedly my knowledge of the algorithm was rudimentary. After this blog post, I have a much stronger understanding and can explain how it works.

Deep Learning for Anomaly Detection: A Comprehensive Survey

This article summarizes a survey paper on deep learning for anomaly detection.

The paper/article describes the key challenges of the anomaly detection task:

  1. The difficulty to achieve high anomaly detection recall rate
  2. Anomaly detection in high-dimensional and/or not-independent data
  3. Data-efficient learning of normality/abnormality
  4. Noise-resilient anomaly detection
  5. Detection of complex or multidimensional anomalies
  6. Anomaly explanation

There are three general ways to use deep learning for anomaly detection:

  • Deep learning for feature extraction
  • Learning feature representations of normality
  • End-to-end anomaly score learning

Why I Liked This Article

My reasons for sharing this article are simple: it is well-written and I am quite interested in this topic.


以上所述就是小编给大家介绍的《Medium Articles that Made Me a Better Data Scientist in July》,希望对大家有所帮助,如果大家有任何疑问请给我留言,小编会及时回复大家的。在此也非常感谢大家对 码农网 的支持!

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

游戏人工智能编程案例精粹

游戏人工智能编程案例精粹

巴克兰德 (Mat Buckland) / 罗岱 / 人民邮电出版社 / 2008年06月 / 55.00元

《游戏人工智能编程案例精粹》适合对游戏AI开发感兴趣的爱好者和游戏AI开发人员阅读和参考。一起来看看 《游戏人工智能编程案例精粹》 这本书的介绍吧!

JSON 在线解析
JSON 在线解析

在线 JSON 格式化工具

MD5 加密
MD5 加密

MD5 加密工具

html转js在线工具
html转js在线工具

html转js在线工具