内容简介:I am a data scientist and avid reader (andWhen browsing, I look for articles that will teach me something new and applicable. Where is the field of machine learning heading? What is the latest research? How can I do my job as a data scientist better? How c
Improving my skillset and knowledge about contemporary machine learning
Aug 2 ·4min read
I am a data scientist and avid reader (and writer ) of articles about data science and machine learning. It takes time to read journal articles, listen to podcasts, analyze interesting data, and play with new machine learning packages. Blog posts offer much of this knowledge in a condensed form.
When browsing, I look for articles that will teach me something new and applicable. Where is the field of machine learning heading? What is the latest research? How can I do my job as a data scientist better? How can I understand machine learning algorithms better?
My favorite articles are those that help me understand machine learning algorithms — especially algorithms that weren’t widely used (or didn’t exist) when I completed my MS in data science in 2017. The most useful articles are those that include code examples because recoding algorithms is time-consuming.
All this said, there are a lot of data science articles out there! Which to read?
In this post, I share some articles that I read and found useful in July. These helped grow my knowledge about data science and machine learning. I hope you find them just as valuable.
The Severe Limitations of Supervised Learning Are Piling Up
What is the future of machine learning research?
Supervised learning algorithms have brought a lot of value to businesses. However, the marginal value of small improvements in a supervised learning algorithm is decreasing. Why? The algorithms themselves are already pretty good and require many labels to provide good results. Labels are expensive to acquire and there are many datasets with no explicit “labels”!
Supervised learning is in the process of realizing another limitation: at its best, it only does exactly what we want it to do. Supervised learning can only interpolate. Reinforcement learning and similar evolutionary algorithms have the potential to extrapolate.
The future of research is likely in unsupervised, semi-supervised, and reinforcement learning!
Why I Liked This Article
This article speaks to the future of machine learning research and what sort of break-throughs I might expect to see.
NGBoost algorithm: solving probabilistic prediction problems
NGBoost is a “natural gradient” boosting algorithm that can predict the distribution of a target variable, not just a point estimate. This is important because often the uncertainty of a model, or range of probable values, is just as important as the exact predicted value.
How is this done? Per the paper, “NGBoost generalizes gradient boosting to probabilistic regression by treating the parameters of the conditional distribution as targets for a multiparameter boosting algorithm.”
Why I Liked This Article
This article nicely explains a technical paper about the new NGBoost algorithm. As icing on the cake, the article also shared a python package that makes it easy to apply ngboost in practice. (Paper + open source code = gold)
GPT-3, a Giant Step for Deep Learning And NLP
Many articles (and tweets) have been written demonstrating the impressive capabilities of GPT-3. But how does it work? Fortunately, this article breaks down and explains the key points of the 72-page GPT-3 paper.
Why I Liked This Article
GPT-3 is a powerful model with many applications. I personally hope to incorporate it into a project sometime! As a practitioner, there is a lot of value in understanding how language models work, especially in choosing how to proceed in an NLP project. But the paper itself is bit long to read for general understanding; this blog post highlights the key points that I need to know.
SHAP explained the way I wish someone explained it to me
Explainable ML is an important new area in machine learning. Shap is a popular method that highlights how a black-box model uses data to make predictions.
The explainer of a black-box model should not itself be a black-box.
This article provides a visual and intuitive explanation of the SHAP algorithm.
Why I Liked This Article
I have worked with model explainability methods like SHAP before, but admittedly my knowledge of the algorithm was rudimentary. After this blog post, I have a much stronger understanding and can explain how it works.
Deep Learning for Anomaly Detection: A Comprehensive Survey
This article summarizes a survey paper on deep learning for anomaly detection.
The paper/article describes the key challenges of the anomaly detection task:
- The difficulty to achieve high anomaly detection recall rate
- Anomaly detection in high-dimensional and/or not-independent data
- Data-efficient learning of normality/abnormality
- Noise-resilient anomaly detection
- Detection of complex or multidimensional anomalies
- Anomaly explanation
There are three general ways to use deep learning for anomaly detection:
- Deep learning for feature extraction
- Learning feature representations of normality
- End-to-end anomaly score learning
Why I Liked This Article
My reasons for sharing this article are simple: it is well-written and I am quite interested in this topic.
以上所述就是小编给大家介绍的《Medium Articles that Made Me a Better Data Scientist in July》,希望对大家有所帮助,如果大家有任何疑问请给我留言,小编会及时回复大家的。在此也非常感谢大家对 码农网 的支持!
猜你喜欢:本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们。
游戏人工智能编程案例精粹
巴克兰德 (Mat Buckland) / 罗岱 / 人民邮电出版社 / 2008年06月 / 55.00元
《游戏人工智能编程案例精粹》适合对游戏AI开发感兴趣的爱好者和游戏AI开发人员阅读和参考。一起来看看 《游戏人工智能编程案例精粹》 这本书的介绍吧!