Why computing standard deviation in pandas and NumPy yields different results?

栏目: IT技术 · 发布时间: 5年前

Why computing standard deviation in pandas and NumPy yields different results?

Curious? Let’s talk about statistics, populations, and samples…

Apr 29 ·5min read

Why computing standard deviation in pandas and NumPy yields different results? — Image by Gerd Altmann from Pixabay

How many of you have noticed that when you compute standard deviation using pandas and compare it to a result of NumPy function you will get different numbers?

I bet some of you did not realize this fact. And even if you did you’re maybe asking: Why?

In this short article, we will demonstrate that:

standard deviations results are indeed different using both libraries (at least at the first glance),
discuss why is that so (focusing on populations, samples, and how this influences calculation of standard deviation for each library)
and finally show you how to obtain same results using pandas and NumPy (in the end they should agree on such a simple computation that standard deviation is)

Let’s get started.

以上所述就是小编给大家介绍的《Why computing standard deviation in pandas and NumPy yields different results?》，希望对大家有所帮助，如果大家有任何疑问请给我留言，小编会及时回复大家的。在此也非常感谢大家对码农网的支持！

查看所有标签

猜你喜欢:

Why computing standard deviation in pandas and NumPy yields different results?

本站部分资源来源于网络，本站转载出于传递更多信息之目的，版权归原作者或者来源机构所有，如转载稿涉及版权问题，请联系我们。

码农书籍

大数据之路

阿里巴巴数据技术及产品部 / 电子工业出版社 / 2017-7-1 / CNY 79.00

在阿里巴巴集团内，数据人员面临的现实情况是：集团数据存储已经达到EB级别，部分单张表每天的数据记录数高达几千亿条；在2016年“双11购物狂欢节”的24小时中，支付金额达到了1207亿元人民币，支付峰值高达12万笔/秒，下单峰值达17.5万笔/秒，媒体直播大屏处理的总数据量高达百亿级别且所有数据都需要做到实时、准确地对外披露……巨大的信息量给数据采集、存储和计算都带来了极大的挑战。《大数据......一起来看看《大数据之路》这本书的介绍吧!

码农工具