Lies, Damned Lies, and Averages: Perc50, Perc95 explained for Programmers

栏目: IT技术 · 发布时间: 4年前

内容简介:I got a customer ticket the other day that said they weren’t worried about response time because “New Relic is showing our average response time to be sub 200ms”. Sounds good, right? Well, when it comes to performance - you can’t use the average if you don

I got a customer ticket the other day that said they weren’t worried about response time because “New Relic is showing our average response time to be sub 200ms”. Sounds good, right? Well, when it comes to performance - you can’t use the average if you don’t know the distribution. It’s usually best to use the median, which is also perc50, though you’ll also want to look at your long tail of responses. If you’re not following, then this post is for you.

Normal distributions and you

What is a distribution? It’s the shape of the values that you get when you plot them in a histogram. Mainly you’re looking at how frequently does a number show up in your data set. If that sounds like a “bell curve,” then you’re right, but only for a normal distribution. Here’s a histogram of values generated from this code:

require 'rubystats'

average = 178
std_dev = 10
rand = Rubystats::NormalDistribution.new(average, std_dev)

1000.times.each { puts rand.rng }

Got the idea of using the rubystats gem from this post on generating random numbers for in Ruby .

Lies, Damned Lies, and Averages: Perc50, Perc95 explained for Programmers

With a normal distribution, the “median” is roughly the same as the average. In the data set I used to generate this image the average was 177.8 and the median is 178.1. You might recall that the median is essentially the middle point of the sorted data set.

If the average and median are pretty close, why would I recommend you not use averages? Well, they’re only close if you’ve got a particular set of distributions such as a normal or flat distribution. Here are some numbers I pulled from a recent benchmark:

Lies, Damned Lies, and Averages: Perc50, Perc95 explained for Programmers

In this case, red and blue represent different measurements of code changes before and after. You can see the shape of the distributions is different. If you use an average here, then the values are pretty far apart. The min and max values have a difference in this data set of 0.78 (seconds), and the difference between median and average is 0.17 (seconds). That means comparing average values instead of medians means that the values would be off by about 17% of the entire range, which is not great.

The blue and red graph above showing the non-normal distribution is pretty standard in web performance. Most of the requests are clustered around common values, but then there are a tiny fraction of requests that are significant outliers. In this data set, you can see the median for red is about 3.18, while it’s maximum value is almost 4. In an ideal world we’ll be able to see a histogram of values while comparing performance, but that’s not always possible. The key in this section is to know that web performance likely does not follow a normal distribution, and using an average, to sum up, those calculations are a terrible idea.

Perc50 and Perc95

Previously I mentioned “perc50” and “perc95”, what exactly do those terms mean? The term “perc” stands for percentage, and the number indicates what percentage. The term “perc50” indicates you’re looking at a number where 50% of requests are at or below that number. Here’s how that looks in code:

def perc(number, values)
  sorted_values = values.sort
  index = values.size * (number / 100.0)

  raise "Not a valid perc number #{number}" if index > 100 || index < 0

  return (sorted_values[index.ceil] + sorted_values[index.floor]) / 2.0
end

If we use this to find perc50 of the normal distribution we generated earlier, then it gives a similar answer:

require 'rubystats'

average = 178
std_dev = 10
rand = Rubystats::NormalDistribution.new(average, std_dev)

normal_values = 1000.times.map { rand.rng }

puts perc(50, normal_values)
# => 178.1

How would this look on a histogram? Here’s how I think of it:

Lies, Damned Lies, and Averages: Perc50, Perc95 explained for Programmers

Essentially that value 178.1 is saying that 50% of items in our array will be that value or less. When you increase to perc95 here’s what it would look like:

perc(95, normal_values)
# => 194.9

Lies, Damned Lies, and Averages: Perc50, Perc95 explained for Programmers

Here we’re saying that 95% of all values are 194.9 or lower.

Heroku response time percentiles

When you look at your Heroku dashboard, you’ll get a perc50, perc95, perc99, and “max” values. The idea here is to give you a snapshot of the distribution of your data. Here’s a screenshot of my app CodeTriage which helps people contribute to open source :

Lies, Damned Lies, and Averages: Perc50, Perc95 explained for Programmers

The median is lightning fast at 47ms, so at least half of all requests are that fast (or faster). But it looks like we’ve got a pretty long tail, perc95 is more than double our median and perc99 is more than six times our median. The “max” value (which is essential perc100) is even worse. When you’re visualizing these numbers, you would imagine a clustered peak right around 47ms and then a really wide graph that ended at 3,071ms.

What this says to me is that, on average, my app is pretty fast. But at the fringes, people are waiting multiple seconds just to get a response from the server.

Lies and averages

In the case of my customer, they were absolutely right that their average was perfectly fine. But what they didn’t see is their perc95 was in the 10s of seconds. That is an eternity to wait for a page to render. While it’s perhaps a bit disingenuous to say the average was a “lie,” it was certainly not the best representation of their data.

The next time someone gives you a single numerical answer for how fast something is, ask if it’s an average or a median or something else. For bonus points, ask for the distribution or a histogram. As humans we tend to prefer reassuring lies over hard truths and this certainly applies to benchmarks and profiling data. By learning about these essential measurements, you’re arming yourself to have a better understanding of your code, your performance characteristics, and the world.

If you liked this post, you might also like Statistical Literacy and You: An Intro for Programmers .


以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

计算几何

计算几何

周培德 / 2008-7 / 69.00元

《计算几何:算法设计与分析(第3版)》系统地介绍了计算几何中的基本概念、求解诸多问题的算法及复杂性分析,概括了求解几何问题所特有的许多思想方法、几何结构与数据结构。全书共分11章,包括:预备知识,几何查找(检索),多边形,凸壳及其应用,Voronoi图、三角剖分及其应用,交与并及其应用,多边形的获取及相关问题,几何体的划分与等分、算法的运动规划、几何拓扑网络设计、随机几何算法与并行几何算法等。一起来看看 《计算几何》 这本书的介绍吧!

随机密码生成器
随机密码生成器

多种字符组合密码

SHA 加密
SHA 加密

SHA 加密工具