Go is a Pretty Average Language (2019)

栏目: IT技术 · 发布时间: 4年前

内容简介:I had a data visualization problem at work. I’ve been thinking about set coverage issues, and wanted to test some ideas for visualizations. I had wanted to visualize the space of aggregate measures (i.e. things like means, etc). It later transpired that I

I had a data visualization problem at work. I’ve been thinking about set coverage issues, and wanted to test some ideas for visualizations. I had wanted to visualize the space of aggregate measures (i.e. things like means, etc). It later transpired that I didn’t need it, because my thinking around the issue had been wrong to begin with. I had written some code, and was eager to check it out. By the end of it, it had morphed into something entirely different, but it was a good entertaining night last night nonetheless.

For some reason, the Computer Language Benchmarks Game had been sitting open on one of my browser tabs for about a month now. I wondered if the data was freely available. It was . I rubbed my hands and got to work. An hour later, I have this (explanations and how-to follows).

Go is a Pretty Average Language (2019)

What Is This All About Then

The X-axis represents the size of the clean (i.e. no comments, normalized white space) GZip’d program source code, in bytes. The Y-axis represents performance, as measured by CPU time - the lower, the better. The dots are all the data points for all the benchmarks for all the languages in the Benchmarks Game dataset. In each subplot, are the language specific plots. Each red line is a line from the mean (X, Y) to a data point of the given language. The specific subset of programming languages is arbitrary: they’re languages I’ve written useful programs in (yes, don’t laugh, even NodeJS).

This chart shows the balance between the verbosity of a program written in a language, and the runtime performance of it. The ideal programming language would sit at the lower left quadrant. You may have noticed I left one of my favourite languages, Python, out. This is because I had truncated the chart at the given maximum X and Ys. Such a plot for Python would show no red line. I replaced it with Swift.

The density of lines show how many programs are being considered. In the dataset, multiple implementations are considered for each benchmark for each language. For the image above, I considered only the highest performing benchmarks for each language benchmark. Here’s one without filtering only the best:

Go is a Pretty Average Language (2019)

The area is essentially the variance. I had originally wanted to also plot the polygon connecting them. But I reasoned that the human mind is not good at understanding areas, so decided against it.

What’s the point of these charts? Well, for one, it allows me to play the amateur taxonomist. It is by no means formal, but now I can quantify the species of programming languages by the shape of their size-performance plot. It allows for the following observations to be made.

Some Observations

Go is for Buddhists who like “The Middle Way”. It is really average. By sheer chance I left it at the middle of the plot. Rust shares the performance characteristics of C, but is more verbose. The biggest surprise is the functional languages, OCaml, and Haskell. For a language famed for its terseness, Haskell it turns out, isn’t as terse as expected - it’s average size of source code is larger than the average Go source code size. Ocaml on the other hand was the dark horse. It’s about as verbose as Go, but performs better on average. This should be no surprise to anyone using FFTW though. Ocaml has been known to generate great backend code (and I have used Ocaml to generate some initial Gorgonia pieces too).

Julia was rather surprising in several aspects: Julia is a lisp. So its terseness is expected. Julia uses LLVM, which has crazy amazing backend optimization. Which is why its lower performance is somewhat surprising. However, using a median aggregate measure, Julia looks more as expected.

How Was This Made

To the surprise to absolutely no one who reads this blog frequently (hello to the two of you), I wrote it in Go. I used Gonum’s plot library to generate the plots. The full souce code can be found here

The program does warrant some explanations. There are no plot.Plotter for drawing the lines, so I had to write that from scratch. These are the relevant lines:

// star is a data structure used for plotting line stars
type star struct {
	plotter.XYs
	draw.LineStyle
	mx, my   float64
	trx, try float64 // truncate at
}

func (s *star) Plot(c draw.Canvas, p *plot.Plot) {
	tx, ty := p.Transforms(&c)
	trx, try := tx(s.trx), ty(s.try)
	ls := s.LineStyle
	mx, my := tx(s.mx), ty(s.my)
	for _, xy := range s.XYs {
		x := tx(xy.X)
		y := ty(xy.Y)
		if x > trx {
			x = trx
		}
		if y > try {
			y = try
		}
		c.StrokeLine2(ls, mx, my, x, y)
	}
}

I didn’t quite remember how to do this. But thankfully I had written a book on how to plot with custom Gonum plotters , so I could just refer to it . If you want full explanations, buy the book, or ask me nicely * As part of my marketing contract requirement, I am required to write an article about my book. The experience of writing a book had been more rushed than expected, and if asked, I will say "I could have done better". So I shall subvert this article for the aforementioned marketing purpose .

Further, to tile the plots, I made use of Gorgonia’s tensor package , which provides truly generic multidimensional arrays. It’s not strictly necessary for 3x3 plots, but it was a quick and easy way to do things for me:

t := tensor.New(tensor.WithBacking(ps), tensor.WithShape(len(list)/cols, cols))
	matUgh, err := native.Matrix(t)
	dieIfErr(err)
	mat := matUgh.([][]*plot.Plot)
	tiles := draw.Tiles{Rows: t.Shape()[0], Cols: t.Shape()[1]}

I do definitely think the tiling function could have been more neatly written. But it is what it is. Old mate Sebastien Binet from Go-HEP suggests using the TiledPlot data structure from hplot . I concur with the suggestion.

Conclusion

I set out to visualize the space of aggregates in neural networks. I got sidetracked and plotted some charts about programming languages as an attempt to quantify them in some way. Came into some probably spurious conclusions. Go is pretty average.

There is somewhere, an irony - this is that I have now taken more time to write this blog post than to write the program. I’d like to hear what you think.

Addendum

In a case of not-doing-your-research-before-blabbing, Isaac Gouy, current maintainer of the Computer Language Benchmarks Game mentions in the comments below that something similar had been done by Guillaume Marceau almost a decade ago. Guillaume even has the whole quadrant thing set up and properly explained. Do note that if you want to compare my results with Guillaume’s, the axes are flipped on Guillaume’s

Additionally, the post above have been corrected to include a note that the source code size is GZip’d. This does not change the result of the “analysis” (if ever there was one)


以上就是本文的全部内容,希望本文的内容对大家的学习或者工作能带来一定的帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

流量池

流量池

杨飞 / 中信出版集团 / 2018-4 / 68.00

移动互联网时代,信息日益冗余,新闻速朽; 整体流量增长速度放缓,而竞争者数量高速增加; 流量呈现变少、变贵、欺诈频繁的现状; 品效合一的营销策略成为共识,而实现路径成为痛点; 多次开创各营销渠道效果之最的营销人、各种刷屏级营销事件操盘手、神州专车CMO杨飞,这一次倾囊相授,诚恳讲述如何实现流量获取、营销转化以及流量的运营和再挖掘。一起来看看 《流量池》 这本书的介绍吧!

JSON 在线解析
JSON 在线解析

在线 JSON 格式化工具

HEX HSV 转换工具
HEX HSV 转换工具

HEX HSV 互换工具

HSV CMYK 转换工具
HSV CMYK 转换工具

HSV CMYK互换工具