Go is a Pretty Average Language (2019)

栏目: IT技术 · 发布时间: 4年前

内容简介:I had a data visualization problem at work. I’ve been thinking about set coverage issues, and wanted to test some ideas for visualizations. I had wanted to visualize the space of aggregate measures (i.e. things like means, etc). It later transpired that I

I had a data visualization problem at work. I’ve been thinking about set coverage issues, and wanted to test some ideas for visualizations. I had wanted to visualize the space of aggregate measures (i.e. things like means, etc). It later transpired that I didn’t need it, because my thinking around the issue had been wrong to begin with. I had written some code, and was eager to check it out. By the end of it, it had morphed into something entirely different, but it was a good entertaining night last night nonetheless.

For some reason, the Computer Language Benchmarks Game had been sitting open on one of my browser tabs for about a month now. I wondered if the data was freely available. It was . I rubbed my hands and got to work. An hour later, I have this (explanations and how-to follows).

Go is a Pretty Average Language (2019)

What Is This All About Then

The X-axis represents the size of the clean (i.e. no comments, normalized white space) GZip’d program source code, in bytes. The Y-axis represents performance, as measured by CPU time - the lower, the better. The dots are all the data points for all the benchmarks for all the languages in the Benchmarks Game dataset. In each subplot, are the language specific plots. Each red line is a line from the mean (X, Y) to a data point of the given language. The specific subset of programming languages is arbitrary: they’re languages I’ve written useful programs in (yes, don’t laugh, even NodeJS).

This chart shows the balance between the verbosity of a program written in a language, and the runtime performance of it. The ideal programming language would sit at the lower left quadrant. You may have noticed I left one of my favourite languages, Python, out. This is because I had truncated the chart at the given maximum X and Ys. Such a plot for Python would show no red line. I replaced it with Swift.

The density of lines show how many programs are being considered. In the dataset, multiple implementations are considered for each benchmark for each language. For the image above, I considered only the highest performing benchmarks for each language benchmark. Here’s one without filtering only the best:

Go is a Pretty Average Language (2019)

The area is essentially the variance. I had originally wanted to also plot the polygon connecting them. But I reasoned that the human mind is not good at understanding areas, so decided against it.

What’s the point of these charts? Well, for one, it allows me to play the amateur taxonomist. It is by no means formal, but now I can quantify the species of programming languages by the shape of their size-performance plot. It allows for the following observations to be made.

Some Observations

Go is for Buddhists who like “The Middle Way”. It is really average. By sheer chance I left it at the middle of the plot. Rust shares the performance characteristics of C, but is more verbose. The biggest surprise is the functional languages, OCaml, and Haskell. For a language famed for its terseness, Haskell it turns out, isn’t as terse as expected - it’s average size of source code is larger than the average Go source code size. Ocaml on the other hand was the dark horse. It’s about as verbose as Go, but performs better on average. This should be no surprise to anyone using FFTW though. Ocaml has been known to generate great backend code (and I have used Ocaml to generate some initial Gorgonia pieces too).

Julia was rather surprising in several aspects: Julia is a lisp. So its terseness is expected. Julia uses LLVM, which has crazy amazing backend optimization. Which is why its lower performance is somewhat surprising. However, using a median aggregate measure, Julia looks more as expected.

How Was This Made

To the surprise to absolutely no one who reads this blog frequently (hello to the two of you), I wrote it in Go. I used Gonum’s plot library to generate the plots. The full souce code can be found here

The program does warrant some explanations. There are no plot.Plotter for drawing the lines, so I had to write that from scratch. These are the relevant lines:

// star is a data structure used for plotting line stars
type star struct {
	plotter.XYs
	draw.LineStyle
	mx, my   float64
	trx, try float64 // truncate at
}

func (s *star) Plot(c draw.Canvas, p *plot.Plot) {
	tx, ty := p.Transforms(&c)
	trx, try := tx(s.trx), ty(s.try)
	ls := s.LineStyle
	mx, my := tx(s.mx), ty(s.my)
	for _, xy := range s.XYs {
		x := tx(xy.X)
		y := ty(xy.Y)
		if x > trx {
			x = trx
		}
		if y > try {
			y = try
		}
		c.StrokeLine2(ls, mx, my, x, y)
	}
}

I didn’t quite remember how to do this. But thankfully I had written a book on how to plot with custom Gonum plotters , so I could just refer to it . If you want full explanations, buy the book, or ask me nicely * As part of my marketing contract requirement, I am required to write an article about my book. The experience of writing a book had been more rushed than expected, and if asked, I will say "I could have done better". So I shall subvert this article for the aforementioned marketing purpose .

Further, to tile the plots, I made use of Gorgonia’s tensor package , which provides truly generic multidimensional arrays. It’s not strictly necessary for 3x3 plots, but it was a quick and easy way to do things for me:

t := tensor.New(tensor.WithBacking(ps), tensor.WithShape(len(list)/cols, cols))
	matUgh, err := native.Matrix(t)
	dieIfErr(err)
	mat := matUgh.([][]*plot.Plot)
	tiles := draw.Tiles{Rows: t.Shape()[0], Cols: t.Shape()[1]}

I do definitely think the tiling function could have been more neatly written. But it is what it is. Old mate Sebastien Binet from Go-HEP suggests using the TiledPlot data structure from hplot . I concur with the suggestion.

Conclusion

I set out to visualize the space of aggregates in neural networks. I got sidetracked and plotted some charts about programming languages as an attempt to quantify them in some way. Came into some probably spurious conclusions. Go is pretty average.

There is somewhere, an irony - this is that I have now taken more time to write this blog post than to write the program. I’d like to hear what you think.

Addendum

In a case of not-doing-your-research-before-blabbing, Isaac Gouy, current maintainer of the Computer Language Benchmarks Game mentions in the comments below that something similar had been done by Guillaume Marceau almost a decade ago. Guillaume even has the whole quadrant thing set up and properly explained. Do note that if you want to compare my results with Guillaume’s, the axes are flipped on Guillaume’s

Additionally, the post above have been corrected to include a note that the source code size is GZip’d. This does not change the result of the “analysis” (if ever there was one)


以上就是本文的全部内容,希望本文的内容对大家的学习或者工作能带来一定的帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

规划算法

规划算法

拉瓦利 / 2011-1 / 99.00元

《规划算法》内容简介:规划是人类智慧的结晶,规划问题广泛地存在于人们的日常工作和生活中。现在,规划已涉及计算机科学、人工智能、力学、机械学、控制论、对策论、概率论、图论、拓扑学、微分几何、代数几何等许多现代科学领域。《规划算法》是作者多年来教学和科研工作的总结,系统地介绍了规划领域中的基础知识和最新成果。作者将三个相对独立的学科:机器人学、人工智能和控制论巧妙地结合在一起。《规划算法》给出了大量内......一起来看看 《规划算法》 这本书的介绍吧!

URL 编码/解码
URL 编码/解码

URL 编码/解码

html转js在线工具
html转js在线工具

html转js在线工具

UNIX 时间戳转换
UNIX 时间戳转换

UNIX 时间戳转换