Wc in D: 712 Characters Without a Single Branch

栏目: IT技术 · 发布时间: 6年前

内容简介:After readingIt consists of one file and has 34 lines and 712 characters.Sure, it is using

After reading “Beating C With 80 Lines Of Haskell: Wc” , which I found on Hacker News , I thought D could do better. So I wrote a wc in D.

The Program

It consists of one file and has 34 lines and 712 characters.

import std.stdio : writefln, File;
import std.algorithm : map, fold, splitter;
import std.range : walkLength;
import std.typecons : Yes;
import std.uni : byCodePoint;

struct Line {
	size_t chars;
	size_t words;
}

struct Output {
	size_t lines;
	size_t words;
	size_t chars;
}

Output combine(Output a, Line b) pure nothrow {
	return Output(a.lines + 1, a.words + b.words, a.chars + b.chars);
}

Line toLine(char[] l) pure {
	return Line(l.byCodePoint.walkLength, l.splitter.walkLength);
}

void main(string[] args) {
	auto f = File(args[1]);
	Output o = f
		.byLine(Yes.keepTerminator)
		.map!(l => toLine(l))
		.fold!(combine)(Output(0, 0, 0));

	writefln!"%u %u %u %s"(o.lines, o.words, o.chars, args[1]);
}

Sure, it is using Phobos, D’s standard library , but then why wouldn’t it? Phobos is awesome and ships with every D compiler. The program itself does not contain a single if statement. The Haskell wc implementation has several if statements. The D program, apart from the main function, contains three tiny functions. I could have easily put all the functionally in one range chain, but then it probably would have exceeded 80 characters per line. That’s a major code-smell.

The Performance

Is the D wc faster than the coreutils wc ? No, but it took me 15 minutes to write mine (I had to search for walkLength , because I forgot its name).

file lines bytes coreutils haskell D
app.d 46 906 3.5 ms +- 1.9 ms 39.6 ms +- 7.8 ms 8.9 ms +- 2.1 ms
big.txt 862 64k 4.7 ms +- 2.0 ms 39.6 ms +- 7.8 ms 9.8 ms +- 2.1 ms
vbig.txt 1.7M 96M 658.6ms +- 24.5ms 226.4 ms +- 29.5 ms 1.102 s +- 0.022 s
vbig2.txt 12.1M 671M 4.4 s +- 0.058 s 1.1 s +- 0.039 s 7.4 s +- 0.085 s

Memory:

file coreutils haskell D
app.d 2052K 7228K 7708K
big.txt 2112K 7512K 7616K
vbig.txt 2288K 42620K 7712K
vbig2.txt 2360K 50860K 7736K

Is the Haskell wc faster? For big files, absolutely, but then it is using threads. For small files, GNU’s coreutils still beats the competition. At this stage my version is very likely IO bound, and it’s fast enough anyway.

I’ll not claim that one language is faster than another. If you spend a chunk of time on optimizing a micro-benchmark, you are likely going to beat the competition. That’s not real life. But I will claim that functional programming in D gives functional programming in Haskell a run for its money.

A Bit About Ranges

Wc in D: 712 Characters Without a Single Branch A range is an abstraction that you can consume through iteration without consuming the underlying collection (if there is one). Technically, a range can be a struct or a class that adheres to one of a handful of Range interfaces. The most basic form, the InputRange , requires the function

void popFront();

and two members or properties:

T front;
bool empty;

T is the generic type of the elements the range iterates.

In D, ranges are special in a way that other objects are not. When a range is given to a foreach statement, the compiler does a little rewrite.

foreach (e; range) { ... }

is rewritten to

for (auto __r = range; !__r.empty; __r.popFront()) {
    auto e = __r.front;
    ...
}

auto e = infers the type and is equivalent to T e = .

Given this knowledge, building a range that can be used by foreach is easy.

struct Iota {
	int front;
	int end;

	@property bool empty() const {
		return this.front == this.end;
	}

	void popFront() {
		++this.front;
	}
}

unittest {
	import std.stdio;
	foreach(it; Iota(0, 10)) {
		writeln(it);
	}
}

Iota is a very simple range. It functions as a generator, having no underlying collection. It iterates integers from front to end in steps of one. This snippet introduces a little bit of D syntax.

@property bool empty() const {

The @property attribute allows us to use the function empty the same way as a member variable (calling the function without the parenthesis). The trailing const means that we don’t modify any data of the instance we call empty on. The built-in unit test prints the numbers 0 to 10 .

Another small feature is the lack of an explicit constructor. The struct Iota has two member variables of type int . In the foreach statement in the test, we create an Iota instance as if it had a constructor that takes two int s. This isa struct literal. When the D compiler


以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

社交红利2.0

社交红利2.0

徐志斌 / 中信出版社 / 2015-9 / 42元

大型社交网络发展至今,开始显露出更为惊人的力量。有一个独特现象与这一结果相伴相生,即新应用或服务一进入社交网络就即时引爆,就像用户在等待它出现一样。随即开始的病毒式扩散,让创业者成为全民话题的焦点。但这一切是如何实现的?具备哪些特征的合作伙伴才可以被即时引爆? 作者从其长期追踪的近30个一进入微博、微信就引爆的经典案例中甄选出若干典型案例。从大量一手鲜活的后台数据入手,并结合腾讯对社交网络的......一起来看看 《社交红利2.0》 这本书的介绍吧!

JS 压缩/解压工具
JS 压缩/解压工具

在线压缩/解压 JS 代码

MD5 加密
MD5 加密

MD5 加密工具

UNIX 时间戳转换
UNIX 时间戳转换

UNIX 时间戳转换