Wc in D: 712 Characters Without a Single Branch

栏目: IT技术 · 发布时间: 6年前

内容简介:After readingIt consists of one file and has 34 lines and 712 characters.Sure, it is using

After reading “Beating C With 80 Lines Of Haskell: Wc” , which I found on Hacker News , I thought D could do better. So I wrote a wc in D.

The Program

It consists of one file and has 34 lines and 712 characters.

import std.stdio : writefln, File;
import std.algorithm : map, fold, splitter;
import std.range : walkLength;
import std.typecons : Yes;
import std.uni : byCodePoint;

struct Line {
	size_t chars;
	size_t words;
}

struct Output {
	size_t lines;
	size_t words;
	size_t chars;
}

Output combine(Output a, Line b) pure nothrow {
	return Output(a.lines + 1, a.words + b.words, a.chars + b.chars);
}

Line toLine(char[] l) pure {
	return Line(l.byCodePoint.walkLength, l.splitter.walkLength);
}

void main(string[] args) {
	auto f = File(args[1]);
	Output o = f
		.byLine(Yes.keepTerminator)
		.map!(l => toLine(l))
		.fold!(combine)(Output(0, 0, 0));

	writefln!"%u %u %u %s"(o.lines, o.words, o.chars, args[1]);
}

Sure, it is using Phobos, D’s standard library , but then why wouldn’t it? Phobos is awesome and ships with every D compiler. The program itself does not contain a single if statement. The Haskell wc implementation has several if statements. The D program, apart from the main function, contains three tiny functions. I could have easily put all the functionally in one range chain, but then it probably would have exceeded 80 characters per line. That’s a major code-smell.

The Performance

Is the D wc faster than the coreutils wc ? No, but it took me 15 minutes to write mine (I had to search for walkLength , because I forgot its name).

file lines bytes coreutils haskell D
app.d 46 906 3.5 ms +- 1.9 ms 39.6 ms +- 7.8 ms 8.9 ms +- 2.1 ms
big.txt 862 64k 4.7 ms +- 2.0 ms 39.6 ms +- 7.8 ms 9.8 ms +- 2.1 ms
vbig.txt 1.7M 96M 658.6ms +- 24.5ms 226.4 ms +- 29.5 ms 1.102 s +- 0.022 s
vbig2.txt 12.1M 671M 4.4 s +- 0.058 s 1.1 s +- 0.039 s 7.4 s +- 0.085 s

Memory:

file coreutils haskell D
app.d 2052K 7228K 7708K
big.txt 2112K 7512K 7616K
vbig.txt 2288K 42620K 7712K
vbig2.txt 2360K 50860K 7736K

Is the Haskell wc faster? For big files, absolutely, but then it is using threads. For small files, GNU’s coreutils still beats the competition. At this stage my version is very likely IO bound, and it’s fast enough anyway.

I’ll not claim that one language is faster than another. If you spend a chunk of time on optimizing a micro-benchmark, you are likely going to beat the competition. That’s not real life. But I will claim that functional programming in D gives functional programming in Haskell a run for its money.

A Bit About Ranges

Wc in D: 712 Characters Without a Single Branch A range is an abstraction that you can consume through iteration without consuming the underlying collection (if there is one). Technically, a range can be a struct or a class that adheres to one of a handful of Range interfaces. The most basic form, the InputRange , requires the function

void popFront();

and two members or properties:

T front;
bool empty;

T is the generic type of the elements the range iterates.

In D, ranges are special in a way that other objects are not. When a range is given to a foreach statement, the compiler does a little rewrite.

foreach (e; range) { ... }

is rewritten to

for (auto __r = range; !__r.empty; __r.popFront()) {
    auto e = __r.front;
    ...
}

auto e = infers the type and is equivalent to T e = .

Given this knowledge, building a range that can be used by foreach is easy.

struct Iota {
	int front;
	int end;

	@property bool empty() const {
		return this.front == this.end;
	}

	void popFront() {
		++this.front;
	}
}

unittest {
	import std.stdio;
	foreach(it; Iota(0, 10)) {
		writeln(it);
	}
}

Iota is a very simple range. It functions as a generator, having no underlying collection. It iterates integers from front to end in steps of one. This snippet introduces a little bit of D syntax.

@property bool empty() const {

The @property attribute allows us to use the function empty the same way as a member variable (calling the function without the parenthesis). The trailing const means that we don’t modify any data of the instance we call empty on. The built-in unit test prints the numbers 0 to 10 .

Another small feature is the lack of an explicit constructor. The struct Iota has two member variables of type int . In the foreach statement in the test, we create an Iota instance as if it had a constructor that takes two int s. This isa struct literal. When the D compiler


以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

社交红利

社交红利

徐志斌 / 北京联合出版公司 / 2013-8 / 42

如今的互联网,社交网络已占据了主要的位置。如腾讯微博、微信、QQ空间、人人网、新浪微博、唱吧、美丽说、啪啪等等,都可以算是社交网络,将大部分活跃的人们聚集起来,通过文字、图片、语音等形式分享着身边的事。这些社交网络吸引着更多兴趣相投的陌生人成为朋友结成圈子,也衍生出的海量流量和机会,为业界和创业者提供着源源不绝的新机会。可以这样说,社交网络在将散落在人们中的需求汇聚起来,等待着企业来提供服务。因此......一起来看看 《社交红利》 这本书的介绍吧!

JSON 在线解析
JSON 在线解析

在线 JSON 格式化工具

正则表达式在线测试
正则表达式在线测试

正则表达式在线测试

HEX CMYK 转换工具
HEX CMYK 转换工具

HEX CMYK 互转工具