Wc in D: 712 Characters Without a Single Branch

栏目: IT技术 · 发布时间: 4年前

内容简介：After readingIt consists of one file and has 34 lines and 712 characters.Sure, it is using

After reading “Beating C With 80 Lines Of Haskell: Wc” , which I found on Hacker News , I thought D could do better. So I wrote a wc in D.

The Program

It consists of one file and has 34 lines and 712 characters.

import std.stdio : writefln, File;
import std.algorithm : map, fold, splitter;
import std.range : walkLength;
import std.typecons : Yes;
import std.uni : byCodePoint;

struct Line {
	size_t chars;
	size_t words;
}

struct Output {
	size_t lines;
	size_t words;
	size_t chars;
}

Output combine(Output a, Line b) pure nothrow {
	return Output(a.lines + 1, a.words + b.words, a.chars + b.chars);
}

Line toLine(char[] l) pure {
	return Line(l.byCodePoint.walkLength, l.splitter.walkLength);
}

void main(string[] args) {
	auto f = File(args[1]);
	Output o = f
		.byLine(Yes.keepTerminator)
		.map!(l => toLine(l))
		.fold!(combine)(Output(0, 0, 0));

	writefln!"%u %u %u %s"(o.lines, o.words, o.chars, args[1]);
}

Sure, it is using Phobos, D’s standard library , but then why wouldn’t it? Phobos is awesome and ships with every D compiler. The program itself does not contain a single if statement. The Haskell wc implementation has several if statements. The D program, apart from the main function, contains three tiny functions. I could have easily put all the functionally in one range chain, but then it probably would have exceeded 80 characters per line. That’s a major code-smell.

The Performance

Is the D wc faster than the coreutils wc ? No, but it took me 15 minutes to write mine (I had to search for walkLength , because I forgot its name).

file	lines	bytes	coreutils	haskell	D
app.d	46	906	3.5 ms +- 1.9 ms	39.6 ms +- 7.8 ms	8.9 ms +- 2.1 ms
big.txt	862	64k	4.7 ms +- 2.0 ms	39.6 ms +- 7.8 ms	9.8 ms +- 2.1 ms
vbig.txt	1.7M	96M	658.6ms +- 24.5ms	226.4 ms +- 29.5 ms	1.102 s +- 0.022 s
vbig2.txt	12.1M	671M	4.4 s +- 0.058 s	1.1 s +- 0.039 s	7.4 s +- 0.085 s

Memory:

file	coreutils	haskell	D
app.d	2052K	7228K	7708K
big.txt	2112K	7512K	7616K
vbig.txt	2288K	42620K	7712K
vbig2.txt	2360K	50860K	7736K

Is the Haskell wc faster? For big files, absolutely, but then it is using threads. For small files, GNU’s coreutils still beats the competition. At this stage my version is very likely IO bound, and it’s fast enough anyway.

I’ll not claim that one language is faster than another. If you spend a chunk of time on optimizing a micro-benchmark, you are likely going to beat the competition. That’s not real life. But I will claim that functional programming in D gives functional programming in Haskell a run for its money.

A Bit About Ranges

Wc in D: 712 Characters Without a Single Branch A range is an abstraction that you can consume through iteration without consuming the underlying collection (if there is one). Technically, a range can be a struct or a class that adheres to one of a handful of Range interfaces. The most basic form, the InputRange , requires the function

void popFront();

and two members or properties:

T front;
bool empty;

T is the generic type of the elements the range iterates.

In D, ranges are special in a way that other objects are not. When a range is given to a foreach statement, the compiler does a little rewrite.

foreach (e; range) { ... }

is rewritten to

for (auto __r = range; !__r.empty; __r.popFront()) {
    auto e = __r.front;
    ...
}

auto e = infers the type and is equivalent to T e = .

Given this knowledge, building a range that can be used by foreach is easy.

struct Iota {
	int front;
	int end;

	@property bool empty() const {
		return this.front == this.end;
	}

	void popFront() {
		++this.front;
	}
}

unittest {
	import std.stdio;
	foreach(it; Iota(0, 10)) {
		writeln(it);
	}
}

Iota is a very simple range. It functions as a generator, having no underlying collection. It iterates integers from front to end in steps of one. This snippet introduces a little bit of D syntax.

@property bool empty() const {

The @property attribute allows us to use the function empty the same way as a member variable (calling the function without the parenthesis). The trailing const means that we don’t modify any data of the instance we call empty on. The built-in unit test prints the numbers 0 to 10 .

Another small feature is the lack of an explicit constructor. The struct Iota has two member variables of type int . In the foreach statement in the test, we create an Iota instance as if it had a constructor that takes two int s. This isa struct literal. When the D compiler

以上就是本文的全部内容，希望对大家的学习有所帮助，也希望大家多多支持码农网

查看所有标签

猜你喜欢:

Wc in D: 712 Characters Without a Single Branch

本站部分资源来源于网络，本站转载出于传递更多信息之目的，版权归原作者或者来源机构所有，如转载稿涉及版权问题，请联系我们。

码农书籍

深入浅出Tapestry

董黎伟 / 电子工业出版社 / 2007-3 / 49.0

本书以循序渐进的方式，从Tapestry框架技术的基本概念入手，讲解Tapestry框架在J2EE Web应用程序中的整体架构实现。使读者在学习如何使用Tapestry框架技术的同时，还能够获得在J2EE Web应用程序中应用Tapestry框架的先进经验。本书详细介绍了Hivemind框架的原理与应用，使读者不但可以通过Hivemind来重构Tapestry的官方实现，还可以使用Hive......一起来看看《深入浅出Tapestry》这本书的介绍吧!

码农工具

Wc in D: 712 Characters Without a Single Branch

The Program

The Performance

A Bit About Ranges

深入浅出Tapestry

JS 压缩/解压工具

SHA 加密

RGB CMYK 转换工具