Donald Knuth Was Framed

栏目: IT技术 · 发布时间: 4年前

内容简介:The other day I was talking with a friend about structured editing and literate programming came up. LP was one of Donald Knuth’s ideas, to structure programs as readable documents instead of just machine docs. He was interested in it, I was cautiously ske

The other day I was talking with a friend about structured editing and literate programming came up. LP was one of Donald Knuth’s ideas, to structure programs as readable documents instead of just machine docs. He was interested in it, I was cautiously skeptical. We both knew the famous story about it:

In 1986, Jon Bentley asked Knuth to demonstrate the concept of literate programming by writing a program in WEB. Knuth came up with an 8-pages long monolithic listing that was published together with a critique by Douglas McIlroy of Bell Labs. McIlroy praised intricacy of Knuth’s solution, his choice of a data structure (Frank M. Liang’s hash trie), but noted that more practical, much faster to implement, debug and modify solution of the problem takes only six lines of shell script by reusing standard Unix utilities. McIlroy concluded:[12]

Knuth has shown us here how to program intelligibly, but not wisely. I buy the discipline. I do not buy the result. He has fashioned a sort of industrial-strength Faberge egg—intricate, wonderfully worked, refined beyond all ordinary desires, a museum piece from the start. 

The program was “print out the top K most-used words in a text.” The six lines of shell script were

tr -cs A-Za-z '\n' |
tr A-Z a-z |
sort |
uniq -c |
sort -rn |
sed ${1}q

A damning counter. But neither of us had ever read the paper. And as you know, I’m all about primary sources. We pulled up the paper here and read through it together. And it left us with a very different understanding of literate programming, and the challenge, than the famous story gave.

First of all, we found that Literate Programming as conceived by Knuth not just “text, code, text, code”. You could say “now we add foo to section <<bar>> ” even after having already created bar and modified it several times before. Writing the code out-of-order is the whole point. You’d use a tool to take the narrative, or “weave” and synthesize the code, the “tangle”. Very roughly, you could write:

Blah blah blah

  <<foo>>
  <<bar>>

Blah blah
  <<bar>> = print("world
Blah
  <<foo>> = print("hello ");
Blah
  <<bar>> += !");

And it would weave that into the actual program.

The actual paper paints LP in a much better light than we thought it did.

  • Knuth was asked to demonstrate his tool for LP, WEB, and was demonstrating LP as part of that. He wasn’t competing with McIlroy, and didn’t know how McIlroy was going to critique it.
  • The actual content is effectively his stream-of-consciousness notes about what he’s doing. He discusses why he makes the choices he does and what he’s thinking about as primary concerns. It’s easy to skim or deep-dive the code.
  • Most of the “eight pages” aren’t because Knuth is doing LP, but because he’s Donald Knuth:
    • One page is him setting up the problem (“what do we mean by ‘word’? What if multiple words share the same frequency?”) and one page is just the index.
    • Another page is just about working around specific Pascal issues no modern language has, like “how do we read in an integer” and “how do we identify letters when Pascal’s character set is poorly defined.”
    • Then there’s almost four pages of handrolling a hash trie.
  • Knuth is doing everything from first principles. One of McIlroy’s strongest criticisms is “everything there- even input conversion and sorting-is programmed monolithically and from scratch. In particular the isolation of words, the handling of punctuation, and the treatment of case distinctions are built in.” Of course it is! Knuth was trying to show how WEB makes it easier to write and understand programs, not explain how Pascal packages work! His ideas aren’t invalidated just because we can import libraries.

[rant] McIlroy could only write such a tight script because we’re doing something that’s near perfect for shell scripting: simple processing of text. If we choose a slightly different problem, like “find the top K pairs of words and print the Levenshtein distance between each pair” then shell scripting that would be a lot harder.[/rant]

Not to say that McIlroy is entirely wrong, and he raises a lot of good points. But I don’t think this is the damning indictment of LP or a total vindication of the Unix model. It also got me to look into the Leo Editor , which seems a more modern version of LP. I’ll report on how it goes.


以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

一本书读懂大数据

一本书读懂大数据

黄颖 / 吉林出版集团有限责任公司 / 2014-12-1

进入大数据时代,让数据开口说话将成为司空见惯的事情,书中将从大数据时代的前因后果讲起,全面分析大数据时代的特征、企业实践的案例、大数据的发展方向、未来的机遇和挑战等内容,展现一个客观立体、自由开放的大数据时代。一起来看看 《一本书读懂大数据》 这本书的介绍吧!

HTML 编码/解码
HTML 编码/解码

HTML 编码/解码

UNIX 时间戳转换
UNIX 时间戳转换

UNIX 时间戳转换

HEX CMYK 转换工具
HEX CMYK 转换工具

HEX CMYK 互转工具