The growth of command line options

栏目: IT技术 · 发布时间: 4年前

内容简介:The first of McIlroy's dicta is often paraphrased as "do one thing and do it well", which isshortened from "Make each program do one thing well. To do a new job, build afresh rather than complicate old programs by adding new 'features.'"McIlroy's example o
The growth of command line options, 1979-Present The growth of command line options, 1979-Present

My hobby : opening up McIlroy’s UNIX philosophy on one monitor while reading manpages on the other.

The first of McIlroy's dicta is often paraphrased as "do one thing and do it well", which isshortened from "Make each program do one thing well. To do a new job, build afresh rather than complicate old programs by adding new 'features.'"

McIlroy's example of this dictum is:

Surprising to outsiders is the fact that UNIX compilers produce no listings: printing can be done better and more flexibly by a separate program.

If you open up a manpage for ls on mac, you’ll see that it starts with

ls [-ABCFGHLOPRSTUW@abcdefghiklmnopqrstuwx1] [file ...]

That is, the one-letter flags to ls include every lowercase letter except for {jvyz} , 14 uppercase letters, plus @ and 1 . That’s 22 + 14 + 2 = 38 single-character options alone.

On ubuntu 17, if you read the manpage for coreutils ls , you don’t get a nice summary of options, but you’ll see that ls has 58 options (including --help and --version ).

To see if ls is an aberration or if it's normal to have commands that do this much stuff, we can look at some common commands, sorted by frequency of use.

command 1979 1996 2015 2017
ls 11 42 58 58
rm 3 7 11 12
mkdir 0 4 6 7
mv 0 9 13 14
cp 0 18 30 32
cat 1 12 12 12
pwd 0 2 4 4
chmod 0 6 9 9
echo 1 4 5 5
man 5 16 39 40
which 0 1 1
sudo 0 23 25
tar 12 53 134 139
touch 1 9 11 11
clear 0 0 0
find 14 57 82 82
ln 0 11 15 16
ps 4 22 85 85
ping 12 12 29
kill 1 3 3 3
ifconfig 16 25 25
chown 0 6 15 15
grep 11 22 45 45
tail 1 7 12 13
df 0 10 17 18
top 6 12 14

This table has the number of command line options for various commands for v7 Unix (1979), slackware 3.1 (1996), ubuntu 12 (2015), and ubuntu 17 (2017). Cells are darker and blue-er when they have more options (log scale) and are greyed out if no command was found.

We can see that the number of command line options has dramatically increased over time; entries tend to get darker going to the right (more options) and there are no cases where entries get lighter (fewer options).

McIlroy has long decried the increase in the number of options, size, and general functionality of commands :

Everything was small and my heart sinks for Linux when I see the size [inaudible]. The same utilities that used to fit in eight k[ilobytes] are a meg now. And the manual page, which used to really fit on, which used to really be a manual page , is now a small volume with a thousand options... We used to sit around in the UNIX room saying "what can we throw out? Why is there this option?" It's usually, it's often because there's some deficiency in the basic design -- you didn't really hit the right design point. Instead of putting in an option, figure out why, what was forcing you to add that option. This viewpoint, which was imposed partly because there was very small hardware ... has been lost and we're not better off for it.

Ironically, one of the reasons for the rise in the number of command line options is another McIlroy dictum, "Write programs to handle text streams, because that is a universal interface" (see ls for one example of this).

If structured data or objects were passed around, formatting could be left to a final formatting pass. But, with plain text, the formatting and the content are intermingled; because formatting can only be done by parsing the content out, it's common for commands to add formatting options for convenience. Alternately, formatting can be done when the user leverages their knowledge of the structure of the data and encodes that knowledge into arguments to cut , awk , sed , etc. (also using their knowledge of how those programs handle formatting; it's different for different programs and the user is expected to, for example, know how cut -f4 is different from awk '{ print $4 }' ). That's a lot more hassle than passing in one or two arguments to the last command in a sequence and it pushes the complexity from the tool to the user.

I've heard people say that there isn't really any alternative to this kind of complexity for command line tools, but people who say that have never really tried the alternative, something like PowerShell. I have plenty of complaints about PowerShell, but passing structured data around and easily being able to operate on structured data without having to hold metadata information in my head so that I can pass the appropriate metadata to the right command line tools at that right places the pipeline isn't among my complaints.

The sleight of hand that's happening when someone says that we can keep software simple and compatible by making everything handle text is the pretense that text data doesn't have a structure that needs to be parsed. In some cases, we can just think of everything as a single space separated line, or maybe a table with some row and column separators that we specify ( with some behavior that isn't consistent across tools, of course ). That adds some hassle when it works, and then there are the cases where serializing data to a flat text format adds considerable complexity since the structure of data means that simple flattening requires significant parsing work to re-ingest the data in a meaningful way.

Another reason commands now have more options is that people have added convenience flags for functionality that could have been done by cobbling together a series of commands. These go all the way back to v7 unix, where ls has an option to reverse the sort order (which could have been done by passing the output to tac ).

Over time, more convenience options have been added. For example, to pick a command that originally has zero options, mv can move and create a backup (three options; two are different ways to specify a backup, one of which takes an argument and the other of which takes zero explicit arguments and reads an implicit argument from the VERSION_CONTROL environment variable; one option allows overriding the default backup suffix). mv now also has options to never overwrite and to only overwrite if the file is newer.

mkdir is another program that used to have no options where, excluding security things for SELinux or SMACK as well as help and version options, the added options are convenience flags: setting the permissions of the new directory and making parent directories if they don't exist.

If we look at tail , which originally had one option ( -number , telling tail where to start), it's added both formatting and convenience options For formatting, it has -z , which makes the line delimiter null instead of a newline. Some examples of convenience options are -f to print when there are new changes, -s to set the sleep interval between checking for -f changes, --retry to retry if the file isn't accessible.

McIlroy says "we're not better off" for having added all of these options but I'm better off. I've never used some of the options we've discussed and only rarely use others, but that's the beauty of command line options -- unlike with a GUI, adding these options doesn't clutter up the interface.

This isn't to say there's no cost to adding options -- more options means more maintenance burden, but that's a cost that maintainers pay to benefit users, which isn't obviously unreasonable considering the ratio of maintainers to users. This is analogous to Gary Bernhardt's comment that it's reasonable to practice a talk fifty times since, if there's a three hundred person audience, the ratio of time spent watching to the talk to time spent practicing will still only be 1:6. In general, this ratio will be even more extreme with commonly used command line tools.

Methodology for table

Command frequencies were sourced from public command history files on github, not necessarily representative of your personal usage. Only "simple" commands were kept, which ruled out things like curl, git, gcc (which has > 1000 options), and wget. What's considered simple is arbitrary. Shell builtins , like cd weren't included.

Repeated options aren't counted as separate options. For example, git blame -C , git blame -C -C , and git blame -C -C -C have different behavior, but these would all be counted as a single argument even though -C -C is effectively a different argument from -C .

The table counts sub-options as a single option. For example, ls has the following:

--format=WORD across -x, commas -m,  horizontal  -x,  long  -l,  single-column  -1,  verbose  -l, vertical -C

Even though there are seven format options, this is considered to be only one option.

Options that are explicitly listed as not doing anything are still counted as options, e.g., ls -g , which reads Ignored; for Unix compatibility. is counted as an option.

Multiple versions of the same option are also considered to be one option. For example, with ls , -A and --almost-all are counted as a single option.

In cases where the manpage says an option is supposed to exist, but doesn't, the option isn't counted. For example, the v7 mv manpage says

BUGS

If file1 and file2 lie on different file systems, mv must copy the file and delete the original. In this case the owner name becomes that of the copying process and any linking relationship with other files is lost.

Mv should take -f flag, like rm, to suppress the question if the target exists and is not writable.

-f isn't counted as a flag in the table because the option doesn't actually exist.

The latest year in the table is 2017 because I wrote the first draft for this post in 2017 and didn't get around to cleaning it up until 2020

Thanks to Leah Hanson, Hillel Wayne, and Wesley Aptekar-Cassels for comments/corrections/discussion.


以上就是本文的全部内容,希望本文的内容对大家的学习或者工作能带来一定的帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

金字塔原理

金字塔原理

[美] 巴巴拉·明托 / 王德忠、张珣 / 民主与建设出版社 / 2002-12 / 39.80元

《金字塔原理》是一本讲解写作逻辑与思维逻辑的读物,全书分为四个部分。 第一篇主要对金字塔原理的概念进行了解释,介绍了如何利用这一原理构建基本的金字塔结构。目的是使读者理解和运用简单文书的写作技巧。 第二篇介绍了如何深入细致地把握思维的环节,以保证使用的语句能够真实地反映希望表达的思想要点。书中列举了许多实例,突出了强迫自己进行“冷静思维”对明确表达思想的重要性。 第三篇主要针对的......一起来看看 《金字塔原理》 这本书的介绍吧!

JSON 在线解析
JSON 在线解析

在线 JSON 格式化工具

在线进制转换器
在线进制转换器

各进制数互转换器

随机密码生成器
随机密码生成器

多种字符组合密码