Anybody can write good bash (with a little effort)

栏目: IT技术 · 发布时间: 4年前

内容简介:Tags:programming,A gentle admonishment toShell programming is a popular and predictable target of ire in programming communities: virtually everybody has a horror story about a vintage, broken, or monstrous shell script underpinning a critical component of

Jan 23, 2020

Tags:programming, rant

What this post is

A gentle admonishment to use shell scripts where appropriate accept that shell scripts will appear in your codebases and to lean heavily on automated tools, modern features, safety rails, and best practices whenever possible.

Shell programming is a popular and predictable target of ire in programming communities: virtually everybody has a horror story about a vintage, broken, or monstrous shell script underpinning a critical component of their development environment or project.

Personal favorites include:

  • The ominous run.sh , which regularly:

    1. Runs something , somewhere
    2. Lacks the executable bit
    3. Doesn't specify its shell with a shebang
    4. Expects to be run as a particular user, or runs itself again in a different context
    5. Does very bad things if run from the wrong directory
    6. May or may not fork
    7. May or may not write a pidfile correctly, or at all
    8. May or may not check that pidfile, and subsequently clobber itself on the next run
    9. All of the above
  • make.sh (or build.sh , or compile.sh , or ...), which:

    1. Doesn't understand CC , CXX , CFLAGS or any other standard build environment variable
    2. Gets clever and tries to implement its own preprocessor
    3. Contains a half-baked -j implementation
    4. Contains a half-baked make implementation, including (broken) install and clean targets
    5. Decides that it knows better than you where it should be installed
    6. Fails if anything, anywhere has a space in it
    7. Leaves the build in an indeterminate state if interrupted
    8. Happily chugs along after a command fails, leaving the build undiagnosable
    9. All of the above
  • test.sh , which:

    1. Expects to be run in some kind of virtual environment (a venv , a container, a folder containing a bundle , &c)
    2. ...tries to install and/or configure and/or load that virtual environment if not run inside it correctly, usually breaking more things
    3. Incorrectly detects that is or isn't in the environment it wants, and tries to do the wrong thing
    4. Masks and/or ignores the exit codes of the test runner(s) it invokes internally
    5. Swallows and/or clobbers the output of the runner(s) it invokes internally
    6. Contains a half-baked unit test implementation that doesn't clean up intermediates or handle signals correctly
    7. Gets really clever with colored output and doesn't bother to check isatty
    8. All of the above
  • env.sh , which:

    1. May or may not actually be a shell script
    2. May or may eval ed into a shell process of indeterminate privilege and state somewhere in your stack
    3. May or may not just be split on = in Python by your burnt-out DevOps person
    4. All of the above, at different stages and on different machines

I've experienced all of these, and am personally guilty of a (slight) majority of them. Despite that (and perhaps because of it) I continue to believe that shell scriptshave an important (and irreplaceable) niche in my development cycle, and should occupy that same niche in yours .

I'll go through the steps I take to write (reliable, composable) bash below.

Basics

A bash script (i.e., a bash file that's meant to be run directly) doesn't end up in my codebases unless it:

  • Has the executable bit
  • Has a shebang and that shebang is #!/usr/bin/env bash
  • Has a top level comment that briefly explains its functionality
  • Has set -e (and ideally set -euo pipefail )
  • Can either:
    • Be run from any directory, or
    • Fail immediately and loudly if they aren't run from the correct directory

I also put two functions in (almost) every script:

function installed {
  cmd=$(command -v "${1}")

  [[ -n "${cmd}" ]] && [[ -f "${cmd}" ]]
  return ${?}
}

function die {
  >&2 echo "Fatal: ${@}"
  exit 1
}

These compose nicely with bash 's conditional tests and operators (and each other) to give me easy sanity checks at the top of my scripts:

[[ "${BASH_VERSINFO[0]}" -lt 4 ]] && die "Bash >=4 required"

deps=(curl nc dig)
for dep in "${deps[@]}"; do
  installed "${dep}" || die "Missing '${dep}'"
done

Some other niceties:

  • I use shopt -s extglob and shopt -s globstar in some of my scripts, slightly preferring it over (simple) find invocations. Compare this find invocation:

    items=$(find . -name 'foo*' -o -name 'bar*')

    to the shorter (and process-spawn-free):

    items=(**/@(foo|bar)*)

    Linux Journal has a nice extended globbing reference here ; globstar is explained in the GNU shopt documentation here .

Automated linting and formatting

In terms of popularity and functionality, shellcheck reigns supreme. Going by its changelog , shellcheck has been around for a little under 7 years. It's also available in just about every package manager .

As of 0.7.0, shellcheck can even auto-generate (unified-format) patches for some problems:

shellcheck -f diff my_script.sh | patch

And includes a (sadly optional) check for my personal bugbear: non-mandatory variable braces:

# Bad
foo="$bar"
stuff="$# $? $$ $_"

# Good
foo="${bar}"
stuff="${#} ${?} ${$} ${_}"

shellcheck also doesn't complain about usage of [ (instead of [[ ), even when the shell is explicitly GNU bash.

There's also bashate and mvdan/sh , neither of which I've used.

Environment variables, not flags

In the past, I've used the shift and getopt builtins (sometimes at the same time) to do flag parsing. I've mostly given up on that, and have switched to the following pattern:

  • Boolean and trivial flags are passed via environment variables:

    VERBOSE=1 STAMP=$(date +%s) frobulate-website

    I find this substantially easier to read and remember than flags (did I use -v or -V for verbose in this script?), and allows me to use this nice syntax for defaults:

    VERBOSE=${VERBOSE:-0}
    STAMP=${STAMP:-$(date +%s)}
  • Where possible stdin , stdout , and stderr are used instead of dedicated positional files:

    VERBOSE=1 DEBUG=1 frobulate-website < /var/www/index.html > /var/www/index2.html
  • The only parameters are positional ones, and should generally conform to a variable-argument pattern (i.e., program <arg> [arg ...] ).

  • -h and -v are only added if the program has non-trivial argument handling and is expected to be (substantially) revised in the future.

    • I generally prefer not to implement -v at all, favoring a line in the header of -h 's output instead.
    • Running the command with no arguments is treated as equivalent to -h .
  • All other kinds of flags, inputs, and mechanisms (including getopt ) are only used as a last resort.

Compose liberally

Don't be afraid of composing pipes and subshells:

# Combine the outputs of two `stage-run` invocations for
# a single pipeline into `stage-two`
(stage-one foo && stage-one bar) | stage-two

Or of using code blocks to group operations:

# Code blocks aren't subshells, so `exit` works as expected
risky-thing || { >&2 echo "risky-thing didn't work!"; exit 1; }

Subshells and blocks can be used in many of the same contexts; which one you use should depend on whether you need an independent temporary shell or not:

# Both of these work, but the latter preserves the variables

(read line1 && read line2 && echo "${line1} vs. ${line2}") < "${some_input}"
# line1 and line2 are undefined

{ read line1 && read line2 && echo "${line1} vs. ${line2}"; } < "${some_input}"
# line1 and line2 are defined and contain their last values

Note the slight syntactic differences: blocks require spacing and a final semicolon (when on a single line).

Use process substitution to avoid temporary file creation and management:

Bad :

function cleanup {
  rm -f /tmp/foo-*
}

output=$(mktemp -t foo-XXXXXX)
trap cleanup EXIT

first-stage output
second-stage --some-annoying-input-flag output

Good :

second-stage --some-annoying-input-flag <(first-stage)

You can also use them to cleanly process stderr :

# Drop `big-task`'s stdout and redirect its stderr to a substituted process
(big-task > /dev/null) 2> >(sed -ne '/^EMERG: /p')

Roundup

The shell is a particularly bad programming language that is particularly easy to write (unsafe, unreadable) code in.

It's also a particularly effective language with idioms and primitives that are hard to (tersely, faithfully) reproduce in objectively better languages.

It's also not going anywhere anytime soon: according to sloccount , kubernetes@e41bb32 has 28055 lines of shell in it.

The moral of the story: shell is going to sneak into your projects. You should be prepared with good practices and good tooling for when it does.

If you somehow manage to keep it out of your projects, people will use shell to deploy your projects or to integrate it into their projects. You should be prepared to justify your project's behavior and (non-)conformity to the (again, objectively bad) status quo of UNIX-like environments for when they come knocking.


以上所述就是小编给大家介绍的《Anybody can write good bash (with a little effort)》,希望对大家有所帮助,如果大家有任何疑问请给我留言,小编会及时回复大家的。在此也非常感谢大家对 码农网 的支持!

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

Introduction to Computer Science Using Python

Introduction to Computer Science Using Python

Dierbach, Charles / 2012-12 / $ 133.62

Introduction to Computer Science Using Python: A Computational Problem-Solving Focus introduces students to programming and computational problem-solving via a back-to-basics, step-by-step, objects-la......一起来看看 《Introduction to Computer Science Using Python》 这本书的介绍吧!

JSON 在线解析
JSON 在线解析

在线 JSON 格式化工具

随机密码生成器
随机密码生成器

多种字符组合密码

html转js在线工具
html转js在线工具

html转js在线工具