内容简介:This document describes Oil's word evaluation semantics (The main idea is that Oil behaves like a traditional programming language:
source |all docs for version 0.8.pre1 |all versions |oilshell.org
Simple Word Evaluation in Unix Shell
This document describes Oil's word evaluation semantics ( shopt -s simple_word_eval
) for experienced shell users. It may also be useful to
those who want to implement this behavior in another shell.
The main idea is that Oil behaves like a traditional programming language:
- It's parsed from start to endin a single pass.
- It's evaluated in a single step too.
That is, parsing and evaluation aren't interleaved, and code and data aren't confused.
Table of Contents
An Analogy: Word Expressions Should Be Like Arithmetic Expressions
No Implicit Splitting, Dynamic Globbing, or Empty Elision
Splicing, Static Globbing, and Brace Expansion
Where These Rules Apply
Opt In to the Old Behavior With Explicit Expressions
More Word Evaluation Issues
Arithmetic Is Statically Parsed
Tip: View the Syntax Tree With -n
An Analogy: Word Expressions Should Be Like Arithmetic Expressions
In Oil, "word expressions" like
$x "hello $name" $(hostname) 'abc'$x${y:-${z//pat/replace}}"$(echo hi)$((a[i] * 3))"
are parsed and evaluated in a straightforward way, like this expression when x == 2
:
1 + x / 2 + x * 3 → 8 # Python, JS, Ruby, etc. work this way
In contrast, in shell, words are "expanded" in multiple stages, like this:
1 + "x / 2 + \"x * 3\"" → 8 # Hypothetical, confusing language
That is, it would be odd if Python looked inside a program's strings
for
expressions to evaluate, but that's exactly what shell does! There are
multiple places where there's a silent eval
, and you need quoting
to
inhibit it. Neglecting this can cause security problems due to confusing code
and data (links below).
In other words, the defaults are wrong . Programmers are surprised by shell's behavior, and it leads to incorrect programs.
So in Oil, you can opt out of the multiple "word expansion" stages described in the POSIX shell spec . Instead, there's only one stage : evaluation.
Design Goals
The new semantics should be easily adoptable by existing shell scripts.
-
Importantly,
bin/osh
is POSIX-compatible and runs realbash scripts. You can gradually opt into stricter and saner behavior withshopt
options (or by runningbin/oil
). The most important one issimple_word_eval, and the others are listed below. -
Even after opting in, the new syntax shouldn't break many scripts. If it does break, the change to fix it should be small. For example,
echo @foo
is not too common, and it can be made bash-compatible by quoting it:echo '@foo'
.
Examples
In the following examples, the argv
command prints the argv
array it
receives in a readable format:
$ argv one "two three" ['one', 'two three']
I also use Oil'svar keyword for assignments. (TODO: This could be rewritten with shell assignment for the benefit of shell implementers)
No Implicit Splitting, Dynamic Globbing, or Empty Elision
In Oil, the following constructs always evaluate to one argument :
-
Variable / "parameter" substitution:
$x
,${y}
-
Command sub:
$(echo hi)
or backticks -
Arithmetic sub:
$(( 1 + 2 ))
That is, quotes aren't necessary to avoid:
-
Word Splitting
, which uses
$IFS
. -
Empty Elision
. For example,
x=''; ls $x
passesls
no arguments. - Dynamic Globbing . Globs are dynamic when the pattern comes from program data rather than the source code.
Here's an example showing that each construct evaluates to one arg in Oil:
oil$ var pic = 'my pic.jpg' # filename with spaces oil$ var empty = '' oil$ var pat = '*.py' # pattern stored in a string oil$ argv ${pic} $empty $pat $(cat foo.txt) $((1 + 2)) ['my pic.jpg', '', '*.py', 'contents of foo.txt', '3']
In contrast, shell applies splitting, globbing, and empty elision after the substitutions. Each of these operations returns an indeterminate number of strings:
sh$ pic='my pic.jpg' # filename with spaces sh$ empty= sh$ pat='*.py' # pattern stored in a string sh$ argv ${pic} $empty $pat $(cat foo.txt) $((1 + 2)) ['my', 'pic.jpg', 'a.py', 'b.py', 'contents', 'of', 'foo.txt', '3']
To get the desired behavior, you have to use double quotes:
sh$ argv "${pic}" "$empty" "$pat", "$(cat foo.txt)" "$((1 + 2))" ['my pic.jpg', '', '*.py', 'contents of foo.txt', '3']
Splicing, Static Globbing, and Brace Expansion
The constructs in the last section evaluate to a single argument . In contrast, these three constructs evaluate to 0 to N arguments :
-
Splicing
an array:
"$@"
and"${myarray[@]}"
-
Static Globbing
:
echo *.py
. Globs are static when they occur in the program text. -
Brace expansion
:
{alice,bob}@example.com
In Oil, shopt -s parse_at
enables these shortcuts for splicing:
-
@myarray
for"${myarray[@]}"
-
@ARGV
for"$@"
Example:
oil$ var myarray = @('a b' c) # array with 2 elements oil$ set -- 'd e' f # 2 arguments oil$ argv @myarray @ARGV *.py {ian,jack}@sh.com ['a b', 'c', 'd e', 'f', 'g.py', 'h.py', 'ian@sh.com', 'jack@sh.com']
is just like:
bash$ myarray=('a b' c) bash$ set -- 'd e' f bash$ argv "${myarray[@]}" "$@" *.py {ian,jack}@sh.com ['a b', 'c', 'd e', 'f', 'g.py', 'h.py', 'ian@sh.com', 'jack@sh.com']
Unchanged: quotes disable globbing and brace expansion:
$ echo *.py foo.py bar.py $ echo "*.py" # globbing disabled with quotes *.py $ echo {spam,eggs}.sh spam.sh eggs.sh $ echo "{spam,eggs}.sh" # brace expansion disabled with quotes {spam,eggs}.sh
Where These Rules Apply
These rules apply when a sequence of words is being evaluated, exactly as in shell:
-
Command
:
echo $x foo
-
For loop
:
for i in $x foo; do ...
-
Array Literals
:
a=($x foo)
andvar a = @($x foo)
( oil-array )
Shell has other word evaluation contexts like:
sh$ x="${not_array[@]}" sh$ echo hi > "${not_array[@]}"
which aren't affected bysimple_word_eval.
Opt In to the Old Behavior With Explicit Expressions
Oil can express everything that shell can.
-
Split with
@split(mystr)
-
Glob with
@glob(mypat)
(not implemented) - Elide an empty string by converting it to an array of length 0 or 1, then splice that array into a command. See theexample in the last section.
More Word Evaluation Issues
More shopt
Options
- nullglob - Globs matching nothing don't evaluate to code.
-
dashglob is true by default, but disabled
when Oil is enabled, so that files that begin with
-
aren't returned. This avoids confusing flags and files .
Strict options cause fatal errors:
- strict_tilde - Failed tilde expansions don't evaluate to code.
- strict_word_eval - Invalid slices and invalid UTF-8 aren't ignored.
Arithmetic Is Statically Parsed
This is an intentional incompatibility described in theKnown Differences doc.
Summary
Oil word evaluation is enabled with shopt -s simple_word_eval
, and proceeds
in a single step.
Variable, command, and arithmetic substitutions predictably evaluate to a single argument , regardless of whether they're empty or have spaces. There's no implicit splitting, globbing, or elision of empty words.
You can opt into those behaviors with explicit expressions like @split(mystr)
, which evaluates to an array.
Oil also supports shell features that evaluate to 0 to N arguments : splicing, globbing, and brace expansion.
There are other options that "clean up" word evaluation. All options are designed to be gradually adopted by other shells, shell scripts, and eventually POSIX.
Notes
Related Documents
- The Simplest Explanation of Oil . Some color on the rest of the language.
- Known Differences Between OSH and Other Shells . Mentioned above: Arithmetic is statically parsed. Arrays and strings are kept separate.
- OSH Word Evaluation Algorithm on the Wiki. Informally describes the data structures, and describes legacy constructs.
-
Security implications of forgetting to quote a variable in bash/POSIX
shells
by Stéphane Chazelas. Describes the "implicit split+glob" operator, which Oil word evaluation removes.
- This is essentially the samesecurity issue I rediscovered in January 2019. It appears in allksh-derived shells, and some shells recently patched it. I wasn't able to exploit in a "real" context; otherwise I'd have made more noise about it.
- Also described by the Fedora Security team: Defensive Coding: Shell Double Expansion
Tip: View the Syntax Tree With -n
This gives insight intohow Oil parses shell:
$ osh -n -c 'echo ${x:-default}$(( 1 + 2 ))' (C {<echo>} { (braced_var_sub token: <Id.VSub_Name x> suffix_op: (suffix_op.Unary op_id:Id.VTest_ColonHyphen arg_word:{<default>}) ) (word_part.ArithSub anode: (arith_expr.Binary op_id: Id.Arith_Plus left: (arith_expr.ArithWord w:{<Id.Lit_Digits 1>}) right: (arith_expr.ArithWord w:{<Id.Lit_Digits 2>}) ) ) } )
You can pass --ast-format text
for more details.
Evaluation of the syntax tree is a single step.
Elision Example
The elision of empty strings from commands is verbose but we could simplify it with a builtin function if necessary:
var x = '' # empty in this case var tmp = @() if (x) { # test if string is non-empty append :tmp $x # appends 'x' to the array variable 'tmp' } argv a @tmp b # OUTPUT: # ['a', 'b']
When it's not empty:
var x = 'X' ... argv a @tmp b # OUTPUT: # ['a', 'X', 'b']
Generated on Mon Feb 24 23:27:01 PST 2020
以上所述就是小编给大家介绍的《Simple Word Evaluation in Oil Shell》,希望对大家有所帮助,如果大家有任何疑问请给我留言,小编会及时回复大家的。在此也非常感谢大家对 码农网 的支持!
猜你喜欢:本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们。