内容简介:In this chapter, we will extend theArrays are collections of elements, which use their common variable name and are accessible via integer indices. Let us introduce the following syntax for array declaration:It uses the same
This is a chapter from
Creating a compiler with RakuIn this chapter, we will extend the Lingua language with aggregate data types: arrays and hashes . From this point, we will call numbers and strings scalar variables.
Arrays
Arrays are collections of elements, which use their common variable name and are accessible via integer indices. Let us introduce the following syntax for array declaration:
my data[];
It uses the same my
keyword as for declaring scalar variables (which can keep numbers or strings) and has two square brackets after the name. The variable-declaration
grammar rule can now be split in two parts, one for arrays and one for scalars:
rule variable-declaration { 'my' <strong>[ </strong> <strong>| <array-declaration> </strong> <strong>| <scalar-declaration> </strong> <strong>] </strong>}
Arrays go first here, as their definition contains extra characters after the variable name and can be caught earlier.
Alternatively, we could introduce a new keyword, say arr
, to define arrays instead of my
, and simplify parsing at this point: arr data
. But let us return to my choice, my data[]
, as it also have its own advantages when we come to initialisations and reduces the number of reserved keywords.
The previous rule for scalar variable declaration migrates to a separate rule:
rule array-declaration { <variable-name> '[' ']' } rule scalar-declaration { <variable-name> [ '=' <value> ]? }
The new array-declaration
rule requires a pair of square brackets and does not yet include an initialiser part.
In the actions, we also have to distinguish between arrays and scalars, and we can do it by checking the presence of the $<array-declaration>
match object.
method variable-declaration($/) { <strong>if $<scalar-declaration></strong> { %!var{$<scalar-declaration><variable-name>} = $<scalar-declaration><value> ?? $<scalar-declaration><value>.made !! 0; } <strong>elsif $<array-declaration></strong> { %!var{$<array-declaration><variable-name>} = $[]; } }
It works but it looks too overloaded because of nested match object keys. In fact, there is no need for doing that, because individual actions can be created for each case.
<strong>method scalar-declaration</strong>($/) { %!var{$<variable-name>} = $<value> ?? $<value>.made !! 0; } <strong>method array-declaration</strong>($/) { %!var{$<variable-name>} = $[]; }
With this change, the variable-declaration
method is not needed anymore and can be removed from the LinguaActions
class.
You can temporarily replace it with the following code just to see how the parser works with arrays:
method variable-declaration($/) { dd %!var; }
The method displays what the variable storage contains after each variable declaration. Let us test this in action:
my x = 3; say x; my data[];
This program successfully compiles, and you can see how the %!var
hash changes:
Hash %!var = {:x(3)} 3 Hash %!var = {:data($[]), :x(3)} OK
Assigning to an array item
OK, we can create an array and it’s time to fill its elements with some data:
data[0] = 10; data[1] = 20;
The assignment
rule can be updated similarly to how we did it with string indexing in the previous chapter by adding an optional integer index in square brackets:
rule assignment {
<variable-name> [ '[' <integer> ']' ]? '=' <value>
}
In the corresponding action, the presence of the index indicates that we are working with an array, otherwise it is a scalar variable.
method assignment($/) {
if $<integer> {
%!var{~$<variable-name>}[+$<integer>] =
$<value>.made;
}
else {
%!var{~$<variable-name>} = $<value>.made;
}
}
After you run the program with the above assignments, the data
variable will keep two values in the storage:
Hash %!var = {:data($[10, 20])}
Side story: The joy of syntax
Before moving on towards more features for arrays and hashes, let us transform the grammar a bit. In the assignment method, the if
– else
check occupies more lines than the “useful” code. We can do two transformation to make the methods more compact.
First, let us repeat the trick with splitting a rule into two. Instead of one universal assignment rule, we can have two subrules:
rule assignment { <strong>| <array-item-assignment> </strong> <strong>| <scalar-assignment> </strong>} rule array-item-assignment { <variable-name> [ '[' <integer> ']' ] '=' <value> } rule scalar-assignment { <variable-name> '=' <value> }
It made the grammar more verbose, but the action became more compact:
method array-item-assignment($/) { %!var{~$<variable-name>}[+$<integer>] = $<value>.made; } method scalar-assignment($/) { %!var{~$<variable-name>} = $<value>.made; }
The second possible solution is to keep the original assignment
rule and use the where
clause in method’s signatures to dispatch the calls depending of the content of the match object.
multi method assignment($/ where $<integer>) {
%!var{~$<variable-name>}[+$<integer>] = $<value>.made;
}
multi method assignment($/ where !$<integer>) {
%!var{~$<variable-name>} = $<value>.made;
}
The negative condition !$<integer>
in the signature of the second variant of the multi-method is optional, but I’d prefer to keep it for clarity of the code.
There are two more actions that can be re-written in the same manner. The value
action:
multi method value($/ where $<expression>) {
$/.make($<expression>.made);
}
multi method value($/ where $<string>) {
$/.make($<string>.made);
}
The second action with a big if
– elsif
– else
condition is expr
:
multi method expr($/ where $<number>) {
$/.make($<number>.made);
}
multi method expr($/ where $<string>) {
$/.make($<string>.made);
}
multi method expr($/ where $<variable-name> && $<integer>) {
$/.make(%!var{$<variable-name>}.substr(+$<integer>, 1));
}
multi method expr($/ where $<variable-name> && !$<integer>) {
$/.make(%!var{$<variable-name>});
}
multi method expr($/ where $<expr>) {
$/.make(process($<expr>, $<op>));
}
multi method expr($/ where $<expression>) {
$/.make($<expression>.made);
}
These methods look so trivial now. Notice that some of the candidates check more than one key in the match object, for example: $<variable-name> && !$<integer>
.
Accessing array elements
The next goal is to start using individual array items, for instance, as it is shown in the next fragment:
say data[0]; say data[1]; my n = data[0] * data[1]; say n;
Our current actions class supports indexing strings already, and that’s the exact place which we have to extend:
multi method expr($/ where $<variable-name> && $<integer>) {
if %!var{$<variable-name>} ~~ Array {
$/.make(%!var{$<variable-name>}[+$<integer>]);
}
else {
$/.make(%!var{$<variable-name>}.substr(
+$<integer>, 1));
}
}
The method checks the type of the variable stored in the %!var
hash, and if it is an array, returns the requested element. The other branch works with strings as it did before.
The grammar can be simplified once again by extracting the sequence representing an array (and string) index to a separate rule:
rule index { '[' <integer> ']' }
Use the new rule inside assignment
and inside expr
when you take the value:
rule assignment {
<variable-name> <index>? '=' <value>
}
. . .
multi rule expr(4) {
| <number>
| <variable-name> <index>?
| '(' <expression> ')'
}
If you ever will want to change the syntax of indexes, say, to data:3
instead of data[3]
, there’s a single place to do that, the index
rule.
The actions must be adapted too. The index’s attribute is an integer value:
method index($/) { $/.make(+$<integer>); }
And thus you should use $<index>.made
to read it from other methods:
multi method assignment($/ where $<index>) {
%!var{~$<variable-name>}[$<index>.made] = $<value>.made;
}
multi method assignment($/ where !$<index>) {
%!var{~$<variable-name>} = $<value>.made;
}
. . .
multi method expr($/ where $<variable-name> && $<index>) {
if %!var{$<variable-name>} ~~ Array {
$/.make(%!var{$<variable-name>}[$<index>.made]);
}
else {
$/.make(%!var{$<variable-name>}.substr(
$<index>.made, 1));
}
}
multi method expr($/ where $<variable-name> && !$<index>) {
$/.make(%!var{$<variable-name>});
}
Once again, redundant conditions such as !$<index>
are used in the where
clause to make the code more readable; the multi-method can be correctly dispatched without them.
List assignments
So far, arrays can be created but you have to assign their elements one by one. Let us allow list assignment and initialisation:
my data[] = 111, 222, 333; data = 7, 9, 11;
A new syntax element, comma , appeared here. It does not clash with any other constructs of the language, so it can be easily embedded into the grammar.
rule array-declaration {
<variable-name> '[' ']' [ '=' <value>+ %% ',' ]?
}
rule assignment {
<variable-name> <index>? '=' <value>+ %% ','
}
In both cases, the value
rule is used, which means you can use both numbers, strings, and arithmetical expressions to be initialising values for the array elements:
my strings[] = "alpha", "beta", "gamma"; say strings[1]; # beta my arr[] = 11, 3 * 4, 2 * (6 + 0.5); say arr[0]; # 11 say arr[1]; # 12 say arr[2]; # 13
To implement it in actions, let’s make a helper method init-array
that takes the name of the variable and the list of the values:
<strong>method init-array</strong>($variable-name, @values) { %!var{$variable-name} = $[]; for @values -> $value { %!var{$variable-name}.push($value.made); } } multi method array-declaration($/ where $<value>) { <strong>self.init-array</strong>($<variable-name>, $<value>); } multi method assignment($/ where !$<index>) { if %!var{$<variable-name>} ~~ Array { <strong>self.init-array</strong>($<variable-name>, $<value>); } . . . }
When making a new array, you can also type Array.new
instead of $[]
.
Unlike, for example, the set of operator
functions, the init-array
routine is made a method as it has to have access to the variable storage %!var
.
Printing arrays
Another thing which is really desired for arrays, is to allow to print all their elements in a single instruction. Instead of listing separate items, we’d like to pass the whole array to the say
function:
my data[] = 5, 7, 9; say data;
In fact, Raku can already do that, because our implementation of say
just passes the whole container to Raku’s say
, which prints the data like this:
[5 7 9]
Let us be less humble and create our own output format by checking the type of the variable, as we did before:
method function-call($/) {
my $object = $<value>.made;
if $object ~~ Array {
say $object.join(', ');
}
else {
say $object;
}
}
This function prints the array as a comma-separated list of its items:
5, 7, 9
Hashes
In the remaining part of this chapter, we will implement hashes in our Lingua language. You have seen most of the ideas on the example of arrays, so the changes should be transparent and obvious. So, we have to implement a few things: declaration, declaration with initialisation, assignment to the whole hash and to a single element, reading the single value and printing the hash.
The following fragments demonstrate the syntax we use. To declare a hash, use a pair of curly braces after the name of the variable:
my data{};
Initialisation and assignments are done using the comma-separated list of key—value pairs. Keys are always strings, values can be any scalar value (numbers or strings). The separator between the key and the value is a colon:
my hash{} = "alpha" : 1, "beta": 2, "gamma": 3; my days{}; days = "Mon": "work", "Sat": "rest";
The grammar includes a separate rule for hash declaration:
rule variable-declaration { 'my' [ | <array-declaration> <strong>| <hash-declaration></strong> | <scalar-declaration> ] } rule <strong>hash-declaration</strong> { <variable-name> '{' '}' [ '=' [ <string> ':' <value> ]+ %% ',' ]? }
The assignment
rule should know how to deal with hashes. This time, the changes can be done in-place without creating new rules.
rule assignment { <variable-name> <index>? '=' [ | [ <string> ':' <value> ]+ %% ',' | <value>+ %% ',' ] }
In the actions, you have to carefully implement the declaration and hash assignment methods. They both use common method, init-hash
, to set the keys and the values of the hash.
method <strong>init-hash</strong>($variable-name, @keys, @values) { %!var{$variable-name} = Hash.new; while @keys { %!var{$variable-name}.{@keys.shift.made} = @values.shift.made; } } multi method <strong>hash-declaration</strong>($/) { self.init-hash($<variable-name>, $<string>, $<value>); } multi method assignment($/ where !$<index>) { . . . <strong>elsif %!var{$<variable-name>} ~~ Hash</strong> { self.init-hash($<variable-name>, $<string>, $<value>); } . . . }
Another part of hash implementation is allowing accessing values using the keys. It is wise to re-use the index
rule and make it a collection of two alternatives:
rule index { | <array-index> | <hash-index> } rule array-index { '[' <integer> ']' } rule hash-index { '{' <string> '}' }
Use the new rules in the already existing methods. The where
clauses receive an additional condition to make sure we caught the rule we want.
multi method assignment($/ where $<index> && <strong>$<index><array-index></strong>) { %!var{$<variable-name>}[$<index>.made] = $<value>[0].made; } multi method assignment($/ where $<index> && <strong>$<index><hash-index></strong>) { %!var{$<variable-name>}{$<index>.made} = $<value>[0].made; } multi method assignment($/ where !$<index>) { . . . <strong>elsif %!var{$<variable-name>} ~~ Hash</strong> { self.init-hash($<variable-name>, $<string>, $<value>); } . . . }
The new index
rule can already work in the expr(4)
rule, which allows us to read the values by the given hash key using the hash{"key"}
syntax. All we need is to update the expr
method to let it know about the new data structure:
multi method expr($/ where $<variable-name> && $<index>) {
. . .
elsif %!var{$<variable-name>} ~~ Hash {
$/.make(%!var{$<variable-name>}{$<index>.made});
}
. . .
}
Add the following lines to a test program and confirm that it works:
days{"Tue"} = "work"; say days{"Sat"};
Finally, teach the say
function to print hashes:
method function-call($/) { . . . <strong>elsif $object ~~ Hash</strong> { my @str; for $object.keys.sort -> $key { @str.push("$key: $object{$key}"); } say @str.join(', '); } . . . }
If you are good at using the map
method, try making a better version of the function. The expected output in response to say days;
should be like this:
Mon: work, Sat: rest, Tue: work
Review and test
Let us take a look at the current state of the Lingua language. We made a big work and implemented support for numbers, strings, arrays and hashes. It is possible to change the content of the variables and print their values. Let us make another small step to allow variables as array indices or hash keys.
rule array-index {
'[' [ <integer> | <variable-name> ] ']'
}
rule hash-index {
'{' [ <string> | <variable-name> ] '}'
}
The corresponding actions can be transformed to pairs of trivial multi-methods:
multi method array-index($/ where !$<variable-name>) {
$/.make(+$<integer>);
}
multi method array-index($/ where $<variable-name>) {
$/.make(%!var{$<variable-name>});
}
multi method hash-index($/ where !$<variable-name>) {
$/.make($<string>.made);
}
multi method hash-index($/ where $<variable-name>) {
$/.make(%!var{$<variable-name>});
}
Refer to the repository to check if you’ve got the correct files and if so, you will be able to run the following test program that uses most of the features that are implemented at this moment.
# Illustrating the Pythagorean theorem my a = 3; my b = 4; my c = 5; my left = a**2 + b**2; my right = c**2; say "The hypotenuse of a rectangle triangle with the sides $a and $b is indeed $c, as $left = $right."; /* Using floating-point numbers for computing the length of a circle */ my pi = 3.1415926; my R = 7; my c = 2 * pi * R; say "The length of a circle of radius $R is $c."; # A list of prime numbers my n = 5; my data[] = 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31; my nth = data[n]; say "$n th prime number is $nth."; # Demonstrating the use of hashes my countries{} = "France": "Paris", "Germany": "Berlin", "Italy": "Rome"; my country = "Italy"; my city = countries{country}; say "$city is the capital of $country.";
This program prints the following result.
$ ./lingua test22.lng
The hypotenuse of a rectangle triangle with the
sides 3 and 4 is indeed 5, as 25 = 25.
The length of a circle of radius 7 is 43.9822964.
5 th prime number is 13.
Rome is the capital of Italy.
OK
It is quite fascinating to see that the interpreter understands the program that never existed before. You wrote it and you can make lots of changes in it, and the program will still show the results that you expect.
In the next chapters, we will work on more complex sides of the language.
以上所述就是小编给大家介绍的《Chapter 7. Arrays and Hashes》,希望对大家有所帮助,如果大家有任何疑问请给我留言,小编会及时回复大家的。在此也非常感谢大家对 码农网 的支持!
猜你喜欢:本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们。
机器学习系统设计
[德] Willi Richert、Luis Pedro Coelho / 刘峰 / 人民邮电出版社 / 2014-7-1 / CNY 49.00
如今,机器学习正在互联网上下掀起热潮,而Python则是非常适合开发机器学习系统的一门优秀语言。作为动态语言,它支持快速探索和实验,并且针对Python的机器学习算法库的数量也与日俱增。本书最大的特色,就是结合实例分析教会读者如何通过机器学习解决实际问题。 本书将向读者展示如何从原始数据中发现模式,首先从Python与机器学习的关系讲起,再介绍一些库,然后就开始基于数据集进行比较正式的项目开......一起来看看 《机器学习系统设计》 这本书的介绍吧!