Chapter 7. Arrays and Hashes

栏目: IT技术 · 发布时间: 4年前

内容简介:In this chapter, we will extend theArrays are collections of elements, which use their common variable name and are accessible via integer indices. Let us introduce the following syntax for array declaration:It uses the same

Chapter 7. Arrays and Hashes This is a chapter from

Creating a compiler with Raku

In this chapter, we will extend the Lingua language with aggregate data types: arrays and hashes . From this point, we will call numbers and strings scalar variables.

Arrays

Arrays are collections of elements, which use their common variable name and are accessible via integer indices. Let us introduce the following syntax for array declaration:

my data[];

It uses the same my keyword as for declaring scalar variables (which can keep numbers or strings) and has two square brackets after the name. The  variable-declaration grammar rule can now be split in two parts, one for arrays and one for scalars:

rule variable-declaration {
    'my' <strong>[
</strong>         <strong>| <array-declaration>
</strong>         <strong>| <scalar-declaration>
</strong>    <strong>]
</strong>}

Arrays go first here, as their definition contains extra characters after the variable name and can be caught earlier.

Alternatively, we could introduce a new keyword, say arr , to define arrays instead of my , and simplify parsing at this point: arr data . But let us return to my choice, my data[] , as it also have its own advantages when we come to initialisations and reduces the number of reserved keywords.

The previous rule for scalar variable declaration migrates to a separate rule:

rule array-declaration {
    <variable-name> '[' ']'
}

rule scalar-declaration {
    <variable-name> [ '=' <value> ]?
}

The new array-declaration rule requires a pair of square brackets and does not yet include an initialiser part.

In the actions, we also have to distinguish between arrays and scalars, and we can do it by checking the presence of the $<array-declaration> match object.

method variable-declaration($/) {
    <strong>if $<scalar-declaration></strong> {
        %!var{$<scalar-declaration><variable-name>} =
            $<scalar-declaration><value> ??
            $<scalar-declaration><value>.made !! 0;
    }
    <strong>elsif $<array-declaration></strong> {
        %!var{$<array-declaration><variable-name>} = $[];
    }
}

It works but it looks too overloaded because of nested match object keys. In fact, there is no need for doing that, because individual actions can be created for each case.

<strong>method scalar-declaration</strong>($/) {
    %!var{$<variable-name>} = $<value> ?? $<value>.made !! 0;
}

<strong>method array-declaration</strong>($/) {
    %!var{$<variable-name>} = $[];
}

With this change, the variable-declaration method is not needed anymore and can be removed from the  LinguaActions class.

You can temporarily replace it with the following code just to see how the parser works with arrays:

method variable-declaration($/) {
    dd %!var;
}

The method displays what the variable storage contains after each variable declaration. Let us test this in action:

my x = 3;
say x;

my data[];

This program successfully compiles, and you can see how the %!var hash changes:

Hash %!var = {:x(3)}
3
Hash %!var = {:data($[]), :x(3)}
OK

Assigning to an array item

OK, we can create an array and it’s time to fill its elements with some data:

data[0] = 10;
data[1] = 20;

The assignment rule can be updated similarly to how we did it with string indexing in the previous chapter by adding an optional integer index in square brackets:

rule assignment {
<variable-name> [ '[' <integer> ']' ]? '=' <value>
}

In the corresponding action, the presence of the index indicates that we are working with an array, otherwise it is a scalar variable.

method assignment($/) {
if $<integer> {
%!var{~$<variable-name>}[+$<integer>] =
$<value>.made;
}
else {
%!var{~$<variable-name>} = $<value>.made;
}
}

After you run the program with the above assignments, the data variable will keep two values in the storage:

Hash %!var = {:data($[10, 20])}

Side story: The joy of syntax

Before moving on towards more features for arrays and hashes, let us transform the grammar a bit. In the assignment method, the ifelse check occupies more lines than the “useful” code. We can do two transformation to make the methods more compact.

First, let us repeat the trick with splitting a rule into two. Instead of one universal assignment rule, we can have two subrules:

rule assignment {
    <strong>| <array-item-assignment>
</strong>    <strong>| <scalar-assignment>
</strong>}

rule array-item-assignment {
    <variable-name> [ '[' <integer> ']' ] '=' <value>
}

rule scalar-assignment {
    <variable-name> '=' <value>
}

It made the grammar more verbose, but the action became more compact:

method array-item-assignment($/) {
    %!var{~$<variable-name>}[+$<integer>] = $<value>.made;
}

method scalar-assignment($/) {
    %!var{~$<variable-name>} = $<value>.made;
}

The second possible solution is to keep the original assignment rule and use the  where clause in method’s signatures to dispatch the calls depending of the content of the match object.

multi method assignment($/ where $<integer>) {
%!var{~$<variable-name>}[+$<integer>] = $<value>.made;
}

multi method assignment($/ where !$<integer>) {
%!var{~$<variable-name>} = $<value>.made;
}

The negative condition !$<integer> in the signature of the second variant of the multi-method is optional, but I’d prefer to keep it for clarity of the code.

There are two more actions that can be re-written in the same manner. The value action:

multi method value($/ where $<expression>) {
$/.make($<expression>.made);
}

multi method value($/ where $<string>) {
$/.make($<string>.made);
}

The second action with a big ifelsifelse condition is  expr :

multi method expr($/ where $<number>) {
$/.make($<number>.made);
}

multi method expr($/ where $<string>) {
$/.make($<string>.made);
}

multi method expr($/ where $<variable-name> && $<integer>) {
$/.make(%!var{$<variable-name>}.substr(+$<integer>, 1));
}

multi method expr($/ where $<variable-name> && !$<integer>) {
$/.make(%!var{$<variable-name>});
}

multi method expr($/ where $<expr>) {
$/.make(process($<expr>, $<op>));
}

multi method expr($/ where $<expression>) {
$/.make($<expression>.made);
}

These methods look so trivial now. Notice that some of the candidates check more than one key in the match object, for example: $<variable-name> && !$<integer> .

Accessing array elements

The next goal is to start using individual array items, for instance, as it is shown in the next fragment:

say data[0];
say data[1];

my n = data[0] * data[1];
say n;

Our current actions class supports indexing strings already, and that’s the exact place which we have to extend:

multi method expr($/ where $<variable-name> && $<integer>) {
if %!var{$<variable-name>} ~~ Array {
$/.make(%!var{$<variable-name>}[+$<integer>]);
}
else {
$/.make(%!var{$<variable-name>}.substr(
+$<integer>, 1));
}
}

The method checks the type of the variable stored in the %!var hash, and if it is an array, returns the requested element. The other branch works with strings as it did before.

The grammar can be simplified once again by extracting the sequence representing an array (and string) index to a separate rule:

rule index {
    '[' <integer> ']'
}

Use the new rule inside assignment and inside expr when you take the value:

rule assignment {
<variable-name> <index>? '=' <value>
}

. . .

multi rule expr(4) {
| <number>
| <variable-name> <index>?
| '(' <expression> ')'
}

If you ever will want to change the syntax of indexes, say, to data:3 instead of data[3] , there’s a single place to do that, the  index rule.

The actions must be adapted too. The index’s attribute is an integer value:

method index($/) {
    $/.make(+$<integer>);
}

And thus you should use $<index>.made to read it from other methods:

multi method assignment($/ where $<index>) {
%!var{~$<variable-name>}[$<index>.made] = $<value>.made;
}

multi method assignment($/ where !$<index>) {
%!var{~$<variable-name>} = $<value>.made;
}

. . .

multi method expr($/ where $<variable-name> && $<index>) {
if %!var{$<variable-name>} ~~ Array {
$/.make(%!var{$<variable-name>}[$<index>.made]);
}
else {
$/.make(%!var{$<variable-name>}.substr(
$<index>.made, 1));
}
}

multi method expr($/ where $<variable-name> && !$<index>) {
$/.make(%!var{$<variable-name>});
}

Once again, redundant conditions such as !$<index> are used in the where clause to make the code more readable; the multi-method can be correctly dispatched without them.

List assignments

So far, arrays can be created but you have to assign their elements one by one. Let us allow list assignment and initialisation:

my data[] = 111, 222, 333;

data = 7, 9, 11;

A new syntax element, comma , appeared here. It does not clash with any other constructs of the language, so it can be easily embedded into the grammar.

rule array-declaration {
<variable-name> '[' ']' [ '=' <value>+ %% ',' ]?
}

rule assignment {
<variable-name> <index>? '=' <value>+ %% ','
}

In both cases, the value rule is used, which means you can use both numbers, strings, and arithmetical expressions to be initialising values for the array elements:

my strings[] = "alpha", "beta", "gamma";
say strings[1]; # beta

my arr[] = 11, 3 * 4, 2 * (6 + 0.5);
say arr[0]; # 11
say arr[1]; # 12
say arr[2]; # 13

To implement it in actions, let’s make a helper method init-array that takes the name of the variable and the list of the values:

<strong>method init-array</strong>($variable-name, @values) {
    %!var{$variable-name} = $[];
    for @values -> $value {
        %!var{$variable-name}.push($value.made);
    }
}

multi method array-declaration($/ where $<value>) {
    <strong>self.init-array</strong>($<variable-name>, $<value>);
}

multi method assignment($/ where !$<index>) {
    if %!var{$<variable-name>} ~~ Array {
        <strong>self.init-array</strong>($<variable-name>, $<value>);
    }
    . . .
}

When making a new array, you can also type Array.new instead of  $[] .

Unlike, for example, the set of operator functions, the  init-array routine is made a method as it has to have access to the variable storage  %!var .

Printing arrays

Another thing which is really desired for arrays, is to allow to print all their elements in a single instruction. Instead of listing separate items, we’d like to pass the whole array to the say function:

my data[] = 5, 7, 9;
say data;

In fact, Raku can already do that, because our implementation of say just passes the whole container to Raku’s  say , which prints the data like this:

[5 7 9]

Let us be less humble and create our own output format by checking the type of the variable, as we did before:

method function-call($/) {
my $object = $<value>.made;

if $object ~~ Array {
say $object.join(', ');
}
else {
say $object;
}
}

This function prints the array as a comma-separated list of its items:

5, 7, 9

Hashes

In the remaining part of this chapter, we will implement hashes in our Lingua language. You have seen most of the ideas on the example of arrays, so the changes should be transparent and obvious. So, we have to implement a few things: declaration, declaration with initialisation, assignment to the whole hash and to a single element, reading the single value and printing the hash.

The following fragments demonstrate the syntax we use. To declare a hash, use a pair of curly braces after the name of the variable:

my data{};

Initialisation and assignments are done using the comma-separated list of key—value pairs. Keys are always strings, values can be any scalar value (numbers or strings). The separator between the key and the value is a colon:

my hash{} = "alpha" : 1, "beta": 2, "gamma": 3;

my days{};
days = "Mon": "work", "Sat": "rest";

The grammar includes a separate rule for hash declaration:

rule variable-declaration {
    'my' [
        | <array-declaration>
        <strong>| <hash-declaration></strong>
        | <scalar-declaration>
    ]
}

rule <strong>hash-declaration</strong> {
    <variable-name> '{' '}' [
        '=' [ <string> ':' <value> ]+ %% ','
    ]?
}

The assignment rule should know how to deal with hashes. This time, the changes can be done in-place without creating new rules.

rule assignment {
    <variable-name> <index>? '='
        [
            | [ <string> ':' <value> ]+ %% ','
            |                <value>+   %% ','
        ]
}

In the actions, you have to carefully implement the declaration and hash assignment methods. They both use common method, init-hash , to set the keys and the values of the hash.

method <strong>init-hash</strong>($variable-name, @keys, @values) {
    %!var{$variable-name} = Hash.new;
    while @keys {
        %!var{$variable-name}.{@keys.shift.made} =
            @values.shift.made;
   }
}

multi method <strong>hash-declaration</strong>($/) {
    self.init-hash($<variable-name>, $<string>, $<value>);
}

multi method assignment($/ where !$<index>) {
    . . .
    <strong>elsif %!var{$<variable-name>} ~~ Hash</strong> {
        self.init-hash($<variable-name>, 
                       $<string>, $<value>);
    }
    . . .
}

Another part of hash implementation is allowing accessing values using the keys. It is wise to re-use the index rule and make it a collection of two alternatives:

rule index {
    | <array-index>
    | <hash-index>
}

rule array-index {
    '[' <integer> ']'
}

rule hash-index {
    '{' <string> '}'
}

Use the new rules in the already existing methods. The where clauses receive an additional condition to make sure we caught the rule we want.

multi method assignment($/ where $<index> && 
                        <strong>$<index><array-index></strong>) {
    %!var{$<variable-name>}[$<index>.made] =
        $<value>[0].made;
}

multi method assignment($/ where $<index> &&
                        <strong>$<index><hash-index></strong>) {
    %!var{$<variable-name>}{$<index>.made} =
        $<value>[0].made;
}

multi method assignment($/ where !$<index>) {
    . . .
    <strong>elsif %!var{$<variable-name>} ~~ Hash</strong> {
        self.init-hash($<variable-name>,
                       $<string>, $<value>);
    }
    . . .
}

The new index rule can already work in the  expr(4) rule, which allows us to read the values by the given hash key using the  hash{"key"} syntax. All we need is to update the  expr method to let it know about the new data structure:

multi method expr($/ where $<variable-name> && $<index>) {
. . .
elsif %!var{$<variable-name>} ~~ Hash {
$/.make(%!var{$<variable-name>}{$<index>.made});
}
. . .
}

Add the following lines to a test program and confirm that it works:

days{"Tue"} = "work";
say days{"Sat"};

Finally, teach the say function to print hashes:

method function-call($/) {
    . . .
    <strong>elsif $object ~~ Hash</strong> {
        my @str;
        for $object.keys.sort -> $key {
            @str.push("$key: $object{$key}");
        }
        say @str.join(', ');
    }
    . . .
}

If you are good at using the map method, try making a better version of the function. The expected output in response to  say days; should be like this:

Mon: work, Sat: rest, Tue: work 

Review and test

Let us take a look at the current state of the Lingua language. We made a big work and implemented support for numbers, strings, arrays and hashes. It is possible to change the content of the variables and print their values. Let us make another small step to allow variables as array indices or hash keys.

rule array-index {
'[' [ <integer> | <variable-name> ] ']'
}

rule hash-index {
'{' [ <string> | <variable-name> ] '}'
}

The corresponding actions can be transformed to pairs of trivial multi-methods:

multi method array-index($/ where !$<variable-name>) {
$/.make(+$<integer>);
}

multi method array-index($/ where $<variable-name>) {
$/.make(%!var{$<variable-name>});
}

multi method hash-index($/ where !$<variable-name>) {
$/.make($<string>.made);
}

multi method hash-index($/ where $<variable-name>) {
$/.make(%!var{$<variable-name>});
}

Refer to the repository to check if you’ve got the correct files and if so, you will be able to run the following test program that uses most of the features that are implemented at this moment.

# Illustrating the Pythagorean theorem
my a = 3;
my b = 4;
my c = 5;
my left = a**2 + b**2;
my right = c**2;
say "The hypotenuse of a rectangle triangle with the 
sides $a and $b is indeed $c, as $left = $right.";

/* Using floating-point numbers for 
computing the length of a circle */
my pi = 3.1415926;
my R = 7;
my c = 2 * pi * R;
say "The length of a circle of radius $R is $c.";

# A list of prime numbers
my n = 5;
my data[] = 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31;
my nth = data[n];
say "$n th prime number is $nth.";

# Demonstrating the use of hashes
my countries{} = 
    "France": "Paris", "Germany": "Berlin", "Italy": "Rome";
my country = "Italy";
my city = countries{country};
say "$city is the capital of $country.";

This program prints the following result.

$ ./lingua test22.lng 
The hypotenuse of a rectangle triangle with the
sides 3 and 4 is indeed 5, as 25 = 25.
The length of a circle of radius 7 is 43.9822964.
5 th prime number is 13.
Rome is the capital of Italy.
OK

It is quite fascinating to see that the interpreter understands the program that never existed before. You wrote it and you can make lots of changes in it, and the program will still show the results that you expect.

In the next chapters, we will work on more complex sides of the language.


以上所述就是小编给大家介绍的《Chapter 7. Arrays and Hashes》,希望对大家有所帮助,如果大家有任何疑问请给我留言,小编会及时回复大家的。在此也非常感谢大家对 码农网 的支持!

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

机器学习系统设计

机器学习系统设计

[德] Willi Richert、Luis Pedro Coelho / 刘峰 / 人民邮电出版社 / 2014-7-1 / CNY 49.00

如今,机器学习正在互联网上下掀起热潮,而Python则是非常适合开发机器学习系统的一门优秀语言。作为动态语言,它支持快速探索和实验,并且针对Python的机器学习算法库的数量也与日俱增。本书最大的特色,就是结合实例分析教会读者如何通过机器学习解决实际问题。 本书将向读者展示如何从原始数据中发现模式,首先从Python与机器学习的关系讲起,再介绍一些库,然后就开始基于数据集进行比较正式的项目开......一起来看看 《机器学习系统设计》 这本书的介绍吧!

RGB转16进制工具
RGB转16进制工具

RGB HEX 互转工具

随机密码生成器
随机密码生成器

多种字符组合密码

RGB HSV 转换
RGB HSV 转换

RGB HSV 互转工具