A recursive descent parser in Forth

栏目: IT技术 · 发布时间: 6年前

内容简介:I was reading through Schorre’sSo it compiles down to an idealised assembly language whose instructions have to be implemented in some other language. I know that there is at least one C and lua implementation, for example. No doubt the Raku folks out ther

I was reading through Schorre’s paper on Metta II again (see also the Wikipedia article ).  For those that don’t know, Meta II is a metacompiler, which is to say, a language that can compile itself. It can’t, of course, be turtles all the way down, otherwise you wouldn’t be able to bootstrap it.

So it compiles down to an idealised assembly language whose instructions have to be implemented in some other language. I know that there is at least one C and lua implementation, for example. No doubt the Raku folks out there can’t permit this to go unchallenged given its grammar functionality.

A criticism of Meta II is that a fair bit of work has to go into writing a virtual machine. It’s not an excessive amount of work, though.

One weakness of Meta II, I think, is that you can’t perform arbitrary actions, only transform input to output. To give an example, Meta II might produce an instruction like:

TST ' .OUT'

which tests the input string against the given string (in this ‘ .OUT’). This gives us a problem: we cannot, for example, create assembly code output from Meta II, as assemblers don’t allow for the inlining of strings. They need to be collected and put in a separate area. Meta II has no mechanism that allows you to do that.

Raku grammars allow you to perform actions as it parses, but often they aren’t what you want. Take, for example, a rule that might look something like

rule my-if { 
    'IF' <expr> 'THEN' <statements> 'END' 
    { do-something } 
}

The problem is that <expr> and <statements> are processed before do-something, which is not what you want. You want to be able to “mix in” processing at different levels of parsing.

The solution in Raku’s case is either to bolt on an action class to the parsed tree, or parse the resultant tree manually yourself. The former approach is the standard one, but I think that the latter approach might be easier in a lot of circumstances.

It got me to thinking as to how Forth, a David compared to Raku’s Goliath, might compete in this arena.

As a small diversion, I tried to see if I could put together a recursive descent parser for arithmetic involving just numbers, parentheses, addition and multiplication. Here’s what I whipped up using gforth:

create yytext 80 allot
variable yylen
: .yytext yytext yylen @ type ; \ print yytext
: yylex parse-name yylen ! yytext yylen @ move ; \ scan next token
: yy0 yytext c@ ;
: IS? yy0 = ; \ ( c -- t ) is first char of lexeme c?
: NUM? yy0 '0 >= yy0 '9 <= and ;

\ NOW WE DEFINE OUR ACTUAL RULES

defer ex1

: BRA yylex ex1 yylex ;
: EX3 num? if ." LDA " .yytext cr yylex else BRA then ;
: EX2 EX3 begin '* is? while yylex EX3 ." MUL" cr repeat ;
: EX1' EX2 begin '+ is? while yylex EX2 ." ADD" cr repeat ;

' ex1' IS ex1

: RUN refill drop yylex ex1 cr .s bye ;
run
1 + ( 5 + ( 2 + 3 * 4 ) * 6 )

Running it produces the following code:

LDA 1
LDA 5
LDA 2
LDA 3
LDA 4
MUL
ADD
LDA 6
MUL
ADD
ADD

Now, there are severe limitations to the approach, and there’s a little bit of “cheating” going on, with some simplifying assumptions, but I was gratified with just how little code I needed to produce the desired result.

For comparison purposes, the equivalent rules in Metta II are:

EX3 = .NUMBER .OUT('LD ' *) / '(' EX1 ')' .,EX2 = EX3 $ ('*' EX3 .OUT('MUL')) .,EX1 = EX2 $ ('+' EX2 .OUT('ADD')) .,

Part of the cheating I adopted involves using Forth’s own parser. How could this be improved?

There are a few approaches I can think of.

One might be to implement the idealised assembly language exactly as it is stated by Schorre. The main problem as I see it is with the problem of branching instructions. It’s difficult to visualise the nesting structure of the outputted branches, but if the forward branches are strictly nested, then we should be able to cope  .

A second approach might be that instead of writing a Forth that parses the idealised code, make Meta II output valid Forth code.  We then don’t have to write anything. We just run the code.

A third might be to write a “messy Meta II” parser, in the style of my Forth solution above, which takes in a clean Metta II description and produces a messy one.

A fourth solution might be to write the grammars in a more Forth-friendly style, with something like:

: EX1 EX2 $ ( '+'EX2 => ."ADD" ) ;

You’d then define $ as an immediate word that scans the next token. ‘(‘ would also be an immediate word that does its own scanning up to and include the ‘)’. I’m sure there’d be many details to work out.

The coolness of all this is that it would then be “Forth all the way down” (well, maybe a couple of levels deep, but you wouldn’t need to implement an assembler). And you’d be able to perform any arbitrary actions you liked.

Interesting, no? I have to say, there’s something just way intriguing about Forth itself and how it can effectively reconfigure itself. Perhaps one can achieve similar expressiveness with Raku’s “slang” capabilities?

Maybe one could write a Forth as a Raku slang, and use that slang to create other slangs? Come on Rakuers, you know you want to :wink:

For more mind-blowing stuff, check out the page on creating a BNF parser in Forth. I haven’t studied it yet, but it looks interesting. I am constantly amazed as to how experienced Forthers manage to achieve so much with so little.

Your Forth program isn’t finished until it looks like a Haiku. :wink:

Postscript: Greetings Bulgaria! Looking at the analytics for this blog, I notice that an overwhelming number of readers today are from Bulgaria. I have no idea why they should find my blog interesting all of a sudden, but welcome anyway.


以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

创业的艺术2.0

创业的艺术2.0

〔美〕盖伊·川崎 / 刘悦、段歆玥 / 译言·东西文库/电子工业出版社 / 2016-9 / 68

“创业者导师”——盖伊•川崎的《创业的艺术2.0》被阿丽亚娜•赫芬顿评为“终极的创业手册”。无论您是企业家、小企业主、企业开拓者还是非盈利组织的领导人,都可以让你的产品、服务或理念获得成功。 盖伊选取了不用角度,探索前十年商界的巨大变化,并寻求解决之道。曾经所向披靡的市场巨头深陷水深火热之中,社交媒体也取代了人际关系和广告,成为营销推广的主要渠道。众筹也成为广大投资者的可行之举。“云”更是每......一起来看看 《创业的艺术2.0》 这本书的介绍吧!

随机密码生成器
随机密码生成器

多种字符组合密码

MD5 加密
MD5 加密

MD5 加密工具

RGB HSV 转换
RGB HSV 转换

RGB HSV 互转工具