Full Proof that C++ Grammar is Undecidable

栏目: IT技术 · 发布时间: 4年前

内容简介:Most programming languages’ grammars fall into the category of Context-Free Grammar (CFG), or sometimes Context-Sensitive Grammar (CSG). What about C++? It turns out parsing C++’s grammar isTo prove this, we will write a program that is parsed differently

Most programming languages’ grammars fall into the category of Context-Free Grammar (CFG), or sometimes Context-Sensitive Grammar (CSG). What about C++? It turns out parsing C++’s grammar is literally undecidable . In other words, if you have a compiler that parses all valid C++ sources perfectly, you have solved the Halting Problem.

Full Proof that C++ Grammar is Undecidable

Relationship between various categories of formal languages

To prove this, we will write a program that is parsed differently depending on the solution to the Halting Problem. But we need to implement a Turing Machine to formulate the Halting Problem, and doing that with only template metaprogramming is a huge pain, so we will instead use the Post Correspondence Problem which is proven to be equivalent to the Halting Problem.

The Post Correspondence Problem’s statement is simple: You are given a set of dominoes which have an array of symbols written on the top part and the bottom part. Suppose there are an infinite amount of each type of domino. Can you arrange the dominoes so that the string of symbols on the top part matches the string of symbols on the bottom part?

Full Proof that C++ Grammar is Undecidable
Three types of dominoes [bba, bb], [ab, aa], [a, baa] arranged so that the top part “bbaabbbaa” reads the same as the bottom part.

Turns out, there exists no algorithm that says “yes” or “no” to the Post Correspondence Problem in finite time, given any set of dominoes as input.

We first define a variadic struct template Row , which represents a “row” of symbols ( int s). We add a convenience static member constant that says whether this row is empty.

Full Proof that C++ Grammar is Undecidable

Now we define a template that represents a domino:

Full Proof that C++ Grammar is Undecidable

And a handy type alias that concatenates two Row s together:

Now, to exhaustively search through the entire search space, we will use a Breadth-First Search (BFS) algorithm. In the core of BFS lies a FIFO queue, so we first implement that:

Full Proof that C++ Grammar is Undecidable

Each state of the search space consists of the upper row and lower row of the already-arranged dominoes. We can check if the state is a match by checking if the two rows are identical and not empty.

Now we implement the core of the algorithm. First, pop the head of the queue, and check if the popped state is a match. If it is, the problem is solved, and the answer is “yes”; if it isn’t, push all the child states generated by appending a domino at the right end, and keep going. If the queue is somehow empty, that means we looked at the entire search space and did not find a match, so the answer is “no”.

Full Proof that C++ Grammar is Undecidable

The initial state is two empty rows:

Now, DominoList::match has the solution to the Post Correspondence Problem defined by the template argument Dominos !

We can now define a helper struct ParseThis whose member typeOrValue is a type int if the solution to the Post Correspondence Problem is “yes”, and a value 0 of type int if the solution is “no”, using SFINAE:

Full Proof that C++ Grammar is Undecidable

Therefore, the following line is a function declaration if the solution to the Post Correspondence Problem with the dominoes [bba, bb], [ab, aa], [a, baa] is “Yes”, and a variable declaration if it is “No”. (In the former case, it is equivalent to int x(int); , and in the latter case, it is equivalent to int x(0); or int x = 0; )

There we have it! We can substitute any set of dominoes in the above line, and a conforming compiler should be able to decide whether x is a function or variable! If such compiler exists, however, we can use it to solve the Post Correspondence Problem for any input, which means we can use it to solve the Halting Problem. But no such program can exist, which means parsing C++ is undecidable. QED

Read the full code here:

https://gist.github.com/mujjingun/efcdc9d8e82bc44c67843a542d3917d9


以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

PCI Express 体系结构导读

PCI Express 体系结构导读

王齐 / 机械工业 / 2010-3 / 55.00元

《PCI Express 体系结构导读》讲述了与PCI及PCI Express总线相关的最为基础的内容,并介绍了一些必要的、与PCI总线相关的处理器体系结构知识,这也是《PCI Express 体系结构导读》的重点所在。深入理解处理器体系结构是理解PCI与PCI Express总线的重要基础。 读者通过对《PCI Express 体系结构导读》的学习,可超越PCI与PCI Express总线......一起来看看 《PCI Express 体系结构导读》 这本书的介绍吧!

HTML 编码/解码
HTML 编码/解码

HTML 编码/解码

URL 编码/解码
URL 编码/解码

URL 编码/解码

XML 在线格式化
XML 在线格式化

在线 XML 格式化压缩工具