Expressive power of graph neural networks and the Weisefeiler-Lehman test

栏目: IT技术 · 发布时间: 4年前

内容简介:TL;DR:One of the challenges is that graphs encountered in applications are combinations of continuous and discrete structures (node- and edge features and connectivity, respectively), and thus this question can be posed in different ways. One possible form

TL;DR: Do you have a feeling that deep learning on graphs is a bunch of heuristics that work sometimes and nobody has a clue why? In this post, I discuss the graph isomorphism problem, the Weisfeiler-Lehman heuristic for graph isomorphism testing, and how it can be used to analyse the expressive power of graph neural networks. This is the first in the series of three posts on the expressivity of graph neural networks. In Part 2, I will discuss how to depart from the Weisfeiler-Lehman hierarchy and in Part 3, I will suggest why it may be a good idea to revisit the whole graph isomorphism framework.

Expressive power of graph neural networks and the Weisefeiler-Lehman test

T raditional feed-forward networks (multi-layer perceptrons) are known to be universal approximators: they can approximate any smooth function to any desired accuracy. For graph neural networks, which have emerged relatively recently, the representation properties are less understood. It often happens to observe in experiments that graph neural networks excel on some datasets but at the same time perform disappointingly on others. In order to get to the root of this behaviour, one has to answer the question: how powerful are graph neural networks?

One of the challenges is that graphs encountered in applications are combinations of continuous and discrete structures (node- and edge features and connectivity, respectively), and thus this question can be posed in different ways. One possible formulation is whether graph neural networks can distinguish between different types of graph structures. This is a classical question in graph theory known as the graph isomorphism problem , aiming to determine whether two graphs are topologically equivalent [1]. Two isomorphic graphs have the same connectivity and differ only by a permutation of their nodes.

Somewhat surprisingly, the exact complexity class of the graph isomorphism problem is unknown. It is not known to be solvable in polynomial time nor to be NP-complete , and is sometimes attributed to a special “ GI class ” [2].

Weisfeiler-Lehman test.The seminal 1968 paper of Boris Weisfeiler andAndrey Lehman [3] proposed an efficient heuristic now known as the Weisfeiler-Lehman (WL) test that was initially believed to be a polynomial-time solution for the graph isomorphism problem [4]. A counterexample was found a year later; however, it appears that the WL test works for almost all graphs, in the probabilistic sense [5].

Expressive power of graph neural networks and the Weisefeiler-Lehman test

Example of execution of the Weisfeiler-Lehman test on two isomorphic graphs. Curled brackets denote multisets. The algorithm stops after the colouring does not change and produces an output (histogram of colours). Equal outputs for the two graphs suggest that they are possibly isomorphic.

The WL test is based on iterative graph recolouring [6] (“colour” in graph theory refers to a discrete node label), starting with all nodes of identical colour. At each step, the algorithm aggregates the colours of nodes and their neighbourhoods representing them as multisets [7], and hashes the aggregated colour multisets into unique new colours. The algorithm stops upon reaching a stable colouring. If at that point the colourings of the two graphs differ, the graphs are deemed non-isomorphic. However, if the colourings are the same, the graphs are possibly (but not necessarily) isomorphic. In other words, the WL test is a necessary but insufficient condition for graph isomorphism. There exist non-isomorphic graphs for which the WL test produces identical colouring and thus considers them “possibly isomorphic”; the test is said to fail in this case. One such example is shown in the following figure:

Expressive power of graph neural networks and the Weisefeiler-Lehman test

Two non-isomorphic graphs on which the WL graph isomorphism test fails, as evident from the identical colouring it produces. In chemistry, these graphs represent the molecular structure of two different compounds, decalin (left) and bicyclopentyl (right). Figure adapted from [14].

Graph isomorphism networks.Keyulu Xu [9] and Christopher Morris [10] (and at least two years earlier, Thomas Kipf in his blog post ) noticed that the WL test bears striking resemblance to graph message passing neural networks [8], a way of doing convolution-like operations on graphs. In a message-passing layer, the features of each node are updated by aggregating the features of the neighbours. The choice of the aggregation and the update operations is crucial: only multiset injective functions make it equivalent to the WL algorithm. Some popular choices for aggregators used in the literature such as maximum or mean are actually strictly less powerful than WL and fail to distinguish between very simple graph structures:

Expressive power of graph neural networks and the Weisefeiler-Lehman test

Examples of graph structures that cannot be distinguished by max but can be distinguished by mean aggregator (first and second) and can be distinguished by neither max nor mean (first and third). The reason is that the features aggregated in this way from the neighbours of the black node will be the same. Figure adapted from [9].

Xu proposed a choice of the aggregation and update functions that make message passing neural networks equivalent to the WL algorithm, calling them Graph Isomorphism Networks (GIN). This is as powerful as a standard message-passing neural network can get. But more than a new architecture, the main impact was formulating the question of expressiveness in a simple setting that could be related to a classical problem from graph theory. This idea has already spurred multiple follow-up works.

Weisfeiler-Lehman hierarchy.One direction of extending the results of Xu and Morris was using more powerful graph isomorphism tests. Proposed by László Babai, the k-WL test is a higher-order extension of the Weisfeiler-Lehman algorithm that works on k -tuples instead of individual nodes. With the exception of 1-WL and 1-WL tests that are equivalent, ( k +1)-WL is strictly stronger than k -WL, for any k ≥2, i.e. there exist examples of graphs on which k -WL fails and ( k +1)-WL succeeds, but not vice versa. k -WL is thus a hierarchy or increasingly more powerful graph isomorphism tests, sometimes referred to as the Weisfeiler-Lehman hierarchy [10].

It is possible to design graph neural networks that follow the k -WL test and are thus strictly more powerful than message passing architectures. One such first architecture, k -GNN, was proposed by Morris [11]. A key difference between traditional message passing neural networks and such higher-order GNNs is the fact that they are non-local , as the k -WL algorithm operates on k -tuples of nodes. This has significant implications both on the implementation of the algorithm and its computational and memory complexity: k -GNNs require ( nᵏ ) memory. As a way of mitigating complexity, Morris devised a local version of k -GNNs based on aggregation in local neighbourhoods, which however is less expressive than k -WL.

Somewhat different high-order graph architectures were proposed by Haggai Maron, in whose Ph.D. defence at Weizmann Institute I had the privilege to participate in September 2019. Maron defined a class of Invariant Graph Networks (IGN) based on k -order tensors [12] and showed they are as powerful as k -WL. IGNs are derived from a different variant of k -WL [10] and are more advantageous in terms of their complexity compared to k -GNNs. In particular, the 3-WL-equivalent IGN has “only” quadratic complexity, which is probably the only practically useful graph neural network architecture strictly more powerful than message passing, but still a far cry from the linear complexity of the former[16].

From a theoretical standpoint, the works on provably powerful graph neural networks provided a rigorous mathematical framework that can help interpret and compare different algorithms. There have been multiple follow-up works that extended these results using methods from graph theory and distributed local algorithms [14].

From a practical standpoint, however, there has hardly been a significant impact of these new architectures: for example, the latest benchmarks [15] show that recent provably powerful algorithms actually underperform older techniques. This is not an uncommon situation in machine learning where there are often big gaps between theory and practice. One of the explanations could be deficiencies in the benchmark itself. But perhaps more profound reasons are that better expressivity does not necessarily offer better generalisation (sometimes quite the opposite), and moreover, it is possible that the graph isomorphism model might not capture correctly the actual notion of graph similarity in specific applications — I will discuss this in my next posts. For sure, this line of works is extremely fruitful for building bridges to other disciplines and bringing methods previously not used in the field of deep learning on graphs.


以上所述就是小编给大家介绍的《Expressive power of graph neural networks and the Weisefeiler-Lehman test》,希望对大家有所帮助,如果大家有任何疑问请给我留言,小编会及时回复大家的。在此也非常感谢大家对 码农网 的支持!

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

可视化未来

可视化未来

[美] 埃雷兹·艾登、[法] 让-巴蒂斯特·米歇尔 / 王彤彤、沈华伟、程学旗 / 浙江人民出版社 / 2015-9 / 54.90元

科学的传播速度有多快?今时今日我们很少谈论上帝了吗?人们什么时候开始用“having sex” 而不用“making love”? 史上的人是在哪岁成名的?语法的变化速度到底有多快?哪些作家被纳粹审查得最彻底? “donut” 什么时候开始取代“doughnut”? 我 们能否预测人类未来?比尔·克林顿和花椰菜哪个更出名? 《可视化未来》一书的两位作者通过与“谷歌图书”的合作,得以有机会研究......一起来看看 《可视化未来》 这本书的介绍吧!

HTML 编码/解码
HTML 编码/解码

HTML 编码/解码

RGB HSV 转换
RGB HSV 转换

RGB HSV 互转工具

HEX HSV 转换工具
HEX HSV 转换工具

HEX HSV 互换工具