Machine Learning on Graphs: Why Should you Care?

栏目: IT技术 · 发布时间: 4年前

内容简介:A few years ago, “Balboa Creole French” was considered as one language that is to disappear [1]. Balboa Island is located in Newport Beach,California. People their speak their modified version of French because many French families moved there after the fi

A basic overview of graphs and their intersection with machine learning.

A few years ago, “Balboa Creole French” was considered as one language that is to disappear [1]. Balboa Island is located in Newport Beach,California. People their speak their modified version of French because many French families moved there after the first world war and started to learn English, German, and Spanish until the language was formed. There are around 20 people who still speak that language.

Of course, everything I said was a complete hoax, but people did not believe so until someone actually went to the island to learn and the language and ended up finding that the language did not exist in the first place(at least that’s what the rumors say).

Now, you might ask what does this have to do with machine learning on graphs? Well, around 4 years ago, research [2] done at Stanford University came up with classifiers that managed to detect such hoaxes on Wikipedia that had an accuracy of 86% compared to the human-level accuracy of 66%!

The classifier they used was an ensemble of decision trees called Random Forests. The interesting part was how they crafted the features.

Machine Learning on Graphs: Why Should you Care?

Graph Diagrams for Real and Hoax Wikipedia Articles

One of the key ideas in the paper was how real articles link more coherently than false ones. In a Wikipedia article, you would have markup pointing to some other Wikipedia article. For real articles, the markups are linked together more than in a hoax and this turned out as a key factor in figuring out Wikipedia hoaxes.

Now, go to google, and type a question like “When did Leonardo Da Vinci die?”. You will get a lot of results for your search, but at the top, you will see a small box with the answer inside. How did Google know what we wanted?Back in 2012, Google released its Knowledge Graph which models entities in the world and relationships between them as a graph. So the string you input is not a string, rather a node in a huge graph. Leonardo Da Vinci is one node of this graph. The other node is May 2, 1519 which is his death date. There is a link connecting these two nodes. The link’s name or relation is Date of Death .

Of course, querying this graph and finding ways to embed the nodes/relations is another story which I would not tackle here!

Another one of the interesting applications of machine learning on graphs is the prediction of the side-effects due to the consumption of multiple drugs. Basically, many patients have to take sometimes more than one drug. Each drug affects a certain set of proteins. So if we can build a graph where the nodes are drugs and proteins. An arrow indicates that the associated drug affects the protein. Now, we know the effects of some drugs taken together. The problem is that we do not know the effects of all pairs of drugs since there are over 13000 drugs and doing experiments for each pair is time-consuming.

Machine Learning on Graphs: Why Should you Care?

Drug and Protein Graph

The other solution would be to use machine learning to predict these side-effects. Drugs are represented by triangles and proteins by circles. A link from a drug to a protein indicates that the protein is affected by this drug. A link between two drugs indicates that there is a side-effect if the two drugs are taken together. Notice how if drug #1 and drug #2 are taken together, nausea occurs. What happens if drug #2 and drug #3 are taken together? This is a task called Link Prediction where we aim to predict if there is a link between two nodes by taking advantage of the other links in the graph! Several side-effects have been predicted using Machine Learning without spending time on time-consuming experiments.

To end, graphs are gaining an increased attention these couple of years, especially in the machine learning community. They are a language to describe complex data across various domains. Combined with machine learning, they have had a great impact on social networking, drug design, AI reasoning, and many more.

I have given a basic overview of applications of graphs in Machine Learning. I am thinking of publishing articles tackling the theoretical and practical sides. I will cover basic graph theory, social networks, random graph models, spectral clustering, graph neural networks, and deep generative models for graphs. I will also be accompanying this with code to implement. But first, I need to know if there is an audience for this. If you are interested, please let me know what you think!

Thanks for your time!


以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

计算机网络(第6版)

计算机网络(第6版)

[美] James F.Kurose、[美] Keith W.Ross / 陈鸣 / 机械工业出版社 / 2014-10 / 79.00元

《计算机网络:自顶向下方法(原书第6版)》第1版于12年前出版,首创采用自顶向下的方法讲解计算机网络的原理和协议,出版以来已被几百所大学和学院选用,是业界最经典的计算机网络教材之一。 《计算机网络:自顶向下方法(原书第6版)》第6版继续保持了以前版本的特色,为计算机网络教学提供了一种新颖和与时俱进的方法,同时也进行了相当多的修订和更新:第1章更多地关注时下,更新了接入网的论述;第2章用pyt......一起来看看 《计算机网络(第6版)》 这本书的介绍吧!

在线进制转换器
在线进制转换器

各进制数互转换器

HSV CMYK 转换工具
HSV CMYK 转换工具

HSV CMYK互换工具