KnowledgeNet: A Benchmark for Knowledge Base Population

栏目: IT技术 · 发布时间: 6年前

内容简介:The problem of automatically augmenting a knowledge base with facts expressed in natural language is known as Knowledge Base Population (KBP). This problem has been extensively studied in the last couple of decades; however, progress has been slow in part

EMNLP 2019 paper , dataset leaderboard  and code

Knowledge bases (also known as knowledge graphs or ontologies) are valuable resources for developing intelligence applications, including search, question answering, and recommendation systems. However, high-quality knowledge bases still mostly rely on structured data curated by humans. Such reliance on human curation is a major obstacle to the creation of comprehensive, always-up-to-date knowledge bases such as the Diffbot Knowledge Graph .

The problem of automatically augmenting a knowledge base with facts expressed in natural language is known as Knowledge Base Population (KBP). This problem has been extensively studied in the last couple of decades; however, progress has been slow in part because of the lack of benchmark datasets. 

KnowledgeNet: A Benchmark for Knowledge Base Population
Knowledge Base Population (KBP) is the problem of automatically augmenting a knowledge base with facts expressed in natural language.

KnowledgeNet is a benchmark dataset for populating Wikidata with facts expressed in natural language on the web. Facts are of the form (subject; property; object), where subject and object are linked to Wikidata. For instance, the dataset contains text expressing the fact ( Gennaro Basile ; RESIDENCE; Moravia ), in the passage:

“Gennaro Basile was an Italian painter, born in Naples but active in the German-speaking countries. He settled at Brunn, in Moravia, and lived about 1756…”

KBP has been mainly evaluated via annual contests promoted by TAC . TAC evaluations are performed manually and are hard to reproduce for new systems . Unlike TAC, KnowledgeNet employs an automated and reproducible way to evaluate KBP systems at any time, rather than once a year. We hope a faster evaluation cycle will accelerate the rate of improvement for KBP.

Please refer to our EMNLP 2019 Paper for details on KnowlegeNet, but here are some takeaways:

  • State-of-the-art models (using BERT ) are far from achieving human performance (0.504 vs 0.822).
  • The traditional pipeline approach for this problem is severely limited by error propagation.
  • KnowledgeNet enables the development of end-to-end systems, which are a promising solution for addressing error propagation.

以上所述就是小编给大家介绍的《KnowledgeNet: A Benchmark for Knowledge Base Population》,希望对大家有所帮助,如果大家有任何疑问请给我留言,小编会及时回复大家的。在此也非常感谢大家对 码农网 的支持!

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

Making Things See

Making Things See

Greg Borenstein / Make / 2012-2-3 / USD 39.99

Welcome to the Vision Revolution. With Microsoft's Kinect leading the way, you can now use 3D computer vision technology to build digital 3D models of people and objects that you can manipulate with g......一起来看看 《Making Things See》 这本书的介绍吧!

HTML 编码/解码
HTML 编码/解码

HTML 编码/解码

XML、JSON 在线转换
XML、JSON 在线转换

在线XML、JSON转换工具

HEX CMYK 转换工具
HEX CMYK 转换工具

HEX CMYK 互转工具