内容简介:The problem of automatically augmenting a knowledge base with facts expressed in natural language is known as Knowledge Base Population (KBP). This problem has been extensively studied in the last couple of decades; however, progress has been slow in part
EMNLP 2019 paper , dataset , leaderboard and code
Knowledge bases (also known as knowledge graphs or ontologies) are valuable resources for developing intelligence applications, including search, question answering, and recommendation systems. However, high-quality knowledge bases still mostly rely on structured data curated by humans. Such reliance on human curation is a major obstacle to the creation of comprehensive, always-up-to-date knowledge bases such as the Diffbot Knowledge Graph .
The problem of automatically augmenting a knowledge base with facts expressed in natural language is known as Knowledge Base Population (KBP). This problem has been extensively studied in the last couple of decades; however, progress has been slow in part because of the lack of benchmark datasets.
KnowledgeNet is a benchmark dataset for populating Wikidata with facts expressed in natural language on the web. Facts are of the form (subject; property; object), where subject and object are linked to Wikidata. For instance, the dataset contains text expressing the fact ( Gennaro Basile ; RESIDENCE; Moravia ), in the passage:
“Gennaro Basile was an Italian painter, born in Naples but active in the German-speaking countries. He settled at Brunn, in Moravia, and lived about 1756…”
KBP has been mainly evaluated via annual contests promoted by TAC . TAC evaluations are performed manually and are hard to reproduce for new systems . Unlike TAC, KnowledgeNet employs an automated and reproducible way to evaluate KBP systems at any time, rather than once a year. We hope a faster evaluation cycle will accelerate the rate of improvement for KBP.
Please refer to our EMNLP 2019 Paper for details on KnowlegeNet, but here are some takeaways:
- State-of-the-art models (using BERT ) are far from achieving human performance (0.504 vs 0.822).
- The traditional pipeline approach for this problem is severely limited by error propagation.
- KnowledgeNet enables the development of end-to-end systems, which are a promising solution for addressing error propagation.
以上所述就是小编给大家介绍的《KnowledgeNet: A Benchmark for Knowledge Base Population》,希望对大家有所帮助,如果大家有任何疑问请给我留言,小编会及时回复大家的。在此也非常感谢大家对 码农网 的支持!
猜你喜欢:本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们。
Making Things See
Greg Borenstein / Make / 2012-2-3 / USD 39.99
Welcome to the Vision Revolution. With Microsoft's Kinect leading the way, you can now use 3D computer vision technology to build digital 3D models of people and objects that you can manipulate with g......一起来看看 《Making Things See》 这本书的介绍吧!