内容简介:The problem of automatically augmenting a knowledge base with facts expressed in natural language is known as Knowledge Base Population (KBP). This problem has been extensively studied in the last couple of decades; however, progress has been slow in part
EMNLP 2019 paper , dataset , leaderboard and code
Knowledge bases (also known as knowledge graphs or ontologies) are valuable resources for developing intelligence applications, including search, question answering, and recommendation systems. However, high-quality knowledge bases still mostly rely on structured data curated by humans. Such reliance on human curation is a major obstacle to the creation of comprehensive, always-up-to-date knowledge bases such as the Diffbot Knowledge Graph .
The problem of automatically augmenting a knowledge base with facts expressed in natural language is known as Knowledge Base Population (KBP). This problem has been extensively studied in the last couple of decades; however, progress has been slow in part because of the lack of benchmark datasets.
KnowledgeNet is a benchmark dataset for populating Wikidata with facts expressed in natural language on the web. Facts are of the form (subject; property; object), where subject and object are linked to Wikidata. For instance, the dataset contains text expressing the fact ( Gennaro Basile ; RESIDENCE; Moravia ), in the passage:
“Gennaro Basile was an Italian painter, born in Naples but active in the German-speaking countries. He settled at Brunn, in Moravia, and lived about 1756…”
KBP has been mainly evaluated via annual contests promoted by TAC . TAC evaluations are performed manually and are hard to reproduce for new systems . Unlike TAC, KnowledgeNet employs an automated and reproducible way to evaluate KBP systems at any time, rather than once a year. We hope a faster evaluation cycle will accelerate the rate of improvement for KBP.
Please refer to our EMNLP 2019 Paper for details on KnowlegeNet, but here are some takeaways:
- State-of-the-art models (using BERT ) are far from achieving human performance (0.504 vs 0.822).
- The traditional pipeline approach for this problem is severely limited by error propagation.
- KnowledgeNet enables the development of end-to-end systems, which are a promising solution for addressing error propagation.
以上所述就是小编给大家介绍的《KnowledgeNet: A Benchmark for Knowledge Base Population》,希望对大家有所帮助,如果大家有任何疑问请给我留言,小编会及时回复大家的。在此也非常感谢大家对 码农网 的支持!
猜你喜欢:本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们。
软件框架设计的艺术
[捷] Jaroslav Tulach / 王磊、朱兴 / 人民邮电出版社 / 2011-3 / 75.00元
本书帮助你解决API 设计方面的问题,共分3 个部分,分别指出学习API 设计是需要进行科学的训练的、Java 语言在设计方面的理论及设计和维护API 时的常见情况,并提供了各种技巧来解决相应的问题。 本书作者是NetBeans 的创始人,也是NetBeans 项目最初的架构师。相信在API 设计中遇到问题时,本书将不可或缺。 本书适用于软件设计人员阅读。一起来看看 《软件框架设计的艺术》 这本书的介绍吧!