Understanding Frame Semantic Parsing in NLP

栏目: IT技术 · 发布时间: 4年前

内容简介:Studying computational linguistic could be challenging, especially because there are a lot of terms that linguist has made. It can be in the form of tasks, such as word sense disambiguation, co-reference resolution, or lemmatization. There are terms for th

Understanding Frame Semantic Parsing in NLP

An attempt to make computers understand the meaning of our language

Studying computational linguistic could be challenging, especially because there are a lot of terms that linguist has made. It can be in the form of tasks, such as word sense disambiguation, co-reference resolution, or lemmatization. There are terms for the attributes of each task, for example, lemma, part of speech tag (POS tag), semantic role, and phoneme.

This article aims to give a broad understanding of the Frame Semantic Parsing task in layman terms. Beginning from what is it used for, some terms definitions, and existing models for frame semantic parsing. This article will not contain complete references to definitions, models, and datasets but rather will only contain subjectively important things.

Frame semantic parsing task begins with the FrameNet project [1], where the complete reference available at its website [2]. It aims to capture the meaning of words.

What Semantic Frame Parsing Used For?

Semantic frame parsing may be used for applications that needed to understand deeper about the meaning of words, like question answering. It tries to, determine what is the text talking about (oversimplified paraphrasing of frame) and who did what to whom (oversimplified paraphrasing of frame elements or semantic roles) around it. Consider an example

[The price of bananas] increased [5%]
[The price of bananas] rose [5%]
There has been a [5%] rise in [the price of bananas]

The phrases in the bracket are the arguments, while “increased”, “rose”, “rise” are the predicates.

All of these sentences mean the same thing, but how can a computer understand them? We wanted to be able to ask a computer, for example,

“How much has the price of bananas increased?”

Given a mixed structure, it may be confused and couldn’t find a correct answer.

But what if this computer can parse those sentences into semantic frames? It will recognize that it is most probably a Motion_Directional frame. Then it will recognize that [The price of bananas] is Theme and [5%] is Distance, from frame elements related to the Motion_Directional frame. Knowing this, it should answer with the Distance frame element .

Terms Definitions

Reading articles and papers for frame semantic parsing is confusing. At first glance, it is hard to understand most terms in the reading materials. Thus, it will be great to understand some core terms.

Frame (Semantic Frame)

A frame, or semantic frame, is a category for a part of a sentence. This category indicates that part of the sentence will have certain components. In a sense, the semantic frame is like a rule book. When you see that a part of the sentence has this semantic frame, you will know what else may be in that part of the sentence. Let me show you an example

Cyra tried to swing her sword to parry, but it was to heavy.

You will notice that sword is a “weapon” and her (which can be co-referenced to Cyra) is a “wielder”. This sentence has a high probability to be categorized as containing the “Weapon” frame (see the frame index ). According to the “Weapon” frame, it must have a “Weapon” element. Optionally, it may contain a “wielder” role like in this example.

Targets

The words or sequence of words that should be labeled by frames. It is better to see an example.

Figure 1: Example of a Sentence that has been through frame semantic parsing [4]

Figure 1 shows an example of a sentence with 4 targets, denoted by highlighted words and sequence of words. Those targets are “played”, “major”, “preventing”, and “drying up”. Each of these targets will correspond directly with a frame PERFORMERS_AND_ROLES, IMPORTANCE, THWARTING, BECOMING_DRY frames, annotated by categories with boxes.

Frame Element

Frame element is a component of a semantic frame , specific for certain Frames. It means if you have seen the frame index you will notice there are highlighted words. These are the frame elements, and each frame may have different types of frame elements.

In Figure 1, frame elements denoted by underline. For example, “Hoover Dam”, “a major role”, and “in preventing Las Vegas from drying up” is frame elements of frame PERFORMERS_AND_ROLES.

But then you may think, wow! but isn’t that the whole sentence? Then does a frame covers a whole sentence? It depends on each specific frame rule. Let me get you another shorter example, “Las Vegas” is a frame element of BECOMING_DRY frame. See? it does not have to be a whole sentence.

Lemma

Lemma is the basic form of a word. In English, runs, ran, run will have the same lemma: run . A lemma does not exactly mean a word, because it may contain more than one words, for example, “atomic weapon” or “flame-thrower”.

Lexical Unit (LU)

The lexical unit, in this context, is a pair of basic forms of a word (lemma) and a Frame . At frame index, a lexical unit will also be paired with its part of speech tag (such as Noun/n or Verb/v). I believe the purpose is to clearly state which meaning is this lemma refers to (One lemma/word that has multiple meanings is called polysemy).

source Frame Index

Existing Models

The typical pipeline to solve this task is to identify targets, classify which frame, and identify arguments.

Early works in establishing automatic frame semantic labeling involve two steps: identify frame elements boundaries in the sentence and frame labeling [3]. This early work used a lot of grammatical features as input for frame labeling. Phrase type (noun phrase, verb phrase, and clause), grammatical function, position, voice, and headword. The final result of this system, on identifying frame element boundaries, achieved 66% and on frame labeling, could achieve 80% accuracy.

A more recent model has used a neural network method to do frame semantic parsing, to see if they can reduce the usage of syntactical features [4]. They split the task into 3 parts: target identification, frame labeling, and argument identification.

Target identification is a task to determine which words or phrases to be labeled. A target can also be called a predicate and it could be a noun or a verb. They used 3 features to do target identification: tokens, part of speech of that tokens, and tokens lemma. They also combine pre-trained embedding of tokens, using GloVe representation, and trained token embedding. Then they used bi-lstm layer, which the last layer will be used to predict if the next token is a target.

Frame labeling task here means the same thing with the earlier method. Here, the authors suggest using bi-lstm based classifier. The model should take at least, the tokens, lemmas, part of speech tags, and the target position, a result of an earlier task.

Here what I think interesting is the argument identification part. Argument identification is not probably what “argument” some of you may think, but rather refer to the predicate-argument structure [5]. In other words, given we found a predicate, which words or phrases connected to it. It is essentially the same as semantic role labeling [6], who did what to whom. The main difference is semantic role labeling assumes that all predicates are verbs [7], while in semantic frame parsing it has no such assumption.

References

[1] Baker, Collin F., Charles J. Fillmore, and John B. Lowe. 1998. “The Berkeley FrameNet Project,” 86. https://doi.org/10.3115/980845.980860.

[2] Frame Net project website. https://framenet.icsi.berkeley.edu .

[3] Gildea, Daniel, and Daniel Jurafsky. 2002. “Automatic Labeling of Semantic Roles.” Computational Linguistics 28 (3). https://doi.org/10.1162/089120102760275983 .

[4] Swayamdipta, Swabha, Sam Thomson, Chris Dyer, and Noah A. Smith. 2017. “Frame-Semantic Parsing with Softmax-Margin Segmental RNNs and a Syntactic Scaffold.” http://arxiv.org/abs/1706.09528.

[5] https://en.wikipedia.org/wiki/Argument_(linguistics) .

[6] https://en.wikipedia.org/wiki/Semantic_role_labeling .

[7] Jurafsky, D, James H. Martin. 2019. Speech and Language Processing 3rd edition. https://web.stanford.edu/~jurafsky/slp3/ .


以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

算法竞赛入门经典

算法竞赛入门经典

刘汝佳 / 清华大学出版社 / 2009-11 / 24.00元

《算法竞赛入门经典》是一本算法竞赛的入门教材,把C/C++语言、算法和解题有机地结合在了一起,淡化理论,注重学习方法和实践技巧。全书内容分为11章,包括程序设计入门、循环结构程序设计、数组和字符串、函数和递归、基础题目选解、数据结构基础、暴力求解法、高效算法设计、动态规划初步、数学概念与方法、图论模型与算法,覆盖了算法竞赛入门所需的主要知识点,并附有大量习题。书中的代码规范、简洁、易懂,不仅能帮助......一起来看看 《算法竞赛入门经典》 这本书的介绍吧!

RGB转16进制工具
RGB转16进制工具

RGB HEX 互转工具

随机密码生成器
随机密码生成器

多种字符组合密码

HEX CMYK 转换工具
HEX CMYK 转换工具

HEX CMYK 互转工具