joint learning of entity recognition and relationship extraction based on neural network

Posted by deaguero at 2020-03-09

Author Luo Ling

Doctoral student of Dalian University of Technology

Research direction: deep learning, text classification, entity recognition

Joint learning is not a new term. In the field of natural language processing, researchers have long used the joint model based on traditional machine learning to perform joint learning on some closely related natural language processing tasks. For example, entity recognition and entity standardization joint learning, segmentation and part of speech tagging joint learning and so on.

Recently, researchers have carried out joint learning of entity recognition and relationship extraction based on neural network method. I have read some related works, and I would like to share them with you (in this paper, I quoted some PPT reports of suncong Zheng, the author of the paper).

The task of this paper is to extract entities from unstructured text and the relationship between entities (entity 1-relationship-entity 2, triple). The relationship here is a predefined relationship type, such as the following figure:

At present, there are two kinds of methods, one is to use the pipelined method to extract: input a sentence, first identify the named entity, then combine the two recognized entities, then classify the relationship, and finally take the triplet with entity relationship as the input.

The disadvantages of pipeline method are as follows:

1. Error propagation, the error of entity recognition module will affect the following relationship classification performance;

2. Ignore the relationship between the two subtasks, such as the example in the figure. If there is a country president relationship, we can know that the former entity must belong to the location type, and the latter entity belongs to the person type. The pipeline method cannot use such information;

3. Unnecessary redundant information is generated. Since the identified entities are paired and then classified, those unrelated entity pairs will bring redundant information and improve the error rate.

The ideal joint learning should be as follows: input a sentence, extract the joint model through entity recognition and relationship, and directly get the related entity triples. This method can overcome the shortcomings of the pipeline method above, but it may have more complex structure.

Here I mainly focus on the joint learning based on neural network method. I divide my current work into two categories: 1. Parameter sharing and 2. Tagging scheme. Mainly related to the following work.

In the paper joint entity and relation extraction based on a hybrid neural network, Zheng et al. Used the underlying expression of shared neural network for joint learning.

Specifically, the input sentences are encoded by the common word embedding layer, followed by the bidirectional LSTM layer. Then a LSTM is used for named entity recognition (NER) and a CNN is used for relationship classification (RC).

Compared with the current mainstream ner model, bilstm-crf model, the former prediction label is embedded and then introduced into the current decoding to replace the CRF layer to solve the label dependency problem in NER.

In the process of relationship classification, entities need to be matched according to the results predicted by ner, and then the texts between entities need to be classified using a CNN. Therefore, the model is mainly based on the underlying model parameter sharing. During training, both tasks update the shared parameters through the back propagation algorithm to achieve the dependency between the two subtasks.

The paper "end to end relation extraction using lstms on sequences and tree structures" is also a similar idea, through parameter sharing to joint learning. But they are different in the decoding model of NER and RC.

In this paper, Miwa et al. Also use parameter sharing, NER uses an NN to decode, adds dependency information to RC, and uses a bilstm to classify the relationship according to the shortest path of dependency tree.

According to the experiments of these two papers, using parameter sharing to carry out joint learning can get better results than pipeline method. The F value of their tasks is increased by about 1%, which is a simple and general method. The same idea is applied to the task of entity relationship extraction in biomedical text in the paper a neural joint model for entity and relationship extraction from biomedical text.

However, we can see that the method of parameter sharing actually has two subtasks, only the two subtasks interact with each other through parameter sharing. In addition, when training, we need to first carry out ner, and then carry out pairwise matching according to the prediction information of NER to classify the relationship. There will still be unrelated entities to this redundant information.

For this reason, Zheng et al. Put forward a new annotation strategy for relationship extraction in the paper joint extraction of entities and relations based on a new tagging scheme, which was published on ACL 2017 and incorporated into the existing paper.

They put forward a new annotation strategy, which turns the relationship extraction between sequence annotation task and classification task into a sequence annotation problem. Then, an end-to-end neural network model is used to get the relation entity triples directly.

Their new annotation strategy consists of three parts: 1) location information {B (entity start), I (entity internal), e (entity end), s (single entity)}; 2) relationship type information {encoding according to predefined relationship type}; 3) entity role information {1 (entity 1), 2 (entity 2)}. Note that all words in the entity relation triplet are labeled "O".

According to the tag sequence, entities of the same relationship type are combined into a triple as the final result. If a sentence contains more than one relationship of the same type, then the nearest principle is used for pairing. At present, this set of tags does not support overlapping entity relationships.

Then the task becomes a sequence annotation problem, and the overall model is shown as follows. First, a bilstm is used for encoding, and then the LSTM mentioned in parameter sharing is used for decoding.

What's different from the classical model is that they use an objective function with bias. When the label is "O", it is the normal objective function. When the label is not "O", it involves the related entity label, then the influence of the label is increased by α. The experimental results show that the objective function with bias can predict the entity relationship pairs more accurately.

There are two kinds of methods for entity recognition and relationship extraction based on neural network. Among them, the method of parameter sharing is simple and easy to implement, which has a wide range of applications in multi task learning.

Zheng et al. Put forward a new annotation strategy, although there are still some problems (for example, unable to identify overlapping entity relations), but gave a new idea, really achieved the combination of two subtasks into a sequence annotation problem, in this annotation strategy, we can also make more improvements and development to further improve the end-to-end relationship extraction task.

[1] S. Zheng, Y. Hao, D. Lu, H. Bao, J. Xu, H. Hao, et al., Joint Entity and Relation Extraction Based on A Hybrid Neural Network, Neurocomputing. (2017) 1–8. 

[2] M. Miwa, M. Bansal, End-to-End Relation Extraction using LSTMs on Sequences and Tree Structures, ACL, (2016). 

[3] F. Li, M. Zhang, G. Fu, D. Ji, A Neural Joint Model for Entity and Relation Extraction from Biomedical Text, BMC Bioinformatics. 18 (2017). 

[4] S. Zheng, F. Wang, H. Bao, Y. Hao, P. Zhou, B. Xu, Joint Extraction of Entities and Relations Based on a Novel Tagging Scheme, Acl. (2017).

[1] Report by suncong Zheng:

This article is selected and recommended by the AI academic community paperweekly. At present, the community has covered natural language processing, computer vision, artificial intelligence, machine learning, data mining, information retrieval and other research directions. Click "read the original" to join the community immediately!           

I'm an egg

Unlock new features: popular position recommendation!

Paperweekly applet upgrade

Today's arXiv √ guess you like √ hot position √

Looking for full-time internships is not a problem

Unlocking mode

1. Identify the QR code below to open the applet

2. Log in with the paperweekly community account

3. After login, all functions can be unlocked

Position release

Please add a small assistant wechat (pwbot01) for consultation

Long press identification QR code, use applet

*Click to read the original to register

About paperweekly

Paperweekly is an academic platform to recommend, interpret, discuss and report the achievements of advanced papers on artificial intelligence. If you study or engage in AI, welcome to click on the "official account" in the background of public numbers, and the little assistant will bring you into the PaperWeekly communication group.