A Unified Architecture for Natural Language Processing: Deep Neural Networks with Multitask Learning

Authors

Abstract

We describe a single convolutional neural network architecture that given a sentence, outputs a host of language processing predictions: part-of-speech tags, chunks, named entity tags, semantic roles, semantically similar words and the likelihood that the sentence makes sense (grammatically and semantically) using a language model. The entire network is trained jointly on all these tasks using weight-sharing, an instance of multitask learning. All the tasks use labeled data except the language model which is learnt from unlabeled text and represents a novel way of performing semi-supervised learning for the shared tasks. We show how both multitask learning and semi-supervised learning improve the generalization of the shared tasks, resulting in a learnt model with state-of-the-art performance.

Discussion

Enter your comment (wiki syntax is allowed):
MQENT
 
paper/2008/391.txt · Last modified: 2008/06/22 03:35 (external edit)
 
Driven by DokuWiki