dc.contributor.advisor | Breslin, John G. | |
dc.contributor.author | Ruder, Sebastian | |
dc.date.accessioned | 2019-09-26T07:49:40Z | |
dc.date.available | 2019-09-26T07:49:40Z | |
dc.date.issued | 2019-06-07 | |
dc.identifier.uri | http://hdl.handle.net/10379/15463 | |
dc.description.abstract | The current generation of neural network-based natural language processing models excels at learning from large amounts of labelled data. Given these capabilities, natural language processing is increasingly applied to new tasks, new domains, and new languages. Current models, however, are sensitive to noise and adversarial examples and prone to overfitting. This brittleness, together with the cost of attention, challenges the supervised learning paradigm.
Transfer learning allows us to leverage knowledge acquired from related data in order to improve performance on a target task. Implicit transfer learning in the form of pretrained word representations has been a common component in natural language processing. In this dissertation, we argue that more explicit transfer learning is key to deal with the dearth of training data and to improve downstream performance of natural language processing models. We show experimental results transferring knowledge from related domains, tasks, and languages that support this hypothesis.
We make several contributions to transfer learning for natural language processing: Firstly, we propose new methods to automatically select relevant data for supervised and unsupervised domain adaptation. Secondly, we propose two novel architectures that improve sharing in multi-task learning and outperform single-task learning as well as the state-of-the-art. Thirdly, we analyze the limitations of current models for unsupervised cross-lingual transfer and propose a method to mitigate them as well as a novel latent variable cross-lingual word embedding model. Finally, we propose a framework based on fine-tuning language models for sequential transfer learning and analyze the adaptation phase. | en_IE |
dc.publisher | NUI Galway | |
dc.rights | Attribution-NonCommercial-NoDerivs 3.0 Ireland | |
dc.rights.uri | https://creativecommons.org/licenses/by-nc-nd/3.0/ie/ | |
dc.subject | natural language processing | en_IE |
dc.subject | machine learning | en_IE |
dc.subject | deep learning | en_IE |
dc.subject | transfer learning | en_IE |
dc.subject | Engineering and Informatics | en_IE |
dc.subject | Information technology | en_IE |
dc.subject | Computer science | en_IE |
dc.title | Neural transfer learning for natural language processing | en_IE |
dc.type | Thesis | en |
dc.contributor.funder | Irish Research Council for Science, Engineering and Technology | en_IE |
dc.local.note | This dissertation demonstrates that neural networks in natural language processing that leverage existing relevant information from related domains, tasks, and languages outperform models not using this information across a wide range of tasks and proposes new algorithms for transfer learning in these settings. | en_IE |
dc.local.final | Yes | en_IE |
nui.item.downloads | 14380 | |