Show simple item record

dc.contributor.advisorDavis, Brian
dc.contributor.authorAbdelaal, Hazem
dc.description.abstractKnowledge base creation and population are an essential formal backbone for a variety of intelligent applications, decision support and expert systems and intelligent search. Although knowledge extraction from unstructured text offers a means of easing the knowledge acquisition process, the ambiguous nature of language tends to impact on accuracy when engaging in more complex semantic analysis. Controlled Natural Languages (CNLs) are subsets of natural language which are restricted grammatically in order to reduce or eliminate ambiguity for the purposes of machine understanding, or unambiguous human communication within a domain or industry context, such as Simplified English. Moreover, CNLs help engaging non-expert users with no background in knowledge engineering, as these languages offer user-friendly interfaces that are easier to understand and accepted by users. The latter type of human-oriented CNL is under-researched despite having found favor in industry over many years. Rewriting such human-oriented CNL content into a machine-oriented CNL could potentially unlock significant silos of implicit valuable general purpose domain knowledge. In this thesis, we have a developed an approach for a series of corpus based rewriting rules for subsequent knowledge capture. Our work confirms that a substantial amount of human-oriented CNL content can be easily translated into a machine processable CNL for formal knowledge capture with little semantic loss. In addition, we describe a novel dataset which aligns a representative sample of Simplified English Wikipedia sentences with a well known machine-oriented CNL. This linguistic resource is both human-readable and semantically machine interpretable, where it can be used by the community as a gold-standard dataset which can benefit a variety of language processing and knowledge based applications.en_IE
dc.publisherNUI Galway
dc.rightsAttribution-NonCommercial-NoDerivs 3.0 Ireland
dc.subjectNatural Language Processingen_IE
dc.subjectKnowledge Extractionen_IE
dc.subjectControlled Natural Languageen_IE
dc.subjectSemantic Weben_IE
dc.subjectEngineering and Informaticsen_IE
dc.titleKnowledge extraction from simplified natural language texten_IE
dc.contributor.funderScience Foundation Irelanden_IE
dcterms.projectinfo:eu-repo/grantAgreement/SFI/SFI Research Centres/12/RC/2289/IE/INSIGHT - Irelands Big Data and Analytics Research Centre/en_IE

Files in this item


This item appears in the following Collection(s)

Show simple item record

Attribution-NonCommercial-NoDerivs 3.0 Ireland
Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivs 3.0 Ireland