Show simple item record

dc.contributor.advisorCurry, Edward
dc.contributor.advisorBreslin, John
dc.contributor.authorKhan, Muhammad Jaleed
dc.date.accessioned2024-02-26T11:35:00Z
dc.date.issued2024-02-26
dc.identifier.urihttp://hdl.handle.net/10379/18065
dc.description.abstractVisual reasoning is a critical component of artificial intelligence that aims to understand, interpret, and reason about complex visual content. It has an interdisciplinary nature incorporating visual feature extraction and image generation from computer vision, linguistic feature extraction and language generation from natural language processing, and graph-based representation and semantic enrichment from knowledge representation and reasoning. Data-centric visual reasoning techniques often face limitations in intuitively interpreting visual content due to the limited expressiveness and generalisability of scene representations. We propose a knowledge-enhanced neurosymbolic visual reasoning framework based on scene graph enrichment. This framework employs deep learning techniques for object detection and relationship prediction in visual content to generate scene graph representations, which are then refined and semantically enriched using common sense knowledge extracted from a heterogeneous knowledge graph. The enriched scene graphs are used in downstream visual reasoning tasks, including image captioning, visual question answering and image generation. A comprehensive experimental analysis on the standard datasets and evaluation benchmarks demonstrates considerable improvement over existing state-of-the-art methods in terms of relationship recall rate, image captioning quality, question answering accuracy and image generation realism. The encouraging results validate the effectiveness of leveraging heterogeneous common sense knowledge for enhanced scene understanding and visual reasoning.en_IE
dc.publisherNUI Galway
dc.rightsAttribution-NonCommercial-NoDerivs 3.0 Ireland
dc.rightsCC BY-NC-ND 3.0 IE
dc.rights.urihttps://creativecommons.org/licenses/by-nc-nd/3.0/ie/
dc.rights.urihttps://creativecommons.org/licenses/by-nc-nd/3.0/ie/
dc.subjectScience and Engineeringen_IE
dc.subjectComputer Scienceen_IE
dc.subjectData Scienceen_IE
dc.subjectArtificial Intelligenceen_IE
dc.subjectscene representationen_IE
dc.subjectscene graphen_IE
dc.subjectsemantic enrichmenten_IE
dc.subjectcommon sense knowledgeen_IE
dc.subjectvisual reasoningen_IE
dc.subjectvisual question answeringen_IE
dc.subjectimage captioningen_IE
dc.subjectneurosymbolic integrationen_IE
dc.subjectdeep learningen_IE
dc.subjectknowledge graphsen_IE
dc.titleNeurosymbolic visual reasoning with scene graph enrichmenten_IE
dc.typeThesisen
dc.description.embargo2025-02-28
dc.local.finalYesen_IE
nui.item.downloads0


Files in this item

Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record

Attribution-NonCommercial-NoDerivs 3.0 Ireland
Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivs 3.0 Ireland