|dc.description.abstract||The automatic inference of video semantics is an important but highly challenging problem whose solution can greatly contribute towards the annotation, retrieval, personalisation and reusability of video on the web. From a semantic annotation and retrieval perspective, this thesis investigates the influence of multiple video contexts for inferring video semantics, specifically aiming to improve video tagging and content description. The objective of the thesis is two- fold 1) formalising the representation of a video and its content via an ontological model and 2) inferring concepts to augment the model. First, a lightweight conceptual model of a video is proposed to describe a video object at four different structural abstractions (video, shot, frame, image region) and four different meta information categories (media, content feature, content semantics and context), and second, we investigate an ensemble of methods to infer video semantics from multiple contextual sources in order to augment the above model.
The study showed that contextual sources positively contribute in understanding the ¿aboutness¿ of a video and one can discover many descriptive concepts not originally described by the creator. Experimenting with different contextual sources showed that contexts contributed semantic enrichment is not restricted to document level video annotations, but can go further and be used to localise the detected entities inside the video timeline for a fine grained time-stamped annotation. In all studies we found that a combination of cues results in robust concept detection compared to the cues in isolation. We evaluated our approaches using both quantitative analysis and user based qualitative feedback. The principal benefits of the context based approach over a content based approach is that it is computationally inexpensive, maximises the wisdom of crowds and easily adaptable across domains.
Finally we built an integrated 'Annotate, Search and Browse' prototype building over the proposed framework that supports complex structured queries, ontology based concept querying, temporal segment querying as well as the normal keyword search.||en_US