The use of ensemble techniques in multiclass speech emotion recognition to improve both accuracy and confidence in classifications

Murphy, Alan

dc.contributor.advisor	Redfern, Sam
dc.contributor.author	Murphy, Alan
dc.date.accessioned	2016-01-26T17:09:19Z
dc.date.available	2016-01-26T17:09:19Z
dc.date.issued	2015-09-30
dc.identifier.uri	http://hdl.handle.net/10379/5499
dc.description.abstract	Creating machines with the ability to reason, perceive, learn and make decisions based on a human like intelligence has been an interest of artificial intelligence researchers for decades, with the long term goal of developing a general intelligence capable of solving problems just like humans. Affective computing is the area of these studies which focusses on the design and development of intelligent devices which can perceive, process and synthesize human emotion. Humans can interpret emotion in a number of different ways, for example processing spoken utterances, non-verbal cues, facial expressions and also written communication. Changes in our nervous system indirectly alter spoken utterances which makes it possible for people to perceive how others feel by listening to them speak. These changes can also be interpreted by machines through the extraction of speech features. The field of Speech Emotion Recognition (SER) takes advantage of this capability and has subsequently offered many approaches to recognize affect in spoken utterances. Our research focusses on this problem of recognizing affect in spoken utterances and offers a contribution to state of the art systems, which not only can increase accuracy in predictions made but can also improve the reliability or confidence in predictions made. The majority of state of the art SER systems employ complex statistical algorithms to model the relationship between acoustic parameters extracted from spoken language. This model can then be used to classify new instances of emotionally expressive speech. There are other SER systems which use the content of spoken utterances i.e. what is being said, along with acoustic parameters to make a more informed prediction. Our work highlights how state of the art SER systems do not employ state of the art text analysis techniques and therefore are limiting their prediction ability. This thesis therefore presents a classification system which exploits best practices from both the acoustic and text processing domains, to create an SER system which exhibits more accurate and confident predictions than state of the art systems to date.	en_IE
dc.rights	Attribution-NonCommercial-NoDerivs 3.0 Ireland
dc.rights.uri	https://creativecommons.org/licenses/by-nc-nd/3.0/ie/
dc.subject	Speech emotion recognition	en_IE
dc.subject	Informatics	en_IE
dc.subject	Computer science	en_IE
dc.subject	Department of Information Technology	en_IE
dc.title	The use of ensemble techniques in multiclass speech emotion recognition to improve both accuracy and confidence in classifications	en_IE
dc.type	Thesis	en_IE
dc.contributor.funder	College of Engineering	en_IE
dc.local.note	This thesis presents an emotion recognition system to classify Ekman’s six basic emotions. It employs the combined techniques of acoustic parameters processing, text analysis through keyword spotting, weighting, application of heuristic rules and statistical text processing. This is the first system to study classification confidence in multiclass emotion recognition.	en_IE
dc.local.final	Yes	en_IE
nui.item.downloads	2930

Files in this item

Name:: license.txt
Size:: 5.659Kb
Format:: Text file

View/Open

Name:: PhD Thesis.pdf
Size:: 8.343Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

University of Galway Theses (PhD Theses)

Show simple item record

Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivs 3.0 Ireland