Show simple item record

dc.contributor.advisorHowley, Enda
dc.contributor.advisorDuggan, Jim
dc.contributor.authorMannion, Patrick
dc.date.accessioned2018-02-16T15:10:10Z
dc.date.available2018-02-16T15:10:10Z
dc.date.issued2017-08-17
dc.identifier.urihttp://hdl.handle.net/10379/7142
dc.description.abstractMulti-Agent Reinforcement Learning (MARL) is a powerful Machine Learning paradigm, where multiple autonomous agents can learn to improve the performance of a system through experience. The majority of MARL implementations aim to optimise systems with respect to a single objective, despite the fact that many real world problems are inherently multi-objective in nature. Examples of multi-objective problems where MARL may be applied include water resource management, traffic signal control, electricity generator scheduling and robot coordination tasks. Compromises between conflicting objectives may be defined using the concept of Pareto dominance. The Pareto optimal or non-dominated set consists of solutions that are incomparable, where each solution in the set is not dominated by any of the others on every objective. Reward shaping has been proposed as a means to address the credit assignment problem in single-objective MARL, however it has been shown to alter the intended goals of the domain if misused, leading to unintended behaviour. Potential-Based Reward Shaping (PBRS) and difference rewards (D) are commonly used shaping methods for MARL, both of which have been repeatedly shown to improve learning speed and the quality of joint policies learned by agents in single-objective problems. Research into multi-objective MARL is still in its infancy, and very few studies have dealt with the issue of credit assignment in this context. This thesis explores the possibility of using reward shaping to improve agent coordination in multi-objective MARL domains. The implications of using either D or PBRS are evaluated from a theoretical perspective, and the results of several empirical studies support the conclusion that these shaping techniques do not alter the true Pareto optimal solutions in multi-objective MARL domains. Therefore, the benefits of reward shaping can now be leveraged in a broader range of application domains, without the risk of altering the agents' intended goals.en_IE
dc.rightsAttribution-NonCommercial-NoDerivs 3.0 Ireland
dc.rights.urihttps://creativecommons.org/licenses/by-nc-nd/3.0/ie/
dc.subjectMulti-Agent Systemsen_IE
dc.subjectReinforcement Learningen_IE
dc.subjectMulti-Objective Optimisationen_IE
dc.subjectCredit Assignmenten_IE
dc.subjectReward Shapingen_IE
dc.subjectEngineering and Informaticsen_IE
dc.subjectInformation technologyen_IE
dc.titleKnowledge-based multi-objective multi-agent reinforcement learningen_IE
dc.typeThesisen_IE
dc.contributor.funderIrish Research Councilen_IE
dc.local.finalYesen_IE
nui.item.downloads5759


Files in this item

Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record

Attribution-NonCommercial-NoDerivs 3.0 Ireland
Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivs 3.0 Ireland