Show simple item record

dc.contributor.advisorHayes, Conor
dc.contributor.authorHromic, Hugo
dc.date.accessioned2019-05-07T10:06:27Z
dc.date.available2019-05-07T10:06:27Z
dc.date.issued2019-05-02
dc.identifier.urihttp://hdl.handle.net/10379/15146
dc.description.abstractMicroblogging social media focuses on fast open real-time communication using short messages between users and their followers. Twitter is currently one of the largest and widely known microblogging OSN in the world, with more than 330 million monthly active users as of December 2017. Moreover, an average of 500 million Tweets (short messages) per day are generated within the service. Microblogging social media generate large amounts of content and community finding techniques are a suitable alternative for organising it. However, a fundamental challenge in the community detection literature is the diversity for a definition of user community, which makes evaluating and interpreting algorithms difficult. Therefore, in this thesis, two types of user community definition are adopted and investigated for microblogging: functional and structural definitions. A functional community groups its users by a common independent social function, e.g. fans of the same football team, while in a structural community the members exclusively depend on their connectivity in a network, e.g. modularity. In this work, functional definitions are built and characterised to be used as user-labelled ground-truth using eight types of social functions from Twitter interaction networks. Afterwards, these ground-truth functional communities are evaluated -- in static and dynamic scenarios -- considering thirteen popular structural community definitions from the literature. The goodness, robustness and sensitivity of these structural community definitions for detecting the functional ground-truth under different perturbation strategies is investigated. The proposed evaluation is carried using five different Twitter datasets captured during diverse periods of time. The results of the study show that definitions based on internal and mixed connectivity, e.g. Triangle Participation Ratio, Fraction Over Median Degree or Conductance work best for the Twitter use case and are very robust. On the other hand, other scores such as Modularity are limited and do not perform well due to the sparsity and noise of microblogging. Furthermore, using user activity as basis to refine communities into their active hotspots further improves the performance of community detection in microblogging. It is demonstrated in this work that standard community detection algorithms are challenged by the fast-paced dynamics and link sparsity of microblogging data. Therefore, it is argued that temporal characteristics must be considered for community detection methods in microblogging.en_IE
dc.publisherNUI Galway
dc.rightsAttribution-NonCommercial-NoDerivs 3.0 Ireland
dc.rights.urihttps://creativecommons.org/licenses/by-nc-nd/3.0/ie/
dc.subjectcommunity detectionen_IE
dc.subjectempirical evaluationen_IE
dc.subjectmicrobloggingen_IE
dc.subjecttwitteren_IE
dc.subjectground-truthen_IE
dc.subjectgraph miningen_IE
dc.subjecttemporal dynamicsen_IE
dc.subjectmethodsen_IE
dc.subjectEngineering and Informaticsen_IE
dc.subjectComputer scienceen_IE
dc.subjectData analyticsen_IE
dc.titleMethods for defining dynamic online communities and community detection in fast-paced social media streamsen_IE
dc.typeThesisen
dc.contributor.funderScience Foundation Irelanden_IE
dc.contributor.funderSeventh Framework Programmeen_IE
dc.local.noteThis work is a study of the formation and evolution of online user communities in microblogging social media, represented by the widely used Twitter service. In this study, we propose the construction of ground-truth user communities based on social functions that are later evaluated using a number of state-of-the-art community detection approaches based on network structure. The proposed evaluation considers both, the static (without considering temporal information) and the dynamic scenarios for these user communities to develop. Our results show that certain community detection approaches work better than others in the case of microblogging social media, and that their performance can be further improved when taking the temporal dynamics in consideration. Consequently, a set of recommendations for applying community detection in microblogging is proposed, and a set of practical applications involving community detection in Twitter are also proposed.en_IE
dc.local.finalYesen_IE
dcterms.projectinfo:eu-repo/grantAgreement/SFI/SFI Research Centres/12/RC/2289/IE/INSIGHT - Irelands Big Data and Analytics Research Centre/en_IE
dcterms.projectinfo:eu-repo/grantAgreement/SFI/SFI Strategic Research Cluster/08/SRC/I1407/IE/SRC Clique: Graph & Network Analysis Cluster/en_IE
dcterms.projectinfo:eu-repo/grantAgreement/EC/FP7::SP1::ICT/257859/EU/Risk and Opportunity management of huge-scale BUSiness communiTy cooperation./ROBUSTen_IE
nui.item.downloads3868


Files in this item

Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record

Attribution-NonCommercial-NoDerivs 3.0 Ireland
Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivs 3.0 Ireland