Utilization of social breadcrumbs for user profiling in personalization
MetadataShow full item record
This item's downloads: 11832 (view details)
Personalization efforts aim to alleviate the ``information overload" problem in an attempt to help users address their information needs in the best way possible. An increasing number of systems that employ personalization have cropped up in recent past with even well-known commercial giants targeting their efforts towards enhanced personalization within their services e.g. Amazon product recommendations, Netflix movie recommendations, Google Now etc. A fundamental building block of any personalization attempt is the user model that powers it. User modelling has remained a theme central within the broad research area of personalization with most traditional sources for user modelling being controversial in nature on account of the loss of privacy associated with them. With the advent of the Social Web, a paradigm shift has occurred in the way content is generated on the Web leading it to become an online gathering point for the masses. Users now leave traces of their online experiences on various Social Web platforms referred to as ``social breadcrumbs" in the context of this thesis. Recent research efforts began to explore the possibility of utilizing Social Web data for creation of personalization-centric user models; most of the approaches attempted to make use of bookmarks and social tags for user modelling. These sources however are less effective on account of few users making use of bookmarking and social annotation tools rendering them infeasible for large-scale application in personalized applications. Given the limitations of current user modelling efforts, we explored social network usage patterns and personalization-related privacy concerns in an attempt to derive aspects of Social Web data that can lead towards effective user profiles. The analyzed correlations led us towards the proposition of a Twitter-based user model which takes into account not only the language usage patterns of the user under consideration but also users in his/her network. More specifically, a framework based on statistical language models is proposed. This model enables us to model the probability distribution of words within a user's language that he/she employs over Twitter in addition to the probability distribution of words within those user's language whom he considers trustworthy (on Twitter). The expressive nature of the user modelling efforts are depicted via the incorporation of two similarity measures into the model whereby common users within a network are utilized for the network-based similarity measure, and common topical interests are utilized within the topical similarity measures. To the best of our knowledge, this work constitutes one of the first attempts to take into account social network usage information for the generation of user profiles. The proposed model was extensively explored in the context of two application scenarios, namely Web search personalization and scientific articles' recommendation, and both of these are fundamentally quite challenging in nature. For application to Web search personalization, we take into account various Twitter behaviors a user engages in. Adjustment of the parameters on basis of the Twitter behavior-based heuristics demonstrate an effective solution to personalized Web search which was verified via extensive offline and online experimental evaluations. Similarly, for application to scientific articles' recommendation the model was adjusted by only taking into account network of followed users, and replacing similarity measures with a topic modelling-based filtering measure that helps topics relevant to a user's research interest. The recommendation framework outperforms a standard baseline and produces rich recommendations of scientific articles for the user.