You are here:
ThemeCrowds: Multiresolution Summaries of Twitter Usage

ThemeCrowds: Multiresolution Summaries of Twitter Usage

Publication Type  Report
Year of Publication  2011
Authors  Daniel Archambault; Derek Greene; John Hannon; Pádraig Cunningham; Neil Hurley
Abstract  

Users of social media sites, such as Twitter, rapidly generate large volumes of text content on a daily basis. Visual summaries are needed to understand what groups of people are saying collectively in this unstructured text data. Users will typically discuss a wide variety of topics, where the number of authors talking about a specific topic can quickly grow or diminish over time, and what the collective is saying about the subject can shift as a situation develops. In this paper, we present a technique that summarises what collections of Twitter users are saying about certain topics over time. As the correct resolution for inspecting the data is unknown in advance, the users are clustered hierarchically over a fixed time interval based on the similarity of their posts. The visualisation technique takes this data structure as its input. Given a topic, it finds the correct resolution of users at each time interval and provides tags to summarise what the collective is discussing. The technique is tested on three microblogging corpora, consisting of up to tens of millions of tweets and over a million users. We provide some preliminary user feedback from a research group interested in the area of social media analysis, where this tool could be applied.

Export  Tagged XML BibTex
AttachmentSize
ucd-csi-2011-07.pdf3.5 MB