First look at analysing threaded Twitter discussions from large archives using NodeXL #moocmooc

This post is a bit messy. I got caught trying out too many ideas at once, but hopefully you’ll still find it useful

Sheila recently posted Analytics and #moocmooc in which she collects some thoughts on the role of analytics in courses and how some of the templates I’ve developed can give you an overview of what is going on.  As I commented in the post I still think there is more work to make archives from event hashtags more useful even if just surfacing tweets that got most ‘reaction’.
There are three main reactions that are relatively easy to extract from twitter: retweets, favouring and replies. There are issues with what these actions actually indicate as well as the reliability of the data. For example users will use ‘favouring’ in different ways, and not everyone uses a twitter client that can or uses a reply tweet (if you start a message @reply without clicking a reply button Twitter looses the thread).
But lets ignore these issues for now and start with the hypothesis that a reaction to a tweet is worth further study. Lets also, for now, narrow down on threaded discussions. How might we do this? As mentioned in Sheila’s post we’ve been archiving #moocmooc tweets using Twitter Archiving Google Spreadsheet TAGS v3. As well as the tweet text other metadata is recorded including a tweet unique identifier and, where available the id of the tweet it is replying to.

Google Spreadsheet columns

We could just filter the spreadsheet for rows with reply ids but lets take a visual approach. Downloading the data as a Excel file we can open it using the free add-in NodeXL.

NodeXL allows us to graph connections, in this case conversation threads. NodeXL allows use to do other useful things like group conversations together to make further analysis easier. Skipping over the detail here’s what you get if you condense 6,500 #moocmooc tweets into grouped conversations.

 moocmooc grouped converstations

This is more than just a pretty picture. In NodeXL I’ve configured it so that when I hover over each dot which represents and individual tweet I get a summary of what was said by who and when (shown below).

NodeXL being used to examine nodes

It’s probably not too surprising to see strings of conversations, but by graphing what was an archive of over 6500 tweets we can start focusing on what might be interesting subsets and conversation shapes. There are some interesting patterns that emerge:

conversation group 1 conversation group 2conversation group 3

Within NodeXL I can extract these for further analysis. So the middle image can be viewed as:

Examination of conversation group 2

There’s a lot more you can do with this type of data, start looking at how many people are involved in conversations, number of questions per conversations and lots more. I should also say before I forget that NodeXL can be configured to collect twitter search results with it’s built-in twitter search tool. It can also be configured to do the collection on a regular basis (hmm I should really have a go at doing that myself). So potentially you’ve got a nice little tool to analysis twitter conversations in real-time …
If you’d like to explore the data more it’s available from the NodeXLGraphGallery. I’m going off to play some more 😉


Join the conversation

comment 16 comments

Comments are closed.