Twitter has recently frustrated a number of developers and mashup artists moving to tighter restrictions on it’s latest API. Top of the list for many are all Twitter Search API requests need to be authenticated (you can’t just grab and run, a request has to be via a Twitter account), removal of XML/Atom feeds and reduced rate limits. There are some gains which don’t appear to be widely written about so I’ll share here
#1 Get the last 18,000 tweets instead of 1,500
Reading over the notes for the latest release discussion/notes for NodeXL I spotted that
you now specify how many tweets you want to get from Twitter, up to a maximum of 18,000 tweets
Previously in the old API the hard limits were 1,500 tweets from the last 7 days. This meant of you requested a very popular search term you’d only get the last 1,500 tweets making any tweets made earlier in the day inaccessible. In the new API there is still the ‘last 7 days’ limit but you can page back a lot further. Because the API limits to 100 tweets per call and 180 calls per hour this means you could potentially get 18,000 tweets in one hit. If you cache the maximum tweet id, wait an hour for the rate limit to refresh you could theoretically get even more (I’ve removed the 1.5k limit in TAGSv5.0, but haven’t fully tested how much of the 18k you can get before hit by script timeouts).
#2 Increased metadata with a tweet
Below is an illustration of the data returned in a single search result comparing the old and new search API.
If you look at the old data and the new data the main addition is a lot more profile data. A lot of this isn’t of huge interest (unless you wanted to do a colour analysis of profile colours), but there is some useful stuff. For example in this example I have profile information for the original and retweeter. as well as friend/follower counts, location and more (I’ve already shown how you can combine this data with Google Analytics for comparative analysis).
Whilst I’m sure this won’t appease the hardcore Twitter devs/3rd party for hackademics like myself grabbing extra tweets and more rich data has it’s benefits.
Join the conversation
I’ve taken to using the streaming API to collect tweets.
The search API won’t necessarily give you every tweet on a topic, but streaming will – as long as you can store them fast enough.
The only gotcha is that you have to decide in advance what/who/where you want to track and you’ll catch no tweets when your code isn’t running.
Comments are closed.