Thursday, July 14, 2016

Archiving Tweets: Thank you, Martin Hawksey, for TAGS

When I was looking for a way to collect tweets, for free, I stumbled upon Martin Hawksey’s TAGS, or Twitter Archiving Google Sheet. Using his template for a Google Sheet in Google Drive, you can pull tweets from Twitter. Through some easy steps, you “program” the sheet to collect messages from Twitter by hashtags, handles (@) and AND/OR/” operators.

A few years ago, I started with Version 5.0, and it took me a few tries to get the set-up right. Two new versions are available, 6.0 and 6.1, and they were easy to implement. The steps for set-up are well described and you can find videos to help. A support forum is another resource if you run into hiccups on the set-up. You can create as many files as you want to capture tweets and authentication is now a one-time requirement.
Screenshot of a TAGS spreadsheet

Using TAGS, you are essentially archiving tweets to a Google spreadsheet. You can change the frequency on the data pull, so it will automatically update, or you can handle it manually with the Run Now! option. I typically opt for the automated pull. The first tab of the spreadsheet has the “read me” and instructions for how to conduct your pull. It’s easy to follow and has links to additional resources if you want extra help. The other tabs in the sheet include the list of tweets, a summary, and dashboard. The list of tweets, called archive, alone contains a lot of information: name of the person tweeting, if the tweet is in reply to someone else, links to profiles, and user language.

Screenshot of dashboard

The summary lists the top tweeters, with stats on how many were retweets, unique tweets, and average and medians for tweets per person in the sample. The dashboard provides graphics for the top tweeters, a line graph showing the number of tweets over time, and Twitter activity for the collection.

Screen shot of charts/graphs

The data collected are rawer than some other systems, but that’s what I like. The data are on a spreadsheet that can be loaded into statistical programs or other more powerful analytical tools. While it might be nice to learn an advanced coding language like Python to create your own capture, it’s not on my front burner. And copying and pasting out of Twitter is too manual. Other systems provide fancy graphics, but if you just need an archive or a spreadsheet, TAGS works. Also, the spreadsheet is a format that could be used for qualitative and quantitative analysis.

The upsides include using a system you may already have credentials for, Google Drive. When working with colleagues, you can easily share the data, and they don’t have to create yet another account on a website or app. If your sheet has an error or reaches its limits, you’ll get reminders that it needs work.

The dataset limits were one of the downsides originally, but I was able to grab 17,000 tweets. Other negatives include the rigor of the pull. It won’t download every tweet. According to Hawskey’s blog, it’s a function of the Twitter Search API where relevance, not completeness, rules. You need a little forethought if you want to use this tool effectively. It will only pull historical tweets from the past seven days. If you wanted to watch a hashtag from an event, you’d need to set up the Google Sheet prior to the event. The dashboard only presents a few graphics, and they are basic. For a client or board presentation, you would want to jazz them up a bit. To run other kinds of analyses, like those measuring sentiment, location, network density and links, you would need to import the sheet into another program. This may be an extra unwanted step for some users.

Several audiences could benefit from the quick data pulled using TAGS. Marketers and public relations practitioners looking to archive messages about a particular campaign or event could set up a sheet prior to a launch. Organizers could capture and archive tweets during an annual conference. Academic researchers willing to accept the limitations of the tweets pulled could archive data for future investigations. Teachers could archive tweets from a classroom live-tweet to look for evidence of student learning. The free nature makes this appealing for nonprofit professionals.

As an example, I have 17,000 #StandWithWendy tweets from Wendy Davis’ 2013 filibuster saved for a research project. Originally, the sheet would take a long time to load. Now, it moves much faster. I want the ability to see my data and then clean it, based on my project. TAGS is one way for me to do so, and share with my research partners at the same time.

The newest versions, 6.0 and 6.1, offer a one-time set-up, a support community, and you can capture your favorited tweets, the last 3,000. If you have used the previous versions, go with 6.0, recommended by the TAGS team. It worked for me without any problems.

To learn more about or to get started with TAGS, go to

(This post is not a sponsored message. It originated as a sample for my students on reviewing a measurement tool for our course on social media analytics.)