Confession: I’m not a coder. I haven’t coded since grad school, when I learned how to write basic HTML so I could build wikis (back then, wikis required you to write HTML code to make them work).
So when I saw Kris Schaffer’s article “I deleted 40,000 tweets last week. Here’s why (and how),” I thought, “that’s cool and I’ll never be able to do it because it requires coding…….in Python.”
But in the last year, for reasons I probably don’t have to tell you, I’ve spent a lot of time thinking about what it means to have data “out there” on the web, in the hands of platform companies (like Facebook, Twitter, Amazon, Ancestry, etc. etc. etc.), and in the databases of who-knows-who-purchased-my-data-from-platform-companies. Or in the hands of trolls, harassers, or anyone else who wishes to do me harm. Or wishes to do my family harm.
I’ve also been wondering a lot about how higher education institutions like Middlebury should be thinking about, working with, managing, and protecting student data as students engage with Middlebury. This was the impetus behind my recent writing on Digital Sanctuary.
As I’ve thought about my own digital footprint, I’ve been inspired by Kris Schaffer’s digital minimalism and Mike Caulfield’s information environmentalism (in this case, it’s more of a personal information environmentalism–improving my personal information environments). Frankly, it was past time to put action behind these reflections. It was time to start minimizing and deleting. I recognize that deleting old tweets is just a start, and it doesn’t mean that my data disappears completely. I know Twitter will keep a copy of my data that I will not be able to delete. But deleting old tweets from my public profile, and from the advanced search functions of Twitter, will at least make those tweets inaccessible to others who might try to do harm.
Twitter doesn’t make it easy for you to delete old stuff. Even services that offer to help you delete old tweets–and there are many– can only delete a specific number (3,200; because Twitter won’t allow them to delete any more than that). Your only recourse if you want to delete tweets, say, between 2008 and 2015, is to serve up code through Twitter’s API that will tell Twitter to delete your old stuff.
Thank goodness for Kris Schaffer and his unending patience. I tried to follow his step-by-step instructions for writing code to delete old tweets through Twitter’s API but I was too much of a Python novice to know what I was doing. I needed help and he graciously walked me through the basics I needed to move into the deleting stage.
The good news: As I write this blog post, I am watching my old tweets being deleted in real time. Celebratory happy dance!
Since my starting point for this process was “absolute newbie,” I’m going to walk you through Kris’s process, but add steps that were not as obvious to me on the road to tweet-deleting success.
1) First things first: Download your twitter archive (which has text of all of your tweets) and unzip the folder. As Kris says, doing this allows you to save a copy of any tweets you’re planning to delete (oh, sweet memories), but also this is the file that will guide the Twitter deletions. Instructions for downloading the twitter archive are here: https://support.twitter.com/articles/20170160?lang=en#
When you download and unzip your tweet archive, the folder will be named something long and ugly. I’d recommend changing it to something more typable (you will have to type it later), but keep the tweets.csv file inside the folder named tweets.csv.
2) Next, you’ll create an app on Twitter that uses Twitter’s API to tell it to delete tweets. Here, I’ll copy Kris’ instructions, because they are simple and straightforward:
“Next, create a new app on apps.twitter.com. Just click “Create New App”; then provide a name for your app (“tweet deleter” is fine), description (same), and website (your website, or even the URL for your Twitter account).
Once you’ve created your account, you need to get a few authorization codes to link your Python script with your Twitter app. Open the “Keys and Access Tokens” tab, where you’ll find a Consumer Key (API Key) and a Consumer Secret (API Secret). You’ll also probably need to click on “Generate My Access Token and Token Secret” to generate the Access Token and Access Secret. Keep this tab open in your browser so you can copy these codes into your Python script, and be sure not to share these codes with others, or they’ll be able to access your account.”
3) Here’s where things get fun(ky): You’re going to need to be able to edit Python scripts and to run Python. THIS IS NOT AS SCARY AS IT SEEMS AT FIRST! You’ll need a Python editor (you can’t just use a Mac- or Windows-issued text editor– that was my first rookie mistake). You’ll need a text editor that understands Python quirks. I downloaded and am using Sublime Text and I really like it. So, download a Python editor like Sublime Text.
You’ll also need a Python environment in which to run your code. Following Kris’ recommendation, I downloaded and installed Anaconda. Once you’ve done so (and it took me an embarrassingly long time to figure this out), click Environments on the left sidebar and click the arrow next to “root.” Click Open with Python, and a new Python terminal window will open up. This is where you’ll input your code later.
4) Next: Let’s code!
We’ll be using most of Kris’ code, so I recommend that you copy and paste his code into a Sublime Text window (or whichever code editor you’re using). Go ahead and open a Python environment as well, using Anaconda (or your Python environment of choice).
You’ll start by installing Tweepy, a code framework for working with Twitter’s API. In your Python terminal, you’ll see three arrows pointing to the right, like this: >>>
Next to those arrows, type:
pip install tweepy
and then press Enter on your keyboard. You’ll see a bunch of stuff happen on the Python terminal. Tweepy should now be installed.
Now, go to your text editor (like Sublime Text) and copy/paste in the following code:
consumer_key = ”
consumer_secret = ”
access_key = ”
access_secret = ”
See where it says consumer_key, consumer_secret, access_key, and access_secret? Between the single quotes marks on each, you’ll paste in the codes you got when you created your Twitter deletion app in Step 2.
So the code in your text editor will look something more like this:
consumer_key = ‘394nkdr93jfks9fjajfskefofkeajfke’
consumer_secret = ‘kdfajkldfjaioejfiaofanvanvnvpaei29jdna’
access_key = ‘9rajioajnfanvioaejie9oajf’
access_secret = ’92jioanfkanfalaionn;a;’
Now, I just put goobledy-gook in there, but you’ll paste in the actual numbers provided on your apps.twitter.com page. Basically, you’ve told Tweepy that these are the credentials it’ll need to engage with the Twitter API for your Twitter account/tweets.
Still with me? Whew! Hang in there.
Now that Tweepy knows that we’re going to be coding for your Twitter account, based on the credentials you inputted, let’s tell Tweepy that it’ll be reading a .csv file to know which tweets should be marked for deletion.
reads a CSV file into a list of lists
with open(file, encoding = ‘utf-8’) as csvfile:
reader = csv.reader(csvfile, delimiter = ‘,’)
rows = 
for line in reader:
row_data = 
for element in line:
if row_data != :
You don’t change anything about this code (including the indentations). Just input it directly as it’s written (copy/paste into and from Sublime Text) and press Enter.
On Kris’ blog post, he tells you to run an authentication function but, after I ran into multiple issues with the function, he said I didn’t need to run that function. So next is to authenticate Tweepy to Twitter.
Here’s the code, that you’ll paste next to the >>>:
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
api = tweepy.API(auth)
print(“Authenticated as: %s” % api.me().screen_name)
Again, you don’t have to change a thing in this code–input it exactly as it is (put in your Python editor first, then copy/paste into your Python terminal). You’ve already told Tweepy your consumer key/secret and authorization key/secret, so it should be good to go. Success will look like this:
Now, tell Tweepy and the Twitter API to read your downloaded tweet archive, by inputting this at the >>>:
tweets = read_csv(‘/path/to/file/tweets.csv’)
Where it says /path/to/file/tweets.csv, you’ll want to replace that text with the directory path for the tweets.csv file that you downloaded earlier. Remember how I said that you’d want to rename the unzipped folder to something nice? This is why. For me, my code was:
tweets = read_csv(‘/users/myusername/downloads/twitterarchive/tweets.csv’)
I renamed my unzipped folder “twitterarchive” instead of the long title that folder originally had. Need help figuring out the directory for your file? On a Mac, you can right-click the file and select Get Info. You’ll get something that looks like this:
The directory is under the Where: section. After Macintosh HD is where our directory path begins: /users/myusername/downloads/twitterarchive
Ok, now that we’ve told Tweepy/Twitter API where the file is, we can start querying the file for certain tweets and then telling Tweepy/Twitter API to delete them.
5) Deleting! Our next move is to tell Tweepy/Twitter what to delete. My goal was to delete all tweets that happened before January 2016. I joined Twitter in 2008, so that meant deleting tweets between 2008 and 2015. Here is the script I used:
tweets_marked = 
month_list = [‘2014-01’, ‘2014-02’, ‘2014-03’, ‘2014-04’, ‘2014-05’, ‘2014-06’ ‘2014-07’, ‘2014-09’, ‘2014-10’, ‘2014-11’, ‘2014-12’, ‘2015-01’, ‘2015-02’, ‘2015-03’, ‘2015-04’, ‘2015-05’, ‘2015-06’, ‘2015-07’, ‘2015-08’, ‘2015-09’, ‘2015-10’, ‘2015-11’, ‘2015-12’]
for tweet in tweets:
if tweet[0:7] in month_list:
I marked tweets for two years at a time, so the script above is what I used to identify the tweets between 2014-2015. To do a different set of years, just change the years in the script.
Before I deleted, I checked to make sure I was deleting the correct tweets. This script gave me a chance to review all tweets that were marked for deletion:
print(len(tweets_marked), ‘tweets marked for deletion.’)
for tweet in tweets_marked:
Note: You have to press Enter twice before all of the tweets will show up. Weird thing.
Once I decided I was ready to delete the tweets, I copy/pasted this script from my code editor file:
# build list of marked status IDs
to_delete_ids = 
delete_count = 0
for tweet in tweets_marked:
# delete marked tweets by status ID
for status_id in to_delete_ids:
delete_count += 1
print(status_id, ‘could not be deleted.’)
Press Enter twice and the deletion begins. It’s a thing of beauty. You can get a count of how many tweets were deleted through that process by inputting this code:
print(delete_count, ‘tweets deleted.’)
After the deletion process is finished, you can re-run the scripts with other time variables (or check out Kris’ alternative options for marking certain tweets for deletion).
Through this process, I deleted 7,233 old tweets (all tweets before 2016). I’m hoping to run a similar process to delete old Direct Messages from Twitter.
Are you ready to try it? If you tried it, how did your tweet purge go? Where did you run into issues and how did you resolve them? Share your experiences in the comments section below!