Gender diversity on twitter?

Results of who I follow on twitter

I rarely read twitter due to the API changes which I’ve talked about in the past. But I saw Teknoteacher talking about changing his followers after reading about Male tech CEOs follower accounts. I thought I’d share some things I discovered too. Especially reading this a while back.

So my results are above, using the online tool –

But a while ago I used Open Human’s twitter archive analyzer by Bastian Greshake Tzovaras. It was super sobering!

Here is my replies by gender from when I first started using Twitter back in 2017. As you can see there was a massive spike of conversation with males in 2012, I also generally talk to more men than women on twitter.

My replies & gender Likewise when retweeting based on gender its mainly males. Recently its a lot closer to 50% which is great but I wonder with my lack of twitter use, how that will effect things? (I have requested a new update of my twitter data)

My retweets & genderOf course my instant thought is there is noise in the figures as its not always clear if people are male or female for many reasons. But its disappointing to read Elon Musk’s tweet.

And read about others such as…

Sundar Pichai, the CEO of Google, follows 267 accounts on Twitter. Of those, 238 appear to be men. He follows nearly as many Twitter Eggs (15) as women (21).

Satya Nadella, Microsoft CEO, followed the most women (39) of any of the accounts examined by the Guardian, though that is still half the number of men he follows (78) out of a total of 165 accounts.

I’d really like to see this applied to race not just gender too. It reminds me how I was going to learn more Python so I can create this as a Juno personal notebook in Open Humans.


I updated Open Humans with my latest Twitter data export and here are the results.
Once again very sobering to see. Got to make some changes.

Screenshot of replies for 2019

Worth adding from TwArχiv site.

The graph shows you the number of replies to Twitter users that are classified as either male or female. The classifications are predictions based on users’ first names as given in their Twitter accounts. The predictions itself are performed by the Python package gender_guesser . It uses name/gender-frequencies from a larger text corpus. mostly male, mostly female, andy and unknown classifications are ignored. To decrease the noise the daily values have been averaged by a daily average over a 180 day window (dataframe.rolling('180d').mean()).

Ideally these graphs would include non-binary folks. Doing this is a bit trickier. It is thus a work in progress.

Screenshot of retweets for 2019Also worth mentioned…

Even more interesting than whether replying to people might be gendered can be the question which voices are being amplified . On Twitter a good indicator of amplification are retweets. These can be gender balanced or show biases, similarly to the replies to other users.

The graph shows you the number of retweets to Twitter users that are classified as either male or female. The classifications are again predictions made by the Python package gender_guesser . To decrease the noise the daily values have again been averaged by a daily average over a 180 day window (dataframe.rolling('180d').mean()).

Ideally these graphs would include non-binary folks. Doing this is a bit trickier. It is thus a work in progress.

Maybe it really time to drop twitter…

Dead twitter

I use to use Corebird on my laptop for twitter access. Today this was broken and with a quick search found a page explaining all.

As many of you may know, Twitter decided to remove the UserStream API, which many third-party clients use, including Corebird. It’s a vital part of the user experience and is used for real-time timeline updates, DM retrieval, mentions, etc.

The replacement is the Accounts Activity API. I have not looked much into its details since the technical difficulties are enough to make it virtually impossible for me to port Corebird to it, but what I know is that real-time tweet updates aren’t supported and the prices are well beyond what I could possibly pay (“$2,899 per month for 250 users”).

Now, there would be a few ways out, of course. Porting to the Accounts Activity API is off the table, but other protocols exist. Since Corebird has never been anything else than a Twitter client, there is no abstraction for the Twitter API however, so porting to another protocol will be a lot of work again. Since I’m not a student anymore, I can’t promise to do any of that work. The master branch is additionally in a very WIP state with the ongoing GTK4 port and a bunch of other features.

The API removal will take place mid-August, so Corebird will mostly stop working at that point. I do not know of any real alternative that is not of course.

If this explanation was too convulted, has one as well.

I’d like to thank everyone who helped me over they years and all the patrons on here especially for all the support.

Seriously… I’m so very very close to dropping twitter, as although I benefited greatly from it in the past. They seriously have over stepped the mark and my alternative Mastodon is growing massively. I already stopped cross posting to Facebook after their decision to drop automated posting.

Microblogging dataportability at last?

Twitter data dump

Finally got the ability to download my tweets… Over 6 years of tweets in 6.8 meg of files.

It comes in a zip file not a tar file which is interesting because Facebook uses Tars for its data dumps. Structures interesting because its less of a dump and more a formal backup of your data complete with HTML file bring it all together. Theres a README.txt file which reads…

# How to use your Twitter archive data
The simplest way to use your Twitter archive data is through the archive browser interface provided in this file. Just double-click `index.html` from the root folder and you can browse your entire history of Tweets from inside your browser.

In the `data` folder, your Twitter archive is present in two formats: JSON and CSV exports by month and year.

  • CSV is a generic format that can be imported into many data tools, spreadsheet applications, or consumed simply using a programming language.
  • ## JSON for Developers
  • The JSON export contains a full representation of your Tweets as returned by v1.1 of the Twitter API. See for more information.
  • The JSON export is also used to power the archive browser interface (index.html).
  • To consume the export in a generic JSON parser in any language, strip the first and last lines of each file.

To provide feedback, ask questions, or share ideas with other Twitter developers, join the discussion forums on

Most of the data is JSON which bugs me a little only because I would personally have to transform it all to XML but alas I’m sure everyone loves it. The CSV spreadsheets are odd and could do with being XML instead of CSV but once again sure its useful to someone out there. The nice thing is there is tons of meta around each microblog/tweet including the geo-location, time and device/client. Even the URLs have some interesting things around it, because I was wondering how they were going to deal with shorten urls, retweets and mentions…

 “urls” : [ {
“indices” : [ 69, 89 ],
“url” : “”,
“expanded_url” : “”,
“display_url” : “”
} ]

Doesn’t always work… specially when using urls shortener which don’t keep the url after a certain time period. Interesting internally twitter always uses its own for everything…

Right now I’m just interested in the period around my brush with death… Real shame theres no references to mentions you’ve had, as I would have loved to have seen some of those. Guess Twitter were not going to delve into that can of worms…

I want to know why theres no inporter?

Cnet have a overview of how and what to do with the archive. Thanks Matt

Don’t forget to Tweet seat

cinema "Batalha" #4

Tony Tweets a piece following my blog about what cinema could learn from TV.

The theater may seem like the least appropriate place to check your Twitter feed, but that’s exactly the kind of behavior a Minnesota venue is encouraging with the launch of a designated “Tweet Seats” section. The Guthrie Theater in Minneapolis opened up its Tweet Seats for the first time this week, for the first of four performances of The Servant of Two Masters. Priced at $15 a ticket (compared to the $34 a standard ticket costs), the seats are all located in a balcony-level section where, according to the theater, spectators’ Twitter habits “will not be disruptive to other patrons.”

Tony is worried this might effect the way films are actually made but as I blogged it could be interesting for cinema…

My biggest problem is the light and sound phones generate when I’m trying to watch the film. If the seats are up above or right at the back, then it could work? Although the back seats are usually for couples not really interested in the film… Won’t even tell you what I’ve found in the backseats while I’ve been working…

End of the day, its coming like it or not Tony and others…

…regardless of how theatergoers choose to allocate their tweet time, the Guthrie and other venues seem more willing to embrace the mobile habits of contemporary audiences, rather than discourage them. Theaters in Boston launched similar experiments late last year, as have the Cincinnati Symphony Orchestra, Palm Beach Opera and New York’s Public Theater

Now is the time for the Cinemas and the movie industry to get behind this and do some interesting prototyping…

Sign me up people…!