Franklin Hu

Visualing Reading Diversity

Starting in 2016, I made a concerted effort to broaden my reading to a more diverse set of authors, which as a first pass boiled down to ones who aren’t white men.

I use Trello to track and categorize my reading list by labels based on author background, and while this helps with prioritization1, I hadn’t gone back to look at how I was doing in relation to my original goals.

I’ve cobbled together a set of scripts to dump all my Goodreads data, massage and fix up incomplete data, and export it as a set of graphs by a few dimensions. The Goodreads API is pretty abysmal to work with, but I cleaned up a Golang client library that had enough functionality for what I needed (reviews and authors).

This is definitely still a work-in-progress, and I plan on adding a few more facets2 as well as digging more into some of the intersectionalities (e.g. group by gender+heritage3, rating+gender).


The past two and a half years have trended in the right direction in terms of reading more books by women (2015: 28.5%, 2017: 44.8%) and people of color (2015: 4.7%, 2017: 34.4%). There’s still tons of room for improvement, and making some tweaks to how I choose what to read next.

Are there things you do to broaden your reading?

Note: the graphs in this post are a snapshot in time, but I plan on periodically updating the ones on my Reading page.

  1. Mostly toward books written by women and authors of color. It’s surprisingly easy when reading non-fiction to end up only reading books by white men. 

  2. Nationality, Age, Fiction/Non-fiction, translation, etc.. Breaking down heritage into “white” or “person of color” is a bit broader than I’d like, but as far as I know, there aren’t any good ways to programmatically get this type of data for authors. The tagging here was done entirely by hand. 

  3. If anyone has a better word than “heritage” I’m all ears.