Google’s text mining and statistical research

Obviously as a key part of my statistical research methods essay I should currently be writing, I’ve been inspired by Yolana at Colonial Psychiatry Hub to have a play with Google’s data mining tool Ngram Viewer, which searches its archives of books for the frequency of any word you plug in.  Obviously there’s room for serious error, not only in Google’s own classification of its books, but also with spelling inconsistencies, fashions in words and grammar, bias for the types of books that survive from pre-1700, and all sorts – as Yolana says, Anterotesis does a good and funny look at the wierd things you can come up with.

However, for African political terms, which really only come into their own since the 1800s, there are some expected and fun results.  I’ve played with the key words from my seminar on “Tribe and Nation” here.


