Google’s text mining and statistical research

Obviously as a key part of my statistical research methods essay I should currently be writing, I’ve been inspired by Yolana at Colonial Psychiatry Hub to have a play with Google’s data mining tool Ngram Viewer, which searches its archives of books for the frequency of any word you plug in.  Obviously there’s room for serious error, not only in Google’s own classification of its books, but also with spelling inconsistencies, fashions in words and grammar, bias for the types of books that survive from pre-1700, and all sorts – as Yolana says, Anterotesis does a good and funny look at the wierd things you can come up with.

However, for African political terms, which really only come into their own since the 1800s, there are some expected and fun results.  I’ve played with the key words from my seminar on “Tribe and Nation” here.

Advertisements

Leave a comment

Filed under Academia, Procrastination

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s