Google's N-Gram viewer
Google N-gram viewer
This software is freely available through google, but its
scope is very limited. It uses Google Books, which has books dating back
centuries that are digitized and can be searched for specific terms. Because of
this, it was possible to search through every book on Google Books for a search
term and see when and how it was being used over time.
This is a useful tool for linguists, but also for English
scholars in a broader sense, because it shows the use of language over time. A
linguist might use it to study language changes over time, but a researcher in
literature might use it to compare how a term was being used in one text to how
common the term is at that time, or how it is being used in other works of the
same period.
Searching and N-gram graphs.
My first search on the N-gram viewer was for the word “teenagers”
because I knew from my undergrad capstone that the concept is a fairly new one.
As expected, the N-gram graph shows that the word does not come into existence
until around the 1940’s, and then it quickly picks up in use.
As you can see from this screen, I have limited the search
parameters to uses between 1935 and 1960. This was done because default search
parameters showed that the term doesn’t exist until around 1935, but with the
broader search parameters it was difficult to see exactly when the term first
gained prominence.
Decreasing the search parameters even further to being
between 1935 and 1950 shows a low frequency until 1940 and then a sharp
increase. This agrees with the culture of that period of time. With the ending
of the Second World War and the Great Depression, America entered into a period
of prosperity. This allowed for more leisure time which allowed for our current
schemas of teenagers to develop, as shown by looking at the context of the
search term.
Context
If you click on the year ranges at the bottom of the screen,
Google’s N-gram viewer will return every instance of the search term depicted
in the graph for that range. As shown in the previous image, the first range is
between 1935 and 1943.
When sorted by relevance, the term “teenager” is typically
used in reports and studies. As shown below, the first 3 results are a
pamphlet, a report, and a study on sleep and academics. The other 8 sources are
more of the same. This shows that teenagers weren’t yet the subcultural group
that we know them of today, but rather existed more for sociological
classification.
The range of 1951 to 1957 is the first time we see teenagers
referred to the way that we are used to now. The third result from the American
Lutheran talks about how parents in 1954 were struggling to understand their
unruly teenagers.
To give some perspective, the following image shows the
snippet view of the text in question. This shows that teenagers in the
mid-fifties were being somewhat rebellious. To the modern eye, this is an
amusing complaint about teenagers in comparison to modern day. However, this
snippet is all that is available through Google. One limitation of the Google N-gram
viewer is that it doesn’t always have the full text of the book available. This
can be frustrating because the context might be too limited in the snippet view
for a researcher’s needs.
Overall, Google’s N-gram viewer is useful for starting
research or seeing how terms are used in specific time periods, but should
probably not be used for serious research.
In addition to having a limited scope, it often produced
incorrect results because of the way that Google Books are archived. The
following image shows a search for video games with results dating back to
1945. This is impossible because the first video game in history isn’t revealed
until 1950, and video games developed for the public don’t exist until the 1970’s.
Clicking on the entry for context shows that the original
publication date is 1945, but the section containing the search term is from
1990. This is likely do a minor error and wouldn’t skew the results that much
if there were more of them. However, this inaccuracies make the tool more difficult
and frustrating to use.





Comments
Post a Comment