Ngrams 1851

After running both 5-grams and 7-grams, the 7-grams definitely proved more useful because it offered fuller phrases; this, however, was partially due to the abundance of a and the in the n-grams. On future passes, I strongly recommend adding both articles to the stop word list.

Overall, there were few unexpected results. Negro was either used as a generic noun—a synonym for slave or runaway—or as an adjective predominantly followed by man. Man most often appeared in the phrase a negro man, but boy most often appeared as said boy. Girl only appeared twice and woman not at all. Five of the six appearances of color followed copper; one followed griff. It would be interesting to investigate why the phrase copper color occurred almost exclusively in 1851. Was copper, as a skin descriptor, just entering common usage so that advertisers felt the need to indicate that it referred to skin color? Interestingly, mulatto appeared five times, but on only one occurrence did it stand alone; there were three bright mulatto and one brown mulatto. Unsurprisingly, Texas most often followed County and rarely the name of a city. Says was exclusively used to indicate a claim (i.e. name or owner’s name) made by a captured runaway. Said was used exclusively as an article in place of the or this.

Likewise, there were no surprises when using my top five words. Most of the top five were included in the class list, but those not already searched were years, feet, high, and county. All four yielded the results that one would expect.

For the choose five, I ran black, yellow, dark, about, and speaks. Black occurred most frequently as a skin color descriptor and only rarely to describe an article of clothing (always a hat) or an animal (one mule and one horse); yellow exclusively described skin color. Likewise, dark always described skin color, next to either complexion or a color (all copper or yellow). About was most commonly associated with age (32 occurrences) followed by height (15); rarely was about linked to a date (6), weight (4), location (2), or physical description (2). Lastly, speaks appeared three times. Each time, it indicated a speech pattern characterized by broken English or multi-lingual speech.

