Thursday, November 17, 2016

Data Mining



This week our class began exploring the concept of data mining. As Wikipedia explains, data mining is, “the computational process of discovering pattern in large data sets involving methods at the intersection of artificial intelligence, machine learning, statistics and database systems.” Furthermore, as Wikipedia notes, “the overall goal of the data mining process is to extract information from a data set and transform it into an understandable structure for further use."
After learning more about data mining, our tribe decided to take a shot at. We did this by looking at and examining our most recent blog posts, which were written individually. Using Voyant, a site that is “a web-based reading and analysis environment for digital texts, our tribe wanted to discover and understand what words we frequently used individually. Also, we wanted to discover if and how these frequent words overlapped with another tribe members. To begin, our tribe copied and pasted our individual identity blogs into Voyant’s textbox, which led us to discover what words were used the most. The results were as follows:
Jonas’s three most frequent words: identity, person, online
Sam’s three most frequent words: identity, intelligence, people
Morgan’s three most frequent words: online, intelligence, able
Chase’s three most frequent words: social, identity, virtual
Our tribes most frequently used terms (left to right): Jonas, Sam, Morgan, and Chase


After looking at our identity blogs individually, our tribe compared the frequent terms from each of the tribe member’s blogs and found that the most commonly used words were identity, intelligence, and online. Once we gathered this data, we then used the site Ngram Viewer, “an online search engine that charts frequencies of any coma-delimited, search strings,” to discover how frequent those three terms have been used over time.



As our graph showed, the term “intelligence” has had several spikes since the 1500s, most notably in the 1600s and mid 1700s until 1900. This led our tribe to wonder what was going on during those times that made people ask about intelligence. Was the rise of the term intelligence in the 1600s, due to the pilgrims arriving in Massachusetts? Did intelligence spike again beginning in the mid 1700s have something to do with the French Revolution or the United States gaining independence from England?

Along with intelligence, the term “identity” spiked majoring in 1628. Was this because of England’s exploration into what we know today is America? Were they trying to find out who they were by exploring a new land? Furthermore, after this point in 1628, intelligence drastically decreased until the mid 1950s, right around the time computers were being made and used.

Finally, our tribe examined how the word “online” measured over time. Like our tribe suspected, there was zero frequency measuring the term online until the beginning of the 1980s, when computers and their use started becoming widespread, leading many people to venture online.

After we gathered our data, our tribe was fascinated by the data mining we had done using our past blogs on identity. We found that we not only used common terms in our own blogs, but those same words have been used throughout time, showing us how often or little those words have been discussed by people. Above all, our data mining was able to provide us with some essential evidence about how our world and the time we live in is shaped by the terms and words we us

No comments:

Post a Comment