Dealing with the Overwhelming Flow of Information – Big Data

The Internet has saved researchers the trouble of physically pulling out drawers and flipping through documents. Nowadays almost everyone would start the research online. It seems easy to get enough information from the existing online library, database, journal collections, from JSTOROnline Archive to the increasingly popular Google Scholar, just to name a few. Not to mention many undergoing digital archive projects led by the governments, libraries and non-profits. This endless supply of information can easily create an illusion for us, that we’re being in the know, and the truth is just a few clicks away. 

But is it really that easy? For anyone who is bold enough to have embarked on a data-driven research, after you type in the keywords and click search, the brief moment of excitement is usually followed with a much longer period of panic. What do I do with the 493,000 related articles?

How to manage the overwhelming data/information is no longer a problem exclusively for scientists or mathematicians, it became a challenge for all researchers, students, and governments and private companies. But luckily, using technology, there are also tools and models available to manage big data, and help us eliminate the noise on the surface and dig out the underlining findings.

One of those tools is ASSANA – Accelerating Social Science Analysis for a New Age, a project designed to explore intensive computer programs to facilitate data analysis in social research. Based on Derrick Cogburn & Amy Wozniak’s paper, Computationally Intensive Content Analysis of Public Diplomacy Data: Understanding the Public Remarks of US Secretaries of State, 1997-2011, the basic ASSANA workflow is follows:

Data collection and input into the system -> coding -> content analysis based on the coding -> generate content report (output)

Therefore, our ability to analyze the data is largely depend upon the result of massive data computing. On personal level, this creates technology barrier for those who can’t afford better solutions; but then on national level, it would turn into a competition of upgrading both hardwares and softwares, which would ultimately push the technological advancement further and further.

This entry was posted in Big Data and tagged , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s