Friday, July 25, 2014

Data Analysis Shortcomings

    The increasing amounts of data available on the Internet for analysis are presenting challenges for analysts.  Government agencies collect enormous amounts of data daily.  There are consistently new methods presented to store and manage all of the data until it can be analyzed.  Unfortunately, it is impossible for analysts to review all of this data.  This is where data tools become necessary. 

     Data analysis tools have limitations.  One of the biggest limitations is not with the tools themselves, but rather the user.  This becomes clear if users do not know how to get the most out of a tool or are not using them as they were designed simply because they do not know any better.   Analysts may not want to use these tools to assist them with their data analysis because they may focus only on the tools limitations, view them as them as threats to their jobs or they do not have the necessary skills to use the tools.  The marketing hype of analysis tools can also lead organizations to choose the wrong tools. 

    In the article Shiny, Shiny Data: The Thrill of the Chase, the author Leetaru points out that many are distracted by the shiny new object syndrome.  They believe the hype that the new data tools will change how they analyze their data and blindly use the new tools simply because they are easy to use.  The fault with only using tools because they are easy is that they are most likely using the wrong tools and this is evident specifically by their desire for an easy tool, not an accurate one.  Leetaru gives an eye opening example of this fault after he sat in on a presentation about the Syrian regime.  The presentation did not offer any sources for their data, but that it was based on billions of observations.  Leetrau asked how could it be possible to obtain that much open source, street level data on the rebels.  They disclosed that the information was obtained from Twitter.  They had scanned Twitter for English language tweets that originated in Syria, even though they knew that the software used to codify the tweets warned that the results may be invalid.  The better option would have been to monitor Facebook posts in Arabic language because that is how the rebels were communicating.  Twitter was used simply because the data was easier to access, easier to use and no one on the team spoke Arabic. 

     While there are many good tools available for analysts there seems to be a failure for Silicon Valley to develop applications specific to Washington's needs and Washington fails to recognize which tools would be most beneficial for their needs.  Leetaru recommends that Washington needs to increase their data literacy and Silicon Valley needs to increase their application literacy.  This is necessary to bring the two together to pursue data driven intelligence and policy making.  In order for organizations to be choosing the appropriate tools for their data analysis they need to be working more closely with those that are developing the software.  


Reference:

Leetaru, K. (2014, May 14). Shiny, Shiny Data: The Thrill of the Chase. Retrieved July 22, 2014, from Foreign Policy: http://www.foreignpolicy.com/articles/2014/05/14/nsa_intelligence_big_data_tradecraft_silicon_valley


      

No comments:

Post a Comment