NSA To Use Cloud Model For Intelligence Analysis 41
Hugh Pickens writes "Information Week reports that the National Security Agency is taking a cloud computing approach in developing a new collaborative intelligence gathering system that will link disparate intelligence databases geographically distributed in data centers around the country. The system will house streaming data, unstructured text, large files, and other forms of intelligence data, and analysts will be able to add metadata and tags that, among other things, designate how securely information is to be handled and how widely it gets disseminated. For end users, the system will come with search, discovery, collaboration, correlation, and analysis tools. The intelligence agency is using the Hadoop file system, an implementation of Google's MapReduce parallel processing system, to make it easier to 'rapidly reconfigure data' and for Hadoop's ability to scale. The NSA's decision to use cloud computing technologies isn't about cutting costs or seeking innovation for innovation's sake; rather, cloud computing is seen as a way to enable new scenarios and unprecedented scalability. 'The object is to do things that were essentially impossible before,' says Randy Garrett, director of technology for NSA's integrated intelligence program."
Distributed == cloud? (Score:5, Insightful)
The data you keep
The NSA
May get a peek.
Burma shave.
But seriously... Distributed data is now "the cloud"? Is my dirty laundry in "the cloud" because it is scattered in my bedroom?
Cloud this cloud that (Score:5, Insightful)
Re:Money well spent? (Score:3, Insightful)
You could say the same thing about search engines back in the mid-90s, before Google's PageRank.
I couldn't agree more however the ramifications of inaccurate or misleading results are minimal - to the user if not Google's bottom line. What is at issue here is collating a resource from disparate sources, including vastly different formats, which would enable relevant agencies to better sort the wheat from the chaff. Having been part of a team aiming to standardise similar data sets to provide a search resource I speak from experience when I say this is no simple undertaking. Slashdot has recently [slashdot.org] posted an article on the difficulties of trawling datasets with tools which inevitably produce false positives, and the ramifications on false positives for the people in question are onerous to say the least. And this is without factoring in the liklihood that no more personnel will be available to follow up flagged content (people) so the net result is either an unused collection of positives or an accross the board policy change to haul in outliers under the pretense that they have been highlighted by 'powerful new technology'.