Researchers Forecast the Spread of Diseases Using Wikipedia 61
An anonymous reader writes Scientists from Los Alamos National Laboratory have used Wikipedia logs as a data source for forecasting disease spread. The team was able to successfully monitor influenza in the United States, Poland, Japan, and Thailand, dengue fever in Brazil and Thailand, and tuberculosis in China and Thailand. The team was also able to forecast all but one of these, tuberculosis in China, at least 28 days in advance.
Re: (Score:2)
Re: (Score:2)
whose there?
INteresting (Score:1)
forecast using /. (Score:2)
wondering when they start to try to predict diseases (or may be pc sales) from /. posts
Sounds familiar (Score:1)
Sounds familiar, hasn't someone already done that half a year or a year ago using Google search string mapping?
Re: (Score:2, Informative)
Thought so, it was Google, and they even created a page with real-time stats.
http://www.google.org/flutrends/us/#US
Re: (Score:1)
Re: (Score:2)
Which is ancient magic compared to the power of hashtags. ... #duh
How? (Score:2)
How did they do it? I started reading the linked paper, but my brain started hurting two sentences in. I couldn't extract any useful information on the 'how'.
Re:How? (Score:5, Informative)
This implicitly makes some big assumptions, among which the facts that people are aware of the disease and that they have internet access.
You can easily understand why their approach is of very limited usefulness, and scientifically questionable. I think that it is not by chance that their method fails to work when analyzing data for Uganda (where internet usage probably isn't widespread) and does not score well for China (where censorships both limits information about disease outbreaks and internet access).
They also state in their paper: "With these constraints in mind, we used our professional judgement to select diseases and countries.", and this raised my eyebrows a lot...
I would like to put at chance their approach by sifting wikipedia access data looking for Ebola keyword in slovenian language, and then forecast the diffusion of Ebola in Slovenia (equal to nil up to now...), but I try to use my time for testing methods that are better-posed.
"There are three kinds of lies: lies, damned lies, and statistics."
Re: (Score:2)
Re:How? (Score:4, Funny)
They made the assumption that if a disease is spreading somewhere, there people start looking for information about the disease on wikipedia
Imagine the potential: if a lot of search logs contain "EBOL-AAAARGH", they'll know a particularly fast-acting variant of the virus has emerged.
Re: (Score:2)
Which raises the question: If you search for the symptom keywords(Rash, Boils, Bleeding, coughing), can Wikipedia actually list diseases with those keywords?
From experience I do know that a lot of food can be typed in a native language, and it will still go to the correct page on English Wikipedia, roughly.
But if I start search for terms and keywords, Wikipedia tend to be worse than google.
Re: (Score:2)
'They made the assumption'
They made a hypothesis, then tested that hypothesis against the null hypothesis. This is otherwise known as science. Why do you hate science?
Re: (Score:3)
Re: (Score:2)
I think the most important piece of news of this story is that Wikipedia is no better than Google or Facebook, and exploits/sells search data too.
Re: (Score:2, Funny)
Look, we're onto your game. The suggestion that you've been living under a rock was a dead giveaway that you're a zombie...
"... Spread of Diseases Using Wikipedia" (Score:2)
Wait... what? Diseases now use Wikipedia?
Re: (Score:2)
Re: (Score:1)
+MAX_INT. GP and Parent have officially won the thread.
Re: (Score:2)
No, silly - the diseases themselves are not using Wikipedia; people are going to use Wikipedia to spread diseases.
(I rather enjoy the triple meaning ambiguity in this headline)
Wouldn't it be nice if headlines used commas and reflexive pronouns?
Or if there were someone who checked them over before publishing, like a proofreader?
I too read it as using Wikipedia to spread the diseases. Which is, I guess, doable, if logging gene sequences there, which someone else can splice into harmless but compatible bacteria.
Would publishing that kind of information be illegal?
Re: (Score:2)
Oh noes.. when will we be able to get wikicondums and how would that work?
Useless now that it's known? (Score:2)
Now that they've spread the word, will the approach start to be 'gamed' by big pharma or gov't trying to sow the seasonal flu panic?
Take that Educators! (Score:1)
Re: (Score:1)
The teachers might not know about 'Talk Pages', 'Revisions', and 'What Links Here':
things that make wikipedia much more advanced than traditional encyclopedias.
Re: (Score:2)
The teachers might not know about 'Talk Pages', 'Revisions', and 'What Links Here':
things that make wikipedia much more advanced than traditional encyclopedias.
No, teachers know that lazy students will just blindly copy and paste stuff from wikipedia.
Man! Wikipedia is mean. (Score:2)
umm (Score:2)
Re: (Score:2)
Re: (Score:2)
It's been done, sort of (Score:2)
This is why... (Score:2)
I always wash my hands after using Wikipedia.
Wikipedia the vector (Score:2)
Like others I found the headline confusing. I read it as "Researchers are predicting the use of Wikipedia as a vector for the spread of disease". This may mean that:
google flu trends (Score:1)
google has been forecasting flu through search data for a while.
http://www.google.org/flutrends/us/
It doesn't work perfectly though:
http://www.nature.com/news/when-google-got-flu-wrong-1.12413
Re: (Score:1)
yes, but google does not share its log files!
Google published a Nature paper out of it. AFAIK the data (google queries) on which that research is based is kept well secret. Therefore it is not possible to validate what they did. Science cannot be based on secret data, and the journal Nature in this case published an advertising ("how awesome is google"), not a scientific paper ("these are the data, this is our method, check out our conclusions").
As they athors here say, approaches from closed sources like