Researchers Develop an Internet Truth Machine

Researchers Develop an Internet Truth Machine 87

Posted by timothy on Sunday December 16, 2012 @09:25AM from the how-to-create-better-liars dept.

Hugh Pickens writes "Will Oremus writes that when something momentous is unfolding—the Arab Spring, Hurricane Sandy, Friday's horrific elementary school shooting in Connecticut—Twitter is the world's fastest, most comprehensive, and least reliable source of breaking news and in ongoing events like natural disasters, the results of Twitter misinformation can be potentially deadly. During Sandy, for instance, some tweets helped emergency responders figure out where to direct resources. Others provoked needless panic, such as one claiming that the Coney Island hospital was on fire, and a few were downright dangerous, such as the one claiming that people should stop using 911 because the lines were jammed. Now a research team at Yahoo has analyzed tweets from Chile's 2010 earthquake and looked at the potential of machine-learning algorithms to automatically assess the credibility of information tweeted during a disaster. A machine-learning classifier developed by the researchers uses 16 features to assess the credibility of newsworthy tweets and identified the features that make information more credible: credible tweets tend to be longer and include URLs; credible tweeters have higher follower counts; credible tweets are negative rather than positive in tone; and credible tweets do not include question marks, exclamation marks, or first- or third-person pronouns. Researchers at India's Institute of Information Technology also found that credible tweets are less likely to contain swear words (PDF) and significantly more likely to contain frowny emoticons than smiley faces. The bottom line is that an algorithm has the potential to work much faster than a human, and as it improves, it could evolve into an invaluable 'first opinion' for flagging news items on Twitter that might not be true writes Oremus. 'Even that wouldn't fully prevent Twitter lies from spreading or misleading people. But it might at least make their purveyors a little less comfortable and a little less smug.'"

Researchers Develop an Internet Truth Machine

This discussion has been archived. No new comments can be posted.

Load All Comments

Search 87 Comments Log In/Create an Account

Comments Filter:

Cultural bias? (Score:5, Insightful)

by Anonymous Coward writes: on Sunday December 16, 2012 @09:30AM (#42306737)

This is really interesting research, but it's also based on one event in one country.
Conclusions based on what may be language or cultural norms (such as "did you phrase in the positive or the negative") might not translate to other locales well (e.g. Hurricane Sandy in the US).
But, then, that's what's great about science. Testable predictions we can apply to data.

- - Re: (Score:2)
    
    by Attila Dimedici ( 1036002 ) writes:
    
    See, I am one to say, "The world is going to end." The thing is, people always get disappointed when they ask me "When?" because my answer is, "I don't know, and neither does anyone else."
- - Re: (Score:3)
    
    by thereitis ( 2355426 ) writes:
    
    Where are you getting 16 years from? https://en.wikipedia.org/wiki/File:Global_Temperature_Anomaly_1880-2010_(Fig.A).gif [wikipedia.org] The trend in temperature is clearly 'up' and has been for many years.
    - Re:Cultural bias? (Score:5, Insightful)
      
      by jfengel ( 409917 ) writes: on Sunday December 16, 2012 @04:46PM (#42308519) Homepage Journal
      
      It's a popular denier meme: 1998 was a very hot year and if you start your data series there you can show an overall decline.
      Viewed on any other scale, this artifact goes away. But it doesn't matter how many times you tell deniers about that; they know what story they want to tell and will continue to cherry pick the data to tell it.
      
      - Re: (Score:1)
        
        by Belial6 ( 794905 ) writes:
        
        Why would you expect anyone to talked what you have to say seriously when you jump straight to Goodwining the discussion?
        
        Re: (Score:1)
        
        by steviesteveo12 ( 2755637 ) writes:
        
        I'm not sure we mean the same things by "Goodwining"
        
        Re: (Score:2)
        
        by Belial6 ( 794905 ) writes:
        
        Tablet auto correct problem. It should have said "Godwin".
        
        Re: (Score:1)
        
        by steviesteveo12 ( 2755637 ) writes:
        
        Even then, I didn't realise he made a Nazi reference.
        
        Re: (Score:2)
        
        by Belial6 ( 794905 ) writes:
        
        That would be the 'Denier' label. It is a reference to Holocaust 'Deniers'.
        
        Re: (Score:1)
        
        by steviesteveo12 ( 2755637 ) writes:
        
        Oh no, that's reference to climate change 'deniers', in that they deny climate change.
        "It's a popular [climate change] denier meme: 1998 was a very hot year and if you start your data series there you can show an overall decline."
        There certainly is such a thing as a Holocaust Denier (although even then I personally wouldn't have associated the Nazis with Holocaust *denial* as such) but they deny a separate thing.
        
        Re: (Score:2)
        
        by Belial6 ( 794905 ) writes:
        
        It is a clear attempt to put those that question climate change into the same category as those who would deny the Holocaust.
        
        Re: (Score:1)
        
        by steviesteveo12 ( 2755637 ) writes:
        
        Well, if you really think so. It seems a bit of a stretch to me. A denier is just someone who denies something. The first stage of grief is denial -- but not in the sense that they're denying the Holocaust.
        
        Re: (Score:2)
        
        by Belial6 ( 794905 ) writes:
        
        No one uses the label of "Denier" for someone that is going through grief. Your argument is like the child that stands around saying "Bitch", "Ass", "Cock", and when their parents reprimand them, they claim they are just listing animals. Words have meaning that is based on context. When used as an ad hominem, as it was done in the previous post, it means is a way to discredit anything said by the person by declaring him evil. The specific language that is used is clearly an attempt to parallel climate c
    - Re: (Score:2)
      
      by sublayer ( 2465650 ) writes:
      
      Where are you getting 16 years from? https://en.wikipedia.org/wiki/File:Global_Temperature_Anomaly_1880-2010_(Fig.A).gif [wikipedia.org] The trend in temperature is clearly 'up' and has been for many years.
      Obviously that's due to the decreasing number of pirates [wikipedia.org].
- Re: (Score:2)
  
  by __aaltlg1547 ( 2541114 ) writes:
  
  You could use the same algorithm to derive credibility indicators for any language and region and use multiple verified events and facts to train the system.
  By the way, what about no pronouns vs. 1st-person and 3rd-person? What about no emoticons?
  What about links to known-unreliable sources as opposed to nominally credible sources?
  - Re: (Score:3)
    
    by ArsenneLupin ( 766289 ) writes:
    
    You could use the same algorithm to derive credibility indicators for any language and region and use multiple verified events and facts to train the system.
    But what if its results leak, and bird song adapts to meet expectation, but without actually being more reliable?
    - Re: (Score:2)
      
      by __aaltlg1547 ( 2541114 ) writes:
      
      You could use the same algorithm to derive credibility indicators for any language and region and use multiple verified events and facts to train the system.
      But what if its results leak, and bird song adapts to meet expectation, but without actually being more reliable?
      Arms race.
I wonder... (Score:2, Interesting)

by Anonymous Coward writes:

How effective would this be on real media? I bet it'd put those bastards in their place! :)
- - Re:I wonder... (Score:5, Insightful)
    
    by __aaltlg1547 ( 2541114 ) writes: on Sunday December 16, 2012 @12:22PM (#42307317)
    
    I correctly judged the credibility of the "Iraq has WMDs" based mostly on the tone of original news reports.
    We had information from UN weapons inspectors stating they were able to go wherever they wanted and examine whatever they wanted and so far had not found any evidence of a currently-active program or any stockpiles of usable weapons. The tone of these reports was direct and devoid of pleas to emotion.
    The White House labeled these reports "not helpful" and directed the public's attention to historic atrocities and put forward innuendo regarding alleged Iraqi support for terrorists. It certainly looked like fearmongering. The very fact that the WH was labeling actual current information from Iraq as "not helpful" was to me the most damaging to their case. If they were interested in the truth, I reasoned, current information from international inspectors could only be helpful.
    
- Re: (Score:1)
  
  by maxwell demon ( 590494 ) writes:
  
  You've used a question mark, an exclamation mark and a positive smiley. Thus you lost any credibility, according to the cited criteria.
  - Re: (Score:2)
    
    by Ol Biscuitbarrel ( 1859702 ) writes:
    
    Unfortunately you're - we're - correct. :-( At least he didn't use profanity (PDF) [iiitd.edu.in].
Rating individual tweets, accurate? (Score:5, Insightful)

by JaredOfEuropa ( 526365 ) writes: on Sunday December 16, 2012 @09:35AM (#42306759) Journal

So it provides a first opinion on first posts, sort of. Neat, but I do wonder how accurate this is going to be to vet individual tweets. Twitter trolls may get wise to this and game the system to get their stuff past this filter. A bit like phishers learning how to spell. In the end, the best check is still independent verification, for example by other people tweeting the same thing (not just retweeting of course). If this system could automatically group and cross-verify tweets from multiple sources on the same subject, that would be a step in the right direction.

- Re: (Score:2)
  
  by Hentes ( 2461350 ) writes:
  
  The best check is the site of an actual seismologist. Tweets shouldn't be trusted in emergency scenarios.
Researchers Develop an Internet Truth Machine (Score:3)

by omar.sahal ( 687649 ) writes: on Sunday December 16, 2012 @09:39AM (#42306771) Homepage Journal

Couldn't some enterprising douche programmer use simular programs to write better misleading tweets.

- Re: (Score:2)
  
  by account_deleted ( 4530225 ) writes:
  
  Comment removed based on user account deletion
- Re: (Score:2, Insightful)
  
  by artor3 ( 1344997 ) writes:
  
  Sure, but most people tweeting false info in a disaster are just stupid kids (or man-children) who think its funny. They're probably not going to put lots of effort into it, because then it wouldn't be fun.
- Re: (Score:2)
  
  by SeaFox ( 739806 ) writes:
  
  Of course. We have SEO programmers.
  Coming up next, TTO -- Twitter Trustworthiness Optimizers.
  We start with lots of sock-puppet follower accounts, add a pessimistic spin and frowny faces. Also use links that will probably lead to astroturfing sources, and finally give the tweet a healthy copy-edit before it's posted to made it in a first or second-person perspective and make it a declarative, expletive-free message.
Chile's Earthquake (Score:5, Interesting)

by thejynxed ( 831517 ) writes: on Sunday December 16, 2012 @09:40AM (#42306773)

It's interesting to note, that a seismology student at a university in Chile finally had enough nonsense from false information over Twitter, etc about earthquakes, that he directly wired a big batch of seismographs to directly post their results via Twitter. The last I knew, they had over 1 million followers, and this particular student has been getting big thank yous from residents of the country.

Reliable (Score:5, Funny)

by Anonymous Coward writes: on Sunday December 16, 2012 @09:44AM (#42306781)

Twitter is the world's fastest, most comprehensive, and least reliable source of breaking news

Twitter has dethroned Fox News?!?

- Re: (Score:2)
  
  by Attila Dimedici ( 1036002 ) writes:
  
  It went beyond that. It is even less reliable than MSNBC.
- Re: (Score:1)
  
  by Anonymous Coward writes:
  
  With Fox News, you can reliably conclude that the opposite of what they say is true.
- Re: (Score:2)
  
  by RudyHartmann ( 1032120 ) writes:
  
  No political bias there. Uh huh.
- - Re: (Score:3)
    
    by Artifakt ( 700173 ) writes:
    
    Fox news is the only TV news that actually went into court petitioning for a verdict that it was OK for them to lie, that they didn't lose the first amendment derived right to keep sources confidential just because they were using those sources to deliberately lie. They got that verdict. As part of that case, Fox news is the only TV news that has admitted for the record, in a court of law that they out and out lied. Maybe that's why more people think Fox lies. You are technically correct because you used t
- Re: (Score:3)
  
  by cheekyjohnson ( 1873388 ) writes:
  
  I trust you, Anonymous Coward.
- Re:Truth? Whose? (Score:5, Insightful)
  
  by M. Baranczak ( 726671 ) writes: on Sunday December 16, 2012 @10:38AM (#42306913)
  
  Reality is the stuff that doesn't go away when you stop believing it.
  Don't be a pedantic asshole. We can't determine the absolute truth, but we can get a close enough approximation.
  
- Re: (Score:2)
  
  by maxwell demon ( 590494 ) writes:
  
  1. Reality is relative, as Einstein showed us.
  Wrong. Einstein's theory of relativity doesn't say that reality is relative. Indeed it is very absolute in that theory. What is relative is the way we slice it into space and time.
this is pretty stupid (Score:1)

by Anonymous Coward writes:

let me FTFY:
"... the results of Twitter misinformation can be potentially deadly... team at Yahoo has analyzed tweets... to automatically assess the credibility of information... A machine-learning classifier developed by the researchers uses 16 features to assess the credibility... : credible tweets tend to be longer and include URLs; credible tweeters have higher follower counts; credible tweets are negative rather than positive in tone; and credible tweets do not include question marks, exclamation marks
This is crap (Score:3, Interesting)

by Anonymous Coward writes: on Sunday December 16, 2012 @10:20AM (#42306859)

One of the criteria in their algorithm seems to be that credible tweets were
... significantly more likely to contain frowny emoticons than smiley faces.
They were evaluating tweets about a disaster; not a lot of smiley faces there.
The algorithm seems to have a bias toward bad news. So, if my buddy tweets that a rare Belgian beer will be available at the local liquor store, the algorithm will decide that it isn't credible because of the smiley face.
We just had the above case. Beer that you usually have to cross the Atlantic to get became available for about 30 minutes locally. Some of us lined up starting at 3:00 AM. I would have been really ticked off if some algorithm had made me miss the news.

[:~P (Score:2)

by wrencherd ( 865833 ) writes:

Who stops to type emoticons in the middle of a natural disaster (including switching to the alternate keyboard to get those characters)?
- Re: (Score:1)
  
  by froth-bite ( 2777385 ) writes:
  
  Who stops to type emoticons in the middle of a natural disaster (including switching to the alternate keyboard to get those characters)?
  the same people who sing in the rain?
  - Re: (Score:2)
    
    by Ol Biscuitbarrel ( 1859702 ) writes:
    
    Corpses everywhere. xD The stench of death pervades my very being. :)
- Re: (Score:3)
  
  by grcumb ( 781340 ) writes:
  
  Who stops to type emoticons in the middle of a natural disaster (including switching to the alternate keyboard to get those characters)?
  It happens. When the Rabaul Queen [wikipedia.org] capsized[*] in heavy seas, killing an estimated 321 people, there were dozens of tweets and facebook posts from people on board. They used emoticons because it's a lot easier to write :-( than it is to write 'I'm really frightened right now.' Let me tell you, when I was assigned to write about the disaster, it was very, very difficult to read those posts and remain unmoved.
  Moral: Don't make assumptions about people's state of mind unless you have some insight into what they
  - Re: (Score:2)
    
    by wrencherd ( 865833 ) writes:
    
    So you're saying that you believe that people who are facing their last moments on Earth, if given wifi/cell access during that time would/should NOT call their loved ones to say good-bye/"I love you", but should/would post on twitter instead, and include emoticons?
    
    ps -- Thanks for your de rigueur introduction of victimhood into the discussion. :-O
2 Separate Issues: Evidence vs. Headlines (Score:3)

by retroworks ( 652802 ) writes: on Sunday December 16, 2012 @10:24AM (#42306869) Homepage Journal

There's two topics here, one is use of potentially valuable information by, say, emergency responders (leads, evidence, etc.). The program could be useful. The second (e.g. "don't use 911") is "a headline", i.e. it is aimed at spreading news (or troll farts) as media to the social public. These are definitely two completely separate problems to solve. The second problem is best solved by evolution, as people who get their "news" off of social media become even stupider than they were to begin with and die off.

Twits (Score:2, Funny)

by blagooly ( 897225 ) writes:

It is Twitter, not Tweeter. Therefore Twits. Not Tweets. Twits.
Gaming Reliability/Credibility Assessment (Score:5, Interesting)

by girlinatrainingbra ( 2738457 ) writes: on Sunday December 16, 2012 @11:02AM (#42307009)

Of course, in just the same way that spammers can game Bayesian spam filters or rule-matching pattern filters by knowing what the rules are, given a known set of rules that attempt to assess credibility of tweet allows someone to tweak their tweets in order to be assessed as having high credibility:
1 -- max out your tweet length
2 -- include an URL [doesn't say whether to use a link shrtnr ;>(]
3 -- use a Twitter account with a high number of followers
4 -- use a negative tone
5 -- no question marks or exclamation points
6 -- use 2nd person (same as don't use 1st or 3rd person)
7 -- don't use swear words
8 -- use a sad emoticon
.
Example to maximize this:
a - break into / hack a high follower account (e.g. justinbieber) and tweet: cat > finaltweet
You should know Mayan Calendar sez: world ending this week. Confirmed@ http://netcraft.calendar.mayan/ [netcraft.calendar.mayan] you go hug loved 1s now. :>( beebs
wc finaltweet
1 20 139 finaltweet

First iteration was:
gia@sodium$ cat > count2 You should know that Mayan Calendar says : world ending within week. Confirmed by http://netcraft.calendar.mayan/ [netcraft.calendar.mayan] , you should hug loved ones now. :>( -- beebs gia@sodium$ wc count2 1 25 159 count2
Please note that the "[netcraft.calendar.mayan]" was inserted by /.'s /-code and is not part of the wc wordcount :>(

- Re: (Score:1)
  
  by maxwell demon ( 590494 ) writes:
  
  Or what about this:
  The world is going to end. :-( It will be eaten by a black hole approaching Earth, reaching us on Dec 21. See http://gaotse.cx for details.
  (Link intentionally misspelled)
Looking for disaster, just look at car commercials (Score:3, Insightful)

by DarthVaderDave ( 978825 ) writes: on Sunday December 16, 2012 @11:46AM (#42307161)

If you weren't aware that Hurricane Sandy, Irene or whatever occurred, just tune into the local television and watch the car commercials. If I see one more Maxon, Salerno Dwayne, Rutherford Ford or Honda Hurricane Sandy stimulus event, I'm going to throw up. THAT is how you know something bad has happened.

- Re: (Score:2)
  
  by Samantha Wright ( 1324923 ) writes:
  
  Woah; check your latent variables before you wreck your latent variables. Correlation is not causation. How do you know that maleness and natural disasters aren't both caused by URLs on Twitter?
There's a basic problem here. (Score:3, Insightful)

by __aaltlg1547 ( 2541114 ) writes: on Sunday December 16, 2012 @11:58AM (#42307221)

The basic problem with any such approach is that tweets are individual opinions and you cannot arrive at the truth or falsehood of objective facts by analyzing a collection of he-saids and she-saids.
The hospital is either on fire or it is not on fire, regardless of what anybody says.

Sweet! 8-( (Score:1)

by Impy the Impiuos Imp ( 442658 ) writes:

Researchers...also found that credible tweets are less likely to contain swear words (PDF) and significantly more likely to contain frowny emoticons than smiley faces
.
Hey, that's pretty cool! :)
I mean, that's pretty cool! :(
- Re: (Score:1)
  
  by maxwell demon ( 590494 ) writes:
  
  I think for the first one you wanted to write: "Hey, that's fucking cool! :)"
  And for the second one, you don't want the exclamation mark. That was also claimed to be a sign of non-credibility.
Remember this when you ask for help. (Score:1)

by maxwell demon ( 590494 ) writes:

credible tweets do not include question marks, exclamation marks, or first- or third-person pronouns
Don't write "Help!" (exclamation mark" or "please help me" (first person pronoun).
Leads me to believe I hit the bull in the eye. (Score:2)

by 3seas ( 184403 ) writes:

How to know they are real valuable is when they are censored
My tweets were censored because they had URLs, even to Twitlonger.
So I resorted to these tweets instead @ ~140 characters limits (how long a tweet can be):
#taxes 1) The Declaration of independence recognized the peoples rights & duty to ... remove budgeting & accounting failed tasks from Gov't.
#taxes 2) for proper representation, given all the budgeting & accounting fails, &more, the people must direct where their taxes R 2 B used.
#
- Trying to be funny? (Score:2)
  
  by drainbramage ( 588291 ) writes:
  
  If you weren't AC I would moderate you 'Woosh'.
  Snopes != truth
Snopes for Twitter, then (Score:2)

by macraig ( 621737 ) writes:

Snopes needs to borrow this algorithm and create a subsection devoted to Twitter. It will highlight the unreliable posts and list which criteria made them fail the sniff test. Then, if there's time and resources, a human being might follow up the most significant ones and flesh out the stories.
Community Moderation (Score:1)

by MrHim ( 703476 ) writes:

How combining learning algorithms with community moderation?
Negative in tone? (Score:2)

by chrismcb ( 983081 ) writes:

Credible tweets are negative? How is "Coney Island hospital is on fire" or "don't use 911, the lines are maxed out" positive in tone?
Reminds me of... (Score:1)

by beberly37 ( 1236914 ) writes:

This reminds me of the anecdote about a DOD learning "AI" program to identify tanks in images that worked perfectly in the lab. We they took it into the field it didn't. They taught it by showing it pictures of landscapes with and without tanks. As it turns out, all of the tank pictures also had clouds and all of the no tank pictures didn't have clouds. So the AI was working, doing exactly what it was taught, identifying clouds.
headline fix (Score:2)

by AdamWill ( 604569 ) writes:

Headline should read "Researchers Develop Tool For Twitter Trolls To Improve Plausibility Of Their Tweets"

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Cultural bias? (Score:5, Insightful)

Re: (Score:2)

Re: (Score:3)

Re:Cultural bias? (Score:5, Insightful)

Re: (Score:1)

Re: (Score:1)

Re: (Score:2)

Re: (Score:1)

Re: (Score:2)

Re: (Score:1)

Re: (Score:2)

Re: (Score:1)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3)

Re: (Score:2)

I wonder... (Score:2, Interesting)

Re:I wonder... (Score:5, Insightful)

Re: (Score:1)

Re: (Score:2)

Rating individual tweets, accurate? (Score:5, Insightful)

Re: (Score:2)

Researchers Develop an Internet Truth Machine (Score:3)

Re: (Score:2)

Re: (Score:2, Insightful)

Re: (Score:2)

Chile's Earthquake (Score:5, Interesting)

Reliable (Score:5, Funny)

Re: (Score:2)

Re: (Score:1)

Re: (Score:2)

Re: (Score:3)

Re: (Score:3)

Re:Truth? Whose? (Score:5, Insightful)

Re: (Score:2)

this is pretty stupid (Score:1)

This is crap (Score:3, Interesting)

[:~P (Score:2)

Re: (Score:1)

Re: (Score:2)

Re: (Score:3)

Re: (Score:2)

2 Separate Issues: Evidence vs. Headlines (Score:3)

Twits (Score:2, Funny)

Gaming Reliability/Credibility Assessment (Score:5, Interesting)

Re: (Score:1)

Looking for disaster, just look at car commercials (Score:3, Insightful)

Re: (Score:2)

There's a basic problem here. (Score:3, Insightful)

Sweet! 8-( (Score:1)

Re: (Score:1)

Remember this when you ask for help. (Score:1)

Leads me to believe I hit the bull in the eye. (Score:2)

Trying to be funny? (Score:2)

Snopes for Twitter, then (Score:2)

Community Moderation (Score:1)

Negative in tone? (Score:2)

Reminds me of... (Score:1)

headline fix (Score:2)

Related Links Top of the: day, week, month.

Slashdot Top Deals