Competition Produces Vandalism Detection For Wikis

Competition Produces Vandalism Detection For Wikis 62

Posted by timothy on Sunday September 26, 2010 @10:40AM from the citation-needed dept.

marpot writes "Recently, the 1st International Competition on Wikipedia Vandalism Detection (PDF) finished: 9 groups (5 from the USA, 1 affiliated with Google) tried their best in detecting all vandalism cases from a large-scale evaluation corpus. The winning approach (PDF) detects 20% of all vandalism cases without misclassifying regular edits; moreover, it can be adjusted to detect 95% of the vandalism edits while misclassifying only 30% of all regular edits. Thus, by applying both settings, manual double-checking would only be required on 34% of all edits. Nothing is known, yet, whether the rule-based bots on Wikipedia can compete with this machine learning-based strategy. Anyway, there is still a lot potential for improvements since the top 2 detectors use entirely different detection paradigms: the first analyzes an edit's content, whereas the second (PDF) analyzes an edit's context using WikiTrust."

Competition Produces Vandalism Detection For Wikis

This discussion has been archived. No new comments can be posted.

Search 62 Comments Log In/Create an Account

Comments Filter:

20% with no false positives? (Score:4, Insightful)

by Dan East ( 318230 ) writes: on Sunday September 26, 2010 @11:19AM (#33703402) Journal

If the algorithm can detect 20% with perfection then that must constitute extremely low hanging fruit. That type of vandalism is just annoyance. It is so obvious that the end user readily recognizes it as such and can skip over it or revert the edit.
The real issue is disinformation, which is vastly more subtle. The only defense is fact-checking or seeking out references. If the algorithm is capable of recognizing that kind of vandalism then the developers should have the software writing all the articles in the first place, because it'd have to be pretty spectacular to manage that.

and the reversionists? (Score:2, Insightful)

by Anonymous Coward writes: on Sunday September 26, 2010 @11:20AM (#33703410)

The people who "own" a page with the assistance of powerful insiders and revert any changes to their "pet" pages, even spelling fixes or simple corrections to bad information?
Will edits of *those* insiders, who are ruining wikipedia for the rest of us, be flagged by the algorithm as vandalism?

top 2 (Score:3, Insightful)

by trb ( 8509 ) writes: on Sunday September 26, 2010 @12:08PM (#33703672)

Anyway, there is still a lot potential for improvements since the top 2 detectors use entirely different detection paradigms
This implies that the lower-scoring detectors are less valuable in terms of looking for sources of improvement. That's not true, and that wasn't stated in the paper's "Conclusions" section. If the lowest scoring detector finds 5% of the bad data, and it's a different slice from what the other detectors find, then that's quite valuable.

There already IS a competitive angle (Score:3, Insightful)

by Grimbleton ( 1034446 ) writes: on Sunday September 26, 2010 @12:23PM (#33703756)

They already compete to be the first to revert edits they disagree with.

Re:20% with no false positives? (Score:3, Insightful)

by bunratty ( 545641 ) writes: on Sunday September 26, 2010 @12:29PM (#33703782)

Care to show us even one article where 99% of good edits are reverted? Remember, that will mean that over 99% of all edits are reverted.

Hah, bout time. (Score:3, Insightful)

by OnePumpChump ( 1560417 ) writes: on Sunday September 26, 2010 @02:15PM (#33704384)

4chan and Somethingawful have been having Wikipedia vandalizing competitions for years. (Usually, whoever's edit or fake article stays the longest wins.)

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Competition Produces Vandalism Detection For Wikis 62

Competition Produces Vandalism Detection For Wikis More Login

Competition Produces Vandalism Detection For Wikis

20% with no false positives? (Score:4, Insightful)

and the reversionists? (Score:2, Insightful)

top 2 (Score:3, Insightful)

There already IS a competitive angle (Score:3, Insightful)

Re:20% with no false positives? (Score:3, Insightful)

Hah, bout time. (Score:3, Insightful)

Related Links Top of the: day, week, month.

Slashdot Top Deals

Slashdot