Algorithm-Generated Articles Won't Kill the Journalism Star 29
theodp writes: The AP's announcement that software will write the majority of its earnings reports, argues The Atlantic's Joe Pinsker, doesn't foretell the end of journalism — such reports hardly require humans anyway. Pinsker writes, "While, yes, it's true that algorithms can cram stories about vastly different subjects into the same uncanny monotone — they can cover Little League like Major League Baseball, and World of Warcraft raids like firefights in Iraq — they're really just another handy attempt at sifting through an onslaught of data. Automated Insights' success goes hand-in-hand with the rise of Big Data, and it makes sense that the company's algorithms currently do best when dealing in number-based topics like sports and stocks." So, any chance that Madden-like (video) generated play-by-play technology could one day be applied to live sporting events?
Earnings reports are in XML now. (Score:5, Interesting)
The SEC started requring companies to file their earnings reports in the Extensible Business Reporting Language a few years ago. At first, it was only for big companies; now it's everybody. The SEC displays this info in a standard format on line. Here are the latest earnings for DICE Holdings [sec.gov], Slashdot's parent. Here's the raw XML behind that data. [sec.gov] Turning that into verbiage isn't that hard.
I've been doing this for years at Downside.com, extracting the raw data from the human-readable text. This is now obsolete, but it's still running. Here's the same DICE financial statement as processed by Downside. [downside.com] That's Perl code that's been running for 15 years now. When it started, nobody was doing that. Now that everybody in finance has that data, it's probably time to retire Downside's old extraction engine.
Comment removed (Score:3, Interesting)
Re:Journalism died a long time ago (Score:4, Interesting)
Indeed. If they automatize things, we will at least have consistent low quality...
Actually I think the use of algorithms to write articles is great, I'm currently working on an anti-article algorithm that extracts just the facts from algorithm-generated articles and turns them into tweets. So instead of having to plough through a long slew of pseudo-intelligent analysis, all you get are the essential sound bytes: "Cat explodes; canary charged by police", that sort of thing. Pretty soon it'll be bigger than Facebook.