Replacing Sports Bloggers With an Algorithm 120
tesmar tips a report up at TechCrunch that begins "Here come the robo sports journalists. While people in the media biz worry about content mills like Demand Media and Associated Content spitting out endless SEO-targeted articles written by low-paid Internet writers, at least those articles are still written by humans. We may no longer need the humans, at least for data-driven stories. A startup in North Carolina, StatSheet, today is launching a remarkable network of 345 sports sites, one dedicated to each Division 1 college basketball team in the US. For instance, there is a site for the Michigan State Spartans, North Carolina Tar Heels, and Ohio Buckeyes. Every story on each site was written by a robot, or to put it more precisely, by StatSheet's content algorithms. 'The posts are completely auto-generated,' says founder Robbie Allen. 'The only human involvement is with creating the algorithms that generate the posts.'"
Re:Close, but still not pratical (Score:3, Insightful)
It doesn't have to fool humans (unfortunately). It just has to fool Googlebot
There's a T-shirt for that... (Score:3, Insightful)
Why am I suddenly reminded of this t-shirt? [thinkgeek.com] :)
Including the core cliché of sports reporting (Score:3, Insightful)
Re:Close, but still not pratical (Score:5, Insightful)
The important thing here is that this isn't replacing deep, insightful thoughts and analysis, which still has to be done by a human. If you want a reasoned opinion that pulls together the statistics, external factors (e.g. a player's mind-set or personal life), and adds in some humor, then you're going to want a skilled human doing the writing. But if your interest is more along the lines of "Who won, by how much, and what were the main things that led to them winning (e.g. was it strong offense or good defense)?" then auto-generated content is fine. In fact, as with all aspects of automation, the point is to free up humans from doing the boring, silly tasks, so that they can concentrate on the more important tasks.
After reading some of the auto-generated articles (Michigan State Spartans [spartanball.com], North Carolina Tar Heels [carolinaupdate.com], and Ohio Buckeyes [buckeyesbeat.com]) I must say I'm quite impressed with how good the content is. Obviously it won't be winning any prizes, but I can't say that it's any worse than human-generated summaries of matches. It goes through the details, throwing in some contextual commentary (e.g. "the underdogs") obviously based on a nice database of stats. What's even better is that the articles also present some of the stats themselves, allowing the reader to skip the writeup and focus on the numbers/graphs if they prefer.
So, frankly, I see this as a good thing. It's a waste of human talent (even mechanical-turk caliber talent) to write a bunch of formulaic summaries when a computer can clearly do a decent job. This lets the humans focus on tasks that are more difficult to automated.
Emotionless Facts (Score:2, Insightful)
Comment removed (Score:5, Insightful)
Re:Sentences don't lead into one another (Score:4, Insightful)
Re:fans (Score:4, Insightful)
That wouldn't get rid of them. Better to keep them occupied with a cheap diversion (which also keeps the athletes busy). Do you really want the jocks and their fans wandering around looking for something to do?
Re:Close, but still not pratical (Score:3, Insightful)
If they are just formula's around numbers, just give us the numbers. No need for all the fluff around it.
You're a geek. So am I, but here's a secret I learnt: Lots of people are afraid of numbers. Much in the same way they are afraid of punks - they don't really think they will harm them, but they prefer to have them accompanied by words/police.
Re:tiny issue (Score:4, Insightful)
Perhaps you're confused about the outcome of Feist Publications, Inc., v. Rural Telephone Service Co [wikipedia.org]. Phone books and other collections of facts may not be copyrighted because they lack creativity. Hence the question:
While sribe focuses on the creative element one must also ask who the copyright would go to. The constitution grants congress the power
To my understanding this has always been interpreted to mean authors have rights to their writings and inventors have rights to their discoveries.
Re:Close, but still not pratical (Score:2, Insightful)
here's a secret I learnt: Lots of people are afraid of numbers.
Generally speaking, you're right. That said, I'm an engineer, and I've never met a geek more into numbers than the sports fans I've met. They might not understand the implications of the stats they're spewing out, but that doesn't mean they don't have them memorized.
The site is going after the right demographics. Sports fans are hungry for numbers like these.