Replacing Sports Bloggers With an Algorithm 120
tesmar tips a report up at TechCrunch that begins "Here come the robo sports journalists. While people in the media biz worry about content mills like Demand Media and Associated Content spitting out endless SEO-targeted articles written by low-paid Internet writers, at least those articles are still written by humans. We may no longer need the humans, at least for data-driven stories. A startup in North Carolina, StatSheet, today is launching a remarkable network of 345 sports sites, one dedicated to each Division 1 college basketball team in the US. For instance, there is a site for the Michigan State Spartans, North Carolina Tar Heels, and Ohio Buckeyes. Every story on each site was written by a robot, or to put it more precisely, by StatSheet's content algorithms. 'The posts are completely auto-generated,' says founder Robbie Allen. 'The only human involvement is with creating the algorithms that generate the posts.'"
Close, but still not pratical (Score:4, Interesting)
Re:Close, but still not pratical (Score:5, Funny)
They aren't. They're using it as a replacement for the output of sportswriters.
Re:Close, but still not pratical (Score:5, Insightful)
The important thing here is that this isn't replacing deep, insightful thoughts and analysis, which still has to be done by a human. If you want a reasoned opinion that pulls together the statistics, external factors (e.g. a player's mind-set or personal life), and adds in some humor, then you're going to want a skilled human doing the writing. But if your interest is more along the lines of "Who won, by how much, and what were the main things that led to them winning (e.g. was it strong offense or good defense)?" then auto-generated content is fine. In fact, as with all aspects of automation, the point is to free up humans from doing the boring, silly tasks, so that they can concentrate on the more important tasks.
After reading some of the auto-generated articles (Michigan State Spartans [spartanball.com], North Carolina Tar Heels [carolinaupdate.com], and Ohio Buckeyes [buckeyesbeat.com]) I must say I'm quite impressed with how good the content is. Obviously it won't be winning any prizes, but I can't say that it's any worse than human-generated summaries of matches. It goes through the details, throwing in some contextual commentary (e.g. "the underdogs") obviously based on a nice database of stats. What's even better is that the articles also present some of the stats themselves, allowing the reader to skip the writeup and focus on the numbers/graphs if they prefer.
So, frankly, I see this as a good thing. It's a waste of human talent (even mechanical-turk caliber talent) to write a bunch of formulaic summaries when a computer can clearly do a decent job. This lets the humans focus on tasks that are more difficult to automated.
Comment removed (Score:5, Insightful)
Re: (Score:3, Informative)
The raw numbers are useful, but many people would like to read a quick summary of the highlights of the statistics, rather than having to read through them themselves.
Somebody who is not well acquainted with the specific stats may have trouble telling what is unusual, or combining them together to reach a conclusion. Even those familiar with the statistics would often find it quicker to read the computer generated summary than trying to skim the numbers to determine if they are worth spend more time on.
But
I checked the site, and... (Score:2)
Re: (Score:3, Insightful)
If they are just formula's around numbers, just give us the numbers. No need for all the fluff around it.
You're a geek. So am I, but here's a secret I learnt: Lots of people are afraid of numbers. Much in the same way they are afraid of punks - they don't really think they will harm them, but they prefer to have them accompanied by words/police.
Re: (Score:2, Insightful)
here's a secret I learnt: Lots of people are afraid of numbers.
Generally speaking, you're right. That said, I'm an engineer, and I've never met a geek more into numbers than the sports fans I've met. They might not understand the implications of the stats they're spewing out, but that doesn't mean they don't have them memorized.
The site is going after the right demographics. Sports fans are hungry for numbers like these.
Re: (Score:2)
I was just thinking: yeah the writing is dry and disjointed, much like my scientific articles. I wouldn't mind a robo writing assistant to help me put out journal articles. Much of it is, in fact, dry and formulaic.
Re: (Score:2)
I was just thinking: yeah the writing is dry and disjointed, much like my scientific articles. I wouldn't mind a robo writing assistant to help me put out journal articles. Much of it is, in fact, dry and formulaic.
That's because facts are generally dry and formulaic.
Related, I've heard that the "TV Sportscaster" is the single most truthful person on the evening news. He outputs facts: scores, stats, etc. When he shows highlights, it's footage from an event that actually happened, and usually includes appropriate context such as "and the Vikings went on to lose it, 24-10."
Meteorologists try to show us the future, and while they have a measurable accuracy rating, what they say is certainly not fact.
News reporters are
Re: (Score:2)
Re: (Score:1)
Re:Close, but still not pratical [sic] (Score:2)
Unfortunately, there is a large population of humans who have no skills beyond what are characterized here as boring, silly tasks, nor much inclination to step up and learn how to be more productive. And regardless of whether they deserve to be employed or not, making them unemployed doesn't help the the population as a whole.
So my applause goes more often to technology that helps people work smarter, not as often to that which o
Re: (Score:1)
Re: (Score:2)
It's a flaw that came to light during the Industrial Revolution, and in fact hit so hard it spawned Communism as a response. It is also a flaw that's inherent to Capitalism and can't be fixed under it.
Of course, as soon as automation catches up with all tasks - that is, as soon as Artificial Intelligence catches up with Human Intelligence, at least as far as practical m
Re: (Score:1)
I'd very much like it if we could keep the advantages of technology
Re: (Score:2)
Unfortunately, there is a large population of humans who have no skills beyond what are characterized here as boring, silly tasks, nor much inclination to step up and learn how to be more productive.
And strangely, this description coincides with the characteristics of sport fans I know. They will probably be content with whatever they're fed. No, I am not a fan of sports. Mod me to oblivion.
Re: (Score:2)
Of course, this rises a question: Why don't I simply get the numbers from Wikipedia, RSS feeds or whatever and run the algorithm myself? Why bother with a website, which is bound to be sub-optimal for my uses due to the demand
Re: (Score:1)
Re: (Score:2)
It looks like the old sysadmin insult, "Go away - or I will replace you with a very short shell script" has actually happened to the sports writers...
Re:Close, but still not pratical (Score:4, Funny)
Re:Close, but still not pratical (Score:5, Funny)
Hard to read? Disjointed? Mentally uncomfortable? Sounds like it could fit right in here on /. ;-)
A clever attempt, RoboWrongSizeGlass, but not clever enough! Trying to point the finger at humans while sneaking in another templated contribution! Haha! Your plans will never work! :P
Re: (Score:1)
Actually Slashdot could use a submission bot: Most submissions just copy the first paragraph of the article anyway. That should not be hard to automatize. For the title, just use the page title. The main problem is to detect content which would likely be a successful submission. Maybe a Bayesian filter could be used, which is trained on past accepted and rejected submissions. Connect it to a spider constantly searching the web, and you can automatically fill Slashdot.
Re: (Score:3, Insightful)
It doesn't have to fool humans (unfortunately). It just has to fool Googlebot
Re: (Score:2)
So? If it fools Googlebot while still providing significant value to humans, then mission fucking accomplished. [xkcd.com]
Re:Close, but still not pratical (Score:4, Interesting)
I can't imagine anyone using it as an actual replacement for even semi well-written content.
It's a bit uncomfortable to read in spots, but way above the quality of most blogs and nothing you can actually point out as an error. So if they manage not to swamp the sites in ads, and provide good statistics as well, I can't see why they couldn't get a rather large take of the advertising action - large in proportion to the manhours invested in writing articles, that is. It could be very lucrative and if it is - well, say goodbye to a lot of reporters who aren't the primetime writers but just pad out the papers: they're going to be automated away or at best, write the templates for the robowriters.
Re: (Score:3, Interesting)
Lucrative is right. They seem to have garnered 1.3 million dollars in funding for their efforts.
Re: (Score:2)
It's a bit uncomfortable to read in spots, but way above the quality of most blogs and nothing you can actually point out as an error.
The problem isn't the quality of the writing, it's the content. It's extremely bland. The sports writers, both reporters and blog writers, are usually fans of the sports, and they describe the highlights of the game. They talk about how players peformed, about controversial calls by the refs, last second shots, etc.
Re: (Score:1)
Re: (Score:1)
And a robot will never be able to write a article like this one: http://bluesundaycolts.blogspot.com/2010/11/what-peyton-manning-is-doing-and-how-to.html [blogspot.com]
Point missed. (Score:1)
It's not written for you, so they simply do not care. It's written for Google and other search engines to help push whatever their advertisers are selling.
The age of internet search is dead. We need something more intuitive and something decidedly human.
This post.... (Score:3, Funny)
This post was written by a robot.
Re:This post.... (Score:5, Funny)
What makes you say this post was written by a robot?
Re: (Score:2)
What makes you say this post was written by a robot?
My inability to avoid making puns.
Re: (Score:3, Funny)
Do you enjoy your inability to avoid making puns?
Re:This post.... (Score:5, Funny)
Re:This post.... (Score:4, Funny)
Please tell me more about this post was written by a robot.
If I am correct, you are having trouble believing about this post was written by a robot? I do apologize for the inconvenience. Is there anything else I can do for you today? Thank you for calling Dell technical support.
Re: (Score:2)
Would it help you to know what makes me say this post was written by a robot?
Re: (Score:2)
How do you feel about what makes me say this post was written by a robot?
Not bad... (Score:5, Interesting)
I've read a couple articles and they are no worse than the SEO-targeted content written by freelancers odesk for $2/hr (and english as a second or third language).
Seems as though the "algorithm" is quite elaborate - taking into account odds of winning as well. Lines such as "The [team] was not supposed to win this game, but made it happen" and combined player statistics "Coming off a poorly put together team last year, this year, the [team] looks to have greater talent."
It reminds me of how someone in Junior high would write. Impressive. Similar to MIT's paper generator: http://web.mit.edu/newsoffice/2005/paper.html [mit.edu]
PHP + MySQL + Mad Libs for Sports.
Re: (Score:1)
Including the core cliché of sports reporting (Score:3, Insightful)
Sports Journalism Turing Test ... (Score:2)
... was the article written by a human, or a computer? Can you tell the difference? I remember when robots starting being deployed in factories, that there were concerns about workers sabotaging the robots which were destined to steal their jobs. Will this happen in the sports newsroom?
"The RoboSportReporter is broken again. It looks and smells like someone poured a beer into him."
Re: (Score:3, Funny)
"The RoboSportReporter is broken again. It looks and smells like someone poured a beer into him."
They were just trying to make him more realistic.
Help reading (Score:3, Funny)
Next logical step (Score:2)
Replace the athletes with algorithms. Just think of the savings.
THE Ohio STATE University Buckeyes (Score:1)
Re: (Score:1)
Re: (Score:1)
fans (Score:3, Funny)
Now we need a sports fan algorithm to rid ourselves of all these needless sports fans in the world and replace them with something more worth the resources.
Re:fans (Score:4, Insightful)
That wouldn't get rid of them. Better to keep them occupied with a cheap diversion (which also keeps the athletes busy). Do you really want the jocks and their fans wandering around looking for something to do?
xkcd (Score:5, Funny)
Obligatory Simpsons (Score:2, Funny)
This is the DJ 3000. It plays CDs automatically, and it has three distinct varieties of inane chatter:
- Hey hey -- how about that weather out there?
- Woah, that was the caller from hell.
- Well, hot dog -- we have a weiner.
- Those clowns in congress did it again -- what a bunch of clowns.
How does it keep up with the news like that?
Re: (Score:1)
- Those clowns in congress did it again -- what a bunch of clowns.
u missed the the calling every single last member of the other party terrorists
I can hardly wait (Score:1)
for automated theater and restaurant critics.. The human responses will be priceless.
Impressed (Score:3, Interesting)
I read the first article on the first linked site and I was impressed. I wouldn't have known it was generated by a computer. Even knowing that it was computer-generated, I'd still be happy with the quality for this kind of reporting. Very good.
Human involvement (Score:3, Funny)
I am going to guess that there will not be any humans involved in reading the output either.
tiny issue (Score:5, Interesting)
Can you copyright the output of an algorithm? Seriously, copyright requires a creative element...
Re: (Score:1)
Re: (Score:2)
Re:tiny issue (Score:4, Funny)
Tell that to a phone book or other assemblage of facts.
I tried, but the phone book wouldn't listen to me.
Re:tiny issue (Score:4, Insightful)
Perhaps you're confused about the outcome of Feist Publications, Inc., v. Rural Telephone Service Co [wikipedia.org]. Phone books and other collections of facts may not be copyrighted because they lack creativity. Hence the question:
While sribe focuses on the creative element one must also ask who the copyright would go to. The constitution grants congress the power
To my understanding this has always been interpreted to mean authors have rights to their writings and inventors have rights to their discoveries.
Re: (Score:2)
Perhaps you're unaware that phone books are not copyrighted? Perhaps you're unaware that this issue went all the way to the Supreme Court?
There's a T-shirt for that... (Score:3, Insightful)
Why am I suddenly reminded of this t-shirt? [thinkgeek.com] :)
Slashdot -- proudly Luddite (Score:3, Funny)
Re: (Score:2)
At least we know that Slashdot isn't generated by robots. A robot wouldn't make the idiotic mistakes that the current human (for want of a better word) editors do. E.g. "one dedicated to each Division 1 college basketball tam in the US." Robots don't suffer from dyslexia, and aren't too lazy to use a spell check.
Robots wouldn't have so many dupes either (as they are the easiest thing to check for). As soon as /. goes a week without a duplicate story, then we know the robots have taken over.
Re: (Score:2)
Re: (Score:2)
A robot wouldn't make the idiotic mistakes that the current human
Unless of course it was programmed to deceive the audience into believing that it was a human...
Sentences don't lead into one another (Score:1)
Re:Sentences don't lead into one another (Score:4, Insightful)
Re: (Score:1)
The second sentence refers to the man and wife de
I like it (Score:2, Interesting)
Now sports editors have something to show novice reporters. "If you can't give me something a whole lot better than this, you're fired".
It's a reminder that standards for every knowledge-based profession are going up every year, driven by the combination of the Internet, globalization, and Moore's Law. And this is just the start of it for journalism.
Re: (Score:2)
Journalistic standards won't go up, the scare stories about immigrants and terrorists will be autogenerated. I mean they're already pretty much cut and pasted from last week's shock horror exclusives as it is.
Emotionless Facts (Score:2, Insightful)
Re: (Score:2)
> Part of good sports writing is that it evokes emotions.
Then I've never read any good sports writing (unless boredom counts).
Algorithm fails to inform not tar on their heels (Score:1)
Let me know when the algorithm can insult rivals.
It's not just computers that fail the Turing test (Score:2)
Re: (Score:2)
on the other hand, reminded of a Numb3rs episode wherein a supercomputer was programmed to appear to pass Turing tests.
This is new? (Score:2)
In the UK the tabloids have been auto-generating content for at least 20 years.
Re: (Score:1)
So who collects the stats? (Score:2)
Re: (Score:1)
Re: (Score:2)
Re: (Score:2)
Isn't that what Firehose is supposed to do?
Verb Selection (Score:3, Interesting)
I've always been amused by how sports reporters vary the verb used to describe a win. They can't just keep saying "Team A beat Team B" over and over, so they mix it up, based on how wide the score was. For a win with a small margin, they might say "Detroit edged Ottawa", or "The Rangers slid past the Ducks". For a large margin, perhaps "The Coyotes pummeled the Blues". I give extra credit if the verb matches the subject, as in "So-and-so doused the Flames".
I think it would be a lot of fun to write a program for this.
Lame. Could be better. (Score:2)
That's kind of lame. It's just a one-paragraph summary of the game.
A more promising approach would be to start with a play by play summary [go.com]. Football play-by-plays look like this:
Re: (Score:1)
just read the boxscore instead (Score:2)
For hockey, I skip the article and go straight to the boxscore. It has this great innovation: presenting the information in chronological order so you can follow along, rather than describing them in reverse go-ahead order.
If you see a 10 minute misconduct by some skill dude, you might have to read the article to find out whether the guy went ape, or just forget to tie down his jersey in a tug fest of the midgets. This is exactly the information that's not likely to be found in the robospiel. Sometimes I
The stories read like they're written by bots (Score:2)
Am I a butterfly? (Score:2)
Am I a butterfly imagining that everybody on Slashdot is a bot discussing a story about bots, or am I a bot posting about people or bots on Slashdot imagining that I'm a bot or a butterfly or something?
weather forecasts by robot (Score:2)
Good thing for regular news (Score:2)
Good thing for regular news here in the USA that our news isn't data-driven... it's opinion driven, and you need a person to make up an opinion... or do you? I'm pretty sure someone could generate the fox news content just by scanning cnn's articles and negating all of the opinion statements.
Re: (Score:2)
Don't praise the machine!