Forgot your password?
typodupeerror
AI The Media News Technology

Could a Computer Write This Story? 101

Posted by Soulskill
from the kicking-newspaper-writers-when-they're-down dept.
An anonymous reader tips an article at CNN about the development of technology that automates the process of writing news articles. It started with simple sports reporting, but now at least one company is setting its sights on more complicated articles. Quoting: "Narrative Science then began branching out into finance and other topics that are driven heavily by data. Soon, Hammond says, large companies came looking for help sorting huge amounts of data themselves. 'I think the place where this technology is absolutely essential is the area that's loosely referred to as big data,' Hammond said. 'So almost every company in the world has decided at one point that in order to do a really good job, they need to meter and monitor everything.' ... Meanwhile, Hammond says Narrative Science is looking to eventually expand into long form news stories. That's an idea that's unsettling to some journalism experts."
This discussion has been archived. No new comments can be posted.

Could a Computer Write This Story?

Comments Filter:
  • by Anonymous Coward
    the FP's
  • by Anonymous Coward on Saturday May 12, 2012 @09:23AM (#39978311)

    Could a Computer Write Better Stories on Slashdot?

    YES.

    • by million_monkeys (2480792) on Saturday May 12, 2012 @09:40AM (#39978395)

      Could a Computer Write Better Stories on Slashdot?

      Slashdot summaries would be fairly well suited to being done by computer. They are usually taken from existing articles available on the web. They follow a straightforward format that is largely a quote/summary of the article. Occasionally they provide links to previous stories on the same topic. Computers can already do those things. You could even have an algorithm to put in random typos. I'm not sure how successful a computer would be at generating the tag lines like "from the kicking-newspaper-writers-when-they're-down dept.", but the rest seems doable. If slashdot were run by a bunch of geeks with the desire to do so, the story process could probably be automated, including the process of finding and rating interesting stories by by scanning various sites.

      • Re: (Score:2, Funny)

        by Anonymous Coward

        In fact, I stop reading slashdot summaries as soon as I decided whether I want to read the article or not. Otherwise I keep reading the same material over and over again, which is a waste of my time. Better slashdot summaries would be welcome, yes. The current crop waste time and aren't very strong indicators, as in there's a fairly high "oh the article was crap after all" percentage.

        • People have been asking for the ability to moderate the stories once they hit the front page, not just the comments and the firehose.

          The stupid "like" and "+1" buttons don't have the same effect. If there are 100 +1s, that means nothing by itself - for example, if the story has 1,0000 -1s or "hates" it helps put any up-rating into context.

          So a "200 people liked this story, 5,000 said it sucked" would be appreciated.

      • by Nrrqshrr (1879148)
        There is really no need for that part with the typos, please.
      • by mpicker0 (411333)
        It's already been done. Check out: http://www.bbspot.com/toys/slashtitle/ [bbspot.com]
      • by znrt (2424692)

        Computers can already do those things

        they can't. for now they only produce eliza-like simulations on very restricted domains. just look at the crap examples that fluff company exposes on their site, then consider that's the best they can come up with.

        I'm not sure how successful a computer would be at generating the tag lines like "from the kicking-newspaper-writers-when-they're-down dept.", but the rest seems doable.

        so composing and abstract is "already done" and a tag line is "would be?". you haven't thought a lot about this, did you?

        the real nut here is not just "parse some data" but extract a semantic model out from some text (pcik your domain, ofc), then you need a reasoning/inferential engine with a met

        • I'm not sure how successful a computer would be at generating the tag lines like "from the kicking-newspaper-writers-when-they're-down dept.", but the rest seems doable.

          so composing and abstract is "already done" and a tag line is "would be?". you haven't thought a lot about this, did you?

          Do you understand the difference between extracting a quote from an existing text vs. creating new completely text intended to be humorous?

    • by jd2112 (1535857)
      Perhaps, but it would almost certainly be neuter at checking for dupes.
  • by Anonymous Coward

    3d animation , a computer made voice and now a computer making the story

    WHY do we need copyright again then?

  • Using baseball as an example, it's possible to automate the box score creation, but only if a user inputs the pitch-by-pitch scoring information for what was thrown, how the batter reacted, and where the ball went among other things.You can't make a computer make the decision whether the play was a hit by the batter or an error by the fielder yet. Bottom line, it's totals that a computer can come up with, but the atomic facts still need to be gathered by a human.
    • What if you have cameras and motion analysis software tracking the entire game?
      • If you can do that you can automate balls and strikes and fair and foul, but you still can't automate the official scorer's determination whether the ball was "catchable" or not.
    • Then the journalists are fucked. What most of them do is rewriting press releases submitted by companies, copying subscription news stories, and maybe adding a critical sentence or two cribbed from wikipedia.
      • by sco08y (615665)

        Then the journalists are fucked. What most of them do is rewriting press releases submitted by companies, copying subscription news stories, and maybe adding a critical sentence or two cribbed from wikipedia.

        If by "fucked" you mean, "likely to switch to a job that isn't quite so mind numbingly repetitive and pointless," then, yes, they are fucked. Many people will be fired, yes I've been there, it sucks, but they will all find new jobs, and they will mostly be better off in the long run.

      • My point is, the chain of knowledge must start with a human somewhere. If computers rewrite the press release, the press release is coming from a human. Also, the right to reprocess news has to come from somewhere, you can't use an automated or human process to steal somebody else's news.
        • I fail to see your point. The human company employee who writes the press release isn't part of the news industry, isn't paid by a newspaper, and has no relationship with the journalist we're discussing. The press release is just a raw material which enters into the production of the news story.

          What you're implying is that humans will always have a role, but that role is entirely on the fringes of the economy, as high value consumers and as low value producers. All the value added stuff, the selection, a

    • You can't make a computer make the decision whether the play was a hit by the batter or an error by the fielder yet.

      This sentence seems to be implying that the newspaper reporter is the one that decides whether to score a play a hit or an error. The article is about automated generation of newspaper articles, not computer-refereed sports.

  • Yes, it could, although I hope they can do it better than the AI used to edit and post this story.

    • Editors like the kind on Slashdot are hard to automate. You could rely on the +/- buttons in the firehose to pick stories, butt that's not automation, that's crowdsourcing. Aditionally, you can't automate the submitters without Slashdot looking like Digg where every RSS feed that wants in participates. Automation tools can make Slashdot easier to write, but can't fully replace the man-in-the-machine concept.
  • by guttentag (313541)
    The anonymous reader who submitted the story must be new here. The only automaton-written stories on this site are marked "Slashdot TV."
  • by Dr. Tom (23206) <tomh@nih.gov> on Saturday May 12, 2012 @09:35AM (#39978373) Homepage

    "${subject} ${verb} ${object}," said a source inside the ${CurrentPresident} ${administration} who spoke on condition of anonymity because they were not authorized to speak to the press.

    • The word "administration" is a variable, not a part of the constant strings?
    • Yahoo! Finance has been using this sort of AI for a long time now.

      Stock goes up after earnings: "XXX went up after posting 20% higher profit"
      Stock goes down again a few minutes later: "XXX went down after posting earnings that were lower than analysts expected"
      Stock goes back up again a few minutes later: back to first version
      (I'm not exaggerating, I've seen this happen many times)

      Same thing with "futures pointing up because investors are happy about xxx", then "market opened down because investors are afra

  • A computer can find the first paragraph for every story on the AP Wire and then post a discussion forum to go with it, but a computer can't analyze the story or write and moderate comments that are any good. AI just isn't there. yet. There's a lot to news that still requires people.
    • ...but a computer can't analyze the story or write and moderate comments that are any good.

      What makes you think such a quality is needed or desired in a writer for most newspaper editors?

      • To tell a story, you need to speak with the person who generated the story and decide whether you believe them. Somebody must be a witeness to the event that happened, otherwise there's little to no way to report the story.
  • Easy! (Score:5, Funny)

    by mustafap (452510) on Saturday May 12, 2012 @09:41AM (#39978397) Homepage

    void main (void) {
        printf("First Post!\n");
    }

    • You can automate first post, but you can't automate first post modded to 5.
    • by rrohbeck (944847)

      Fail for not including stdio.h.
      main is supposed to return an int.
      There's no substitution in the string so puts would be way more efficient and would save you precious microseconds in your quest for First Post.

  • Already Been Done (Score:4, Insightful)

    by Kamel Jockey (409856) on Saturday May 12, 2012 @09:42AM (#39978403) Homepage
  • by Anonymous Coward

    with regards to sports news. [slashdot.org].

    Here is a swatch of today's baseball news [statsheet.com], courtesy of writer bots.

  • But will whatever is output make any sense and be verifiable?

    To quote thusly:

    How can the purple yeti be so red,
    Or chestnuts, like a widgeon, calmly groan?
    No sheep is quite as crooked as a bed,
    Though chickens ever try to hide a bone.
    I grieve that greasy turnips slowly march:
    Indeed, inflated is the icy pig:
    For as the alligator strikes the larch,
    So sighs the grazing goldfish for a wig.
    Oh, has the pilchard argued with a top?
    Say never that the parsnip is too weird!
    I tell thee that a wolf-man will not hop
    And no m

    • by Zocalo (252965)
      On the whole, I found this latest work to be somewhat lacking in comparison to earlier works such as "Oh freddled gruntbuggly", but fortunately I don't currently have a poetry appreciation chair available for the full experience.
      • by bmo (77928)

        It is 34 year old computer history.

        You may want to check and see if your geek dues have been paid up.

        --
        BMO

        • You may want to check and see if your geek dues have been paid up.

          --
          BMO

          I tried to use bitcoins but the only think they would accept was a check.

  • We've seen more than we want with regards to claims of copyright over news and information. Copyrights are supposed to be reserved to "creative works." Computer output should not be considered creative works... at least not until AI is advanced enough that machines can think on their own. (I had to write that... our future overlords will read all of this and decide to select me for extermination.)

    Seems like the further we go along, the more absurd these things become.

    • Computer output should not be considered creative works...

      It's not that simple. Presumably, output created purely from factual data isn't, but if the input is already a creative work, then the output is a copyrighted derivative; for example, binaries produced by compilers are still copyrighted by the source code author.

      So, if they write the program to analyze an existing corpus of articles and create new ones based on it (and a database of new facts), I'd say the result could be considered a derivative work, owned by whoever owns the copyright of that corpus.

  • by Snaller (147050) on Saturday May 12, 2012 @10:03AM (#39978505) Journal

    Could have fooled me.

    • by Nerdfest (867930)

      Stories generated by software (especially open source software) from large databases of facts would have the advantage that it would be more likely to be free of bias. I think this sort of story would be still leave room for editorials, opinion pieces, commentaries, suggestions for what went wrong and how to improve things, interviews, and research, so journalists have nothing to worry about. The flavour of some of the major news outfits would have a decidedly different flavour though.

  • Maybe then we won't have to hear about the left or the right or some other such person who has some wild conspiracy to destroy the country with their agenda.
  • Expert Systems are gradually making all but the top thinkers obsolete. No longer will being really really smart or really really talented be enough. Computerized CNC machines have already replaced cabinet makers. Measure your space, put numbers & wood in and out comes easy to assemble cabinet parts that fit perfect. Human beings become interchangeable cogs that just push buttons. In the Jetsons that mean you didn't work all that much, but in real life we can't imagine paying someone who doesn't work (un
    • Those are just tools to enhance productivity. This, spammers will be all over this. There's already a significant cottage industry in getting topics and writing articles for them to game the search engines, what these guys do then is "spin" the articles into hundreds of other articles by swapping around the sentences. One can earn a few dollars per article, with careful rules about the number of keywords and their placement. If the process becomes automated I reckon it will throw search engine results into

    • by unitron (5733)

      Anyway... What are we going to do with all these people?

      There was a movie about that. I think it was the only time Edward G. Robinson did SciFi.

  • I have to say that for the past two years news stories have been degrading in quality and actual syntax or structure; going from cohesive chunks of texts with start and finish to simple conglomerates of non connecting paragraphs. I do not know if this is; (a) a result of journalists becoming more and more lazy due to the high output of stories they need to pump, (b) they are becoming more and more retarded as national education deteriorates, or (c) the technology is already out there and they are keeping co

  • Did anyone else mentally read this summary in General Hammond from Stargate's voice? :D

  • by RyanFenton (230700) on Saturday May 12, 2012 @10:29AM (#39978641)

    When I worked a bit at EA, as a gameplay programmer on the Tiger Woods PGA Tour 2010 project, one of the things I worked on the scripting/event/audio system that makes the announcers react to the player's actions.

    The main task of such an event engine is, working with a finite pool of reactions, it knows what it has said over a given time period, and tuning it so it doesn't repeat a phrase too often, and using it to fill as much 'empty air' as we can while it hasn't reached an annoying threshold.

    The problem in that case, of course is that we only got to record so many responses with a professional voice actor, and only so much room on the disc.

    With a news response engine, you wouldn't have it respond to everything - you'd have a very specific class of stories used to patch holes, the kind that is already nearly automatic already. Grabbing retweets, say "this person said this about this person", send it to an editor for review, then use it to fill gaps in a web page layout.

    But then you'd still have to balance the rate of repetition of such types of news stories - which is a game of novelty and adaptive tuning.

    It's certainly possible - but given the company, I expect it to be used for a while with lots of embarrassing things the editors miss showing up, until the marketing crew discovers they can use it to inject advertising messages into news stream. This input from several sources gaming the system will lead to it becoming useless over time, leading to it eventually being reinvented independently several times.

    Meanwhile, Fox news will become a 24/7 lottery news channel - you too can become rich! They'll put parts of a lottery number in each commercial, then have the exact same news hosts as now tell people about how much you have to gain, using traditional conservative talking points to bolster the appeal.

    MSNBC? They'll just keep selling airtime to infomercials when they can - they've already become the costs-nothing-to-produce-prison-shows channel.

    Ryan Fenton

  • by mrroot (543673) on Saturday May 12, 2012 @10:44AM (#39978731)
    Automated story writing will be easier once the next updated version of the newspeak dictionary is released. Unfortunately I am a doubleplusungood newspeaker, but at least computer written stories will help me avoid crimethink.
  • The value of a human writer over the dumping of raw data is that the writer, you hope, had taken the time to understand what the facts mean, how they might affect you and what is more or less important among the facts. Also, what "facts" are controversial or just too fanciful to be credited at all.

    I would expect an automated report to have perfect grammar and to relate whatever facts were input, but be devoid of any insight and to have confusing presentation of material and ambiguous statements.

  • The secret sauce in a Slashdot story summary is alarmist imprecision.

    The problem with this venture as a business model is that when you fully automate a human process with no value add, it tips the lack of value-add from painfully obvious to gratingly obvious in some subtle way. The least trace of eau-de-uncanny-valley causes the sleeping princess to finally notice the pea. The pea is then perp-walked out of the castle, and the cycle continues.

    The first thing we do, let's kill all the similes.

    Rooting for t

  • Give the program access to a company's enterprise data warehouse and any other data storage, and have it write an article on the health of the company. Could have some interesting results for investors, auditors and investigators. "This company is a hidden gem" or "This company is so rotten you should be able to smell it in the reception".
  • 5 monkeys. 10 minutes.
  • "Toronto recovers for Toronto, and now Toronto is running the Toronto to the Toronto... it's a Toronto! He's done it!" -Watson writing subroutine 100010101101

    "Toronto recovers for Toronto, and now Toronto is running the Toronto to the Toronto... it's a Toronto! Toronto has done won Toronto! [Watson subroutine 100010101101, please use less pronouns]" -Watson editing subroutine 100010101110

    ...after the game, Toronto happily announced to the cameras that he's going to Toronto.
    • by unitron (5733)

      Someone please re-program Watson to know when to use "less" and when to use "fewer".

  • Eventually all jobs will be done by computers. It's just a matter of time. We will either live in a currency-less society (a la star trek) or all of the currency will be controlled and held by those in charge of the computers.
  • I don't know about anyone else, but if I read an article It's because I want a unique input or take on data already given. If there is going to be some robot just writing a bland article about information already presented, can't they just give me the straight information and be done with it? There will not be any good takes on an event when it is programmed at this stage in time.
  • Anyway, define 'write'.

    Weather Reports, Obituaries, Graduation Notifications -- it's all been thought
    of many decades ago.

  • Foxnews decided to save money by auto-generating the facts also ;-)

  • Cal [scribd.com]

You are in a maze of UUCP connections, all alike.

Working...