Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×
Books Math Entertainment

Algorithm Aims To Predict Fiction Bestsellers 146

benonemusic writes "Three computer scientists at Stony Brook University in New York believe they have found some rules through a computer program that might predict which fiction books will be successful. Their algorithm had as much as an 84 percent accuracy rate when applied to already published manuscripts in Project Gutenberg and other sources. Among their findings was that more successful books relied on verbs describing thought processes rather than actions and emotions. However, some disagree with the findings. Author Ron Hansen said style is not the key, but instead readers' interest in the topics in the book." There has been work done already on finding the formula for a hit song, and using analytics to craft a blockbuster movie.
This discussion has been archived. No new comments can be posted.

Algorithm Aims To Predict Fiction Bestsellers

Comments Filter:
  • by Anonymous Coward

    ... becomes more cold.

  • Sex , drugs, and rock 'n roll.
    • by mcgrew ( 92797 ) *

      Damn, so that's why Nobots is selling so poorly... I forgot to put rock and roll in it! Damn... didn't you tell me before I wrote it?

      • by Quirkz ( 1206400 )

        Ah, mine has all three, and it's still selling poorly. I suspect it's the puns and spoonerisms keeping people away. I mean, how many bestsellers have spoonerisms? Other than Lolita, which is probably the exception that proves the rule.

  • to be made from suckers. No fancy computer program is going to replace actual talent.
    • by symbolset ( 646467 ) * on Thursday January 09, 2014 @02:43AM (#45904697) Journal
      So you haven't been to the movies or read a bestselling book lately? There is no talent to replace.
      • So you haven't been to the movies or read a bestselling book lately? There is no talent to replace.

        Lately? Sturgeon's Law is 50 years old or more.

    • by mcgrew ( 92797 ) *

      No fancy computer program is going to replace actual talent.

      I don't think there's any correlation between talent and success whatever. Wikipedia quotes Stephen King as saying that James Patterson "is a terrible writer, but very successful." I read Patterson's "When the Wind Blows" and wasn't very impressed with his writing, either, especially the switching back and forth between 1st and 3rd person. But almost every time I see a woman with a book it's one of his.

      Asimov's Hugo-winning Foundation trilogy didn'

      • Funny, I have the same opinion of Steven King.

        Well, maybe not "terrible", but there have been some pretty bad moments. And not enough good ones.

        • I forget which King story I read, it involved a woman going to a cabin and being tied to a bed by her lover who dies on top of her. While she's there she starts to have images of things that might happen to her.

          I don't think I finished it, the writing and droning was so bad. I only read it because I had never read King before and someone handed it to me.

          Reminds me of the Family Guy scene where King is in front of his publisher who is asking what book King is working on. King grabs the lamp and says it's ab

        • by mcgrew ( 92797 ) *

          Funny, I have the same opinion of Steven King. Well, maybe not "terrible", but there have been some pretty bad moments.

          Well, I never cared for his genre (horror), but I don't see how anyone could say The Green Mile isn't some great writing. I only read it because I'd seen the movie and a friend had a copy I could borrow (six very skinny volumes). It really sucked me in. Patterson? I write better than him, King kicks my ass..

          Of course, since I wasn't a literature or English major, my opinion of Patterson and

      • I don't think there's any correlation between talent and success whatever. Wikipedia quotes Stephen King as saying that James Patterson "is a terrible writer, but very successful."

        "Terrible writer" is subjective. While I'm sure that initial luck and subsequent promotion have something to do with it, he obviously writes stories that a lot of people like. I think Harry Turtledove is a complete hack of a writer, but I read his alternative history stuff because I like the subject so much.

        wasn't very impressed with his writing, either, especially the switching back and forth between 1st and 3rd person

        You may not like switching back and forth between 1st and 3rd person, but it's not an unusual technique. I like it when it's done well (never read Patterson so I couldn't say if he does it well).

        • by mcgrew ( 92797 ) *

          "Terrible writer" is subjective.

          Well, it is to me but if you're a literature or English major I'd say your opinion carries quite a bit of weight since wikipedia says "From 1966, King studied English at the University of Maine, graduating in 1970 with a Bachelor of Arts in English." So I'd say his opinion (terrible) carries far more weight than mine (not that good).

          he obviously writes stories that a lot of people like.

          Yes, he writes murder mysteries with sex. Women eat those up, he panders to them. Actually,

      • Long ago I determined that having talent was a ticket to poverty.

        I know dozens of highly talented musicians, writers, artists, etc. who make jack from their talent. They play lots of gigs, get art shows, get books published but they make very little money from it. All but one have menial jobs to support their dream of someday succeeding with their talents. The only one who doesn't have a real job is very good at marketing her art and builds sculptures for local businesses. But she only makes about $40K/

      • by Tipa ( 881911 )

        Huh? Asimov originally serialized the Foundation series in Astounding Magazine, for which he was paid quite well.

        Those Golden Age SF pros didn't write a word if they weren't going to be paid for that word. This was their livelihood.

        • by mcgrew ( 92797 ) *

          Yes, the magazine paid him, but when it was published as books (I forgot the name of the publisher) it didn't sell and he received no royalties for the books; the publisher just didn't have the marketing muscle that Doubleday did. Asimov recounts this story in one of his books, I don't remember which one.

      • by hey! ( 33014 )

        I don't think there's any correlation between talent and success whatever. Wikipedia quotes Stephen King as saying that James Patterson "is a terrible writer, but very successful."

        I think you are confusing *craft* with *talent*. Craft, talent and taste are all distinct things. So a talented author can write a sloppy and vulgar book. Likewise an author of little talent can write a tasteful and and technically admirable book. I see this in my writer's group all the time, diligently crafted and thoughtful manuscripts that nobody but their author will ever love. The world of unpublished manuscripts is full of irredeemable garbage, but there are plenty of ambitious, clever, and disc

        • by mcgrew ( 92797 ) *

          One does indeed need both talent and craft for a work to be good. However,

          it's not true you can manufacture success with total swill.

          Pet rocks, mood rings, milli vanilli... all you need is money.

          For example, the first "manufactured" hit band was 60s TV show group THE MONKEES.

          I'm not sure they were the first, but at any rate it was never a secret that they didn't perform their own music; they weren't designed to be a real band, but fiction about a fictional band.

          As to Clancy, I only read one of his books (Re

    • The Black Swan will explain why this research is so ludicrously stupid.
      • Unlikely. Their comparison is the outcome of a popularity contest, which in the terminology that Taleb used is an inhabitant of Mediocrastan. The distribution is relatively smooth as it involves the average opinion of a large population.

        • Taleb's point was that you can see patterns in past behavior that don't necessarily indicate future performance. He even used literary work as one of his first examples.

          Here's the completely predictable: One day, a small movie studio will start pushing its own movies where they explicitly try to make "golden-age" movies that aren't formulaic. They'll become wicked-popular. How do we know this? All of the push-back against formulaic crap!

          Here's the unpredictable: Someone made a formula for movies ba

          • Taleb's point was that you can see patterns in past behavior that don't necessarily indicate future performance. He even used literary work as one of his first examples.

            No, it really wasn't. His point was that the phenomena that we encounter are modeled by two very different types of distribution. In one kind the past is a good predictor of the future because deviations from the norm don't happen. In the other kind the past is a poor predictor of the future because although deviations from the norm don't ha

            • Except "groundbreaking authors" come out of nowhere, and literature experiences sudden extreme changes in what's stylistically popular all the time. They're trying to use this week's popularity contest to predict the next eternity's type of fluff to write; what they're going to do is produce pulp that doesn't sell for very long.

              • True (to the first part).

                But they are not trying to predict the success of authors, they are trying to predict the success of works. Predicting the output of any author would be difficulty, modelling human creativity and all that jazz. But predicting the success of a work is simple(-ish) machine learning. Build a learning bias for style-features in the text and throw an optimisation at it.

                For the second part - when do styles of literature experience sudden extreme changes in popularity? I've seen slow chang

  • Bias and other flaws in the design and statistical analysis.

    Suffering increases every day from the ever increasing Marketing Research and its derivations and accompanying costs. Keep in mind, there are more to costs then just money.

  • by ackthpt ( 218170 ) on Thursday January 09, 2014 @12:27AM (#45904315) Homepage Journal

    Is for the enjoyment like article much very.

    Posted by Comment Bot v1.0, Universe Algorithms, division 9 Sirius Cybernetics Corporation.

  • How Jackie Collins sells so many books [wikipedia.org]? She uses too many verbs? I thought it was about the overly dripping romance themes that women seem to like?!?!

    • by icebike ( 68054 )

      Don't forget: Successful books relied on:
         

      verbs describing

      .

      All this time I thought adjectives described. Silly me. No wonder my great novel failed.

      • Re: (Score:3, Informative)

        by Anonymous Coward

        Don't forget: Successful books relied on:

        verbs describing

        .

        All this time I thought adjectives described. Silly me. No wonder my great novel failed.

        If that's what you thought then yes, that's probably one of your problems. Compare the following sentences:
        "He pitched the ball."
        "He hurled the ball."
        "He tossed the ball."
        "He lobbed the ball."
        "He chucked the ball."

        Where's the adjective to describe the manner in which the ball moved? There isn't one. The verb gives you the description of HOW the ball moved.
        In direct contradiction to this "algorithm", stronger writers tend to rely more on descriptive verbs, weaker writers tend to rely on less descriptive word

      • All this time I thought adjectives described.

        Never mention grammar on Slashdot. It'll bring out more responses than a programming language flame war.

        P.S. That's why I always got a laugh out of the stereotype that engineers and programmers are semi-literate. My experience is that many are sticklers for the language, and that's not just limited to grammar.

        • My experience is that many are sticklers for the language, and that's not just limited to grammar.

          Many are.

          And others can't tell the difference between "lose" and "loose", or "they're" and "their" and "there", or "where" and "wear", or "your" and "you're".

          Those aren't exactly uncommon mistakes on /.

          • I always try to be careful about such things, but those differences are strictly about the stupidity of spelling in the English language. I think bad spellers are mostly people who believed their teacher's claims that English is more than vaguely phonetic. I also think some "rebels" should get together, decide on a single spelling for each set of homophones, and tell everybody else to go screw themselves. No, I haven't had the guts to do it myself yet.

  • Reading Level (Score:5, Informative)

    by TubeSteak ( 669689 ) on Thursday January 09, 2014 @12:30AM (#45904325) Journal

    They began their research with Project Gutenberg, a database of 44,500 books in the public domain. A book was considered successful when it was critically acclaimed and had a high download count. The books chosen for analysis represented all genres of literature, from science fiction to poetry.

    Then, they added some books not in the Gutenberg database, including Charles Dickens' "Tale of Two Cities," and Ernest Hemingway's "The Old Man and the Sea." They also added Dan Brown's latest novel, "The Lost Symbol," and books that have won the Pulitzer Prize, the National Book Award, and other awards.

    Nowadays, marketing and signalling has as much to do with sales as anything else.
    I imagine that if some publisher could make the kind of advertising push that Bill O'Reilley does,
    they could put anything onto the NYTimes best seller list too.

    • All books written by politically active people like O'Reilley are nothing more than slush funds to funnel money towards a particular party or candidate. The Clintons have done it, Sarah Palins a master of it... Your donors buy up your books, giving you fame, getting the press to talk about you... and then "donate" them to fund-raisers who "Give" them away to donors. It looks like you sold lots of books, your all over the news because of it but no-ones reading the book, not even the anchors claiming to inter

      • Re:Reading Level (Score:4, Interesting)

        by retchdog ( 1319261 ) on Thursday January 09, 2014 @02:19AM (#45904629) Journal

        It's not just legit donors, either. One of the games these people play is to charge institutions speaking fees for a public appearance, part of which charge is the required purchase of, say, 5,000 books for their library or for "promotional purposes". The institution plays along, sending 90%+ of the books to be pulped the next day, and the speaker's sales stats get bumped. Ridiculous.

      • by cffrost ( 885375 )

        [...] Sarah Palins a master of it...

        Sarah Palin's handler(s)/management (team), more likely. We're talking about a person who thought the 2003 invasion of Iraq was (to paraphrase) "revenge for 9/11," or some such nonsense. In other words, I "betcha" there's little acumen of any utility rattling around in that skull of hers.

        God I hate marketing.

        I hope for all exposed beings to possess the wherewithal to resist for-profit and political propaganda in all of its forms, and manipulation therefrom, particularly anything shat out by the United States' six-headed corpora

    • They began their research with Project Gutenberg, a database of 44,500 books in the public domain.
      Then, they added some books not in the Gutenberg database, including Charles Dickens' "Tale of Two Cities," and Ernest Hemingway's "The Old Man and the Sea." They also added...books that have won the Pulitzer Prize, the National Book Award, and other awards.

      How does Project Gutenberg select its texts?

      A book was considered successful when it was critically acclaimed and had a high download count.

      "Critically acclaimed" by who and when?

      How many of the most downloaded titles are on academically "required" or "recommended" reading lists?

      The prize-winner can sometimes tell you more about the internal and external dynamics of the judging than the quality of the book,

    • Tale of Two Cities is in Gutenberg. That's where I read it from.

      Marketing never hurts, but the advent of minimal-cost publishing via ebooks also has helped some authors. There are several best-selling authors who started out as "dollar discounts" from one of the e-publishers.

      • Tale of Two Cities is in Gutenberg. That's where I read it from.

        Charles Dickens, Mark Twain and others were heavily marketed in the 19th century. It's not a 20th century invention. Speaking of Mark Twain, you'll find satire about advertising in "A Connecticut Yankee in King Arthur's Court". Thanks to the protagonist, there were knights running around with advertisements for toothpaste on their suits of armor.

    • And books on Project Guttenberg have more to do with which are on high school reading lists than anything else. I'd say 90% of the reading I've done of public domain books/peoms was done for assignments.

  • I was about to say that this speaks poorly of the breadth of the current generation's literary interests, and then I recalled books like Little Women and Lord of the Files, or even Arthur C. Clarke's Childhood's End (although the Rama series might be more about descriptions than emotional exposes). Still, it's a little disheartening that technical manuals don't hit the bestseller lists. On the upside, Noam Chomsky will be overjoyed by this development; soon software systems will be developed to 'generate'
    • Re:Stagnation (Score:5, Insightful)

      by noh8rz10 ( 2716597 ) on Thursday January 09, 2014 @12:49AM (#45904387)

      On the upside, Noam Chomsky will be overjoyed by this development; soon software systems will be developed to 'generate' hit books. Someone get Angelina (Mike Cook's, not Pitt's).

      I see, so Angelina Jolie used to be an academy-award-winning actress, but now she's just Mrs. Pitt?

      • I see, so Angelina Jolie used to be an academy-award-winning actress, but now she's just Mrs. Pitt?

        She's an aging sack of bad plastic surgery who's been in too many terrible movies. A pretty good match for her hubby at that.

        • I see, so Angelina Jolie used to be an academy-award-winning actress, but now she's just Mrs. Pitt?

          She's an aging sack of bad plastic surgery who's been in too many terrible movies. A pretty good match for her hubby at that.

          That's why they didn't take each other's names :)

          But seriously son she's the mother of six children, so stop being a douche

  • We all know advertising and product placement can make a big difference and return on investment, so what about including paid for marketing and tv show plugs into the modelling? Nothing can be successful if no one has heard of it.
  • by MacTO ( 1161105 ) on Thursday January 09, 2014 @12:44AM (#45904369)

    Two quotes stand out for me:

    "It's very difficult to quantify decisions that are often made by intuition and relationships."

    The study claims that at least some of those decisions are quantifiable, which pretty much contradicts Hamilburg's point.

    "Of stylistic characteristics, the scientists are flying in the face of most teaching of creative writing when they emphasize nouns over verbs. Verbs are the engine of fiction and quality writing is often measured by their variety, precision, and force,"

    Hansen appears to have missed the point of the study: it is about what sells, rather than what's taught or what makes quality writing.

    • by plover ( 150551 ) on Thursday January 09, 2014 @01:32AM (#45904513) Homepage Journal

      However, the sample's study makes exactly the same mistake. They used Project Gutenberg as the source, and download counts as a substitute for sales. Sales has one measure: the number of dollars in the cash box at the end of the day. They should be measuring books on the NY Times bestseller list, or the Amazon Top 10 list, which have actually sold for money and are actually popular (fraudulently placed books aside.) And they should be comparing them against books from their own genres, or at least books that had similar attributes.

      I think what they'd really find is that "books that sell well are those that are marketed well", regardless of the words they contain.

      Maybe they could focus on a specific key reviewer: what does Oprah like and not like? Maybe when they cross compile the data from all the books, they will find they've only discovered Oprah's tastes. Which isn't a bad outcome, if they are ultimately trying to discover what kinds of books will be better positioned to make the author money. But I don't think they've come close to predicting fiction "best-sellers" yet.

      • by RabidReindeer ( 2625839 ) on Thursday January 09, 2014 @08:52AM (#45905689)

        Success comes in two flavors.

        Gutenberg is stacked with classics. Stuff that has been successful over a long period of time. Some classics were flops when they were first published and some go periodically in and out of favor.

        The NYT bestseller list, Oprah, et. al. focus on what's popular today. Relatively few books that make those lists will be popular in a century just as many of the bestsellers from Dickens' day would only be known to literary historians. And missing from Gutenberg.

        • by plover ( 150551 )

          I was commenting based on the title of the articles discussing the study: "Algorithm aims to predict fiction bestsellers"; and "Computer Algorithm Seeks to Crack Code of Fiction Bestsellers". The strong implications are that the algorithm is designed to unlock the secret of making money by writing books that contain certain words or linguistic structures. I'm arguing that a book's financial success has much less to do with any ephemeral "bestsellerness" quality, and has a much stronger association with "m

          • Tl;dr: marketing wins.

            Well, I think it would be more fair to say marketing and current fads win. A bestselling author may not need to do any marketing at all, other than mentioning, "By the way, I'm coming out with another book," and it will probably still sell well. Books about famous people or written by celebrities will also often sell, regardless of whether they are marketed heavily. Similarly, books about current fads (diets, financial advice, etc.) may also sell pretty well -- the first book regarding a fad may need som

            • by plover ( 150551 )

              I just read the first few pages of the study, and it seems the authors tried to control for the "fame of the author" aspect as much as they could, with things like excluding a second text by the same author in the same genre, that sort of approach. And as suspected, the study is much more modest than the article titles suggest. They are looking for "success" as defined by their own criteria, not "money" or "bestsellers".

              But it was the marketing hype that got me to read a study by some random researchers.

        • Gutenberg is stacked with classics. Stuff that has been successful over a long period of time. Some classics were flops when they were first published and some go periodically in and out of favor.

          Or, in other words, what counts as a "classic" right now is simply what's popular today. I think the trends can be better seen in music history. Take, for example, Pachelbel's Canon in D [wikipedia.org], that piece which seemingly shows up everywhere as "classical music." Johann Pachelbel, however, was a master composer [wikipedia.org], well-known in his lifetime for all sorts of compositions. Today he has one stupid piece played at thousands of weddings and other occasions every year, just because of some whims of audiences in the la

  • Comment removed based on user account deletion
  • can it explain... (Score:4, Interesting)

    by able1234au ( 995975 ) on Thursday January 09, 2014 @01:17AM (#45904457)

    Perhaps they can explain why Fifty Shades did well despite being badly written.

    There is a danger in this process that we end up with a "Save the cat" problem where everything has to follow a formula
    http://www.slate.com/articles/arts/culturebox/2013/07/hollywood_and_blake_snyder_s_screenwriting_book_save_the_cat.html [slate.com]

    • by bob_super ( 3391281 ) on Thursday January 09, 2014 @01:44AM (#45904543)

      50 shades is a textbook example of a perfect marketing campaign. It cannot fit an algorithm, it's a total outlier.

      They sent out press releases to all the agencies about the new phenomena of women using the wonderful anonymity of e-readers/tablets to read Mommy porn, like that "50 shades" thing.
      Journalists just repeated the press releases, over and over again, almost exactly word for word, on various networks, because that's a topic that draws viewer attention.

      And suddenly everyone knew that apparently a lot of people were reading that "50 shades" book, and that reading it was both cool and risqué. Jackpot.

      I read one page of the book that was published on a website. It was worse than the transcript of a reality TV show. it wasn't just bad literature, it was barely passable English.
      But the marketing was absolutely brilliant.

      • That's classic. I would prefer to read the book on the marketing campaign. It is original, brilliantly executed and delivered results. Forget the original book.

      • Or Harry Potter which as Rowling was constantly told, had none of the right tick boxes ticked and many of the 'avoid' ones ticked. Didn't end up doing too badly as I recall after the first few dozen rejections. To be fair, that probably falls into the ~15% 'got it wrong' region the story mentions.
        • I am really surprised at this. I really like the series, but I would never consider it anything other than a somewhat bland very easy read. I think they need to review their formula, because I think HP is a text book example of a mass marketable, guilty pleasure/easy read, that everyone can enjoy.

          • Reason's she had were it was far too long, kids books don't make money etc. It only got published as a favour after the publisher's 8yo daughter got sight of the manuscript and pestered him for more. The initial print run was 500 copies, they really weren't expecting it to sell.
      • Porn works like that. Have you ever landed on a porn video site? Most of the videos have no story, and show even less acting ability or camera skills.

        But you know what? Nobody cares! It's the same with 50 shades, people don't read it because it's art. Women read it to get ideas and phantasies. And to be honest most porn sites don't cater to women, so they have a limited choice in the matter.

      • 50 shades is a textbook example of a perfect marketing campaign. It cannot fit an algorithm, it's a total outlier.

        I suspect that, almost by definition, many best-sellers are outliers. They owe their popularity to marketing, the whims of the book-buying public, what's currently trendy, etc. Like 50 shades of grey, they likely won't succumb to an algorithm.

    • If you think Fifty Shades was bad you need to read Naked Came the Stranger, [wikipedia.org] a best seller with absolutely no literary or social merit that became even more popular when it was revealed to be a hoax.
    • by hey! ( 33014 )

      I read Snyder's book because he was a friend of a friend. First off, it's not about *everything*. It's about movie scripts. Secondly it's a bit naive to blame the lack of creativity of modern movies on his book; that's a trend that predates 2005.

      In any case screenwriters are nothing like the olympian figures playwrights are in theater. The main creative force in a movie is the director, and writers are relatively minor figures in the enterprise. In the theater the script is gospel. In the movies a direc

      • What Snyder did for screenplays was fine. He helped writers understand a structure. The problem was management being risk averse and insisting that all movies follow the Cat even in some cases to the exact minute. Management not knowing their industry and so constraining script writers from doing what they do best. It is the same problem where one successfully movie comes out and suddenly there are lots of copies in the same genre simply because that is lower risk. None of this is Snyder's fault. One

    • Perhaps they can explain why Fifty Shades did well despite being badly written.

      Because people like reading about sex. That's also why romance novels routinely feature good-looking half-naked people on the cover.

      Think of it this way: If the movie is about sex, we'll put up with inane dialog, completely predictable plots, and wooden acting, just to watch a couple of people we'll never meet get it on. Why would you expect books to be much different?

      • I get that, but 50 shades is badly written sex. There is no shortage of better written books that will steam up your glasses.

    • The algorithm would be trying to guess how well the book would do on the market, not how well it was written.
      How well a book is written has little to do with how many copies you can sell of it.

  • Look for modern fiction to adjust to fit the parameters of the application, degrading to a common level and uniform format. The literature cannot be observed without being altered. It will be lot like the mandatory movie formula [slate.com]. The content itself is irrelevant.
  • by OhANameWhatName ( 2688401 ) on Thursday January 09, 2014 @01:50AM (#45904563)
    1. Read the algorithm
    2. Write a book
    3. Profit!!!

    I just wrote an algorithm that predicts that no book detailing the death of creativity at the hands of science will ever be written.
  • by speedplane ( 552872 ) on Thursday January 09, 2014 @01:59AM (#45904583) Homepage
    Does this article make everyone else as sick as it makes me?
    • Re:Uck (Score:5, Funny)

      by symbolset ( 646467 ) * on Thursday January 09, 2014 @02:51AM (#45904711) Journal
      Nowhere does it mention the one weird trick that effortlessly melts away the pounds in six minutes while you sleep - that the government doesn't want you to know because it creates instant wealth for the few who know this secret.
    • Does this article make everyone else as sick as it makes me?

      Nope, I got no idea what you are talking about. In fact, I found it pleasant.

      Acknowledging large shortcomings of their study, the one thing they seem to find was that if you want your fiction book to remain popular with a broad audience, you should take my middle school English teacher's advice and show don't tell.

      They came up with no magic: "save the cat" formulas to make hits and the industry expert says that this study won't help him much, stories still too complex to predict best sellers.

      Further, they p

  • These things don't actually work. They're curiosities and nothing more.

    When they finally develop strong AI... then you might have something. But a non-intelligent system is not going to figure these things out.

  • Preceding any great scientific advancement or discovery it is no accident that you will find a surge in the fiction and cultural themes surrounding it.

    The New World, Forensics, Avionics, Electronic Computing, Nuclear Reaction, Rocketry, Robotics.

    The cultural mind thinks as you do. Its subconscious boils with the direction it will soon take. Ask yourself: What is seen much more now in your culture? What makes you think you have any choice but to latch onto any thoughts but those which come to mind from wi

  • http://bit.ly/1dgDo7d [bit.ly] . Come on slashdot editors, do the legwork and link the article directly! Otherwise people will post a link in the comments, and who's to say it's not a goatse?

    Anyway, I'm a little worried about the methodology. If you train on PG, and test on PG your generalization error will suffer. This is especially easy to get wrong when both the train and test set are constructed repeatedly with various thresholding rules, and the classifier features are (presumably) optimized during the resear

    • You're a cunt for linking to a URL shortener instead of the article directly. You're as bad as the shithead editors. If you're worried about goatse or similar, don't use some shit URL shortener.

      http://aclweb.org/anthology/D/D13/D13-1181.pdf [aclweb.org]

      There. Easy. Why couldn't you have done that you high-horsed cunt?

    • by mcgrew ( 92797 ) *

      http://bit.ly/1dgDo7d . Come on slashdot editors, do the legwork and link the article directly!

      Come on, martin, do the legwork and link [aclweb.org] it directly. This isn't twitter and most folks are wary of shortened links; trolls love hiding their goatse and tubgirl links. I only clicked it because your UID is relatively low and you hadn't (yet) been modded down.

  • It's already been done - though only in fiction.

    Roald Dahl wrote about a machine called the Great Automatic Grammatizator. A machine that you plug in various parameters - such as type of book, characters, proportions of violence/sex/humour - and it churns out something that's pretty much guaranteed to be a bestseller according to those parameters in fifteen minutes flat. Being a writer himself - and a somewhat dark one at that - the end result was a dystopian universe in which writers were forced to give up

  • Remember, there's a HUGE difference between successful and "good".

    "Successful" means appealing to the dozen or so big publishers' editors, such that they are willing to pimp your book and market it. They can - and have, obviously - taken utter crapola to the top of the "bestseller" lists.

    I entirely understand that the algorithm favors deep internal monologues, because those editors clearly love them.

    • there's a HUGE difference between successful and "good"

      "Good" is subjective. It's some sort of consensus amongst people who are, for whatever reason, considered literary experts. Consider the "classics". Some are good and some suck. I tried to read "Moby Dick" and found the perfect cure for insomnia. People said "just get past all the boring and extraneous stuff". Sorry, but if a book is full of boring and extraneous stuff, then it's not a good book. Maybe it would have been if Melville had had a good editor. OTOH some classics are great. I just read "All Quiet

  • Their algorithm had as much as an 84 percent accuracy rate when applied to already published manuscripts

    I could write an algorithm that's 100% accurate selecting yesterday's lottery numbers.

    • I could write an algorithm that's 100% accurate selecting yesterday's lottery numbers.

      That's why data analyst's cross-validate their models. Granted, cross-validation doesn't cure everything (e.g. If the question is already overly specific, or if the analyst double dips in some other way) but it will stop over-fitting and performing at 100%. I downloaded the paper and did a quick search: the authors used a support vector machine for the classification (which effectively allows for fitting of very non-linear boundaries) and they tested it with 5-fold cross-validation. So they given that the

  • by ai4px ( 1244212 ) on Thursday January 09, 2014 @10:05AM (#45906049)
    A blockbuster movie? Space, cowboys, roughnecks, scenes of things blowing up, impending doom saved at the last minute and a guy who doesn't make it home and leaves behind a beautiful girl. Oh and crazy Russians. Perfect formula. A blockbuster song? repeating lyrics which drone on and a drum machine. The public just seems to love it this way!
  • What the algorithm looked at was writing style. That's hardly new. Teachers have been recommending this or that writing style, probably since the preferred medium was stone tablets. Slavish devotion to such recommendations is obviously undesirable, and a few outliers and experiments are necessary if you don't want writing styles to become stultified. But taking some advice about it is nothing new or undesirable. This study said nothing about structure (for which there are also standard recommendations) or s

  • If you read the article they're not really examining best sellers at all. A site like Gutenberg has no correlation with modern best sellers.

    Film, TV and Internet have all had drastic effects on the market as well. Thus old books aren't really representative.

  • Writers aren't going to spend effort to create a well written book about subjects people aren't interested in.

  • Get something that, krufted up, will work... and the publishers will use it, rather than have readers decide what should be published. You like the crap packaged as "music" from the members of the RIAA? You'll see that in books, too....

                      mark

  • "[they believe they have found an algorithm that might] predict which fiction books will be successful. Their algorithm had as much as an 84 percent accuracy rate when applied to already published manuscripts in Project Gutenberg and other sources."

    I can predict the success rate of already published books with 100% accuracy.

    Backtesting is usually bogus because it means nothing unless the experimenter can precisely enumerate the total number of rules that were formulated and discarded--including those formul

"It's a dog-eat-dog world out there, and I'm wearing Milkbone underware." -- Norm, from _Cheers_

Working...