Slashdot is powered by your submissions, so send in your scoop

 



Forgot your password?
typodupeerror
×
Books Math Entertainment

Algorithm Aims To Predict Fiction Bestsellers 146

benonemusic writes "Three computer scientists at Stony Brook University in New York believe they have found some rules through a computer program that might predict which fiction books will be successful. Their algorithm had as much as an 84 percent accuracy rate when applied to already published manuscripts in Project Gutenberg and other sources. Among their findings was that more successful books relied on verbs describing thought processes rather than actions and emotions. However, some disagree with the findings. Author Ron Hansen said style is not the key, but instead readers' interest in the topics in the book." There has been work done already on finding the formula for a hit song, and using analytics to craft a blockbuster movie.
This discussion has been archived. No new comments can be posted.

Algorithm Aims To Predict Fiction Bestsellers

Comments Filter:
  • can it explain... (Score:4, Interesting)

    by able1234au ( 995975 ) on Thursday January 09, 2014 @01:17AM (#45904457)

    Perhaps they can explain why Fifty Shades did well despite being badly written.

    There is a danger in this process that we end up with a "Save the cat" problem where everything has to follow a formula
    http://www.slate.com/articles/arts/culturebox/2013/07/hollywood_and_blake_snyder_s_screenwriting_book_save_the_cat.html [slate.com]

  • by plover ( 150551 ) on Thursday January 09, 2014 @01:32AM (#45904513) Homepage Journal

    However, the sample's study makes exactly the same mistake. They used Project Gutenberg as the source, and download counts as a substitute for sales. Sales has one measure: the number of dollars in the cash box at the end of the day. They should be measuring books on the NY Times bestseller list, or the Amazon Top 10 list, which have actually sold for money and are actually popular (fraudulently placed books aside.) And they should be comparing them against books from their own genres, or at least books that had similar attributes.

    I think what they'd really find is that "books that sell well are those that are marketed well", regardless of the words they contain.

    Maybe they could focus on a specific key reviewer: what does Oprah like and not like? Maybe when they cross compile the data from all the books, they will find they've only discovered Oprah's tastes. Which isn't a bad outcome, if they are ultimately trying to discover what kinds of books will be better positioned to make the author money. But I don't think they've come close to predicting fiction "best-sellers" yet.

  • Re:Reading Level (Score:4, Interesting)

    by retchdog ( 1319261 ) on Thursday January 09, 2014 @02:19AM (#45904629) Journal

    It's not just legit donors, either. One of the games these people play is to charge institutions speaking fees for a public appearance, part of which charge is the required purchase of, say, 5,000 books for their library or for "promotional purposes". The institution plays along, sending 90%+ of the books to be pulped the next day, and the speaker's sales stats get bumped. Ridiculous.

  • by symbolset ( 646467 ) * on Thursday January 09, 2014 @02:43AM (#45904697) Journal
    So you haven't been to the movies or read a bestselling book lately? There is no talent to replace.
  • by RabidReindeer ( 2625839 ) on Thursday January 09, 2014 @08:52AM (#45905689)

    Success comes in two flavors.

    Gutenberg is stacked with classics. Stuff that has been successful over a long period of time. Some classics were flops when they were first published and some go periodically in and out of favor.

    The NYT bestseller list, Oprah, et. al. focus on what's popular today. Relatively few books that make those lists will be popular in a century just as many of the bestsellers from Dickens' day would only be known to literary historians. And missing from Gutenberg.

The one day you'd sell your soul for something, souls are a glut.

Working...