Algorithm Aims To Predict Fiction Bestsellers 146
benonemusic writes "Three computer scientists at Stony Brook University in New York believe they have found some rules through a computer program that might predict which fiction books will be successful. Their algorithm had as much as an 84 percent accuracy rate when applied to already published manuscripts in Project Gutenberg and other sources. Among their findings was that more successful books relied on verbs describing thought processes rather than actions and emotions. However, some disagree with the findings. Author Ron Hansen said style is not the key, but instead readers' interest in the topics in the book." There has been work done already on finding the formula for a hit song, and using analytics to craft a blockbuster movie.
And modern life ... (Score:1)
... becomes more cold.
topic (Score:1)
Re: (Score:2)
Damn, so that's why Nobots is selling so poorly... I forgot to put rock and roll in it! Damn... didn't you tell me before I wrote it?
Re: (Score:2)
Ah, mine has all three, and it's still selling poorly. I suspect it's the puns and spoonerisms keeping people away. I mean, how many bestsellers have spoonerisms? Other than Lolita, which is probably the exception that proves the rule.
There is so much money (Score:1)
Re:There is so much money (Score:4, Interesting)
Re: (Score:2)
So you haven't been to the movies or read a bestselling book lately? There is no talent to replace.
Lately? Sturgeon's Law is 50 years old or more.
Re: (Score:3)
No fancy computer program is going to replace actual talent.
I don't think there's any correlation between talent and success whatever. Wikipedia quotes Stephen King as saying that James Patterson "is a terrible writer, but very successful." I read Patterson's "When the Wind Blows" and wasn't very impressed with his writing, either, especially the switching back and forth between 1st and 3rd person. But almost every time I see a woman with a book it's one of his.
Asimov's Hugo-winning Foundation trilogy didn'
Re: (Score:2)
Funny, I have the same opinion of Steven King.
Well, maybe not "terrible", but there have been some pretty bad moments. And not enough good ones.
Re: (Score:2)
I forget which King story I read, it involved a woman going to a cabin and being tied to a bed by her lover who dies on top of her. While she's there she starts to have images of things that might happen to her.
I don't think I finished it, the writing and droning was so bad. I only read it because I had never read King before and someone handed it to me.
Reminds me of the Family Guy scene where King is in front of his publisher who is asking what book King is working on. King grabs the lamp and says it's ab
Re: (Score:2)
Funny, I have the same opinion of Steven King. Well, maybe not "terrible", but there have been some pretty bad moments.
Well, I never cared for his genre (horror), but I don't see how anyone could say The Green Mile isn't some great writing. I only read it because I'd seen the movie and a friend had a copy I could borrow (six very skinny volumes). It really sucked me in. Patterson? I write better than him, King kicks my ass..
Of course, since I wasn't a literature or English major, my opinion of Patterson and
Re: (Score:2)
I don't think there's any correlation between talent and success whatever. Wikipedia quotes Stephen King as saying that James Patterson "is a terrible writer, but very successful."
"Terrible writer" is subjective. While I'm sure that initial luck and subsequent promotion have something to do with it, he obviously writes stories that a lot of people like. I think Harry Turtledove is a complete hack of a writer, but I read his alternative history stuff because I like the subject so much.
wasn't very impressed with his writing, either, especially the switching back and forth between 1st and 3rd person
You may not like switching back and forth between 1st and 3rd person, but it's not an unusual technique. I like it when it's done well (never read Patterson so I couldn't say if he does it well).
Re: (Score:2)
"Terrible writer" is subjective.
Well, it is to me but if you're a literature or English major I'd say your opinion carries quite a bit of weight since wikipedia says "From 1966, King studied English at the University of Maine, graduating in 1970 with a Bachelor of Arts in English." So I'd say his opinion (terrible) carries far more weight than mine (not that good).
he obviously writes stories that a lot of people like.
Yes, he writes murder mysteries with sex. Women eat those up, he panders to them. Actually,
Re: (Score:2)
Long ago I determined that having talent was a ticket to poverty.
I know dozens of highly talented musicians, writers, artists, etc. who make jack from their talent. They play lots of gigs, get art shows, get books published but they make very little money from it. All but one have menial jobs to support their dream of someday succeeding with their talents. The only one who doesn't have a real job is very good at marketing her art and builds sculptures for local businesses. But she only makes about $40K/
Re: (Score:3)
Huh? Asimov originally serialized the Foundation series in Astounding Magazine, for which he was paid quite well.
Those Golden Age SF pros didn't write a word if they weren't going to be paid for that word. This was their livelihood.
Re: (Score:2)
Yes, the magazine paid him, but when it was published as books (I forgot the name of the publisher) it didn't sell and he received no royalties for the books; the publisher just didn't have the marketing muscle that Doubleday did. Asimov recounts this story in one of his books, I don't remember which one.
Re: (Score:2)
I don't think there's any correlation between talent and success whatever. Wikipedia quotes Stephen King as saying that James Patterson "is a terrible writer, but very successful."
I think you are confusing *craft* with *talent*. Craft, talent and taste are all distinct things. So a talented author can write a sloppy and vulgar book. Likewise an author of little talent can write a tasteful and and technically admirable book. I see this in my writer's group all the time, diligently crafted and thoughtful manuscripts that nobody but their author will ever love. The world of unpublished manuscripts is full of irredeemable garbage, but there are plenty of ambitious, clever, and disc
Re: (Score:2)
One does indeed need both talent and craft for a work to be good. However,
it's not true you can manufacture success with total swill.
Pet rocks, mood rings, milli vanilli... all you need is money.
For example, the first "manufactured" hit band was 60s TV show group THE MONKEES.
I'm not sure they were the first, but at any rate it was never a secret that they didn't perform their own music; they weren't designed to be a real band, but fiction about a fictional band.
As to Clancy, I only read one of his books (Re
Re: (Score:2)
Re: (Score:2)
Unlikely. Their comparison is the outcome of a popularity contest, which in the terminology that Taleb used is an inhabitant of Mediocrastan. The distribution is relatively smooth as it involves the average opinion of a large population.
Re: (Score:2)
Taleb's point was that you can see patterns in past behavior that don't necessarily indicate future performance. He even used literary work as one of his first examples.
Here's the completely predictable: One day, a small movie studio will start pushing its own movies where they explicitly try to make "golden-age" movies that aren't formulaic. They'll become wicked-popular. How do we know this? All of the push-back against formulaic crap!
Here's the unpredictable: Someone made a formula for movies ba
Re: (Score:2)
No, it really wasn't. His point was that the phenomena that we encounter are modeled by two very different types of distribution. In one kind the past is a good predictor of the future because deviations from the norm don't happen. In the other kind the past is a poor predictor of the future because although deviations from the norm don't ha
Re: (Score:2)
Except "groundbreaking authors" come out of nowhere, and literature experiences sudden extreme changes in what's stylistically popular all the time. They're trying to use this week's popularity contest to predict the next eternity's type of fluff to write; what they're going to do is produce pulp that doesn't sell for very long.
Re: (Score:2)
True (to the first part).
But they are not trying to predict the success of authors, they are trying to predict the success of works. Predicting the output of any author would be difficulty, modelling human creativity and all that jazz. But predicting the success of a work is simple(-ish) machine learning. Build a learning bias for style-features in the text and throw an optimisation at it.
For the second part - when do styles of literature experience sudden extreme changes in popularity? I've seen slow chang
New Coke/New Waists/New Privacy Invastions/New ETC (Score:1)
Bias and other flaws in the design and statistical analysis.
Suffering increases every day from the ever increasing Marketing Research and its derivations and accompanying costs. Keep in mind, there are more to costs then just money.
Automated response (Score:5, Funny)
Is for the enjoyment like article much very.
Posted by Comment Bot v1.0, Universe Algorithms, division 9 Sirius Cybernetics Corporation.
Re: (Score:2)
Actually, it's syntactically perfect PostScript. :-)
So does this explain... (Score:2)
How Jackie Collins sells so many books [wikipedia.org]? She uses too many verbs? I thought it was about the overly dripping romance themes that women seem to like?!?!
Re: (Score:2)
Don't forget: Successful books relied on:
verbs describing
.
All this time I thought adjectives described. Silly me. No wonder my great novel failed.
Re: (Score:2)
You're welcome.
Re: (Score:3, Informative)
Don't forget: Successful books relied on:
verbs describing
.
All this time I thought adjectives described. Silly me. No wonder my great novel failed.
If that's what you thought then yes, that's probably one of your problems. Compare the following sentences:
"He pitched the ball."
"He hurled the ball."
"He tossed the ball."
"He lobbed the ball."
"He chucked the ball."
Where's the adjective to describe the manner in which the ball moved? There isn't one. The verb gives you the description of HOW the ball moved.
In direct contradiction to this "algorithm", stronger writers tend to rely more on descriptive verbs, weaker writers tend to rely on less descriptive word
Re: (Score:2)
All this time I thought adjectives described.
Never mention grammar on Slashdot. It'll bring out more responses than a programming language flame war.
P.S. That's why I always got a laugh out of the stereotype that engineers and programmers are semi-literate. My experience is that many are sticklers for the language, and that's not just limited to grammar.
Re: (Score:2)
Many are.
And others can't tell the difference between "lose" and "loose", or "they're" and "their" and "there", or "where" and "wear", or "your" and "you're".
Those aren't exactly uncommon mistakes on /.
Re: (Score:2)
I always try to be careful about such things, but those differences are strictly about the stupidity of spelling in the English language. I think bad spellers are mostly people who believed their teacher's claims that English is more than vaguely phonetic. I also think some "rebels" should get together, decide on a single spelling for each set of homophones, and tell everybody else to go screw themselves. No, I haven't had the guts to do it myself yet.
Re: (Score:2)
That is ridiculous.
Reading Level (Score:5, Informative)
They began their research with Project Gutenberg, a database of 44,500 books in the public domain. A book was considered successful when it was critically acclaimed and had a high download count. The books chosen for analysis represented all genres of literature, from science fiction to poetry.
Then, they added some books not in the Gutenberg database, including Charles Dickens' "Tale of Two Cities," and Ernest Hemingway's "The Old Man and the Sea." They also added Dan Brown's latest novel, "The Lost Symbol," and books that have won the Pulitzer Prize, the National Book Award, and other awards.
Nowadays, marketing and signalling has as much to do with sales as anything else.
I imagine that if some publisher could make the kind of advertising push that Bill O'Reilley does,
they could put anything onto the NYTimes best seller list too.
Re: (Score:3)
All books written by politically active people like O'Reilley are nothing more than slush funds to funnel money towards a particular party or candidate. The Clintons have done it, Sarah Palins a master of it... Your donors buy up your books, giving you fame, getting the press to talk about you... and then "donate" them to fund-raisers who "Give" them away to donors. It looks like you sold lots of books, your all over the news because of it but no-ones reading the book, not even the anchors claiming to inter
Re:Reading Level (Score:4, Interesting)
It's not just legit donors, either. One of the games these people play is to charge institutions speaking fees for a public appearance, part of which charge is the required purchase of, say, 5,000 books for their library or for "promotional purposes". The institution plays along, sending 90%+ of the books to be pulped the next day, and the speaker's sales stats get bumped. Ridiculous.
Re: (Score:2)
[...] Sarah Palins a master of it...
Sarah Palin's handler(s)/management (team), more likely. We're talking about a person who thought the 2003 invasion of Iraq was (to paraphrase) "revenge for 9/11," or some such nonsense. In other words, I "betcha" there's little acumen of any utility rattling around in that skull of hers.
God I hate marketing.
I hope for all exposed beings to possess the wherewithal to resist for-profit and political propaganda in all of its forms, and manipulation therefrom, particularly anything shat out by the United States' six-headed corpora
Re: (Score:2)
They began their research with Project Gutenberg, a database of 44,500 books in the public domain.
Then, they added some books not in the Gutenberg database, including Charles Dickens' "Tale of Two Cities," and Ernest Hemingway's "The Old Man and the Sea." They also added...books that have won the Pulitzer Prize, the National Book Award, and other awards.
How does Project Gutenberg select its texts?
A book was considered successful when it was critically acclaimed and had a high download count.
"Critically acclaimed" by who and when?
How many of the most downloaded titles are on academically "required" or "recommended" reading lists?
The prize-winner can sometimes tell you more about the internal and external dynamics of the judging than the quality of the book,
Re: (Score:2)
Tale of Two Cities is in Gutenberg. That's where I read it from.
Marketing never hurts, but the advent of minimal-cost publishing via ebooks also has helped some authors. There are several best-selling authors who started out as "dollar discounts" from one of the e-publishers.
Re: (Score:2)
Tale of Two Cities is in Gutenberg. That's where I read it from.
Charles Dickens, Mark Twain and others were heavily marketed in the 19th century. It's not a 20th century invention. Speaking of Mark Twain, you'll find satire about advertising in "A Connecticut Yankee in King Arthur's Court". Thanks to the protagonist, there were knights running around with advertisements for toothpaste on their suits of armor.
Re: (Score:2)
And books on Project Guttenberg have more to do with which are on high school reading lists than anything else. I'd say 90% of the reading I've done of public domain books/peoms was done for assignments.
Stagnation (Score:2)
Re:Stagnation (Score:5, Insightful)
On the upside, Noam Chomsky will be overjoyed by this development; soon software systems will be developed to 'generate' hit books. Someone get Angelina (Mike Cook's, not Pitt's).
I see, so Angelina Jolie used to be an academy-award-winning actress, but now she's just Mrs. Pitt?
Re: (Score:2)
I see, so Angelina Jolie used to be an academy-award-winning actress, but now she's just Mrs. Pitt?
She's an aging sack of bad plastic surgery who's been in too many terrible movies. A pretty good match for her hubby at that.
Re: (Score:2)
I see, so Angelina Jolie used to be an academy-award-winning actress, but now she's just Mrs. Pitt?
She's an aging sack of bad plastic surgery who's been in too many terrible movies. A pretty good match for her hubby at that.
That's why they didn't take each other's names :)
But seriously son she's the mother of six children, so stop being a douche
Re: (Score:2)
Sure they can. It's been happening for millennia. Have you been living under a rock?
What about marketing? (Score:2)
Authors fail to understand ... (Score:3)
Two quotes stand out for me:
"It's very difficult to quantify decisions that are often made by intuition and relationships."
The study claims that at least some of those decisions are quantifiable, which pretty much contradicts Hamilburg's point.
"Of stylistic characteristics, the scientists are flying in the face of most teaching of creative writing when they emphasize nouns over verbs. Verbs are the engine of fiction and quality writing is often measured by their variety, precision, and force,"
Hansen appears to have missed the point of the study: it is about what sells, rather than what's taught or what makes quality writing.
Re:Authors fail to understand ... (Score:5, Interesting)
However, the sample's study makes exactly the same mistake. They used Project Gutenberg as the source, and download counts as a substitute for sales. Sales has one measure: the number of dollars in the cash box at the end of the day. They should be measuring books on the NY Times bestseller list, or the Amazon Top 10 list, which have actually sold for money and are actually popular (fraudulently placed books aside.) And they should be comparing them against books from their own genres, or at least books that had similar attributes.
I think what they'd really find is that "books that sell well are those that are marketed well", regardless of the words they contain.
Maybe they could focus on a specific key reviewer: what does Oprah like and not like? Maybe when they cross compile the data from all the books, they will find they've only discovered Oprah's tastes. Which isn't a bad outcome, if they are ultimately trying to discover what kinds of books will be better positioned to make the author money. But I don't think they've come close to predicting fiction "best-sellers" yet.
Re:Authors fail to understand ... (Score:4, Interesting)
Success comes in two flavors.
Gutenberg is stacked with classics. Stuff that has been successful over a long period of time. Some classics were flops when they were first published and some go periodically in and out of favor.
The NYT bestseller list, Oprah, et. al. focus on what's popular today. Relatively few books that make those lists will be popular in a century just as many of the bestsellers from Dickens' day would only be known to literary historians. And missing from Gutenberg.
Re: (Score:3)
I was commenting based on the title of the articles discussing the study: "Algorithm aims to predict fiction bestsellers"; and "Computer Algorithm Seeks to Crack Code of Fiction Bestsellers". The strong implications are that the algorithm is designed to unlock the secret of making money by writing books that contain certain words or linguistic structures. I'm arguing that a book's financial success has much less to do with any ephemeral "bestsellerness" quality, and has a much stronger association with "m
Re: (Score:2)
Tl;dr: marketing wins.
Well, I think it would be more fair to say marketing and current fads win. A bestselling author may not need to do any marketing at all, other than mentioning, "By the way, I'm coming out with another book," and it will probably still sell well. Books about famous people or written by celebrities will also often sell, regardless of whether they are marketed heavily. Similarly, books about current fads (diets, financial advice, etc.) may also sell pretty well -- the first book regarding a fad may need som
Re: (Score:2)
I just read the first few pages of the study, and it seems the authors tried to control for the "fame of the author" aspect as much as they could, with things like excluding a second text by the same author in the same genre, that sort of approach. And as suspected, the study is much more modest than the article titles suggest. They are looking for "success" as defined by their own criteria, not "money" or "bestsellers".
But it was the marketing hype that got me to read a study by some random researchers.
Re: (Score:3)
Gutenberg is stacked with classics. Stuff that has been successful over a long period of time. Some classics were flops when they were first published and some go periodically in and out of favor.
Or, in other words, what counts as a "classic" right now is simply what's popular today. I think the trends can be better seen in music history. Take, for example, Pachelbel's Canon in D [wikipedia.org], that piece which seemingly shows up everywhere as "classical music." Johann Pachelbel, however, was a master composer [wikipedia.org], well-known in his lifetime for all sorts of compositions. Today he has one stupid piece played at thousands of weddings and other occasions every year, just because of some whims of audiences in the la
Re: (Score:2)
can it explain... (Score:4, Interesting)
Perhaps they can explain why Fifty Shades did well despite being badly written.
There is a danger in this process that we end up with a "Save the cat" problem where everything has to follow a formula
http://www.slate.com/articles/arts/culturebox/2013/07/hollywood_and_blake_snyder_s_screenwriting_book_save_the_cat.html [slate.com]
Re:can it explain... (Score:5, Insightful)
50 shades is a textbook example of a perfect marketing campaign. It cannot fit an algorithm, it's a total outlier.
They sent out press releases to all the agencies about the new phenomena of women using the wonderful anonymity of e-readers/tablets to read Mommy porn, like that "50 shades" thing.
Journalists just repeated the press releases, over and over again, almost exactly word for word, on various networks, because that's a topic that draws viewer attention.
And suddenly everyone knew that apparently a lot of people were reading that "50 shades" book, and that reading it was both cool and risqué. Jackpot.
I read one page of the book that was published on a website. It was worse than the transcript of a reality TV show. it wasn't just bad literature, it was barely passable English.
But the marketing was absolutely brilliant.
Re: (Score:2)
That's classic. I would prefer to read the book on the marketing campaign. It is original, brilliantly executed and delivered results. Forget the original book.
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
I am really surprised at this. I really like the series, but I would never consider it anything other than a somewhat bland very easy read. I think they need to review their formula, because I think HP is a text book example of a mass marketable, guilty pleasure/easy read, that everyone can enjoy.
Re: (Score:2)
Re: (Score:3)
But you know what? Nobody cares! It's the same with 50 shades, people don't read it because it's art. Women read it to get ideas and phantasies. And to be honest most porn sites don't cater to women, so they have a limited choice in the matter.
Re: (Score:2)
50 shades is a textbook example of a perfect marketing campaign. It cannot fit an algorithm, it's a total outlier.
I suspect that, almost by definition, many best-sellers are outliers. They owe their popularity to marketing, the whims of the book-buying public, what's currently trendy, etc. Like 50 shades of grey, they likely won't succumb to an algorithm.
Re: (Score:2)
Re: (Score:3)
I read Snyder's book because he was a friend of a friend. First off, it's not about *everything*. It's about movie scripts. Secondly it's a bit naive to blame the lack of creativity of modern movies on his book; that's a trend that predates 2005.
In any case screenwriters are nothing like the olympian figures playwrights are in theater. The main creative force in a movie is the director, and writers are relatively minor figures in the enterprise. In the theater the script is gospel. In the movies a direc
Re: (Score:2)
What Snyder did for screenplays was fine. He helped writers understand a structure. The problem was management being risk averse and insisting that all movies follow the Cat even in some cases to the exact minute. Management not knowing their industry and so constraining script writers from doing what they do best. It is the same problem where one successfully movie comes out and suddenly there are lots of copies in the same genre simply because that is lower risk. None of this is Snyder's fault. One
Re: (Score:2)
Perhaps they can explain why Fifty Shades did well despite being badly written.
Because people like reading about sex. That's also why romance novels routinely feature good-looking half-naked people on the cover.
Think of it this way: If the movie is about sex, we'll put up with inane dialog, completely predictable plots, and wooden acting, just to watch a couple of people we'll never meet get it on. Why would you expect books to be much different?
Re: (Score:2)
I get that, but 50 shades is badly written sex. There is no shortage of better written books that will steam up your glasses.
Re: (Score:2)
The algorithm would be trying to guess how well the book would do on the market, not how well it was written.
How well a book is written has little to do with how many copies you can sell of it.
Quantum Literature (Score:2)
What a stupid idea (Score:3)
2. Write a book
3. Profit!!!
I just wrote an algorithm that predicts that no book detailing the death of creativity at the hands of science will ever be written.
Uck (Score:3)
Re:Uck (Score:5, Funny)
Re: (Score:2)
Does this article make everyone else as sick as it makes me?
Nope, I got no idea what you are talking about. In fact, I found it pleasant.
Acknowledging large shortcomings of their study, the one thing they seem to find was that if you want your fiction book to remain popular with a broad audience, you should take my middle school English teacher's advice and show don't tell.
They came up with no magic: "save the cat" formulas to make hits and the industry expert says that this study won't help him much, stories still too complex to predict best sellers.
Further, they p
Waste of time (Score:2)
These things don't actually work. They're curiosities and nothing more.
When they finally develop strong AI... then you might have something. But a non-intelligent system is not going to figure these things out.
The illusion of choice made real. (Score:2)
Preceding any great scientific advancement or discovery it is no accident that you will find a surge in the fiction and cultural themes surrounding it.
The New World, Forensics, Avionics, Electronic Computing, Nuclear Reaction, Rocketry, Robotics.
The cultural mind thinks as you do. Its subconscious boils with the direction it will soon take. Ask yourself: What is seen much more now in your culture? What makes you think you have any choice but to latch onto any thoughts but those which come to mind from wi
here's the article (Score:2)
Anyway, I'm a little worried about the methodology. If you train on PG, and test on PG your generalization error will suffer. This is especially easy to get wrong when both the train and test set are constructed repeatedly with various thresholding rules, and the classifier features are (presumably) optimized during the resear
Re: (Score:2)
You're a cunt for linking to a URL shortener instead of the article directly. You're as bad as the shithead editors. If you're worried about goatse or similar, don't use some shit URL shortener.
http://aclweb.org/anthology/D/D13/D13-1181.pdf [aclweb.org]
There. Easy. Why couldn't you have done that you high-horsed cunt?
Re: (Score:2)
http://bit.ly/1dgDo7d . Come on slashdot editors, do the legwork and link the article directly!
Come on, martin, do the legwork and link [aclweb.org] it directly. This isn't twitter and most folks are wary of shortened links; trolls love hiding their goatse and tubgirl links. I only clicked it because your UID is relatively low and you hadn't (yet) been modded down.
Already done - albeit in fiction (Score:2)
It's already been done - though only in fiction.
Roald Dahl wrote about a machine called the Great Automatic Grammatizator. A machine that you plug in various parameters - such as type of book, characters, proportions of violence/sex/humour - and it churns out something that's pretty much guaranteed to be a bestseller according to those parameters in fifteen minutes flat. Being a writer himself - and a somewhat dark one at that - the end result was a dystopian universe in which writers were forced to give up
algorithm (Score:2)
Remember, there's a HUGE difference between successful and "good".
"Successful" means appealing to the dozen or so big publishers' editors, such that they are willing to pimp your book and market it. They can - and have, obviously - taken utter crapola to the top of the "bestseller" lists.
I entirely understand that the algorithm favors deep internal monologues, because those editors clearly love them.
Re: (Score:2)
there's a HUGE difference between successful and "good"
"Good" is subjective. It's some sort of consensus amongst people who are, for whatever reason, considered literary experts. Consider the "classics". Some are good and some suck. I tried to read "Moby Dick" and found the perfect cure for insomnia. People said "just get past all the boring and extraneous stuff". Sorry, but if a book is full of boring and extraneous stuff, then it's not a good book. Maybe it would have been if Melville had had a good editor. OTOH some classics are great. I just read "All Quiet
Hindsight is 20/20 (Score:2)
Their algorithm had as much as an 84 percent accuracy rate when applied to already published manuscripts
I could write an algorithm that's 100% accurate selecting yesterday's lottery numbers.
Re: (Score:2)
I could write an algorithm that's 100% accurate selecting yesterday's lottery numbers.
That's why data analyst's cross-validate their models. Granted, cross-validation doesn't cure everything (e.g. If the question is already overly specific, or if the analyst double dips in some other way) but it will stop over-fitting and performing at 100%. I downloaded the paper and did a quick search: the authors used a support vector machine for the classification (which effectively allows for fitting of very non-linear boundaries) and they tested it with 5-fold cross-validation. So they given that the
A block buster? (Score:3)
This is nothing new - or troublesome. (Score:2)
What the algorithm looked at was writing style. That's hardly new. Teachers have been recommending this or that writing style, probably since the preferred medium was stone tablets. Slavish devotion to such recommendations is obviously undesirable, and a few outliers and experiments are necessary if you don't want writing styles to become stultified. But taking some advice about it is nothing new or undesirable. This study said nothing about structure (for which there are also standard recommendations) or s
Extreme selection bias. (Score:2)
If you read the article they're not really examining best sellers at all. A site like Gutenberg has no correlation with modern best sellers.
Film, TV and Internet have all had drastic effects on the market as well. Thus old books aren't really representative.
Obviously (Score:2)
Writers aren't going to spend effort to create a well written book about subjects people aren't interested in.
This is *really* a BAD idea (Score:2)
Get something that, krufted up, will work... and the publishers will use it, rather than have readers decide what should be published. You like the crap packaged as "music" from the members of the RIAA? You'll see that in books, too....
mark
Predicting the past? (Score:2)
"[they believe they have found an algorithm that might] predict which fiction books will be successful. Their algorithm had as much as an 84 percent accuracy rate when applied to already published manuscripts in Project Gutenberg and other sources."
I can predict the success rate of already published books with 100% accuracy.
Backtesting is usually bogus because it means nothing unless the experimenter can precisely enumerate the total number of rules that were formulated and discarded--including those formul
Re:If I had a penny (Score:5, Funny)
Oh, if I had a penny for every time an algorithm aimed to do something...
on (anyAlgorithmProposed) {
give yourself a penny
}
Re:If I had a penny (Score:4, Funny)
Add friendly vampires. If that doesn't work, add werewolves. Alternate version: zombies.
Re: (Score:3, Insightful)
So, a love triangle with a vampire, a werewolf, and a girl with the emotional depth of a zombie?
Re: (Score:2)
I keep thinking it's time for some of the other undead to break out. Where's the best-selling ghoul story? Maybe a skeleton romance (shame the title "Lovely Bones" is already taken). The Mummy had one and a half good movies out of three, so there's probably more potential there. Wistful wights, wandering wraiths, maybe a misunderstood banshee - you can be friends as long as you leave your earplugs in.
Nearly two decades ago I read a comment by an author (sadly can't remember who at the moment) who asked for
Re: (Score:2)
I keep thinking it's time for some of the other undead to break out. Where's the best-selling ghoul story?
I couldn't write a story like that. Well, I could but it would suck because my heart wouldn't be in it. It takes a special kind of ghoul to hack out nonsense that people will pay for that the author has no interest whatever in.
Re: (Score:2)
Lamia story -- I think Tim Powers wrote one. Set in Venice, maybe? and excellent.
Partially. Had Vampires, too. In the unique way that only Tim Powers can do such things.