## Bayesian Filters Predict Sundance123

JohnGrahamCumming writes "The LA Times reports on a company's use of Bayesian filtering to predict the winners at the Sundance Film Festival. They use a modified POPFile email filter and claim an 81% success rate."
## Bayesian Filters Predict Sundance

• #### It goes like this: (Score:4, Funny)

on Tuesday January 24, 2006 @10:02AM (#14548254) Homepage
Gay = +100%
• #### Re:It goes like this: (Score:5, Funny)

on Tuesday January 24, 2006 @10:34AM (#14548482)
Gay cowboys eating pudding = +110%
• #### Re:It goes like this: (Score:1, Troll)

I got your South Park reference, but I guess Slashdot doesn't stoop to humor that low.

...
• #### Re:It goes like this: (Score:2)

Parent contains spoilers!

(I didn't know they were gay! :P)

• #### Re:It goes like this: (Score:1)

So how does the eating of pudding factor in?
• #### Re:It goes like this: (Score:2)

Penis pudding
• #### The Winner! (Score:2, Funny)

by Anonymous Coward
Tortured with health problems? You're one click away from healthy life! An amazing variety of licensed meds at one big store! Click the link and make your first step to constant relief!
• #### Shocking news! (Score:4, Funny)

<big.nothing@bigger.com> on Tuesday January 24, 2006 @10:03AM (#14548268)
So, a company claims that their product (or in this case; algorithm) is good?

STOP THE PRESS!

• #### An algorithm that works (Score:5, Funny)

on Tuesday January 24, 2006 @10:31AM (#14548462)
So, a company claims that their product (or in this case; algorithm) is good?

Well according to their algorithm, certain words such as Africa, America, American, beautiful, black, best, emotional, fascinating, great, inspired, lake, new, riveting, Sundance, sexy, story, subtitles, truth, vision, world should never be used.

My 'kiss of death' film would be:

"The Beautiful Lake: An African Vision of the World"

Description: An emotional story of truth about a man from Africa who comes to America to find himself. Being a skilled carpenter, he builds a new home which is set on a beautiful lake. As we hear anectdotes of his vision of truth, a fascinating story emerges. We also learn about his riveting and inspired adventure to his new home, and we see how it impacts his once black view of the world. A great film for any Sundance enthusiast! (with sexy subtitles)

It is almost guaranteed to bomb, before anyone even sees it!
• #### The Golden Movie (Score:1)

From the article:

Golden: academic, accomplished, bedroom, complex, dialogue, dream, death, focus, girl, human, high, journey, love, mother, narrative, romance, relationship, superbly, sex, ultimately.

Therefore, coming soon to a theater near you:

The Contortionist

This academic work involves an accomplished contortionist, her bedroom, and many complex, dialogue-strewn dreams that focus on girl-girl scenes with animals as well as humans. Everyone is high on life in this journey through love, motherho

• #### Re:An algorithm that works (Score:2, Funny)

I've gone ahead and compiled a similar list with respect to /. posts: Golden: "insensitive clod," "tinfoil hat," "Soviet Russia," "overlords," and "M\$" Kiss of Death: "the honorable Jack Thompson," "the RIAA acted appropriately," etc.
• #### Re:An algorithm that works (Score:2)

It is almost guaranteed to bomb, before anyone even sees it!

That's what they said about "Springtime for Hitler"...
• #### Re:An algorithm that works (Score:1)

Challenge Problem: Write the most kiss-of-death accurate review you can for a film that actually won a Sundance award. Bonus points for avoiding all the words on the golden list.

• #### Re:Shocking news! (Score:4, Insightful)

on Tuesday January 24, 2006 @10:41AM (#14548532)
Yeah, I get so tired of people publishing probabilty success rates without stating what the baseline is.

For example, I could announce I have an 85% accurate weather prediction system. it's this: predict the sun will shine most of the day. nowhere does it rain all day more than 15% of the days. so my predictor is 85% accurate.

When you claim an accuracy you need to also give the null model accuracy or it's gibberish.
• #### Re:Shocking news! (Score:4, Informative)

on Tuesday January 24, 2006 @10:55AM (#14548640) Homepage
nowhere does it rain all day more than 15% of the days.

Time to brush up on geography. It rains pretty much all the time in Cherrapunji [wikipedia.org].

• #### Re:Shocking news! (Score:2)

true but, averaged over the whole planet I'm still 85% correct.
• #### Re:Shocking news! (Score:1)

For example, I could announce I have an 85% accurate weather prediction system. it's this: predict the sun will shine most of the day. nowhere does it rain all day more than 15% of the days. so my predictor is 85% accurate.

Actually you can claim that the sun will shine all day long. It may shine on the cloud tops instead of the ground, but it will shine.

That and don't move to Seattle.
• #### Re:Shocking news! (Score:2)

Actually, most weather systems take 4 to 5 days to move through a region, so if you simply predict "Tomorrow will be about the same as today", you will be right 80% of the time.

• #### Re:Shocking news! (Score:1)

Not to mention the 10 years they did the analysis on doesn't seem like a long enough time to draw any reliable conclusions.

And keep in mind that 67.8% of all statistics are made up on the spot. ;)

I'll bet those guys would describe a divining rod as a scientific means to find water.
• #### Fuck films... (Score:3, Insightful)

on Tuesday January 24, 2006 @10:04AM (#14548274)
...let's see it predict STOCK WINNERS.
• #### Re:Fuck films... (Score:1)

No kiddng. Predict something that noone else can predict. Predict whether Iran will create weapons with their nuclear power facilities. Predict how long it will be before the rain forests are completely wiped out. Predict when the ozone layer will be totally depleted. Predict when Microsoft will ship a secure version of Windows. :)
• #### Re:Fuck films... (Score:4, Funny)

on Tuesday January 24, 2006 @10:19AM (#14548370)
"Predict something that noone else can predict."

Who IS this Noone [urbandictionary.com] guy? I keep hearing his name all over the place. He must be bigger than Jesus.
• #### Re:Fuck films... (Score:2, Flamebait)

No one (Noone) is bigger that Jesus :)

Did I get the joke or was that unintentional? :)
• #### Re:Fuck films... (Score:1, Interesting)

No, that wasn't the joke. I'm not a Christian.
• #### Re:Fuck films... (Score:2)

Reminds me of one of my favourite jokes, from an extremely obscure UK radio comedy programme - the characters had all been getting mysterious notes, from people like Mr N.O. Body, and Mr E. Nigma, and the final character said he just got a note from his milkman. He checked the note:

"Wait a minute, it's signed Mr Noone! You know what this means!"
"Yes!"
"It's from Peter Noone!"
"No, you idiot! Don't you see? He's Mr No-One!"
"Oh, now, he is, yeah, but he was big in the 60s."

I'll get me coat.
• #### Re:Fuck films... (Score:1)

Actually, according to google he is:

• Google for noone: 144.000.000 results
• google for jesus: 73.100.000 results

cheers, majello

• #### Re:Fuck films... (Score:2)

...that Noone guy is a real looser...

• #### Re:Fuck films... (Score:2)

Yeah, he's rediculous. Its a shame.
• #### Re:Fuck films... (Score:2)

definately
• #### Re:Fuck films... (Score:2)

Next week. 2014. 2015. April 6, 2063 (with a little help for their faulty logic).
• #### Re:Fuck films... (Score:2)

Why stock? People bet on all sorts of things. I would be shocked if you couldn't bet on the Sundance winners. So the real question is, if they had used their predictions as a betting strategy, what would their return on investment be?

That would give a good indicator of how much they're simply predicting the favorites (not much return) or accurately predicting surprises.
• #### Re:Fuck films... (Score:5, Informative)

on Tuesday January 24, 2006 @10:28AM (#14548432) Homepage
There are many examples of using statistics and artificial intelligence in finance (go google), including some applications to predict stock prices. Even a decade ago, books like "Neural Networks in Finance and Investing" and "Artificial Intelligence in the Capital Markets" were already published, along with hordes of books on statistics in finance (think about what Quants do).

Of course, I don't think we can yet predict stock prices with the same 81% accuracy as in this article. And, if anyone could, they would be wise to keep it to themselves.
• #### Re:Fuck films... (Score:1)

Of course, I don't think we can yet predict stock prices with the same 81% accuracy as in this article. And, if anyone could, they would be wise to keep it to themselves.

Surely if someone worked this out they would then make money by flogging the method to other people through infomercials, public speaking and self-help books wouldnt they ?!

I mean, thats what everyone else does that figures out how to get rich. you dont actually do it, you teach everyone else how to do it and make money out of it that
• #### Re:Fuck films... (Score:1)

Of course, I don't think we can yet predict stock prices with the same 81% accuracy as in this article. And, if anyone could, they would be wise to keep it to themselves.

quite the opposite.

If I advertised a program that predicted stock pices with 81% accuracy, a very large number of people would buy/sell based on it's results, making it self predicting. at least in the short term.

• #### Feedbacked system (Score:2)

If someone were to use AI to predict the stock market, and would invest on it based on those predictions, they would be very successful initially, but would also change the behaviour of the same market up until it would render the model unusable.

I suspect this has happened several times.
• #### Re:Fuck films... (Score:2, Funny)

Sorry, I don't want to have to drill a hole in my head.
• #### Re:Fuck films... (Score:3, Funny)

** REPORT RESULTS: Bayesian Query = 'STOCK WINNERS' **

George W. Bush
Dick Cheney
Darl McBride
• #### Re:Fuck films... (Score:1)

"** REPORT RESULTS: Bayesian Query = 'STOCK WINNERS' **

George W. Bush Dick Cheney Darl McBride"

You seem to have mis-spelled "WIENERS".
• #### Re:Fuck films... (Score:2)

...let's see it predict STOCK WINNERS.

However, you may not be able to afford more than one stock.
• #### Re:Fuck films... (Score:1)

Hey, that sounds like a good idea for a film! I bet people would like to watch other people fucking on the screen! I'm a-gonna win the next Sundance!
• #### Another method to predict the winners (Score:2, Interesting)

Bring a decibel meter and a stopwatch and find the films with the loudest and longest:

1) Laughter
2) Applause
3) Standing Ovations afterward

This simple method will give you a good idea of who will be the winners.
• #### Re:Another method to predict the winners (Score:1)

I thought this was how to measure which director could stack the audience with the most friends! :)
• #### Re:Another method to predict the winners (Score:1)

Bring a decibel meter and a stopwatch and find the films with the loudest and longest:

1) Laughter
That might not be very accurate, as most Sundance Grand Jury winners [imdb.com] are often not among the happiest of movies.
• #### Re:Another method to predict the winners (Score:1)

I call bollocks. Biased/bribed Judges anyone?

That doesn't account for a crap film like Titanic picking up 11 Oscars, for example. Or the first two "Lord of the Rings" movies getting shafted by inferior films either.
• #### Unimpressed (Score:2, Insightful)

"Our engineers were thinking that determining whether a movie is good or bad could be similar to determining whether e-mail is spam or not," said Unspam Chief Executive Prince, 31, who loves the festival and uses it as a recruiting tool. "We had the last 10 years of the festival's film guides, which are like inputs, and then a bunch of outputs, like how many people saw a film, did it win anything at Sundance, did it have commercial success. If you could figure out the pattern between the inputs and the outp
• #### Re:Unimpressed (Score:2, Insightful)

That depends. If it predicts and filters 84% of all spam, then it can't be anything but good. However, if 84% of what it predicts and filters is indeed spam, then 16% was not and was filtered needlessly - that's bad.
• #### Re:Unimpressed (Score:1, Troll)

I'm not a Spam guru so please excuse me if I'm wrong, but isn't 81% a horrible result? Perhaps not for movie prediction but in Spam filtering?

Perhaps they should use spam filtering for weather reporting. That way, the "dart throwing monkies" will end up with more accurate results than they do now. "There's a 30% chance of rain." I have always wondered if a passing grade in meteorologist college coursework was 30% or better.
• #### Re:Unimpressed (Score:2)

Argh, I hate feeding the trolls... Anyway, meteorological data is very difficult to predict. Simple storms and weather patterns can easily be altered within a couple days time. The amount of influences on weather are enormous, and simple ideas such as the butterfly effect (not the movie) can create huge effects within a months time. Yes, the weather channel doesn't have the best success rate, but considering the number of molecules involved in weather fronts, our computers aren't exactly suited to the j
• #### Re:Unimpressed (Score:1)

I'm unimpressed with your own weather-watching skills, not to mention your math. Have you ever taken a lesson in probability? 30% is about 1/3. So a 30% chance of rain means that about 2/3 of the time, it isn't going to rain! You might even get bright, clear skies.
• #### Re:Unimpressed (Score:3, Informative)

The problem is that saying it is "81% successful" is meaningless. Typically one would use a two-fold measure of success for these sorts of application: precision and recall. In the case of spam, the precision of your algorithm would be the number of correctly marked emails over the total number of emails marked, and the recall would be the number of correctly marked emails over the number of emails that are actually spam.

In terms of search this is perhaps more clear, so consider Google. You issue Google
• #### Filter Mods (Score:5, Funny)

by Anonymous Coward on Tuesday January 24, 2006 @10:12AM (#14548323)
Angsty +2
Depressing +2
Happy or Inspirational -1
Featuring charaters of a marginalized societal group +10
Featuring charaters of a majority societal group -10
Making those majority characters feel guilty +20
Political Agenda +10
Social Agenda +10
Leftist Social & Political Agenda +50
Non-acting acting +3
Use of black and white film +1
Sense of Humor -5
Comedy film -100
Intellectual +1
Pseudo-intellectual +30
Director dresses in all black +4
Actors dress in all black +10
Actors dress in all black and do interpretive dance to Phillip Glass music while speaking German backwards +20
Audience participates and dances with the actors in above scenario +1000
Would actually generate box office revenue -100
Good movie that would appeal to more than a niche audience -20
• #### Kiss of Death? (Score:1, Funny)

by Anonymous Coward
Prince and his crew came up with two lists: words that "make you golden" or are "the kiss of death."

Kiss of death: Africa, America, American, beautiful, black, ...

Prince went on to comment they were suprised to come up with the first racist bayesian filter in their career.
• #### Re:Kiss of Death? (Score:2)

Your little list includes "America" and "beautiful", so who cares?
• #### Fit your stereotype? (Score:5, Interesting)

on Tuesday January 24, 2006 @10:19AM (#14548374)
From TFA (words in the description that help or hurt it): "Golden: academic, accomplished, bedroom, complex, dialogue, dream, death, focus, girl, human, high, journey, love, mother, narrative, romance, relationship, superbly, sex, ultimately. Kiss of death: Africa, America, American, beautiful, black, best, emotional, fascinating, great, inspired, lake, new, riveting, Sundance, sexy, story, subtitles, truth, vision, world." So, they want complex, academic films about girl-mother relationships with a strong narrative of romance and sex. Nothing about beautiful black people in Africa or America with any sort of interest in visions, truth, or the world, especially if said black people are sexy and live near a great, nay, the best lake.
• #### Re:Fit your stereotype? (Score:1)

Oh get off it. You have a list of words completely taken out of context and you're turning it into some "everyone at Sundance is racist" nonsense.

The real difference between the two lists is that the first list is more concrete and the second is more abstract. I'm not surprised to hear that fascinating, beautiful and emotional are in the list. Those words are the hackneyed descriptions of every art house critics favorite film. People are probably sick of hearing them and ignore them like a David Manning r
• #### Re:Fit your stereotype? (Score:1)

I did once analyse titles of British TV programmes [membled.com] to find keywords that make me more or less likely to want to watch a show.
• #### Re:Fit your stereotype? (Score:2)

...especially if said black people are sexy and live near a great, nay, the best lake.

People are trying to make a case that some predictor words for winning a Sundance award (America, beautiful, Africa, black, ...etc.) imply that films about black Africans or beautiful America win awards.

What everyone is missing is that these terms are NOT RELATED in the Baysean filter - they are just words. It is the human brain that is incorrectly associating 'America' with 'beautiful' or 'black' with 'Africa' or 'black
• #### Questionable categories (Score:1)

They picked "Sombodies" as a successful drama. Drama? Not really; definitely a flat out comedy. Good stuff. Sundance is cool, but their ticketing system is getting progressively worse. This year I paid \$5 for the chance to pick movies in a ½ hour slot 3 days after the box office opened. Didn't get even one of my first choices; essentially got what was left.
• #### Bayesian for Slashdot (Score:5, Interesting)

<Bhima DOT Pandava AT gmail DOT com> on Tuesday January 24, 2006 @10:27AM (#14548429) Journal

Someone should develop a client side Bayesian Filter / Moderation system for Slashdot.

A sizable portion of people around here are not consistantly assholes so it doesn't really make sense to add them to a "foe" list.
Frequently things are in strange topics so it doesn't make sense to ignore whole topics.
Not all new members are trolls so modding all new members down doesn't make sense either.
And the current moderation system is subjected to other people's current peeves and political leanings.

And please don't tell me to do it, I'm an embedded developer not a web developer... I have no idea where to even begin with it.
• #### Re:Bayesian for Slashdot (Score:3, Interesting)

Yeah- I've wanted a site like digg/slashdot that worked like this for a while- users can vote on anything, and then anything you haven't voted on is given a score that is calculated according to how the people who most consistently vote in agreement with you score the story/comment. The site is custom-tailored to what you want- People who like stupid crap will mod up stupid crap and get more stupid crap because other people who like stupid crap will have modded up the same stupid crap and more, while people
• #### Re:Bayesian for Slashdot (Score:3, Interesting)

Check out http://reddit.com./ [reddit.com.] At least, once it isn't broken. It's a news aggragation site per slashdot/digg, but incorporates some of what you are looking for.
• #### Re:Bayesian for Slashdot (Score:2)

That's pretty cool- I wish it had brief summaries of the articles and threaded conversations. Digg, Slashdot, Reddit... they've all got good pieces, but are lacking some feature that the others have...
• #### Re:Bayesian for Slashdot (Score:5, Interesting)

<wgrother@@@optonline...net> on Tuesday January 24, 2006 @10:49AM (#14548599) Journal
And the current moderation system is subjected to other people's current peeves and political leanings.

Which is what makes it so much fun!

Seriously, its wonderful that Bayesian filters are useful, but why put blinders on? Slashdot would simply cease to be interesting if you could will away anything you didn't like. Intelligent discourse requires an airing of all sides of an issue and theoretically this can lead to consensus building, if the best parts of all ideas are combined. Of course you're going to get people with very little to say, or very little between the ears, muddying the waters -- the challenge is to take the disparate elements and meld them to something coherent. Superfluous elements will be winnowed out and hopefully the end product is something most people can agree on.

Of course this is Slashdot, the Internet equivalent of a bar brawl. The rough-and-tumble of this kind of fourm is what keeps it interesting and more importantly, as much as we are infuriated by those who don't agree with us, makes us think.

• #### Re:Bayesian for Slashdot (Score:2)

I don't think a Bayesian moderation system would necessarily prevent you from seeing any opposing viewpoints. I often moderate up comments the I disagree with if the submitter has an interesting point, or if it is necessary for some intelligent reply to make sense. I often wish /. had a view where I could see all 3+ comments as well as all their parents. That way, if someone makes an insightful reply I can read what they were replying to without having to open the parent link in another tab or browse at -1.
• #### Re:Bayesian for Slashdot (Score:3, Interesting)

I think you are looking at it the wrong way:

Using the current mod system on Slashdot you are using someone else's blinders.
Using the Friend / Foe system you are using a static subset.

Less than 20% of the comments around here are either meaningful, thought provoking, or relevant... I want to see those that truly are interesting and between the current mod system and the outright volume I can't in the amount of time I'm willing to spend reading Slashdot.

Slashdot is not like the Internet equivalent of a bar br
• #### Re:Bayesian for Slashdot (Score:2)

Using the current mod system on Slashdot you are using someone else's blinders.
Using the Friend / Foe system you are using a static subset.

True, though the system is flexible enough that I'm not required to mod categories and/or people up or down. I've determined over time that adding/subtracting points based on relationships here is a double-edged sword. I often actually want to see what people who don't like me are saying, to get some sense of why and to challenge them on a fundamental level, if I can

• #### Re:Bayesian for Slashdot (Score:1)

I was under the impression that bar brawls were physical and considering the average /. reader, we would lose. Perhaps this should be the Internet equivalent of an angry debate.
• #### Re:Bayesian for Slashdot (Score:2)

And please don't tell me to do it, I'm an embedded developer not a web developer... I have no idea where to even begin with it.

But CGIs are embedded scripts! ;)
• #### And the winner is... (Score:2, Funny)

BUY Ch 3ap \/iag r a 0n1i ne - n0 prescr1pti0n r3quir3d!!!!
• #### A better thing (Score:2, Interesting)

This [slashdot.org] was a far better (and open source) applecation of Bayesian filters
• #### Instructions on completing your Oscar ballot form. (Score:2, Funny)

Does it portray women as victims? +3

Does it star a beautiful actress with ugly makeup +1

Does it deal with weighty issues? +1

Is it science fiction? -3

Does it show how minority groups are oppressed? +2

Does it star people from a minority group who haven't received Oscars for a few years? +2

Did you cry? +2

Was it made by an action movie director turned serious? +2

Does it deal with weighty issues albeit by stringing together a sequence of time-worn cliches? +2

Is it an action movie made by a serious director? -2

Is

• #### Re:Instructions on completing your Oscar ballot fo (Score:2)

That's because Oscar winners are just Lifetime movies with famous people starring in them. I wish they did Oscars for movies for guys.
• #### Re:Instructions on completing your Oscar ballot fo (Score:2)

You didn't like Brokeback Mountain either?

I'll celebrate the big step forward for Hollywood's portrayal of gay issues when they make a gay feelgood movie. Or, you know, a gay Dukes of Hazzard.

Have you seen the movie? +3
• #### Bayesian filter to predict Slashdot's new stories? (Score:3, Insightful)

on Tuesday January 24, 2006 @10:35AM (#14548490)
I'm not sure what kind of crack-simulator Slashdot put into its related stories selector, but some kind of Bayesian filter to figure out the relationship might be helpful.

For example...

Ask Slashdot: State of WLAN Support on Linux?
Related...
IT: Microsoft Spending \$120M To Look Smaller
Games: Defying Review Aggregation
Games: Competitive Gaming Hits the Mainstream

WTF?
• #### Re:Bayesian filter to predict Slashdot's new stori (Score:2, Informative)

Where do you see the word "related" or any of its equivalents? As far as I can tell, every story's position is based on the time it is posted to the front page.
• #### Re:Bayesian filter to predict Slashdot's new stori (Score:2)

Well, if you stick some headlines in a little grey box attached to a "main" story, wouldn't you assume they are related to the main story? (If the "grey box" thing is an attempt to "minimize" items of less-than-wide interest, the interface just sucks.)
• #### Re:Bayesian filter to predict Slashdot's new stori (Score:1)

"Well, if you stick some headlines in a little grey box attached to a "main" story, wouldn't you assume they are related to the main story?"

Maybe at first glance, but if every single one does NOT actually relate to the story above, I would realize that assumption was wrong...
• #### Yes, the new GREY STORY interface sucks (Score:2)

Then I think we agree - the new GREY STORY interface sucks donkey balls.
• #### Re:Yes, the new GREY STORY interface sucks (Score:2)

In other news donkeys start tuning in to Slashdot in record numbers...

• #### Re:Yes, the new GREY STORY interface sucks (Score:1)

Yup.. no argument on that point. They do indeed LOOK like they belong to the story above, which is dumb.

I note they've changed them now to be rectangular instead of with rounded corners to differentiate them from the main stories. I guess that helps somewhat.

If you have any better ideas on how to improve the appearance, CmdrTaco indicated yesterday he is open to suggestions: http://slashdot.org/article.pl?sid=06/01/19/175253 [slashdot.org] (see first update)
• #### Statistical methods? (Score:1)

Although it's possible they ommitted data when when creating their model in order that it could be used later in testing (i didn't see in the article whether this was the case). It is quite possible that the 81% result was based on predicting results that were used in building the model (the article says they used historiacl data to build the model and then tried to predict historical results to test the model) - this would totally negate the results as meaningless. Lets see what it predicts and compare it
• #### Re:Statistical methods? (Score:3, Informative)

Their web site states that the 81% number was "year on year" which I interpret to mean that they took the data for years n - 1 to predict year n.

John.
• #### And that's not all they can do (Score:2)

As well as POPFile's multi-category email filtering, I sell a commercial component [extravalent.com] that does multi-category Bayesian filtering for companies to embed in their own software. Bayesian and other statistical techniques are going to be cropping up everywhere there's text to analyze.

John.
• #### What else? (Score:2)

Wow, document classification with Bayes nets. How fresh is that??! I wonder how many more of these we'll see? I liked this version better: http://www.pitchformula.com/ [pitchformula.com] He took it a step further and actually MADE art based on those kinds of predictions.
• #### Has anyone tried Bayes for... (Score:1, Offtopic)

...predicting Slashdupes?

I was amused by something in the article that said that too many adjectives in the description ("riveting!") is a predictor of a negative outcome for a film. That reminds me of a rule of thumb for restaurants that a friend suggested -- if the name of the dish is full of adjectives, it'll taste bad. Amusingly, I just did a Google search for "restaurant menu adjectives", and most of the hits on the first page were for middle-school lesson plans where kids add adjectives to menus to make the food seem more app
• #### And? (Score:2)

Give me a Bayesian filter that will predict horse races. Now there's something I can use.

What are the odds on Sundance in the 5th?

• #### Pedro For President Meets Gore For President (Score:2)

'I voted for you - did you vote for me' - at least that's what the blog says ;-)

http://efrenramirez.imeem.com/photo/0MCW7w6O/K184B j6EJ60T_ [imeem.com]
• #### media attention (Score:1)

By releasing their results early, they have biased this year's results with media attention. Imagine a judge making a decsion between two films that she liked. Does he pick the same one as the computer? If a computer can do her job what does the world need him for? So she picks the movie that was not predicted.

The correct methodology would have been to entrust the results to a third party to be released after the event was over.

By the way, I am aware that my judge is a he/she, it is Sundance afte
• #### Correlation is not causality (Score:2)

What these folks are doing is cute, but simply boils down to seeking correlations between variables. I'm sure I don't need to remind slashdot readers that correlation and causality are not the same thing. Correlations can give you clues, but are not the real meat of the problem.

If variables are correlated, the mechanics of that correlation might be due to some underlying common cause. Without understanding the underlying cause (if it exists), you are simply groping in the dark, hoping the interplay betw

• #### 81% Success (Score:2)

Uh, they made a real common problem in neural nets. Success rate on test data != actual success rate.

81% success when you run it back on your test data is meaningless. In fact, any number is meaningless when you apply it to your test data. I could get 100% success just by spitting back out if the name of a movie matches a name in the test set.

Here's the relevant quote:
"[t]esting the system with known data from previous years, we have established an approximately 81% typical accuracy rate on a year-by-year b
• #### Winning Sundace (Score:2)

From TFA:

"and then a bunch of outputs, like how many people saw a film, did it win anything at Sundance, did it have commercial success. If you could figure out the pattern between the inputs and the outputs, then you could actually predict future winners."

Where Sundance is concerned, you run the filter one way to determine if it wins, and then reverse the good/bad word lists to see if it will have commercial success.
• #### Hey, I'm gay, and I resent commenter #1s Gay+5RATE (Score:1)

Please check it out, mods. MAHALO
• #### What're they feeding it? (Score:2)

If it were my project, I'd probably feed the script through instead of the reviews.

