Managing Last.FM's "Mountain of Data" 139
Rob Spengler writes "Last.FM co-founder Richard Jones says the biggest asset the company owns is 'hundreds of terabytes of user data.' Jones adds, '... playing with that data is one of the most fun things about working at the company.' Last.FM, for those who have been living on Mars for the last two years, is the largest online radio outlet, with millions of listeners per day. The company surpassed Pandora and others largely due to its unique datamining features: 'Audioscrobbler,' the company's song/artist naming algorithm, can correctly determine a track even with tens of thousands of false entries. Jones says sitting on that much data has even helped police: 'thieves listening to music on an Audioscrobbler-powered media player have helped police in the US, UK, and other countries track down users' stolen laptops.' Does sitting on a mountain of data make Last.FM powerful enough to start making a stand against the record industry? CBS certainly thinks so — they bought the company for £140 (~$200) million last year."
Re:Data is valuable (Score:3, Insightful)
This is the age of communication and nothing is more valuable than information and manipulating that information. How do you manipulate it? To know that, you need another kind of information, which is usually based on statistics on large amounts of data (like Last.FM's database, for example).
So, in today's society, there are three valuable entities: money (manipulated by information, everyone wants it), information (manipulated by more information, any company's dream) and more information (based on statistics, like the Last.FM database) controlling each other in a cascade. Once you have the source you can easily trace it to see how things are flowing, so you may know how to invest your money.
Repeat after me: "I will not disclose the information I have. Information is more valuable than money. If I own a valuable piece of information and I don't make money off it, I'm stupid."
Re:Now What... (Score:5, Insightful)
If you could (accurately) answer that question, then you'd act upon the answer...
Why do you think Google ads are Google's bread and butter as far as cashflow goes? The reason is that Google has a treasure trove of user data, probably more than anyone else, so they can really make contextual ads work. Anyone can write an ad engine, but not everyone has access to mountains and mountains of user data.
You might be surprised at how important context is when you're trying to promote something. Say you're trying to promote an online RPG like Game! [wittyrpg.com], if you took a random collection of people, probably less than 5% of them would be interested in playing, but if you can target gamers specifically, that number might jump to 50%. If you're paying for every impression, that makes a world of difference.
So not only do you need to understand your audience, you also need to effectively target them. Now, how do you do that? Data mining of course, and the more data the better.
Pretty much all data has value, figuring out how to turn that data into money is extremely subjective and might involve some black magic, and definitely requires luck too.
all this data yet so much gets missed (Score:2, Insightful)
So... I've been living on Mars? (Score:3, Insightful)
Last.FM, for those who have been living on Mars for the last two years, is the largest online radio outlet, with millions of listeners per day.
You know, I'm not exactly what you'd call a Luddite, yet I've never heard of Last.FM. Am I the only one? I kind of doubt it.
I have a general gripe about anyone who writes "for those who have been living on Mars" anytime they reference some moderately popular company, service, or product. It smacks of arrogance, as if to say, if you don't have the same interests as I do, you're obviously disconnected from the mainstream.
Or perhaps I'm just annoyed for being called out on being a bit older and out of touch? Bah!
>>goes back to guarding lawn with a shotgun from an old rocking chair...
Surpassing Pandora (Score:5, Insightful)
The company surpassed Pandora and others largely due to its unique datamining features
I would think that being available outside of the USA may have helped quite a bit as well.
Re:So... I've been living on Mars? (Score:3, Insightful)
I've never heard of them either and the bit about living on Mars also irks. And for all the arrogance in that, the summary makes it sound like the internet radio outfit needs fancy algorythms to tell what music they're playing. WTF don't they just program the correct name when they add a new song to their database? I'd read the article, but my shuttle back to Mars is leaving...
Re:Data is valuable (Score:3, Insightful)
Good points. You had me until you said "entity" (do you know what that means? I doubt it) in the place of, I assume, "commodity".
Oh and the repeat after me bit is silly. The "information" you have is worthless on its own. It only becomes valuable when it's coupled with lots of other similar "informations" from other people. By retaining this information you're only preventing someone from making money, without any benefit for yourself, which is arguably dickish. Oh and saying that "information is more valuable than money" is stupid. You can't say that something is superior to what measures it.
Re:So... I've been living on Mars? (Score:2, Insightful)
"IF Y'AINT SEEN THIS THEN Y'AINT SEEN NOTHIN!"
Which is pressed and kneaded as needed to "you have to have been living (under a rock | on mars | in a laundry hamper) for the last (year | ten years | few decades | all your life) if you haven't heard of (this amazing company that can solve all your problems | this great company who has this incredible product | this stupendous chamois which can soak up over seven thousand times its own weight in water).
Last.FM is pretty OK, but I would much rather do business with a company which doesn't have a co-founder who calls it "fun" to play with my personal data.
Re:Data is valuable (Score:4, Insightful)
Sounds like a slight variation on those people who have TB's of movies/music/videos/TV episodes/etc that they will never have the time to watch/listen to.
Re:unique order of songs (Score:3, Insightful)
So your contribution, then, is noise.
But this noise does not affect the signal, which is still there. It's just harder to find.
Nobody ever said mining a mountain of data like this would be a trivial task.