Managing Last.FM's "Mountain of Data" 139
Rob Spengler writes "Last.FM co-founder Richard Jones says the biggest asset the company owns is 'hundreds of terabytes of user data.' Jones adds, '... playing with that data is one of the most fun things about working at the company.' Last.FM, for those who have been living on Mars for the last two years, is the largest online radio outlet, with millions of listeners per day. The company surpassed Pandora and others largely due to its unique datamining features: 'Audioscrobbler,' the company's song/artist naming algorithm, can correctly determine a track even with tens of thousands of false entries. Jones says sitting on that much data has even helped police: 'thieves listening to music on an Audioscrobbler-powered media player have helped police in the US, UK, and other countries track down users' stolen laptops.' Does sitting on a mountain of data make Last.FM powerful enough to start making a stand against the record industry? CBS certainly thinks so — they bought the company for £140 (~$200) million last year."
Data is valuable (Score:5, Interesting)
unique order of songs (Score:5, Interesting)
what i find most interesting is the order certain songs "go together", like listening to a song from Slayer, followed by, say, "someday i suppose" from the bosstones. when composing songlists, i appreciate how similar songs and moods can flow, but also how the contrast of dissimilar songs can SOMETIMES compliment each other.
a large database could ferret out such instances that might occur frequently in multiple playlists.
Now What... (Score:2, Interesting)
I have a similar site that I wrote (pre-audioscrobbler). Granted it's crap, but I have mountains of data also. Closer to 1 tb than hundreds of tb. The question is, how do you monetize the data?
I just don't see how this data is "worth" 200 million bucks. I have some amazing algorithms to do similar cleaning, caching, and recommendations, but still what is that worth?
This is a fairly legit question. If you can figure it out, I can explain to my wife why I have 3 servers in my closet.
Re:Data is valuable (Score:1, Interesting)
I dont necessarely even use the data for anything, I just like how its there and I can play around with it and search thru it. I just go to a webservice, make a scripts to harvest the valuable data from it, save it to db and let scripts peridiocally check if theres new data, either thru my own scripts or RSS.
Back in the Audioscrobbler days Last.FM used to provide full database dumps aswell, but seems they're changed their approach now, saying it is considered too valuable. [www.last.fm].
Re:unique order of songs (Score:2, Interesting)
What about "songs are mostly played in alphabetical order"? :)
No revolution (Score:5, Interesting)
CBS certainly thinks so -- they bought the company for £140 (~$200) million last year.
Which is why whatever comes of them, at best it will be evolutionary. CBS is part of the old guard RIAA corps, they are just one of the faces of Viacom - all controlled by Summer Redstone. They may have brought some money to the table, but they brought a whole ton of baggage with them too. Enough baggage to make this privacy freak decide they couldn't be trusted with all that data they've been collecting (for example, if they can track down a stolen laptop, they can track down someone playing an MP3 from an illegally leaked pre-release album).
The real danger (Score:5, Interesting)
Seriously though, I have found using the site to be pretty enjoyable. And the advertisements are actually worth keeping AdBlock turned off for. I found a few new artists, some unsigned, that way. I like all the various widgets and things that can crunch my data. Songbird has a last.fm plugin/addon that makes for very easy integration. It's just really useful. I've also found concerts on the site.
I rarely use the social side of it, except with friends I already know. But that's me.
Re:The real danger (Score:3, Interesting)
Haha, if it gives you any comfort, I'm the same way. With how iTunes/iPod work - incrementing the count when the track finishes - I'm constantly waiting for songs to end before picking another one, or leaving tracks that have silence at the end to finish completely. Really wish it incremented at 75% complete or something.
Re:Data is valuable (Score:1, Interesting)
And when searching for youtube videos from my irc bot it gives a warm feeling knowing it comes from my local db instead of youtube's, no matter how useless that is
Re:So... I've been living on Mars? (Score:3, Interesting)
Re:Now What... (Score:3, Interesting)
Re:Data is valuable (Score:3, Interesting)
and tell me mr Coward what have you deducted from you pile of information
So what if he has never done one useful thing with it? People like that provide a public service, its people like that which enabled DejaNews and now Google Groups to reconstruct much of the historical usenet. If his hobby is data hording, then let him horde. It doesn't cost you a dime, but one day it might possibly be of great benefit.