Slashdot Log In
Managing Last.FM's "Mountain of Data"
Posted by
Soulskill
on Sat Dec 27, 2008 01:49 AM
from the because-it's-there dept.
from the because-it's-there dept.
Rob Spengler writes "Last.FM co-founder Richard Jones says the biggest asset the company owns is 'hundreds of terabytes of user data.' Jones adds, '... playing with that data is one of the most fun things about working at the company.' Last.FM, for those who have been living on Mars for the last two years, is the largest online radio outlet, with millions of listeners per day. The company surpassed Pandora and others largely due to its unique datamining features: 'Audioscrobbler,' the company's song/artist naming algorithm, can correctly determine a track even with tens of thousands of false entries. Jones says sitting on that much data has even helped police: 'thieves listening to music on an Audioscrobbler-powered media player have helped police in the US, UK, and other countries track down users' stolen laptops.' Does sitting on a mountain of data make Last.FM powerful enough to start making a stand against the record industry? CBS certainly thinks so — they bought the company for £140 (~$200) million last year."
Related Stories
[+]
Audioscrobbler (Anyone Remember Firefly?) 200 comments
asciirock writes "RJ, a University of Southampton grad student in the UK has just put his final year project online. Audioscrobbler is a free plug-in for Linux XMMS and Windows Winamp2. It tracks every tune you play, cross-references with others in the Audioscrobbler community and serves up recommendations. There's also msging, stats and user homepages. In other words... Firefly lives!"
Submission: Last.FM, Boldly Datamining Like Never Before by Anonymous Coward
This discussion has been archived.
No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
Full
Abbreviated
Hidden
Loading... please wait.
Data is valuable (Score:5, Interesting)
Re: (Score:3, Insightful)
Re: (Score:3, Insightful)
Good points. You had me until you said "entity" (do you know what that means? I doubt it) in the place of, I assume, "commodity".
Oh and the repeat after me bit is silly. The "information" you have is worthless on its own. It only becomes valuable when it's coupled with lots of other similar "informations" from other people. By retaining this information you're only preventing someone from making money, without any benefit for yourself, which is arguably dickish. Oh and saying that "information is more valua
Re:Data is valuable (Score:4, Insightful)
Sounds like a slight variation on those people who have TB's of movies/music/videos/TV episodes/etc that they will never have the time to watch/listen to.
Parent
Re: (Score:3, Interesting)
and tell me mr Coward what have you deducted from you pile of information
So what if he has never done one useful thing with it? People like that provide a public service, its people like that which enabled DejaNews and now Google Groups to reconstruct much of the historical usenet. If his hobby is data hording, then let him horde. It doesn't cost you a dime, but one day it might possibly be of great benefit.
Re:Data is valuable (Score:5, Funny)
Parent
Re: (Score:2, Funny)
I disagree. When I was a senior in HS, we had a smoking hot student teacher. I would have paid to get molested by her.
LK
Re:Data is valuable (Score:5, Funny)
As a student, I must respectfully disagree.
Parent
Re: (Score:3, Funny)
I sense a potential solution here...
Just sayin'.
unique order of songs (Score:5, Interesting)
what i find most interesting is the order certain songs "go together", like listening to a song from Slayer, followed by, say, "someday i suppose" from the bosstones. when composing songlists, i appreciate how similar songs and moods can flow, but also how the contrast of dissimilar songs can SOMETIMES compliment each other.
a large database could ferret out such instances that might occur frequently in multiple playlists.
Re: (Score:3, Funny)
To get you into the right mood, think of the impact it could have on mind manipulation
Re: (Score:2, Interesting)
What about "songs are mostly played in alphabetical order"? :)
Re: (Score:3, Insightful)
So your contribution, then, is noise.
But this noise does not affect the signal, which is still there. It's just harder to find.
Nobody ever said mining a mountain of data like this would be a trivial task.
Now What... (Score:2, Interesting)
I have a similar site that I wrote (pre-audioscrobbler). Granted it's crap, but I have mountains of data also. Closer to 1 tb than hundreds of tb. The question is, how do you monetize the data?
I just don't see how this data is "worth" 200 million bucks. I have some amazing algorithms to do similar cleaning, caching, and recommendations, but still what is that worth?
This is a fairly legit question. If you can figure it out, I can explain to my wife why I have 3 servers in my closet.
Re:Now What... (Score:5, Insightful)
If you could (accurately) answer that question, then you'd act upon the answer...
Why do you think Google ads are Google's bread and butter as far as cashflow goes? The reason is that Google has a treasure trove of user data, probably more than anyone else, so they can really make contextual ads work. Anyone can write an ad engine, but not everyone has access to mountains and mountains of user data.
You might be surprised at how important context is when you're trying to promote something. Say you're trying to promote an online RPG like Game! [wittyrpg.com], if you took a random collection of people, probably less than 5% of them would be interested in playing, but if you can target gamers specifically, that number might jump to 50%. If you're paying for every impression, that makes a world of difference.
So not only do you need to understand your audience, you also need to effectively target them. Now, how do you do that? Data mining of course, and the more data the better.
Pretty much all data has value, figuring out how to turn that data into money is extremely subjective and might involve some black magic, and definitely requires luck too.
Parent
Re: (Score:3, Interesting)
Re: (Score:3, Funny)
figuring out how to turn that data into money ... might involve some black magic, and definitely requires luck too.
So what you are saying is:
1. Data
2. ???
3. Profit!
:~)
Re:Now What... (Score:5, Funny)
Information wants to be free.
Information wants to be a ballerina.
Parent
Re:Now What... (Score:5, Funny)
Information wants to be free.
Information wants to be a ballerina.
Then information needs to get her fat ass on a diet or she's never going to fit into that tutu and make Mommy proud!
Parent
Re: (Score:2)
That kind of parenting made information a heroine addicted stripper, now come over here and rub your data against me for a dollar.
It's so popular... (Score:5, Funny)
The summary wasn't insulting enough, so I think I'll just add a bit extra.
Last.FM is so popular that if you aren't familiar with the service, you must be a drooling, knuckle dragging luddite.
Apparently I'm not one of the cool kids. I'm sad now, and my feelings are hurt.
Re: (Score:3, Funny)
Last.FM is so popular that if you aren't familiar with the service, you must be a drooling, knuckle dragging luddite... a step away from churning your own butter.
Sorry, had to add my own.
all this data yet so much gets missed (Score:2, Insightful)
Re: (Score:2)
I missed a Metric show that I wouldn't have they, who know I'm a Metric fan, warned me.
They know what I like, and they have info about albuns and shows, how had it is to fire an actually interesting newsletter once in a while.
No revolution (Score:5, Interesting)
CBS certainly thinks so -- they bought the company for £140 (~$200) million last year.
Which is why whatever comes of them, at best it will be evolutionary. CBS is part of the old guard RIAA corps, they are just one of the faces of Viacom - all controlled by Summer Redstone. They may have brought some money to the table, but they brought a whole ton of baggage with them too. Enough baggage to make this privacy freak decide they couldn't be trusted with all that data they've been collecting (for example, if they can track down a stolen laptop, they can track down someone playing an MP3 from an illegally leaked pre-release album).
Re: (Score:2)
Oligopoly means minimal competition. You assume that CBS has figured out that the game has changed enough that the RIAA membership is no longer an effective monopoly. Given the goose-egg of evidence to support that theory, I sincerely doubt they have.
So... I've been living on Mars? (Score:3, Insightful)
Last.FM, for those who have been living on Mars for the last two years, is the largest online radio outlet, with millions of listeners per day.
You know, I'm not exactly what you'd call a Luddite, yet I've never heard of Last.FM. Am I the only one? I kind of doubt it.
I have a general gripe about anyone who writes "for those who have been living on Mars" anytime they reference some moderately popular company, service, or product. It smacks of arrogance, as if to say, if you don't have the same interests as I do, you're obviously disconnected from the mainstream.
Or perhaps I'm just annoyed for being called out on being a bit older and out of touch? Bah!
>>goes back to guarding lawn with a shotgun from an old rocking chair...
Re: (Score:2)
Re: (Score:3, Insightful)
I've never heard of them either and the bit about living on Mars also irks. And for all the arrogance in that, the summary makes it sound like the internet radio outfit needs fancy algorythms to tell what music they're playing. WTF don't they just program the correct name when they add a new song to their database? I'd read the article, but my shuttle back to Mars is leaving...
Re: (Score:3, Informative)
Well, Amarok has a config menu entry with a big old icon with the label "last.fm" on it. Everyone who ever used Amarok had to pass his's cursor over the label "last.fm", which has been there for a few years, mind you. Other media players also support last.fm, whether through a plugin or even built in. So you may have not been living under a rock but you sure were quite a bit distracted. For at least the last 6 years or so.
On a side note, I've made a point of turning on the last.fm plugin for a simple reason
Re: (Score:2)
Or maybe... just maybe... that sort of music actually is popular. And now that the service is getting to be more mainstream and less the private playground of geeks the charts are starting to reflect more (current) mainstream artists.
A lot of people actually like that crap. Sad but true.
Re: (Score:2)
This is something similar to what I was thinking, I've never heard of them but maybe it's because I listen to my music from other parts of the world. Read that as Japan, France, Germany, S.Korea and UK not in any particular order either but it's mostly the DJ's and/or the individual mixers I'm listening to these days.
I suppose it's the option to having a mass of indie choices that I can happily give a middle finger to anyone who decides to sell out along the way.
Re: (Score:2)
Even by the standard of press releases it seemed to be a particularly rubbish and arrogant press release (and I'm someone who actually uses last.fm).
I'm not sure what it was doing here. What do the editors think this is - the BBC technology pages or something?
Re: (Score:2, Insightful)
"IF Y'AINT SEEN THIS THEN Y'AINT SEEN NOTHIN!"
Which is pressed and kneaded as needed to "you have to have been living (under a rock | on mars | in a laundry hamper) for the last (year | ten years | few decades | all your life) if you haven'
Re: (Score:3, Interesting)
I read it differently (Score:2)
I always see that from the writer's viewpoint, as if he's saying "Look, I know this isn't news, and I'm just getting around to writing about it a few years later, but I really do have something interesting to say about it! So I will acknowledge its apparent staleness with a jokey aside before I get to the point."
Good thing writing isn't some sort of Rorschach test where we can each imbue it with our own insecurities, eh?
Re: (Score:2)
Last.FM has been covered on Slashdot before. What other reason, other than living on Mars, does one have for not keeping on top of Slashdot news?
Surpassing Pandora (Score:5, Insightful)
The company surpassed Pandora and others largely due to its unique datamining features
I would think that being available outside of the USA may have helped quite a bit as well.
Re: (Score:2)
The real danger (Score:5, Interesting)
Seriously though, I have found using the site to be pretty enjoyable. And the advertisements are actually worth keeping AdBlock turned off for. I found a few new artists, some unsigned, that way. I like all the various widgets and things that can crunch my data. Songbird has a last.fm plugin/addon that makes for very easy integration. It's just really useful. I've also found concerts on the site.
I rarely use the social side of it, except with friends I already know. But that's me.
Re: (Score:3, Interesting)
Haha, if it gives you any comfort, I'm the same way. With how iTunes/iPod work - incrementing the count when the track finishes - I'm constantly waiting for songs to end before picking another one, or leaving tracks that have silence at the end to finish completely. Really wish it incremented at 75% complete or something.
Re: (Score:2)
Re: (Score:2)
> I'm constantly waiting for songs to end before
> picking another one, or leaving tracks that have
> silence at the end to finish completely. Really
> wish it incremented at 75% complete or something.
Amarok submits to Last.fm after playing about half of the track. Yet another reason to use Amarok...
Re: (Score:3, Informative)
Yet another reason to use Amarok...
On his iPod?
GP was talking about the iTunes play counts, not the Last.fm play counts. Every app/plugin I've tried (including the official Last.fm app) either scrobbles at 50% or allows the user to configure the percentage. Yet another reason to be free to use whichever media player one prefers...
Re: (Score:3, Informative)
companies biggest asset is my privacy .. (Score:2)
'Without privacy, there cannot be freedom. And without freedom, there cannot be personal or social growth'
Re: (Score:2)
Not half a creepy as those websites you've been visiting; and let's not get started about your taste in music...
surpassed Pandora ... (Score:3, Informative)
The company surpassed Pandora and others largely due to its unique datamining features: 'Audioscrobbler,'
I'd say they surpassed Pandora only because Pandora locked out all non-US users a while back. For people who just wanted to listen to music and find out about new artists, Pandora was so much better IMO, last.fm has a clunky, overloaded UI and is too much like myspace ...
If Last.FM Is So Smart... (Score:3, Funny)
Then why the hell is it that when I run the "Recommendations [www.last.fm]" stream the algorithm occasionally freaks out and starts pushing one unlistenable noise attack after another at me with tags like brutal death metal [www.last.fm], cybergrind [www.last.fm], czech [www.last.fm], death metal [www.last.fm], deathgrind [www.last.fm], goregrind [www.last.fm], grind [www.last.fm], grindcore [www.last.fm], noisecore [www.last.fm], porngrind [www.last.fm], pornogrind [www.last.fm], etc. No matter how many times I click the "Do Not Want" button the stuff just keeps coming. It's like a neighbour from hell. And then there's the days when I get nothing but lesbian deathcore vegan grind [www.last.fm].
The Last.FM brainfarts seem to persist no matter how many times yoy try to train the recommendation engine using the like/ban buttons and the only way to get them to "reset" to something vaguely approximating normality is to log out, log back in, and run the Library [www.last.fm] stream for a while.
Still, even with this weirdness it's still better than Pandora at finding new music I actually like.
Re: (Score:2)
> ...are they just making that up?
They are making pretty much all of it up.