Stories
Slash Boxes
Comments

News for nerds, stuff that matters

Slashdot Log In

Log In

Create Account  |  Retrieve Password

Managing Last.FM's "Mountain of Data"

Posted by Soulskill on Sat Dec 27, 2008 01:49 AM
from the because-it's-there dept.
Rob Spengler writes "Last.FM co-founder Richard Jones says the biggest asset the company owns is 'hundreds of terabytes of user data.' Jones adds, '... playing with that data is one of the most fun things about working at the company.' Last.FM, for those who have been living on Mars for the last two years, is the largest online radio outlet, with millions of listeners per day. The company surpassed Pandora and others largely due to its unique datamining features: 'Audioscrobbler,' the company's song/artist naming algorithm, can correctly determine a track even with tens of thousands of false entries. Jones says sitting on that much data has even helped police: 'thieves listening to music on an Audioscrobbler-powered media player have helped police in the US, UK, and other countries track down users' stolen laptops.' Does sitting on a mountain of data make Last.FM powerful enough to start making a stand against the record industry? CBS certainly thinks so — they bought the company for £140 (~$200) million last year."
+ -
story

Related Stories

[+] Audioscrobbler (Anyone Remember Firefly?) 200 comments
asciirock writes "RJ, a University of Southampton grad student in the UK has just put his final year project online. Audioscrobbler is a free plug-in for Linux XMMS and Windows Winamp2. It tracks every tune you play, cross-references with others in the Audioscrobbler community and serves up recommendations. There's also msging, stats and user homepages. In other words... Firefly lives!"
This discussion has been archived. No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More
Loading... please wait.
  • Data is valuable (Score:5, Interesting)

    by DNS-and-BIND (461968) on Saturday December 27 2008, @01:56AM (#26239951) Homepage
    A buddy of mine used to run this matching website for teachers & students. Free for teachers, and the students had to pay a nominal amount to get the teachers' contact info, and after that, it was up to them to arrange for lessons. The site was popular, and he made decent money at it. I bugged him and bugged him to organize parties, and eventually he came around to my way of thinking (he wanted to make some money without his parasite partner getting it). He used the list of emails from his website to send party invitations for a monthly get-together. He made more money from the parties than he did from the website.
    • Re: (Score:3, Insightful)

      How I see it: there are people with tons of money. Literally, tons. You can't use only money to make more money - no matter what you do with it, it just won't multiply sitting it's ass on the couch all day, watching TV or in a safe somewhere. So what do you need to give that money more value? The answer is simple: information. The only way to make money multiply is if know what to do with it. You can write the best software in the world, the best OS with the best tools ever, but if you don't know how to mak
      • Re: (Score:3, Insightful)

        Good points. You had me until you said "entity" (do you know what that means? I doubt it) in the place of, I assume, "commodity".

        Oh and the repeat after me bit is silly. The "information" you have is worthless on its own. It only becomes valuable when it's coupled with lots of other similar "informations" from other people. By retaining this information you're only preventing someone from making money, without any benefit for yourself, which is arguably dickish. Oh and saying that "information is more valua

          • Re: (Score:3, Interesting)

            and tell me mr Coward what have you deducted from you pile of information

            So what if he has never done one useful thing with it? People like that provide a public service, its people like that which enabled DejaNews and now Google Groups to reconstruct much of the historical usenet. If his hobby is data hording, then let him horde. It doesn't cost you a dime, but one day it might possibly be of great benefit.

    • by Dr_Banzai (111657) on Saturday December 27 2008, @02:59AM (#26240169) Homepage
      I think we have enough teacher-student sex scandals without a matchmaking web site!
  • by Blue Shifted (1078715) on Saturday December 27 2008, @02:14AM (#26240015) Journal

    what i find most interesting is the order certain songs "go together", like listening to a song from Slayer, followed by, say, "someday i suppose" from the bosstones. when composing songlists, i appreciate how similar songs and moods can flow, but also how the contrast of dissimilar songs can SOMETIMES compliment each other.

    a large database could ferret out such instances that might occur frequently in multiple playlists.

    • Re: (Score:3, Funny)

      Excellent point. Think of the impact this could have on psychology!

      To get you into the right mood, think of the impact it could have on mind manipulation ;) Tinfoil hats for sale! Get your tinfoil hats here [tinyurl.com]!
    • Re: (Score:2, Interesting)

      What about "songs are mostly played in alphabetical order"? :)

      • Re: (Score:3, Insightful)

        So your contribution, then, is noise.

        But this noise does not affect the signal, which is still there. It's just harder to find.

        Nobody ever said mining a mountain of data like this would be a trivial task.

  • Now What... (Score:2, Interesting)

    I have a similar site that I wrote (pre-audioscrobbler). Granted it's crap, but I have mountains of data also. Closer to 1 tb than hundreds of tb. The question is, how do you monetize the data?

    I just don't see how this data is "worth" 200 million bucks. I have some amazing algorithms to do similar cleaning, caching, and recommendations, but still what is that worth?

    This is a fairly legit question. If you can figure it out, I can explain to my wife why I have 3 servers in my closet.

    • Re:Now What... (Score:5, Insightful)

      by Mad Merlin (837387) on Saturday December 27 2008, @03:07AM (#26240193) Homepage

      I have a similar site that I wrote (pre-audioscrobbler). Granted it's crap, but I have mountains of data also. Closer to 1 tb than hundreds of tb. The question is, how do you monetize the data?

      If you could (accurately) answer that question, then you'd act upon the answer...

      Why do you think Google ads are Google's bread and butter as far as cashflow goes? The reason is that Google has a treasure trove of user data, probably more than anyone else, so they can really make contextual ads work. Anyone can write an ad engine, but not everyone has access to mountains and mountains of user data.

      You might be surprised at how important context is when you're trying to promote something. Say you're trying to promote an online RPG like Game! [wittyrpg.com], if you took a random collection of people, probably less than 5% of them would be interested in playing, but if you can target gamers specifically, that number might jump to 50%. If you're paying for every impression, that makes a world of difference.

      So not only do you need to understand your audience, you also need to effectively target them. Now, how do you do that? Data mining of course, and the more data the better.

      Pretty much all data has value, figuring out how to turn that data into money is extremely subjective and might involve some black magic, and definitely requires luck too.

      • Re: (Score:3, Interesting)

        Even more impressive is that the guts of the whole last.fm empire was built by a tiny team - a couple of dozen IIRC. They just fired 20% of their staff [theregister.co.uk], incidentally, bringing the numbers down to... 80.
      • figuring out how to turn that data into money ... might involve some black magic, and definitely requires luck too.

        So what you are saying is:

        1. Data
        2. ???
        3. Profit!

        :~)

  • by Anonymous Coward on Saturday December 27 2008, @02:48AM (#26240149)

    The summary wasn't insulting enough, so I think I'll just add a bit extra.

    Last.FM is so popular that if you aren't familiar with the service, you must be a drooling, knuckle dragging luddite.

    Apparently I'm not one of the cool kids. I'm sad now, and my feelings are hurt.

    • Last.FM is so popular that if you aren't familiar with the service, you must be a drooling, knuckle dragging luddite... a step away from churning your own butter.

      Sorry, had to add my own.

  • Last.fm Has all this data and yet so much gets missed. For instance: why doesn't last.fm have a feature to email you when a band you like comes out with a new album?
    • Their services are pretty good, but such functionality is indeed missing.
      I missed a Metric show that I wouldn't have they, who know I'm a Metric fan, warned me.

      They know what I like, and they have info about albuns and shows, how had it is to fire an actually interesting newsletter once in a while.
  • No revolution (Score:5, Interesting)

    by Jah-Wren Ryel (80510) on Saturday December 27 2008, @03:31AM (#26240273)

    CBS certainly thinks so -- they bought the company for £140 (~$200) million last year.

    Which is why whatever comes of them, at best it will be evolutionary. CBS is part of the old guard RIAA corps, they are just one of the faces of Viacom - all controlled by Summer Redstone. They may have brought some money to the table, but they brought a whole ton of baggage with them too. Enough baggage to make this privacy freak decide they couldn't be trusted with all that data they've been collecting (for example, if they can track down a stolen laptop, they can track down someone playing an MP3 from an illegally leaked pre-release album).

      • Oligopoly means minimal competition. You assume that CBS has figured out that the game has changed enough that the RIAA membership is no longer an effective monopoly. Given the goose-egg of evidence to support that theory, I sincerely doubt they have.

  • by Dutch Gun (899105) on Saturday December 27 2008, @03:39AM (#26240307)

    Last.FM, for those who have been living on Mars for the last two years, is the largest online radio outlet, with millions of listeners per day.

    You know, I'm not exactly what you'd call a Luddite, yet I've never heard of Last.FM. Am I the only one? I kind of doubt it.

    I have a general gripe about anyone who writes "for those who have been living on Mars" anytime they reference some moderately popular company, service, or product. It smacks of arrogance, as if to say, if you don't have the same interests as I do, you're obviously disconnected from the mainstream.

    Or perhaps I'm just annoyed for being called out on being a bit older and out of touch? Bah!

    >>goes back to guarding lawn with a shotgun from an old rocking chair...

    • Personally I'm surprised that Last.fm is considered a highly visible entity. I thought it was a niche site. And I use it. So. *shrug*
    • Re: (Score:3, Insightful)

      I've never heard of them either and the bit about living on Mars also irks. And for all the arrogance in that, the summary makes it sound like the internet radio outfit needs fancy algorythms to tell what music they're playing. WTF don't they just program the correct name when they add a new song to their database? I'd read the article, but my shuttle back to Mars is leaving...

    • Re: (Score:3, Informative)

      Well, Amarok has a config menu entry with a big old icon with the label "last.fm" on it. Everyone who ever used Amarok had to pass his's cursor over the label "last.fm", which has been there for a few years, mind you. Other media players also support last.fm, whether through a plugin or even built in. So you may have not been living under a rock but you sure were quite a bit distracted. For at least the last 6 years or so.

      On a side note, I've made a point of turning on the last.fm plugin for a simple reason

      • Maybe the sudden appearance of trash like kayne west and britney spears on the top of last.fm's charts has something to do with it.

        Or maybe... just maybe... that sort of music actually is popular. And now that the service is getting to be more mainstream and less the private playground of geeks the charts are starting to reflect more (current) mainstream artists.

        A lot of people actually like that crap. Sad but true.

    • This is something similar to what I was thinking, I've never heard of them but maybe it's because I listen to my music from other parts of the world. Read that as Japan, France, Germany, S.Korea and UK not in any particular order either but it's mostly the DJ's and/or the individual mixers I'm listening to these days.

      I suppose it's the option to having a mass of indie choices that I can happily give a middle finger to anyone who decides to sell out along the way.

    • Even by the standard of press releases it seemed to be a particularly rubbish and arrogant press release (and I'm someone who actually uses last.fm).

      I'm not sure what it was doing here. What do the editors think this is - the BBC technology pages or something?

    • Re: (Score:2, Insightful)

      No, none of us have been living on Mars. This is just the latest permutation of viral marketing, it seems. But this one is kind of weird, because it combines all the "bleeding edge stuff" we've seen before with the oldest of old school hawking techniques, which is this:

      "IF Y'AINT SEEN THIS THEN Y'AINT SEEN NOTHIN!"

      Which is pressed and kneaded as needed to "you have to have been living (under a rock | on mars | in a laundry hamper) for the last (year | ten years | few decades | all your life) if you haven'
    • Re: (Score:3, Interesting)

      I've never heard of them either....I've never seen an ad about them, I've never heard them mentioned in the piles of blogs and articles I read daily, and nobody has ever recommended them to me. Pandora, meanwhile, HAS been in all of the above.
    • I always see that from the writer's viewpoint, as if he's saying "Look, I know this isn't news, and I'm just getting around to writing about it a few years later, but I really do have something interesting to say about it! So I will acknowledge its apparent staleness with a jokey aside before I get to the point."

      Good thing writing isn't some sort of Rorschach test where we can each imbue it with our own insecurities, eh?

    • Last.FM has been covered on Slashdot before. What other reason, other than living on Mars, does one have for not keeping on top of Slashdot news?

  • Surpassing Pandora (Score:5, Insightful)

    by Paaskonijn (1220996) on Saturday December 27 2008, @04:19AM (#26240413)

    The company surpassed Pandora and others largely due to its unique datamining features

    I would think that being available outside of the USA may have helped quite a bit as well.

    • There is no world outside of the US. Have you never watched The Truman Show? It's like that. But on a larger scale. All of Iraq is just a big studio in Oregon. You can re-use the same piece of desert loads, and no-one notices.
  • The real danger (Score:5, Interesting)

    by Aerynvala (1109505) on Saturday December 27 2008, @04:44AM (#26240501) Homepage
    with last.fm is how it feeds my OCD issues regarding song playcounts. I nearly lost it when the stupid scrobbler started randomly recording excess playcounts on one album. It screwed with my numbers. Then it stopped counting that album's plays all together.

    Seriously though, I have found using the site to be pretty enjoyable. And the advertisements are actually worth keeping AdBlock turned off for. I found a few new artists, some unsigned, that way. I like all the various widgets and things that can crunch my data. Songbird has a last.fm plugin/addon that makes for very easy integration. It's just really useful. I've also found concerts on the site.

    I rarely use the social side of it, except with friends I already know. But that's me.
    • Re: (Score:3, Interesting)

      Haha, if it gives you any comfort, I'm the same way. With how iTunes/iPod work - incrementing the count when the track finishes - I'm constantly waiting for songs to end before picking another one, or leaving tracks that have silence at the end to finish completely. Really wish it incremented at 75% complete or something.

      • Oh yes. I actually have a playlist called Playcount that gets changed out any time I need to even up my numbers. And actually, the Audioscrobbler (at least last I looked) didn't properly count a song if you have it on Repeat One. Very annoying.
      • > I'm constantly waiting for songs to end before
        > picking another one, or leaving tracks that have
        > silence at the end to finish completely. Really
        > wish it incremented at 75% complete or something.

        Amarok submits to Last.fm after playing about half of the track. Yet another reason to use Amarok...

        • Re: (Score:3, Informative)

          Yet another reason to use Amarok...

          On his iPod?

          GP was talking about the iTunes play counts, not the Last.fm play counts. Every app/plugin I've tried (including the official Last.fm app) either scrobbles at 50% or allows the user to configure the percentage. Yet another reason to be free to use whichever media player one prefers...

      • Re: (Score:3, Informative)

        I know what you mean, with my smart playlists that keep out the songs I've played in the last 5 days I always let the songs end too. As for the silence thing just edit the properties of a song in iTunes to start/finish at the time code of your choice. It's very convenient to skip intros and such too.
  • So, I got PHORM [wired.com] monitoring my browsing habits and Audioscrobble monitoring what I listen to. Does anyone here, apart from me, find that just a little bit creepy ..

    'Without privacy, there cannot be freedom. And without freedom, there cannot be personal or social growth'
    • Not half a creepy as those websites you've been visiting; and let's not get started about your taste in music...

  • by Lazy Jones (8403) on Saturday December 27 2008, @12:34PM (#26242531) Homepage Journal

    The company surpassed Pandora and others largely due to its unique datamining features: 'Audioscrobbler,'

    I'd say they surpassed Pandora only because Pandora locked out all non-US users a while back. For people who just wanted to listen to music and find out about new artists, Pandora was so much better IMO, last.fm has a clunky, overloaded UI and is too much like myspace ...

  • Then why the hell is it that when I run the "Recommendations [www.last.fm]" stream the algorithm occasionally freaks out and starts pushing one unlistenable noise attack after another at me with tags like brutal death metal [www.last.fm], cybergrind [www.last.fm], czech [www.last.fm], death metal [www.last.fm], deathgrind [www.last.fm], goregrind [www.last.fm], grind [www.last.fm], grindcore [www.last.fm], noisecore [www.last.fm], porngrind [www.last.fm], pornogrind [www.last.fm], etc. No matter how many times I click the "Do Not Want" button the stuff just keeps coming. It's like a neighbour from hell. And then there's the days when I get nothing but lesbian deathcore vegan grind [www.last.fm].

    The Last.FM brainfarts seem to persist no matter how many times yoy try to train the recommendation engine using the like/ban buttons and the only way to get them to "reset" to something vaguely approximating normality is to log out, log back in, and run the Library [www.last.fm] stream for a while.

    Still, even with this weirdness it's still better than Pandora at finding new music I actually like.