Slashdot is powered by your submissions, so send in your scoop

 



Forgot your password?
typodupeerror
×
Youtube

How a Computer That 'Drunk Dials' Videos is Exposing YouTube's Secrets (bbc.co.uk) 54

An anonymous reader shares a report: How many YouTube videos are there? What are they about? What languages do YouTubers speak? As of 14 February 2025, the platform's will have been running for 20 years. That is a lot of video. Yet we have no idea just how many there really are. Google knows the answers. It just won't tell you.

Experts say that's a problem. For all practical purposes, one of the most powerful communication systems ever created -- a tool that provides a third of the world's population with information and ideas -- is operating in the dark. In part that's because there's no easy way to get a random sampling of videos, according to Ethan Zuckerman, director of the Initiative for Digital Public Infrastructure at the University of Massachusetts at Amherst in the US. You can pick your videos manually or go with the algorithm's recommendations, but an unbiased selection that's worthy of real study is hard to come by.

A few years ago, however, Zuckerman and his team of researchers came up with a solution: they designed a computer program that pulls up YouTube videos at random, trying billions of URLs at a time. You might call the tool a bot, but that's probably over selling it, Zuckerman says. "A more technically accurate term would be 'scraper'," he says. The scraper's findings are giving us a first-time perspective on what's actually happening on YouTube.

[...] The first question was simple. How many videos have people uploaded to YouTube? [...] Zuckerman and his colleagues compared the number of videos they found to the number of guesses it took, and arrived an estimate: in 2022, they calculated that YouTube housed more than nine billion videos. By mid 2024, that number had grown to 14.8 billion videos, a 60% jump.

How a Computer That 'Drunk Dials' Videos is Exposing YouTube's Secrets

Comments Filter:
  • by Pseudonymous Powers ( 4097097 ) on Friday February 14, 2025 @10:31AM (#65166383)
    A computer system that randomly searches URLs to index their content? Isn't that just Google?
    • This whole article is stupid.
    • Google doesn't search randomly. It downloads known pages (home pages for domain names, for example), scrapes them for URLs and then downloads those. Rinse and repeat.

      It is possible to keep pages out of Google's search by not placing links to them in pages that will be crawled. And keep the links out of GMail and other stuff Google tends to exploit.

    • Google doesn't create random URLs, no, it builds lists of likely real URLs based upon root index pages and links from other sites.

      For an idea of what this computer is trying doing, here's the same thing but using telephone numbers: https://www.youtube.com/watch?... [youtube.com]

      • I think you can safely assume that the "drunk dialling" here isn't picking random URLs either. There's a hashing scheme used to access the videos from the youtube web servers, which likely has multiple purposes including load balancing. The valid hash distribution on strings to access videos can be estimated from the list of valid video URLs seen to-date. It's not rocket science.
    • My grand daughter is mildly autistic, and loves YouTube

      The stuff that she manages to find there is absolutely mind-blowing, even with all child safety controls in place

      I remind myself how I used to comb the family encylopaedia as a youth, so I find some solace in that

    • Isn't that just Google

      If you take the trouble to read the summary, it says "we have no idea just how many [videos] there really are. Google knows the answers. It just won't tell you."

  • by crunchy_one ( 1047426 ) on Friday February 14, 2025 @10:47AM (#65166409)
    I'd hazard a guess that the 60% jump can be attributed mostly to the flood of AI generated garbage content infesting YouTube.
    • the flood of AI generated garbage content infesting YouTube.

      "My cousin's best friend's roommate was making fun of me for being lazy by staying in my room all day, when I caught them conspiring to sell my stuff and kick me out I stopped paying the 10k per month keeping them afloat and left. Days later I had 32 missed calls and numerous panicked texts..."

      Don't get me wrong, I don't mind stories and often listen to audiobooks to help me sleep - if the AI voice is good and the story remains somewhat coherent I'll throw one on if I'm having trouble sleeping, saves mone

    • I'd hazard a guess that the 60% jump can be attributed mostly to the flood of AI generated garbage content infesting YouTube.

      Lets also not overlook the obvious when looking at any trends from 2020 to 2024. Since that’s also happens to be when a few billlion humans were forced out of their place of employment (creating massive job loss), resulting in millions of amateur YouTubers created several metric fucktons of content out of boredom and desperation.

      Not really surprising that social media content spiked.

    • I haven't seen one of these AI generated videos, do you have an example?

  • I had a couple of exes that would drunk dial me in the middle of the night. Never underestimate the power of number block in today's cellphones.

  • So why... (Score:5, Insightful)

    by RobinH ( 124750 ) on Friday February 14, 2025 @11:02AM (#65166433) Homepage

    By mid 2024, that number had grown to 14.8 billion videos

    If that's true, then why does YouTube only show me the same 40 or so videos over and over again in my feed?

    • Re:So why... (Score:4, Informative)

      by omnichad ( 1198475 ) on Friday February 14, 2025 @11:41AM (#65166511) Homepage

      Here are some official stats shared last year:
      The average number of views on YouTube videos is 5868. The median is 35
      68%: the proportion of *videos with zero views
      38%: the number of YouTube videos with fewer than 5 views
      44%: the number of YouTube videos with fewer than 100 views
      93%: the number of YouTube videos with less than 1000 views
      34% of YouTube videos concern gaming

      • by jvkjvk ( 102057 )

        Your stats don't stat.

        How can you have 68% with zero views and only 38% with fewer than 5 views, since zero is *definitely* fewer than five?

        Are you just makin' stuff up?

        Source or it is xkcd stat trope time.

        • I'm going to guess that's non-zero. I didn't write it

          • by jvkjvk ( 102057 )

            So those two categories account for 106% ov the videos then.

            Either way it appears to be full of shit.

        • from the article (yes I clicked the link, I know I'm not suppose too) 4% had zero views and less than 5 view seems to be about 20% (the graph is not that good a bit hard to read), and the median is 17 to 32 views, don't know where the gp found the number it is quoting.

    • the same 40 or so videos over and over again in my feed?

      Probably because, for many people, YouTube is used like cable TV was back in the day - something you put on for noise while you focus on something else. Feeding you something you'd already seen and possibly enjoyed lets you fulfil that need by providing something familiar to have on in the background so you can focus on the other task.

      It's like throwing Star Wars on for the umpteenth time, sometimes you really want to watch it while others it's just something you can have on in the background while doing s

    • If that's true, then why does YouTube only show me the same 40 or so videos over and over again in my feed?

      Are you kidding? I have to resist clicking videos knowing YouTube will flood my feed with more videos of a similar type. Click on any of these and find out:
      Canadian Train Plowing A HUGE Snowdrift https://www.youtube.com/watch?... [youtube.com]
      TOP 10 HARD LANDINGS https://www.youtube.com/watch?... [youtube.com]
      Can 10,000 Lego Bricks Stop a 300-Ton Hydraulic Press? https://www.youtube.com/watch?... [youtube.com]

    • by ddtmm ( 549094 )
      Because you didn't clear your history.
  • by cob666 ( 656740 ) on Friday February 14, 2025 @11:11AM (#65166449)
    That's really just a different flavor of War Dialing and isn't even relatively a new thing, so 'designing a computer program' seems like a bit of an overreach. The old school war dialer called a predefined or sequential list of phone number looking for a modem, but this type of URL war dialing is hunting for resources on the same site. If the calls aren't spread out enough, it could look like a DOS attack on the server side. I'm curious to see how they were able to get around that potential problem.
  • by JamesTRexx ( 675890 ) on Friday February 14, 2025 @11:16AM (#65166463) Journal

    How many cat videos are we talking about?

    Asking for a friend.

  • by PPH ( 736903 ) on Friday February 14, 2025 @11:18AM (#65166471)

    Or how many still exist? I've bookmarked "interesting" videos for years. And found that a not insignificant number of them have just disappeared.

    • The number they got to is based on statistical analysis of today. They are trying to figure out how many videos are on YouTube, not how many have been historically uploaded.

  • The researcher mentioned the corresponding example of sampling the US phone number space. However, the distribution of phone numbers is not uniform, at least in part because most area codes are not allocated and it's not clear if phone numbers are uniformly distributed within allocated area codes. Similarly for YouTube videos, it's not clear if YouTube URLs are uniformly distributed. It wouldn't be at all surprising if Google stratified the URL space based on some semantic meaning.

    Maybe the researchers d

  • The article says google won't tell you and "Experts say that's a problem."

    Experts in what? I asked google and it told me:

    "In 2025, about 2.6 million videos are uploaded to YouTube every day, which is equivalent to 518,400 hours of content."
    That's an AI generated answer, of course. So who knows, it could be completely wrong.

    Whatever the data may be, the data for recommendations is in need of a reset. At least I'd like to be able to reset it for my account.
    Every time I look it's the same garbage I didn't want

  • One for each of His names: The Nine Billion Names of God [wikipedia.org].
  • Google will buy this company and add a (beta) feature, which is new button labeled "I Feel Drunk".

    Well, maybe they already did that. But it's not a "random Youtube" and it's not even a button. It's just called search results, or sometimes, "Generative AI".

    There is by the way a button (tab) on Youtube labeled "New To You" but I'm not sure exactly what it's supposed to be doing. Last time I clicked it, I got a video showing how to make Chloroform. How did they know I was going on a date tonight? Analytics and

  • Nynorsk is the 11th most common language of Youtube Videos, but Svenska doesn't register? Sweden has more than double the population of Norway. Is it because most Swedish Youtubers just use English?

"Yes, and I feel bad about rendering their useless carci into dogfood..." -- Badger comics

Working...