Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Open Source

VLC Tops 6 Billion Downloads, Previews AI-Generated Subtitles (techcrunch.com) 59

VLC media player, the popular open-source software developed by nonprofit VideoLAN, has topped 6 billion downloads worldwide and teased an AI-powered subtitle system. From a report: The new feature automatically generates real-time subtitles -- which can then also be translated in many languages -- for any video using open-source AI models that run locally on users' devices, eliminating the need for internet connectivity or cloud services, VideoLAN demoed at CES.

VLC Tops 6 Billion Downloads, Previews AI-Generated Subtitles

Comments Filter:
  • by Baron_Yam ( 643147 ) on Thursday January 09, 2025 @09:10AM (#65075051)

    Translation is difficult to do well, and it seems like something you could tune and AI to do with reasonable accuracy.

    I'm going to need an option to exclude one or more languages; I really only need subtitles for those languages I don't speak. And an image processing AI that can look for embedded subtitles and turn off its own subtitles if embedded ones exist would be a good feature.

    Then again, with a separate channel for dialog, why not that fancy new AI auto-dubbing that sounds like the original actor's voice and delivery? Nothing stops the system from working ahead a few seconds to get the audio ready.

    • by Joce640k ( 829181 ) on Thursday January 09, 2025 @09:30AM (#65075105) Homepage

      AI, schmayai.

      What I want is a version that can buffer more than 10kk of data so it doesn't all go to hell when I'm watching over WiFi and a scene with rain or sea or whatever. I've got 16Gb of RAM, FFS. The entire file could fit in memory, but nooooo, we can only buffer a few kb because more would be wasteful!

      and, (b) a version where I can single-step more than dozen frames without it crashing.

      and (c), a version that can re-open the same file and continue if the network glitches or if I put my computer to sleep half way through watching something. The file's still there , use it!.

      • I've also used a few systems that won't start until the buffer is full. There ought to be two limits - the minimum required to start, and the maximum memory possible to allocate.

        If the download can happen faster than I watch (and these days, it almost always should), then download the whole damn thing and go offline once done.

        Though a throttle option would be good too, so you don't choke your connection for everyone else during the initial stage.

      • by r1348 ( 2567295 )

        I'd like a Google TV version that didn't break UPnP/DLNA browsing, making it completely useless on my TV.

      • VLC sucks. Windows: https://github.com/Aleksoid197... [github.com]. MacOS: https://firecore.com/ [firecore.com] (paid)
    • something you could tune and AI to do with reasonable accuracy.

      Google Translate's been running on an early prototype of LLMs since 2016. And earlier versions of it still used a lot of tech from the AI research space. I'd argue it was never not AI.

    • by Luckyo ( 1726890 )

      Youtube had that for what, half a decade? This is pre-current gen AI technology.

      And you already have the AI auto-dubbing, just like with subtitles, it still misses a lot and so it's not very good. You can see some of the cutting edge of this technology in things like Fridman's recent Zelensky interview, where they managed to slap three different audio tracks in three different languages all with speaker's own voice auto-transcribed and translated (helped by a human) using cutting edge current gen AI.

      It's ju

      • by oneiros27 ( 46144 ) on Thursday January 09, 2025 @10:43AM (#65075355) Homepage

        But the Youtube version doesn't run on your local system.

        And as someone who keeps closed captions on all the time.. automated transcripts have been around for years now. It's especially bad with people with strong accents, and I've seen a few cases where it corrupts it so bad to make it dangerous (home improvement and cooking shows)

        Manual translations would sometimes correct themselves... you'd see a word disappear and then be replaced. Especially during live TV such as news broadcasts.

        • Microsoft and Apple have subtitle generators in their accessibility tools, which I think work locally, and certainly work reasonably well.
          We've had voice recognition since at least the 1990s with things like IBM Via Voice. It has certainly improved a lot since those days, but nevertheless it is not new.

    • by Kisai ( 213879 )

      AI translation is about 50% accurate, at best. It's good enough in most main-stream content.

      - Translation can not deal with onomatopoeia
      - Translation can not deal with talking over each other
      - Translation can not deal with music

      So that AI translation is not going to work on sex scenes in porn, violent action scenes full of music and bullet/explosion sounds, and it won't work on poorly audio balanced films/television shows.

      Usually where it fails the hardest is with technical jargon and names. As an example f

    • AI translation is great, and also it's awful, at least what I see from FB and Google. AI spoken language understanding is great, but also it's awful, at least what I see from Google.

      Human subtitling of foreign language video is great, but also it's awful, at least what I see from Netflix.

      It would be nice if VLC worked on providing a better UI, rather than jumping on the latest bandwagon.

      If they really want to use AI, how about fixing low-resolution videos of famous people to be high resolution based on exi

    • AI Translation of audiovisual content is already sub-par for Anything-to-English. Some may find it acceptable in some scenarios, others not.
      But it gets much worse when it translates to other languages with more grammar features like gender distinction or declinations. The reason is quite simple: AI translation lacks context of the scene. For example: "Give it to me" could be "dÃmela" or "dÃmelo" in its literal sense or could be "castÃgame" in figurative sense. That's just a small example. I

      • Absolutely agreed, but there's a lot of potential yet to explore - so far I believe it's all word recognition and translation, but eventually someone is going to train an AI on data that includes 'previous sentence(s)', 'following sentence(s)' (because this task not only doesn't have to be done in real time, it can actually look ahead), and also 'speaker tone and relative volume'.

        True understanding shouldn't be necessary to get it so good you stop caring about the occasional errors. What it needs is a gian

    • by narcc ( 412956 )

      Translation is difficult to do well, and it seems like something you could tune and AI to do with reasonable accuracy.

      Shaka, when the walls fell.

      I'm going to need an option to exclude one or more languages

      ... because you find the existence of options you don't personally want to use offensive?

      I really only need subtitles for those languages I don't speak

      An increasing number of people are using CC because they can't hear the dialog over the background music. [theatlantic.com]

      why not that fancy new AI auto-dubbing that sounds like the original actor's voice and delivery?

      Because even at its best, the results are terrible.

      Nothing stops the system from working ahead a few seconds to get the audio ready.

      Wait ... You want to do that on demand?! I'm not sure I could come up with an approach that's more needlessly wasteful.

  • AI-powered subtitles? Meh, you can run just about run Whisper on a pop-up toaster these days and there are lighter solutions that work well enough.

  • by Qbertino ( 265505 ) <moiraNO@SPAMmodparlor.com> on Thursday January 09, 2025 @09:11AM (#65075057)

    ... one of those hero projects of FOSS.

    AFAIK the project lead turned down a million dollar offer to run ads on VLC 15 years ago or so and decided to keep the project clean.

    I hope he's doing well. Awesome software.

    • And yet the video adjustments still don't work. Just try playing a video then tweaking the brightness.

      I like VLC, but only because I've yet to find anything better.

    • I just had a shiver as you brought back memories of the stuff that would have an installer that would install crapware on your machine (gator, browser toolbars, etc)

      I think some of the 'download' websites 20-ish years ago repackaged things to do that sort of shit, too.

      (I think cnet was one of those... I think cdrom.com was clean; tucows and download.com I don't remember)

  • My toaster doesn't need AI.
    Nor does my thermostat.
    and especially not my car.

    I have an idea: build a car with hand crank windows, carburators, and a gasoline engine. The type of car you can fix yourself... not dependent on electronic chips...
    Go fund me.
    • Okay, but this is a media player. It's already software. Heck it's already using your GPU when it can.

      • Yeah, but this is still bloat. I want a media player to play media files using as little resources as possible. Yes, I understand that decoding a 4K h265 file requires a lot of power, but if I'm playing a SD file, then it should not use lots of RAM or CPU.

        I use VLC infrequently - when I want to test various HLS streams and when I want to play some weird format file that other players do not want to play. Otherwise I use mpc-hc if I'm on Windows - it has a nice interface and seems to not use a lot of resourc

        • I would hope this module doesn't allocate any resources unless turned on. I know basic competence seems to be a lot to ask from software in 2025, but I still trust VLC to pull it off.

        • by haruchai ( 17472 )

          Too bad it's Windows-only
          also this: "MPC-HC is not under development since 2017. Please switch to something else."

          • I still use my Grandad's hammer. Still works as well as it did when he bought it in about 1930.

            In the case of software if it does what it's intended to, and does it well, why change it ? "just because" ?

            This is how good software becomes shit. Idiots can't leave well enough alone and all of a sudden your simple tool, which did one thing really well, now has a built in email client and a shiny new UI (where you can't find anything and all the features you use are now 92 clicks away in a "hamburglar" menu i

          • So, when I watch an episode of Babylon 5, which is not under development since 1998 everything should be OK.
            Also, it works well, why develop it further, possibly adding bloat?

        • Yes, I understand that decoding a 4K h265 file requires a lot of power, but if I'm playing a SD file, then it should not use lots of RAM or CPU.

          Why do you believe this? I would expect newer codecs to use LESS ram and CPU. Otherwise, what's the point of them.

          • The point of newer codecs is to reduce the file size while keeping most of the quality.
            MPEG2 uses less CPU than h264, but at the same quality, a h264 file is smaller.

            So, a 4K video is either going to take up a lot of space, or require a lot of CPU, that is OK. What's not OK is when the player uses lots of CPU just for itself, even if playing a mp3 or a DivX file.

    • Yeah. I have an old car with a carburetor, crank windows etc. It has a somewhat modern tape deck with microprocessors though.
      The biggest problem for me is rust - steel rusts, especially when exposed to salt from the road. I have to get some rust holes patched once in a while.

      But yeah, I would only consider buying a brand new car if they made one using plans from the 70s or 80s with the only allowable alterations being better, corrosion resistant alloys and better rust-proofing.

      • You'd buy a 70s era death trap with a carb and points that need adjusting? Reliable fuel injection was perfected in the 1980s. You'd be hard pressed to find a vehicle today that won't make 100k with only oil changes.

        • Gapping plugs and timing lights, what's not to love? :)

        • by DarkOx ( 621550 )

          Says someone who has never once had to deal with 80s vintage fuel injection after the 80s.

          Essentially no diagnostic capability, mechanical things like MAF sensors that are hard to verify and may be wildly erratic in terms of performance when warn. 300 feet of hose all of which is dried up and cracking and if any of leaks the entire system runs like crap and you will be replacing all of it all at once because again no diagnostics. Then it thing still won't run right because one of those injectors has bad pa

        • What about lasting many decades? My car is over 40 years old and is still running. Would a car bought today last that long? How expensive are its parts?

          I actually have a car with points and it's not that big of a deal to adjust or replace them. They are also cheap. My main car has electronic ignition though. Still with a carb.

    • Re:No Thanks. (Score:5, Insightful)

      by JamesTRexx ( 675890 ) on Thursday January 09, 2025 @09:55AM (#65075193) Journal

      Yeah, modern codecs are already a large enough burden on my Core 2 Duo when playing movies, I don't unneeded features chocking the life out of it.

      I use mpv instead of VLC because it's easier and more basic in use for me.

      • by Tx ( 96709 )

        The downside of VLC's cross-platform nature is it's never been that optimised on any one platform. It's a fantastic "swiss army knife" player in terms of features and capabilities, though.

    • My toaster doesn't need AI. Nor does my thermostat. and especially not my car. I have an idea: build a car with hand crank windows, carburators, and a gasoline engine. The type of car you can fix yourself... not dependent on electronic chips... Go fund me.

      Buy a classic car. They still exist. And some of them are available for less than new cars. Get a body style that's not considered sporty and they're downright cheap. Some of them were purchased by Boomers right before they hit the no-driving age and stuck in a garage somewhere for the last thirty to forty years. Hell, my Dad just bought a SS Malibu from 69 that had less than 10k on it. And you can work on those things without having to dismantle the entire engine compartment to do it. No need to build a ne

    • I'd buy that car.

      • That is basically what Volkswagen was. I thought I wouldnt be the only one who would want lo tech. We could very well get one from Tata.

        Who would have believed that music on vinyl world return to great popularity ?
    • My toaster does need AI. It can never get toast right -- either it burns or is not toasted. Ok, maybe a slightly lower tech solution might be available, like in toasters of the 60s.

      • The $15 (CDN) toaster, with plastic sides , that I bought after the wifey gave me the heave ho around 20 years ago is still making excellent toast .. the up/down lever is slightly wobbly from use , but you cant really make better toast. I can't see how adding silicon chips and software is going to help ME.

        I'll lookup the brand for you :-)
    • Crank windows? My car has crank linux.

  • The AI-generated translation would be a great addition. But I'm not using it if it uses Google as a backend.

    • TFS is only like two sentences.

      using open-source AI models that run locally on users' devices

  • by geekmux ( 1040042 ) on Thursday January 09, 2025 @10:39AM (#65075337)

    for any video using open-source AI models that run locally on users' devices, eliminating the need for internet connectivity or cloud services..

    Adobe Reader v5.x was approx. 5MB in size. It was used to read PDFs.

    Adobe Reader v24.x is over 575MB in size. It is used to read PDFs.

    As companies offer offline AI-enabled services, what are we to expect from software bloat? What is the real value-add of AI services being offline? How many versions of the same AI engine are we to expect to be found coming online and outdated, “chatting” with each other? Will the T-1000 even talk to the T-800, or is that Boomer-nator too old school (and racist) to be in the same AI social circles?

    • Re: (Score:2, Troll)

      by AlvySinger ( 900304 )

      The singularity will actually be when technology becomes pointless. Have AI write your emails! Have AI read your emails! What's the point. Why not just come up with a better means of communicating and models for working?

      Previously, Google developed an agent to make telephone appointments for you. Nice for the user, a misery for the poor bugger talking to an AI. Why not just make IT ubiquitous through open APIs, etc. that make this direct and simple.

      Technology used to solve problems. Now it's turning into a

  • A job *ripe* for AI (Score:4, Interesting)

    by Tony Isaac ( 1301187 ) on Thursday January 09, 2025 @02:55PM (#65076261) Homepage

    Have you ever watched closed captioning on live TV events? It's always laggy and inaccurate. They miss a ton of stuff that's said, and it's three sentences behind. I would *much* rather see AI closed captioning, if I'm in a place where the volume can't be turned up.

  • And it still can not pause on mouse or touchpad/touchscreen click. This is the single reason I am not using VLC for everyday video watching.

  • Let's hope VLC does a better job than the live captions option that Microsoft Teams has. I switch on that option to see how hilariously it mangles the English language :-) It's rare for it to get one complete sentence right and this is from a trillion dollar company. It's frankly embarrassing and completely useless as a text record of your Teams meeting.

MSDOS is not dead, it just smells that way. -- Henry Spencer

Working...