YouTube Now Controls Its Hardware Roadmap (techspot.com) 29
An anonymous reader shares a report: Partha Ranganathan came to realize about seven years ago that Moore's law was dead. No longer could the Google engineering VP expect chip performance to double roughly every 18 months without major cost increases, and that was a problem considering he helped Google construct its infrastructure spending budget each year. Faced with the prospect of getting a chip twice as fast every four years, Ranganathan knew they needed to mix things up. Ranganathan and other Google engineers looked at the overall picture and realized transcoding (for YouTube) was consuming a large fraction of compute cycles in its data centers. The off-the-shelf chips Google was using to run YouTube weren't all that good at specialized tasks like transcoding. YouTube's infrastructure uses transcoding to compress video down to the smallest possible size for your device, while presenting it at the best possible quality.
What they needed was an application-specific integrated circuit, or ASIC -- a chip designed to do a very specific task as effectively and efficiently as possible. Bitcoin miners, for example, use ASIC hardware and are designed for that sole purpose. "The thing that we really want to be able to do is take all of the videos that get uploaded to YouTube and transcode them into every format possible and get the best possible experience," said Scott Silver, VP of engineering at YouTube. It didn't take long to sell upper management on the idea of ASICs. After a 10-minute meeting with YouTube chief Susan Wojcicki, the company's first video chip project was approved. Google started deploying its Argos Video Coding Units (VCUs) in 2018, but didn't publicly announce the project until 2021. At the time, Google said the Argos VCUs delivered a performance boost of anywhere between 20 to 33 times compared to traditional server hardware running well-tuned transcoding software. Google has since flipped the switch on thousands of second-gen Argos chips in servers around the world, and at least two follow-ups are already in the pipeline.
What they needed was an application-specific integrated circuit, or ASIC -- a chip designed to do a very specific task as effectively and efficiently as possible. Bitcoin miners, for example, use ASIC hardware and are designed for that sole purpose. "The thing that we really want to be able to do is take all of the videos that get uploaded to YouTube and transcode them into every format possible and get the best possible experience," said Scott Silver, VP of engineering at YouTube. It didn't take long to sell upper management on the idea of ASICs. After a 10-minute meeting with YouTube chief Susan Wojcicki, the company's first video chip project was approved. Google started deploying its Argos Video Coding Units (VCUs) in 2018, but didn't publicly announce the project until 2021. At the time, Google said the Argos VCUs delivered a performance boost of anywhere between 20 to 33 times compared to traditional server hardware running well-tuned transcoding software. Google has since flipped the switch on thousands of second-gen Argos chips in servers around the world, and at least two follow-ups are already in the pipeline.
Moore's law in Monty Python? (Score:2)
People keep saying Moore's Law is dead, and it like the guy in the Monty Python movie, it keeps saying "I'm not dead yet!"
https://www.intel.com/content/... [intel.com]
Eventually it'll happen, I'm sure.
Re: (Score:3, Informative)
That's transistor count though, not performance. somewhere around 2008 moore's law switched from scale up to scale out... because scale up more than doubled to about 36 months for double the performance, but then half a decade later scale out wasn't viable anymore (can't be putting 128 'i' cores in an office workstation...) so we've really slowed down. Today's i9-12900 isn't twice as fast as 2020's i9-10900 for example, it's only about 35% faster.
I think you mean 1965 (Score:5, Informative)
> That's transistor count though, not performance. somewhere around 2008 moore's law switched
Somewhere around 1965, Gordon Moore made his famous observation about *transistor density*. It never was about performance.
Maybe you mean somewhere around 2008 somebody wasn't paying attention and thought that Moore had said something about performance 43 years earlier?
Kinda like a lot of people think ESR said "there will never be any bugs", when in fact he said that when bugs are found, having a lot of people looking at the bug will help find the best solution quickly.
Re:I think you mean 1965 (Score:4)
I suppose people can make their own, possibly unrelated statements, and somehow believe that whatever thought they have is somehow related to Moore's Law.
Gordon Moore said something very specific. Moore's Law is what Moore said. He defined a specific rate of growth, and a specific metric. Any other metric may be interesting or not interesting, but it's not Moore's law.
Moore's Law, what Moore said, is the doubling of density for the process which results in the lowest cost per gate. That's Moore's law. It may not be AC's law or Ted's law or Bob's law, but it's Moore's law.
Took them long enough (Score:1)
The TV and streaming video industry have been using specialized transcoders for over 20yrs.
Re: (Score:2)
yeah but that hardware is stupid expensive and google is a cloud scale operation. I see why they did their own thing. I can't find it but at one point I thought google was courting AMD or Nvidia for specialized encoders but ended up not liking their output quality enough. Their own argos VCUs rival x264/5 and commercial encoders while amd and nvidia's offerings are 'budget' quality.
I do some HLS encoding and I cannot get nvenc or quicksync to output same size/same quality as youtube but I can get x264
Re:Took them long enough (Score:5, Interesting)
Youtube has some interesting scaling issues; as of 2020, youdube got 500+ hours of video every minute. Those are arriving using many different formats and resolutions, and every video needs to be converted to many different format+resolution tuples. Most streaming services get less data and serve far fewer streams (both simultaneous streams and distinct videos).
This is not a new story, but it's still quite an achievement.
Moore’s Law (Score:5, Insightful)
Re: (Score:2)
right, but it's been used with the implication that double the transistors, double the work done. Not really the case but it worked for almost 50 years!
Any idea when these will hit Google's Cloud? (Score:2)
Or are they too proprietary to let out of the organization?
Re: (Score:2)
Re: (Score:1)
After all how many people need a cloud server to transcode video?
Quite a number probably. Beyond services like Twitch (which wouldn't be using GC, but that's besides the point), Pornhub, Vimeo, Dailymotion, Tiktok, etc., services like Zoom, WebEx, etc. can benefit from fast and high quality trancoding. So could anything with a bandwidth constrained environment attempting to do real-time video.
Re: (Score:2)
Big difference between real-time transcoding and video file transcoding.
Re: (Score:2)
So could anything with a bandwidth constrained environment attempting to do real-time video.
In constrained environments, is it not easier, cheaper not to do real-time transcoding? Limit the formats to the common ones seems to be a much better solution in a constrained environment that use a potentially expensive cloud service
FPGAs? (Score:4, Interesting)
Do these ASICs do the whole encoding in hardware, or do they just accelerate the grunt work?
WOuldn't re-programmable FPGAs allow way more updatetability and flexibility?
Re: (Score:3, Informative)
Link to a review of those chips one year ago. https://www.tomshardware.com/n... [tomshardware.com]
Re: (Score:3)
Yes, I was surprised ASICs were cost-effective. I would have thought FPGAs would be the way to go... they'd probably get 1/2 to 1/4 the performance of an ASIC, but I think they'd be much less than 1/2 to 1/4 the price.
Re: (Score:3)
Perhaps it doesn't take as many ASICs to get cheaper than using FPGAs as you thought.
I think what you're forgetting that more chips means more servers and more power. If FPGAs literally deliver 1/4 the performance, then Google will need 4 times the power budget for video transcoding if they use them, 4 times the rack space, 4 times the cooling equipment...
Re: (Score:2)
I guess not; I don't know the relative costs, but I'd assume that ASICs are hugely expensive up front and cheaper in volume. It surprises me that a few thousand is a high-enough volume to make ASICs competitive, but maybe things have changed since I was in the hardware world quite a few years ago.
Re: (Score:2)
FPGAs with enough gates and high enough clock frequencies to do what they need are pretty expensive, not least because you end up paying for a load of hardware you don't actually need. Being generic devices they offer all sorts of stuff that Google won't use. Plus they have to pay the manufacturer of the FPGA their profit margin, which covers their R&D etc.
The price of getting ASICs fabbed has come down a lot. Considering the scale YouTube operates on, I can believe that making their own ASICs is cheape
Re:FPGAs? (Score:4, Informative)
Re: (Score:2)
FPGA is for low volume. They usually run very hot compared to ASIC.
Re: (Score:2)
x264 can easily beat hardware H.264 encoders in efficiency, but it comes at a huge cost of adding CPU time. It has had a very long time to improve itself.
I'm not sure if H.265, VP9, or AV1 have similarly mature software encoders. It wouldn't surprise me to learn the codecs have become complicated and so inherently slow that spending time iterating software for it isn't a good use of time.
Re: (Score:2)
WOuldn't re-programmable FPGAs allow way more updatetability and flexibility?
At the cost of performance. That's generally the tradeoff between FPGAs and ASICs. FPGA's a ... well ... Field Programmable. ASICs are designed once for ... well ... Application Specific purposes.
Imagine that... (Score:2)
Purpose-specific hardware is better suited for its specific purpose than general-compute hardware that isn't designed specifically for that purpose.
Weird.
Also pushed work to users (Score:3)
Google Photos got excellent video stabilization, for instance.
YouTube kept that proprietary for a while, so you would upload to YouTube preferentially, but when cycles got tight everybody got stabilization (and monetization).
I could tell things had dramatically sped-up! (Score:3)
The processing time on my uploaded videos has drastically dropped from a year ago. I wondered what the difference was.
Now I G.I. Jooooooe!
someone at Google had an epiphany (Score:2)
So many engineers and now they figure out.....