Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Media Software News Technology

Video Inpainting Software Deletes People From HD Video Footage 124

cylonlover writes "In a development sure to send conspiracy theorists into a tizzy, researchers at the Max Planck Institute for Informatics (MPII) have developed video inpainting software that can effectively delete people or objects from high-definition footage. The software analyzes each video frame and calculates what pixels should replace a moving area that has been marked for removal. In a world first, the software can compensate for multiple people overlapped by the unwanted element, even if they are walking towards (or away from) the camera."
This discussion has been archived. No new comments can be posted.

Video Inpainting Software Deletes People From HD Video Footage

Comments Filter:
  • Summary Fail (Score:2, Informative)

    by Anonymous Coward

    Background has to be static for it to work.

    Nevertheless, an interesting accomplishment.

    • by Knuckles ( 8964 )

      Background has to be static for it to work.

      Nevertheless, an interesting accomplishment.

      Surely the person has to move far enough across the static background to reveal at last in one frame what's behind the person? I mean, if I'm standing in front of a dwarf for the whole duration of the video, how is the software to know about the dwarf?

      • It doesn't know, but if this works like Photoshop's content aware fill, it can convincingly fake the rest of the wall. That being said, it's my experience that older, manual methods, usually work better.
        • by Knuckles ( 8964 )

          It doesn't know, but if this works like Photoshop's content aware fill, it can convincingly fake the rest of the wall. That being said, it's my experience that older, manual methods, usually work better.

          Yeah, figured as much but, it's early here :)

        • Re: (Score:2, Insightful)

          by Anonymous Coward

          That being said, it's my experience that older, manual methods, usually work better.

          Yes, but how fast can you manually remove someone from 30 seconds of video? You're talking about manually touching up over 700 pictures. How many can you do in a day? This software would probably do that 30 seconds in 30 seconds.

      • Re:Summary Fail (Score:5, Informative)

        by jimshatt ( 1002452 ) on Friday March 15, 2013 @04:08AM (#43180569)
        But, then, how is anyone else to know about the dwarf. From the viewer's perspective the dwarf doesn't exist. For that matter, dwarfs might not even exist at all!
        If you look at the video, though, the background doesn't have to be static. Objects moving over other moving objects can be removed as well. But, yeah, they have to be visible at some point.
      • Zoom and inhance.

    • Re:Summary Fail (Score:5, Informative)

      by grumbel ( 592662 ) <grumbel+slashdot@gmail.com> on Friday March 15, 2013 @04:32AM (#43180635) Homepage

      Background has to be static for it to work.

      Nope [youtube.com]

      • Re: (Score:2, Interesting)

        by EdZ ( 755139 )
        Yes, it does, as the video demonstrates. Their algorithm requires a static background (i.e. a stationary camera) and handles and moving foreground objects as user-selected special cases. If the background were to move, they'd need to either motion-compensate the entire footage first (assuming the camera only changed orientation, and not position, so parallax was not an issue), or perform an exhaustive search over the entire footage (which is the specific situation their algorithm is trying to avoid!).
        • Re:Summary Fail (Score:5, Informative)

          by Psyborgue ( 699890 ) on Friday March 15, 2013 @07:20AM (#43181279) Journal
          Absolutely false [youtube.com]. Check the Pax Planck page.
        • Why do you claim it's impossible when that's the very thing that's demonstrated in the first few seconds of the video link you responded to?

          I know people don't RTFA, but not to look at the link of the post you're responding to?

        • by SpzToid ( 869795 )

          Using the free Windows software from Microsoft, it is trivial for me to stitch many photos together to make a stunning panorama photo. For example I once sat and made one while a boat pulled in to dock alongside me, to pick up a few passengers and then it pulled away. I probably shot about a dozen photos during this short time, all the while the boat was directly within my panorama. The resulting photo looks amazing, like there's 3-4 boats, (with the same people!). In fact I think I had to shoot quickly to

      • by tgd ( 2822 )

        Background has to be static for it to work.

        Nope [youtube.com]

        That's still a static background, just the camera position and field of view isn't static. The view of the background isn't static, but the background is. If there were moving people in the background, and you wanted to remove someone in between, you've got a problem (because you don't know what the people were doing when they were blocked).

    • by Barryke ( 772876 )

      How about the swinging arm of the guy behind the bench? That is background that is not static, yet it is redrawn with the arm motion neatly interpolated.

      Ofcourse a lot of the background should be static, but it nearly always is unless you're looking at movie SFX or sea footage.

  • by Anonymous Coward on Friday March 15, 2013 @02:19AM (#43180187)

    Researchers have developed video inpainting to remove the character Jar Jar Binks from the Star Wars Prequals.

    • by Barryke ( 772876 )

      This would be a cool usecase. I dont mind JarJar so much though, theres lots of stranger things in starwars, but i'd make a good technology demo.

    • They'd have to develop a similar algorithm for audio though.

      I think I'd find a mute Jar Jar far easier to tolerate than an invisible one.

    • by Dabido ( 802599 )
      ... and replace him with walkie talkies so as not to scare the children!
  • Don't the new Galaxy S4 have a similar feature, if I read correctly? Although only for photos.

  • Reflections (Score:5, Interesting)

    by Anonymous Coward on Friday March 15, 2013 @02:41AM (#43180263)

    I liked the fact that you could still see the pedestrians in the reflections of the display window in the video of the musicians, even though they had been erased from the front end. Like the vampire test, but the other way around.

  • oh, great (Score:5, Funny)

    by roman_mir ( 125474 ) on Friday March 15, 2013 @02:46AM (#43180279) Homepage Journal

    So combine this tech with Google Glass and identify people you just don't want to see ever again, and you may end up walking right into them without even knowing.

    • Re: (Score:2, Interesting)

      Rainbows End, by Vernor Vinge. What you described is just the beginning.

    • Already done in a Charlie Stross novel... people are walking the streets but appear as pixellated blurs... Anonymity in the crowd taken to the extreme... Also reminiscent of Peter Watts - Blindsight. We already knew that eyes are unreliable indicators at best, let's not worry until someone is editing memories...
      • Already done in a Charlie Stross novel... people are walking the streets but appear as pixellated blurs... Anonymity in the crowd taken to the extreme... ..

        Sounds like the scramble suits in Philip K Dick's A Scanner Darkly.

    • by geekmux ( 1040042 ) on Friday March 15, 2013 @05:21AM (#43180807)

      So combine this tech with Google Glass and identify people you just don't want to see ever again, and you may end up walking right into them without even knowing.

      "Hey dude, what's u...Ow! Dude, what the hell, you just ran into me!"

      "Oh hey, sorry about that. I had you flagged as spam. Didn't even see you there."

      • I would have to flag some as ham, some as bull, some as chicken.

          If it's possible to replace them with avatars, they would all look like supermodels. All of a sudden it is just me and supermodels around me. Actually I begin to like the idea. But they would have to be dressed or I'd be crushed by incoming traffic.

        For programmers in offices they could replace colleagues with naked supermodels. I wonder what that would do to productivity.

      • Imagine the spam restaurant, a visual spam filter, and Google Glass. Enter the restaurant, order food, eat everything you see. "But you've barely touched the food!" "What do you mean, I've eaten everything!" Epic.

        You could also get rid of the Vikings, if you're not that much into the Nordic stuff. (Just don't do that when they turn violent: You'll never see it coming.)

    • Re: (Score:2, Funny)

      by brillow ( 917507 )

      Oh when you're eyes get hacked and you're unable to see people who don't want to be seen.

      • by EdZ ( 755139 )
        And people complain about CCTV. Just wait until they hear about Interceptors!
    • So combine this tech with Google Glass and identify people you just don't want to see ever again, and you may end up walking right into them without even knowing.

      We all know it would not removing them but likely replacing them with some attractive person. For /. probably w/o clothes, or if for Reddit, they'd be turned into cats.

  • by eric31415927 ( 861917 ) on Friday March 15, 2013 @02:46AM (#43180281)

    Ten years ago, I predicted a "nudie button," which, instead of removing people from live video, would simply remove their clothing (through interpolation). I do not endorse the use of such a button on your TV's remote control, I merely predict its future existence.

    • I endorse this!

    • There is an pseudo-app for photo's: Nudifier
      https://itunes.apple.com/us/app/nudifier/id554023264?mt=8 [apple.com]

      Someone just needs to take it to the next step. :-)

    • Re:What's next? (Score:5, Interesting)

      by docmordin ( 2654319 ) on Friday March 15, 2013 @03:42AM (#43180485)

      What you proposed isn't that far-fetched, as I ended up having to contrive and implement the equivalent of this, i.e., passive, automated estimation of body shape under clothing, either from a single image or from multiple video frames, for some work I did in action recognition that required a fairly accurate representation of the person's proportions. Others, e.g., A. O. Balan and M. J. Black, "The naked truth: Estimating body shape under clothing," in Proceedings of the European Conference on Computer Vision (ECCV), 2008, pp. 15–29, have come up with solutions too.

      • Make sure always to add inserts under your clothes in the most unexpected (and some obvious) places. Estimate this, mofo.

      • by HtR ( 240250 )

        Sounds like the best research project ever! Tell me, when you set up video experiments, what do you use as a control group? Do you just tape a bunch of test subjects walking around naked?
        Wait a second - do you pick your own test subjects, or do you have to take anyone who volunteers?
        JK.

        • As you noted, the project was fun to undertake, even though it was only a sub-component to a much larger endeavor. I may yet go back, visit it, and submit an extension as a nice stand-alone article.

          To answer your questions, though, I relied on a pool of around seventy subjects, equally distributed across genders and with a tri-modal distribution for age, many of whom were nudists that had heard about the data collection through some friends of mine. I also had a couple of adventurous fellow students and pe

    • ...make sure you turn it off before going into a Walmart or KFC etc.
      If that happens, I predict some nerds with Google Glasses gouging their eyes out...

    • Heheh, I want to see the video with a swarm of hang-gliders removed. Or Indy-car racing sans cars "And as we enter the final lap the head of Al Unser Junior pulls into the lead..."

  • by Anonymous Coward on Friday March 15, 2013 @03:28AM (#43180421)

    If we extrapolate this, perhaps we won't be able to trust video as evidence any longer, so there's no reason to have all these surveillance cameras around.

  • by Anonymous Coward

    This course [coursera.org] on Coursera describes the basics of that technology. It's in its final week but I'm sure it will be reissued later on. They have other courses about computer vision announced for the next months.

    The course was pretty interesting but you don't really have to do any programming to get a grade (programming assignments are optional). Lucky for me, because I have a job and no time to spend on lengthy programming assignments, but one can't become an expert of that subject just by listening at the les

  • by kevingolding2001 ( 590321 ) on Friday March 15, 2013 @03:44AM (#43180491)
    Joseph Stalin would have loved this.
  • by Anonymous Coward on Friday March 15, 2013 @03:44AM (#43180497)

    People have been doing this for far more than the last 5 years. It is a trivial application of so-called 'optical flow' where motion vectors are used to identify independently moving objects within a scene.

    One interesting application (seen, for instance, in the Will Smith film "I am legend") takes video footage of a real environment, and converts the footage into a virtual static 'texture' for the background elements. Artists can then repaint over this 'texture' to add damage to buildings etc. The new texture can now be reapplied to the original footage, so the moving shot appears to show the artistic changes in visual context. Clearly this method will not stand up to the same scrutiny as remodelling buildings in CGI, and inserting them into a virtual set, but it works well for backgrounds.

    Films today frequently use a so-called skybox- a 360 panorama stitched from multiple still photos shot on location. This skybox allows a virtual background to be 'projected' behind the actors (say when they are pretending to be on top of a tall building or mountain) that can track the rotational movement of the camera.

    The idea of element extraction forms the basis of various camera enhanced video games found on the current consoles. Usually, the technique is the reverse of the example in the article, where it is the background that is removed so that the player may be isolated and inserted into a virtual scene.

    Slashdot needs editors that know something about technology, but that isn't going to happen while the owners of Slashdot use the tech stories to draw readers to the constant anti-Iranian warmongering propaganda that appears here almost daily.

    • by geekmux ( 1040042 ) on Friday March 15, 2013 @05:28AM (#43180827)

      ...The idea of element extraction forms the basis of various camera enhanced video games found on the current consoles. Usually, the technique is the reverse of the example in the article, where it is the background that is removed so that the player may be isolated and inserted into a virtual scene.

      Uh, yeah, and now that we've revealed that removing someone is "ancient technology" is is exactly this reversed scenario that most should fear today and is ripe for abuse in a corrupt world.

      One day, you think it's cool that you've been "painted" into a video game...until you realize that same technology can "paint" you right into Exhibit A: The murder scene.

      How long before innocent people are framed? Judges can't even understand how the internet works. You think they're going to grasp this and give you a fair trial?

      • by Anonymous Coward

        easy, just paint the judge into the same scene as evidence it can be faked

      • by cffrost ( 885375 ) on Friday March 15, 2013 @08:23AM (#43181709) Homepage

        One day, you think it's cool that you've been "painted" into a video game...until you realize that same technology can "paint" you right into Exhibit A: The murder scene.

        How long before innocent people are framed? Judges can't even understand how the internet works. You think they're going to grasp this and give you a fair trial?

        Already, people are routinely convicted based on bullshit forensic pseudoscience: PBS Frontline: The Real CSI [pbs.org] [torrent] [kat.ph]

    • The camera games, so far as I know, still require a background plate to be taken as calibration before the game starts so it can generate a difference mask. Apple's photo booth software on mac also does this, though it only works if your clothing differs enough from the background (otherwise you become transparent). If your background or lighting doesn't change, it works great. This seems to be much advanced than diff mask. shift map or simple optical flow techniques in that they're regenerating occluded
      • The photobooth trick is called a 'difference matte' or 'difference key.' It's found in most high-end video editing software, though not always by the same name. I wrote one myself too - http://birds-are-nice.me/video/bluescreen.shtml [birds-are-nice.me]

        As you can see from my effort, it isn't as easy as you'd think to get good results.

        • I think you're filtering is a little too strong in your example. Noisy camera? You use a median filter?
          • Very noisy camera. I was trying to best replicate the conditions typical of a zero-budget production: No proper backdrop, a too-small studio (thus shadows cast behind the actor), no professional lighting, a cheap consumer camera. I ran the video through virtualdub's temporal smooth, that took the worst off. The only places the filter really struggles are the very dark regions - those near-black folds in the trousers, the shadows under the arms - where noise overwhelms the color component. There's a lot of f

            • Dark areas, and motion blur - you can see that clearly in the final seconds. It's a binary mask. The solution to this, I feel, is in physics rather than any change to the algorithm: Just set up a couple of those cheap 100W DIY worklights. No more near-0,0,0 blacks, and the motion blur reduced as the camera automatically reduces shutter'* speed.

              * That term is going to be with us long after the physical shutter is gone. It's easier than saying 'CCD sampling interval duration.'

    • I love how you turned your whole personal vibe around there at the last second.

  • by thoughtlover ( 83833 ) on Friday March 15, 2013 @04:14AM (#43180585)
    Software of this type has existed for a long time. It's commonly used for rig removal, but can be used to remove any object that is 'outlined' for removal. Next-and-last frame comparison is what 'batch clones-out' the outlined object. It's the same tech that Boujou used (vector analysis, per-pixel tracking via next-current-last frame comparison), but that app is/was used more for creating a virtual camera path for a 3D environment... Mokey was pretty good at this type of object removal, too (it's called Mocha, now - www.imagineersystems.com ). This type of software is pretty common and many companies make their own in-house if they have the need, I'd think. Remember how they removed Denzel Washington's character from the remake of The Manchurian Candidate? The real-time aspect is where we're going.. like The Running Man w Schwarzenegger. They used this type of tech in it, but in near real-time. That's scary.
    • by dFaust ( 546790 )

      It's interesting you mention Boujou - the same company (2d3) announced basically this software, with some mighty impressive demo videos, at SIGGRAPH about 11 years ago.

      As far as I know, though, it never saw the light of day.

    • I'm just interested if there is any open-source versions of this available. Mainly, I want to remove TV logos and ads from shows I want to save on my PC.

  • by Gordonjcp ( 186804 ) on Friday March 15, 2013 @04:26AM (#43180611) Homepage

    ... I'd pretend I was one of those deaf-mutes.

  • by Lumpy ( 12016 )

    I have seen this done in After Effects 5 years ago. In fact I remember a script/template that was floating around that tried to automate it quite well.

  • the sponsor which pays for the broadcast paints over the stadium ads for the sponsor's competition. for example: in an auto race, the TV sponsor, Budweiser, paints the Miller car.
  • The Laughing Man (Score:5, Interesting)

    by SpectreBlofeld ( 886224 ) on Friday March 15, 2013 @08:49AM (#43181947)

    Anyone reminded of the Laughing Man from Ghost in the Shell: Stand Alone Complex?

    A hacker who was able to hack the cybernetic vision of others in real-time to make himself invisible...

  • ** We're sorry, the author of this thoughtcrime has been vaporized. **
  • I believe this is just an extension of the research done by this [garfieldmi...rfield.net] guy.

  • Now I have installed an Active People Filter to remove people from all those Internet videos and movies and DVDs I watch. Along with speech and vocal removal filters for music, speech removal.

    The filters have learned what people look like and do a fair job at stamping them out of still images too.

    Now my life is like a series of paintings in still-life. Sitcoms are rooms of silent furniture and no stupid laughter, the Olympics and football a breathtaking vista of grand spaces and odd sporting equipment lyin

  • What does this have to do with the number of pixels of the video format? Why even mention the words "HD"? Is this software unable to operate at higher compression formats or smaller aspect ratios?

After all is said and done, a hell of a lot more is said than done.

Working...