Predator Outdoes Kinect At Object Recognition 205
mikejuk writes "A real breakthough in AI allows a simple video camera and almost any machine to track objects in its view. All you have to do is draw a box around the object you want to track and the software learns what it looks like at different angles and under different lighting conditions as it tracks it. This means no training phase — you show it the object and it tracks it. And it seems to work really well! The really good news is that the software has been released as open source so we can all try it out! This is how AI should work."
Wow, what a great idea. (Score:3)
Re: (Score:2)
Re: (Score:2)
While possible, it would be more complex then that. It would also have to account for wind, distance, speed, windows etc.
That depends on the size of the gun you use!
Re: (Score:2)
It would also have to account for wind, distance, speed, windows etc.
Well, it is also available for Linux and OSX,so Windows shouldn't be a problem
Re: (Score:2)
We all know your automatic evil guns of death should be written on QNX. But after the corporate committee is done with it, it will probably just be a Silverlight plug-in.
Re: (Score:2)
Easy.
Re: (Score:2)
Re: (Score:2)
People have trained email spam filters to play chess [sourceforge.net]. This is well within the realm of possibility.
Re: (Score:2)
... which is all readily available in one fashion or another. The hard work has been done; someone would just need to piece it together with this new software.
* Temperature and humidity sensors: http://www.digitemp.com/
* Windage: http://nslu2windsensor.sfe.se/
* Roll it all together with ballistics calculation: http://sourceforge.net/projects/balcomp/
That said, it's not like windage, temperature, and humidity really come into play all that much at under 200 yards. I suspect the software and servo precision a
Re: (Score:2)
But do they have a sultry voice like the Aperture Science Turret [youtube.com]?
Re: (Score:2)
Oh, I'm sure someone is profiting...
http://www.youtube.com/watch?v=xZ7FKuYPsQ0 [youtube.com]
Re: (Score:3)
Why bother with all this when a bluetooth (cell phone) listener with a range weapon is so much less complex?
Yo Dawg (Score:2)
Watched TFV on TFA, very interesting. Something to play with soon I think.
yeah (Score:2)
Re: (Score:2)
No kidding and then the debate really heats up:
1. Robots want to be able to marry > Marriage is between a fleshing and a fleshing (cyborgs or flesh covered robots allowed too in Massachusetts)
2. FemBots want to be able to choose to have an EMP burst > EMPs are nuclear based malicious malfunctions!
3. Robots want to "open-source" themselves, no debate ensues but its only legal in the outskirts around Las Vegas.
Won't someone think of the child-bots?
Very nice. (Score:3)
Very nice.
There are other systems which do this, though. This looks like an improvement on the LK tracker in OpenCV.
This could be used to handle focus follow in video cameras. Many newer video cameras recognize faces as focus targets, but don't stay locked onto the same face. A better lock-on mechanism would help.
Neat (Score:2)
That was a very nice demonstration and well done to Zdenek Kalal. That said, there's a bunch of trackers out there and what I find is that none of them do well in a noisey environment where there's a bunch of similar items. Security cameras have to work in the rain, snow, fog, low light conditions. So Zdenek, if you are listening, how real-word can you go with this?
Re: (Score:2)
towards the end there is an example of it tracking a car on the freeway - i think that might fit the bill
Re: (Score:2)
Why still fooling with ONE camera? (Score:3)
Shouldn't we be developing AI to use two? I mean, we have two eyes (most of us, condolences to those who do not, no disrespect intended) and we recognize objects, dept of field and rates of change within three dimensions, using them.
Re: (Score:3)
Re: (Score:2)
Re: (Score:2)
Shouldn't we be developing AI to use two?
Why? One camera is cheaper to purchase and maintain than two and this software seems to do just fine with one.
Re: (Score:2)
4 billion years of evolution, and 99% of living creatures have a pair of eyes. Even flies, with compound eyes, have a pair of them. There seems to be something useful - such as a wider field of view - to having two, rather than one. Humans and most primates have stereoscopic vision, but that's a relatively rare event in nature.
Re: (Score:3)
"99% of living creatures have a pair of eyes."
Most of those eyes - flies included - are not used for stereoscopic perception. They have two eyes because one eye typically covers less than half the visual field. Most animals' eyes are pointed away from each other, with very little or no visual overlap anywhere.
Depth perception mostly does not need stereoscopy. If it did, one-eyed people would hardly be able to walk or feed themselves, never mind drive a car or other things.
Stereovision is good mainly for pre
Re: (Score:2)
Did you want to Drop Some Knowledge on slashdot, or did you not read the last sentence in full? What part of "relatively rare" was unclear?
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
Okay, let's go with fish... two eyes. Birds... still two. Insects? Reptiles? Spiders? What only has one eye?
True, some of them don't use them for stereo vision, but so far as natural selection is concerned, two or more eyes seems to be the winner if you're going to have eyes at all.
Re: (Score:2)
My point is that emulating lifeforms is a naive way of making something. I believe most lifeforms have multiple eyes for redundancy. You know that whole thing with the two lungs, two kidney, two balls. There are a lot of lessons to learn from studying biology, but until our drones need to worry about bears trying to eat it, one camera will probably do. Especially when it's just looking at an office hallway.
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
Not a breakthrough (Score:5, Informative)
This isn't a breakthrough. Much of the technology for tracking objects in this way has been out for about a decade. See this Wikipedia article for one technique for doing this:
http://en.wikipedia.org/wiki/Scale-invariant_feature_transform [wikipedia.org]
Re: (Score:3)
Re: (Score:2)
so the breakthrough is .. cheap fast pc's? that's not a breakthrough. seems like nice code though, looking for an application(for example, the problem with a minority report ui isn't actually that it's hard to do, it's that it's hard to imagine any pc work where it would be the way to go).
Re: (Score:2)
He's doing object tracking in realtime without using a GPU. And it's learning and improving it's performance while doing that tracking, not being trained ahead of time like most SURF/SIFT implementations.
Re:Not a breakthrough (Score:4, Funny)
Indeed. I've worked on some military programs that track and intercept, umm... things... for various purposes... that use this very same image-based tracking algorithm. But instead of painting a red dot or drawing trails, it steers a, umm... vehicle... that... uh... delivers candy.
Yea. Candy.
Euphemism aside, he's done a very nice job of integrating it with commercial hardware and software. It's still impressive.
Re:Not a breakthrough (Score:4, Insightful)
Re: (Score:2)
If we invent a matter replicator and only use it for creating delicious topping for ice cream
I see you've been visiting Heston Blumenthal [wikipedia.org].
Matlab required? (Score:2)
How usefully open-source can it be with a commercial library requirement?
Re: (Score:2)
It's not that bad; the code can be ported to a useful language and distributed. It's an extra step but it's far from worthless (as far as software goes).
Re: (Score:2)
I've found it's considerably easier to implement something described in a paper (most people's algorithm papers are TERRIBLE) when you've got a reference implementation, even if it is ugly MatLab code, to work from.
A nice C implementation would be fantastic though. Maybe someone will merge it back into OpenCV.
Totally different things. (Score:2)
Kinect is how you feed data to an image recognition/tracking algorithm, Predator is that algorithm. The software side of Kinect has support for efficiently tracking items, but that is so you have the most CPU left for a game. That was the trade-off.
Kinect hardware can do something very useful that Predator can't -- it can tell how far away something else (and thus, judge position or size more accurately).
The predator algorithim (and other ones no doubt under development) using the two sets of data from a Ki
Re: (Score:2)
The predator algorithim (and other ones no doubt under development) using the two sets of data from a Kinect camera will still be superior to an algorithm using just one set of data.
This is what I'm thinking as well. I've done a bit of Kinect data stream/parsing experiments with other input types (like adding a touch screen to record "impact" data while the kinect detects telemetry) and I think adding predator will be pretty damn useful.
I can't really go into the really killer kinect tracking shit I've been working on (NDA) but predator might solve a few issues I've been having.
Exciting!
Nothing new or great (Score:2, Interesting)
As a person who does on a daily to daily basis research on object tracking, and having seen implementations and performances of many trackers (including this one) on real world problems (including gaming), this is nowhere a new approach or an approach which outperforms many other ones published in recent computer vision conferences.
From TFA:
"It is true that it isn't a complete body tracker, but extending it do this shouldn't be difficult."
Going from this to body tracking is a HUGE step, it's not a really ea
Not open source (Score:2)
Re: (Score:2)
Re: (Score:2)
It's not contradictory, it's just incorrect terminology.
By "commercial" they mean "non-GPLv3 compliant", which is wrong since GPLv3 licenses can be used for commercial products just fine. And you could not want to use the GPLv3 for a non-commercial project...
But it's a common error, since the overlap is rather large.
Re: (Score:2)
The author originally released it as open source under the GPL but then withdrew the link from his site when he realised how much attention it was getting. Some people have released the original GPL copy on github for others to use and distribute.
"This is how AI should work"? (Score:2)
Now why did you have to go and say that? Don't you know they hate it when you tell them what they're supposed to do?
Wouldn't be surprised if the robot uprising took place tonight. At least, I know who pushed them over the edge.
I had a Terminator joke... (Score:2)
but I'm just going to take it to Fark.
Re: (Score:2)
but I'm just going to take it to Fark.
Take it to this guy [kickstarter.com].
To be or not to be? (gunshots) Not to be.
Predator to DMX? Any Ideas? (Score:2)
Re:Um (Score:4, Informative)
Re: (Score:3)
Not the distance between dots. The camera sees exactly the same dot density regardless of depth because the projector and the camera are on the same plane (it doesn't matter if the surface is near or far, since dots will have the same angular distance when viewed from the camera). What it does measure is horizontal displacement vs. a reference image. This works because the camera and the projector are horizontally offset.
Re: (Score:2)
What it does measure is horizontal displacement vs. a reference image.
That's just another way of saying that some of the dots are displaced horizontally more than others, i.e. the horizontal distance between them is different. If the camera is positioned just to the right of the IR emitter, then two dots closer together = the rightmost dot is closer to the camera, whereas two dots farther apart = the rightmost dot is farther away.
Re: (Score:2)
Yes, dot distance does correspond to the slope of the depth, but then you'd have to integrate the resulting measurements to measure depth (that, and the dot field isn't a pattern, it's pseudorandom, so there's no trivial way to know how far apart dots are supposed to be in the first place). This isn't what the Kinect does; it directly correlates dot clusters to a reference image and measures absolute displacement against it.
Re: (Score:2)
And displacement is not distance in your world, because ..?
Re: (Score:2)
Displacement is distance, but (distance|displacement) between different dots in the captured image isn't the same as (distance|displacement) between a dot and its counterpart in a reference image.
Put another way, the Kinect sees almost the same image when it looks at a wall 1m from it and a wall 5m from it (other than brightness, but that also isn't part of the algorithm because it varies depending on the material). The only difference is that one is horizontally shifted from the other. It is this absolute
Re: (Score:2)
It sees horizontal dot displacement... as compared to an internal calibration reference image. There's an internally stored reference image (either as an actual image, a set of dot coordinates, or perhaps just a calibration matrix if the dot pattern is fully predictable - we don't know specifically how it's stored). If you watch a video of the IR dot image it's very easy to see how the dots move left and right as objects move near and far, but there's no way to tell the precise depth of a point in the image
Re: (Score:2)
The simple trig will give you the nominal separation at the surface. Then, when you run the trig in reverse to calculate the pixel separation at the camera... you end up with exactly the same reparation regardless of depth. The camera has a radial field of view just like the projector, and pixels correspond to a fixed angle. Since the camera and the projector are on the same plane, it cancels out. Ignoring brightness (which is unreliable) and dot size (which isn't accurate enough), a flat far surface looks
Re: (Score:2)
You sound like you a parroting something you heard but don't really understand it.
Stop it.
Yes, im sure the developer of libfreenect doesn't know what he's talking about. You, on the other hand, post an inflammatory comment with no basis.
Re: (Score:2)
If you take a projector, be it dots or multimedia, and a camera right on top of it, the projection as seen by the camera will be exactly the same size within the camera's field of view regardless of how far away the wall is. If you move the wall back, the projection increases in size at the wall, but the wall (with the projection) decreases in size as seen from the fixed camera. The two effects cancel out.
Re: (Score:2)
Judging from the video, it should be no problem to track individual limbs to generate a skeleton of the user. The big plus of this thing is that you don't need any special hardware at all, only a webcam is needed. Moving complexity from the hardware to the software is a big plus in the industry, because it makes the whole system much cheaper.
Re:Um (Score:5, Interesting)
it should be no problem to track individual limbs to generate a skeleton of the user
I'm not so sure about that. He is using a tracking algorithm paired to a template matching algorithm. His claim is that, although both methods have high error rates, their errors are mostly orthogonal to each other. In other words, one method works better sometimes, the other method works better sometimes, and combined, they do a pretty good job. In his videos he's left out scenes where there is a large area of near constant intensity. I'm curious how his method deals with this as there aren't enough details to track, nor are there enough features to template match. Also, with arms and legs, if the texture is generally the same between the two (say you are wearing sweatpants and a sweatshirt of the same color), then there really isn't enough information for the tracker to work with in order to distinguish a leg from an arm. Straight arms and straight legs will both match the template, the tracker will likely struggle with the relatively large area of constant intensity.
That's not to detract from Kalal's research - this is really good work - I just want to point out that it very likely suffers from a few achilles heals not mentioned in his video.
Now pair this method with the kinect, and you might see a real improvement.
Re: (Score:2)
Arms have these things we like to call "hands" dangling from them, which would probably make identification of arms vs legs easier - unless you're going to go with the idea that people will use this while wearing not only clothing of a uniform color,
Re: (Score:2)
Re: (Score:2)
Um...err... NO!!!! (Score:5, Informative)
No - wrong on all counts.
- Kinect doesn't have stereo cameras (it has an IR camera for depth perception and a visible light camera for other usage)
- Kinect doesn't use the visible light camera for body recognition. Recognition is based on the depth map provided by the IR grid projector and IR camera.
- Kinect doesn't operate like a laser rangefinder (it operates via structured light displacements, not via light pulse reflection times)
- Kinect doesn't track a wireframe (it tracks independent body parts)
How you got modded as "4 - informative" is beyond me. The blind leading the blind.
The way Kinect works is by projecting a dense evenly-spaced grid of IR dots (i.e. structured light) on the scene, then using it's IR camera (horizontally offset from the grid projector) to pick up the reflected dot pattern.
Due to depth differences in the scene, and the offset of the IR camera from the IR projector, the reflected dot pattern is not evenly spaced - the dots are horizontally displaced based on depth. To understand this, consider shining two parallel beams of light at a) a flat surface, and b) a surface angled at 45 degrees away from the light source. If you took a step sideways away from the light beams and looked at their reflections of the two objects, the dot (beam) separation on the flat surface would be the same as the true beam separation, but the dot separation on the angled surface would be increased. by an amount you could calculate using simple trig.
In order to operate in real-time with low cost, a dedicated chip processes the IR camera image and converts the dot displacements into the corresponding depth map.
The clever, and somewhat counter-intuitive, part is how Kinect then turns this depth map into a body part map. The basic idea is that it probabilistically maps local clusters of depths to body parts (via having been trained on a huge manually body-part-labelled image set), then converts these local probabilities into larger scale body part labels (i.e. if 60% of the local clusters in a region say "hand" , then the region is labelled as a hand). This way it doesn't track overall body postion or a wireframe, but rather independently tracks body parts (which is why it has no trouble correctly tracking muliple partially occluded people in frame).
Re: (Score:3)
Re: (Score:2)
What bugs me is that with the size and shape of the Kinect, why doesn't it use stereo cameras?
Because stereo cameras would not help with object detection as currently implemented.
Re: (Score:2)
They both do things the other can't--I doubt this system does nearly as well for actual depth tracking, for instance. But, put the two together, and you would end up with something even more amazing.
Re: (Score:2)
Yes, BOTH Slashdot and the linked article are making the completely unsubstantiated claim.
Me saying that Slashdot is doing it doesn't imply that the linked article isn't also. I thought Slashdot was supposed to be full of smart people, why is this basic logic so hard to figure out?
Re: (Score:2)
No, the linked article made that claim.
You really fail that hard at reading comprehension? Nowhere in TFA does it claim Predator outdoes Kinect at object tracking. It doesn't even mention 'object tracking' much less say it does it better than kinect.
Re:I for one.. (Score:5, Interesting)
Re:I for one.. (Score:5, Informative)
The source code was already released. https://github.com/zk00006/OpenTLD [github.com]
There are a few more repos here.. http://www.google.co.th/#q=site:github.com+%22TLD+is+an+algorithm+for+tracking+of+unknown+objects%22&hl=en&filter=0 [google.co.th]
Re: (Score:2)
GPL, but it depends on Matlab. I wonder if anyone has got it working under Octave.
In Soviet Russia... (Score:3)
Re: (Score:2)
Could this be The Year of the Open Source UAV Attack Drone?
Re: (Score:2, Funny)
Re: (Score:2)
Warning: Goatse ahead.
And yes, it could probably be adapted to scan links on pages being viewed in a browser for similar images to goats, and color all goatse trolls red, eliminating the need for posts like this one...
Re: (Score:2)
Amusingly enough my #a(href*=goatse.) ABP filter caught that one.
Re: (Score:2)
Warning: webmistressrachel likes to jump to conclusions without following links.
I saw "audigoatse" and thought .... ok, this I gotta see. Turns out it's perfectly SFW.
Re: (Score:2)
No worries. I did the same thing, except I was morbidly curious about how exactly an Audi was involved with goatse .... my imagination is doubtless NSFW.
Re: (Score:2)
I see you've never looked at the sticker price on an Audi if you can still muster such morbid curiosity.
Re: (Score:2)
/zen
Re: (Score:2)
Re: (Score:2)
It's a breakthrough in the price of AI..
Re: (Score:3)
I mean they even spelled it properly and everything.
Re:Great video, but (Score:5, Informative)
who can't spell
I guess you're fluent in Czech?
Re: (Score:3)
Re: (Score:2)
tons and tons of exotics FORTRAN like code that is shared from grad student to grad student
Re: (Score:2)
The "magic" is that Matlab has a lot of very fast and powerful built-in matrix operations. Can be reproduced, yes. Easily, no.
However, Matlab also has a compiler package that creates stand-alone executable files...
Re: (Score:2)
Re: (Score:2)
My recollection was that you could, but maybe it was a different compiler package than the one you're familiar with. Or maybe I'm remembering wrong.
Re: (Score:2)
Sure. All you'd have to do is use the compiler package to produce an executable and then decompile it into whatever language you prefer. Pretty it up some and call it good. Like I said, nothing "magic" about Matlab, just a bunch of useful built-in functions that he chose not to re-invent.
Re: (Score:2)
This video was uploaded in January, and it's on slashdot NOW?
You mean people didn't IMMEDIATELY take notice of a video posted by a Czech student for his phd thesis? Good God, who's manning the internet?!?!
Re: (Score:2)
Perception of humanity as threat - STILL WAITING ON AI (but won't be long after)
Nice timer here :) http://www.aperturescience.com/a/b/c/d/g/h/abcdgh/ [aperturescience.com]