Predator Outdoes Kinect At Object Recognition 205

Posted by timothy on Thursday April 14, 2011 @02:23PM from the think-well-inside-the-box dept.

mikejuk writes "A real breakthough in AI allows a simple video camera and almost any machine to track objects in its view. All you have to do is draw a box around the object you want to track and the software learns what it looks like at different angles and under different lighting conditions as it tracks it. This means no training phase — you show it the object and it tracks it. And it seems to work really well! The really good news is that the software has been released as open source so we can all try it out! This is how AI should work."

Predator Outdoes Kinect At Object Recognition

This discussion has been archived. No new comments can be posted.

Search 205 Comments Log In/Create an Account

Comments Filter:

Re:Um (Score:4, Informative)

by SorcererX ( 818515 ) writes: on Thursday April 14, 2011 @02:39PM (#35820526) Homepage

The kinect doesn't have stereo cameras. It has one color camera which isn't really used for much, a IR projector (that projects IR dots all over the scene) and a IR camera. The IR camera uses the pixel distance between the dots to find the distance. The depth image you then get is used as input to the algorithm that detects the body parts and their orientation etc.

Not a breakthrough (Score:5, Informative)

by Dachannien ( 617929 ) writes: on Thursday April 14, 2011 @02:51PM (#35820644)

This isn't a breakthrough. Much of the technology for tracking objects in this way has been out for about a decade. See this Wikipedia article for one technique for doing this:
http://en.wikipedia.org/wiki/Scale-invariant_feature_transform [wikipedia.org]

Re:Great video, but (Score:5, Informative)

by hotkey ( 969493 ) writes: on Thursday April 14, 2011 @03:14PM (#35820880)

who can't spell
I guess you're fluent in Czech?

Re:I for one.. (Score:5, Informative)

by LingNoi ( 1066278 ) writes: on Thursday April 14, 2011 @04:06PM (#35821496)

The source code was already released. https://github.com/zk00006/OpenTLD [github.com]
There are a few more repos here.. http://www.google.co.th/#q=site:github.com+%22TLD+is+an+algorithm+for+tracking+of+unknown+objects%22&hl=en&filter=0 [google.co.th]

Um...err... NO!!!! (Score:5, Informative)

by SpinyNorman ( 33776 ) writes: on Thursday April 14, 2011 @04:48PM (#35822064)

No - wrong on all counts.
- Kinect doesn't have stereo cameras (it has an IR camera for depth perception and a visible light camera for other usage)
- Kinect doesn't use the visible light camera for body recognition. Recognition is based on the depth map provided by the IR grid projector and IR camera.
- Kinect doesn't operate like a laser rangefinder (it operates via structured light displacements, not via light pulse reflection times)
- Kinect doesn't track a wireframe (it tracks independent body parts)
How you got modded as "4 - informative" is beyond me. The blind leading the blind.
The way Kinect works is by projecting a dense evenly-spaced grid of IR dots (i.e. structured light) on the scene, then using it's IR camera (horizontally offset from the grid projector) to pick up the reflected dot pattern.
Due to depth differences in the scene, and the offset of the IR camera from the IR projector, the reflected dot pattern is not evenly spaced - the dots are horizontally displaced based on depth. To understand this, consider shining two parallel beams of light at a) a flat surface, and b) a surface angled at 45 degrees away from the light source. If you took a step sideways away from the light beams and looked at their reflections of the two objects, the dot (beam) separation on the flat surface would be the same as the true beam separation, but the dot separation on the angled surface would be increased. by an amount you could calculate using simple trig.
In order to operate in real-time with low cost, a dedicated chip processes the IR camera image and converts the dot displacements into the corresponding depth map.
The clever, and somewhat counter-intuitive, part is how Kinect then turns this depth map into a body part map. The basic idea is that it probabilistically maps local clusters of depths to body parts (via having been trained on a huge manually body-part-labelled image set), then converts these local probabilities into larger scale body part labels (i.e. if 60% of the local clusters in a region say "hand" , then the region is labelled as a hand). This way it doesn't track overall body postion or a wireframe, but rather independently tracks body parts (which is why it has no trouble correctly tracking muliple partially occluded people in frame).

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Predator Outdoes Kinect At Object Recognition 205

Predator Outdoes Kinect At Object Recognition More Login

Predator Outdoes Kinect At Object Recognition

Re:Um (Score:4, Informative)

Not a breakthrough (Score:5, Informative)

Re:Great video, but (Score:5, Informative)

Re:I for one.. (Score:5, Informative)

Um...err... NO!!!! (Score:5, Informative)

Related Links Top of the: day, week, month.

Slashdot Top Deals

Slashdot