Details And Proposals For Object Detection

Jump to navigationJump to search

Object Detection

Here we have the most complicated situations you can have: We want to detect moving objects in a video stream of moving cameras.

Usually detecting moved objects is done from cameras not in motion. In this situation you can use successive frames to get differential images. Unchanged regions in the image become dark, the remaining regions are edges from objects in motion. So you can use the differential images as mask to the current images and get the edges to detect. Ok, you have to combine all the edges correctly to get the real objects, but if you know what kind of object it could be (cars), it is possible do to.

This simple situation isn't given, if the camera itself is in motion. In the real world it would be very complicated to build a system, being able to work perfect here. But we don't start in the real world!

To understand what do to, lets's have a look to the World Viewer. Our simulation contains all objects of the fictive world. These objects are bodies and their visible outside is defined by textures mapped on the surface of it. The World Viewer renders the images of all the bodies as seen from an given point.

To be able to detect moving objects from moving camera's images, you have to:

  • Calculate the (previously) detected object's estimated position in the next image's coordinates.
  • Render a fake image of the object at that location as you'd expect to see it from the new position.
  • Render the current image.
  • Hand the vision engine these two images and have it do a diff to determine motion.

In the real world this is a hard to do work, but in our situation, we have it perfectly if we do nothing!


The image we want to fake here is just the one the world viewer renders, at least for the not moved part of the world! To get what we would have in reality (with a perfect system), we only have to use a trick: Combine the static part of the world as seen from the new timestep's position with the image of the dynamic part of the world in the last timestep's positions but seen from the new position.

This is what we would get in reality. Now we can compare it with the image of the whole world as seen from the new timestep's position. Again we can get the differential images and find the moving objects.

Simple isn't it?

Precisely opposite of real life: we start with the perfect system and if all the stuff works ok, we can add uncertainty. We can simulate imperfect motion sensors, noise, light changes etc., just all the hard things we would have in reality. But we can do it in steps and controlled to find out what to do to handle it well.