Evaluation of Color-Based Techniques for Robotic

well-liked approach for its robustness and light computational cost. As an example, the case of color markers on a robot-arm illustrates the discussion. Section.
388KB taille 4 téléchargements 384 vues
Evaluation of Color-Based Techniques for Robotic Positioning Tasks G. Hermann, D. Greboval, H. Kihl, J.P. Urban MIPS, Université de Haute Alsace 68093 Mulhouse, France

ABSTRACT This paper evaluates the potential of color histogram techniques in the context of object tracking through a sequence of images. The method retained is based on a histogram similarity measure at each position in the image. The experimental results on a variety of realworld image sequences are promising. Keywords: Color histograms, Color Multicolored objects, Object tracking

similarity,

The robot has to react to the image feature extracted from the current image. We therefore limit ourselves to the techniques that can be reasonably implemented in realtime (video-rate) on today’s workstations. An overview of the literature leads rapidly to color histogram [1, 2, 3] or cross-correlation techniques [4,5]. In this paper, we consider that the object of interest is composed of a discriminant combination of colors, representing a unique color signature in the scene, even if individual colors can be found elsewhere in the scene. This is a reasonable hypothesis for the range of our applications, where the use of color markers, if needed, is acceptable.

1. INTRODUCTION Our works focus on neural network learning algorithms applied to visually guided robot positioning tasks. Typically the robot-arm has to interact with known objects in its environment, based on active vision techniques to locate the object. The learning algorithms correlate the extracted image features with arm and camera movements to issue control movements that will position the end-effector appropriately. Until recently, we did not address the image processing problem in a realistic environment. We concentrated on the learning algorithms, and limited the difficulty to using a light-bulb to identify the robot’s end-effector. Considering a single target in such a well-conditioned environment, it is straightforward to obtain pixel precision target tracking based on intensity grey-level images appropriately thresholded.

Referring to Swain’s et al. reference paper [1], Section 2 introduces the classical histogram based techniques, a well-liked approach for its robustness and light computational cost. As an example, the case of color markers on a robot-arm illustrates the discussion. Section 3 introduces a Color Histogram Similarity (CHS) technique for object tracking. The approach is illustrated with sequences that contain robot-arm movements, where markers of different combinations of colors and shapes (circular, conic, arm-badge) are fixed on a 6 d.o.f. robotarm.

A real-world problem we are about to address is the control of a robot-arm able to reach out to known objects. The video camera mounted on the end-effector delivers now images containing known objects that can be viewed from various angles, distances, illuminations. The vision system must identify and localize the object that is asked for, and then follow it while the arm is performing the appropriate positioning movement.

In Section 4 we test the color histogram similarity algorithm on a number of experimental image sequences. These scenes are representative of situations where the feature extraction should be robust. They contain indoor and outdoor natural color object sequences (e.g., character in a toy box, ball juggling, …). The sequences are taken under different lighting conditions, to emphasize the effect of parameters such as illumination variation, object deformation through changing orientation, and partial occlusion of the markers. The technique is evaluated in terms of robustness (not loosing the object) and precision (distance between desired and real trajectory). We conclude with the neat perspectives that this technique offers within the context of our applications.

SCI 2004 – July 18-21, 2004, Orlando, Florida (USA)

page 1/5

The 8th World Multi-Conference on SYSTEMICS, CYBERNETICS AND INFORMATICS

2. COLOR HISTOGRAM TECHNIQUES The color histogram represents the color density distribution and is obtained by counting the number of times each color occurs in a discretized image. The histogram of a reference image of the object of interest is a feature that is robust to changes in orientation, size, and angle of view; spatial information is not being taken into account. However histograms are sensitive to varying lighting conditions and much effort is put into defining color constancy spaces and efficient color quantization techniques. Most standard video cameras deliver images in 24 bit RGB color space. Color quantization along the main RGB-axes can be uniform or non-uniform. RGB is a format that is sensitive to color variations and is perceptually non-uniform. Color constancy spaces are obtained from linear or non-linear transformations of the RGB space. The simplest of these spaces is the normalized-RGB space, noted rgb, resulting in intensity (R+B+G) invariance through normalization of each plane. Adaptive quantization techniques are also proposed [6]. A more sophisticated space is HSV. This space corresponds to the projection of the RGB space along its main diagonal that is a function of intensity. The Hue (H) component defines a category of colors corresponding to color wave lengths and is close to human color perception. When lighting is weak, the hue parameter is not as pertinent. More elaborate spaces are described in [7, 8]. Identification Swain & Ballard [1] introduce the Intersection Histogram algorithm, an indexing technique based on a measure of histogram similarity. Histogram M of a reference image of the object is compared with histogram I of the image in which the object is to be identified. The object covers an important part of the image:

their classic Back-Projection algorithm BP. It is obtained from the histogram ratio of M over I, for each bin i: M  Ri = min  i ,1 .  Ii 

(2)

For every image pixel, the value of the ratio of the color considered is back-projected to form an image. The dominant colors of the object, if they are only present in the object, will be given a high rank. If an object color is massively present in the background, it will contribute weakly to the formation of the back-projected image. Finally, the image will contain high density zones corresponding to the possible object locations. Various methods are proposed to estimate the maximum densities. Swain et al. [1] propose a convolution technique. The Mean-Shift approach [9] can be adapted [10], or a more simple technique is the projection of the respective densities on rows and columns to retain the maxima [11]. Robot-arm Example The case of a color-marked black-painted robot-arm illustrates the discussion. The image in Figure 1a shows the arm with 3 different color markers, each uniquely defined by its color combination: A (blue, green, red, magenta), B (green, magenta), C (blue, red). The purpose is to localize A, the 4-color disk, from a reference image of the disk taken in slightly different conditions. The back-projected image is represented in Figure 1b. The positions of the markers appear clearly due to a dark uniform background, and the presence of reference colors in all markers. Marker A is easily localized with a maximum density estimator. The back-projection image is pertinent as long as the dominant colors of the reference object remain dominant in the image to be searched. If it isn’t anymore the case, it becomes difficult to interpret. The presence of a representative object color in various places in the scene will diminish the value of the back-projection ration and lead to the formation of density zones that do not correspond to the object of interest.

l

H=

∑ min ( I , M ) i

i =1

i

l

∑M i =1

.

(1)

C

i

The algorithm is simple and robust, and defines an L1metric for normalized histograms. When the object changes in scale, the Bhattacharyya coefficient is favored [2]. Localization Considering a known object the tracking problem could be defined as a localization problem in each image of the sequence. We refer again to Swain & Ballard [1] and

Estimated localization

B

A a.

b.

Figure 1. Robotic arm with 3 color markers, where A is the one of interest. The initialization image is represented in the top-left corner of a. The corresponding back projection image with maximum density estimation is represented in b.

SCI 2004 – July 18-21, 2004, Orlando, Florida (USA) The 8th World Multi-Conference on SYSTEMICS, CYBERNETICS AND INFORMATICS

page 2/5

Comparing histograms to determine maximum similarity in the vicinity of high density zones will solve the problem. We therefore favor a similarity measure at every position of the image. In the following section we introduce the method that we will use for our experiments.

estimated localization

The method we retain to track robustly a target in a sequence of images is based on a measure of histogram similarity. For each pixel position m,n in the image, the model histogram M is compared with the histogram I (m, n) of a window of interest of the same size as the object reference image. The object is localized when the similarity measure is maximum. We will refer to this method as the Color Histogram Similarity. Robot-arm example To position the robot-arm in its 3-dimensional workspace, we add a conic color marker to the end-effector and a color badge on axis 2 (see Figure 2). The experiment consists of tracking both markers during the robot movement. Under good lighting condition, it is an easy task for both, the back-projection and histogram similarity algorithm, to track the markers through the sequence of images. A more challenging sequence Figure 2 shows four images of an arm movement sequence taken in unfriendly conditions. The background is green (one of the marker’s colors), there is a neon light source from the corridor behind the robot, and lighting is weak. This last parameter makes the colors of the conic marker hard to distinguish.

Image 1 (init)

Image 40

Maximum similarity

Init

3. HISTOGRAM-BASED TRACKING

a) Back-projection image

b) Histogram similarity surface

Figure 3. Localization of the color cone in image 10 of the sequence, based on an initialization in image 1. a) Back-projection image with failed estimated localization, b) Color histogram similarity surface: the cone is successfully localized.

This is a sequence where the back-projection method fails. This is due to the fact that the back-projected image, Figure 3a is not pertinent anymore; there are large zones of grey areas but no high density zones that would point to a possible solution. Especially, the cone does not stand out in this image. On the contrary, the color histogram similarity method clearly leads to the cone, showing thus a superior robustness for this sequence. 4. EXPERIMENTS We run a number of experiments to further test the robustness of this simple approach. The method being sensitive to varying lighting conditions, we present two series of image sequences: - a first series of constant illumination sequences, where the focus is on the robustness of the back-projection and histogram similarity methods; - a second series of sequences with varying illumination, where we compare several constancy spaces. Experimental Setup The color histogram of the object to be tracked is initialized from the first image. The size of the search window is fixed at this time, and is not adapted with the size of the objects. Other methods to define models are proposed in [12, 13]. The objects to be tracked are quite small in the image and their size does not change drastically, except for the toy box character. The image sequences are taken with a Sony DCR-PC120E camcorder at 25 images/s. It delivers a 24 bit RGB image of resolution 576 × 720.

Figure 2. Four images from a robotic arm sequence. The task is to track the conic end-effector and the arm-badge on axis 2.

Constant illumination sequences Four sequences are presented in figure 4: a. In the toy box sequence, the static scene is rich in colors and the hand-held camera makes a very non-linear 140° rotation movement around the box. The size and the angle of view of the character change therefore drastically. b. In the juggling sequence, the black/orange ball changes position quickly in the scene and is sometimes occluded.

SCI 2004 – July 18-21, 2004, Orlando, Florida (USA)

page 3/5

Image 60

Image 90

The 8th World Multi-Conference on SYSTEMICS, CYBERNETICS AND INFORMATICS

c. The diabolo player scene is a difficult outdoor sequence because of the very distracting grass and trees in the background. d. In the walking man sequence, the target is in the distance and therefore really small.

Model initialization in image 1

Driver character from image 1

The four sequences are processed in the RGB space quantified in 128 bins. The tracking results are given in Table 1. The comparison criterion used is the distance between the estimated position (BP or CHS) and the desired one, determined “manually”. Small errors are therefore not significant. Table 1. BP and CHS comparison for constant illumination sequences. The table contains mean errors (m), standard deviation (Std), and maximum errors (Max) in pixels

Image 56

Image 112

a. Toy box sequence. A car driver character is tracked in a rich bright color environment. The scene is static and the camera moves around the box, changing thus the character’s angle of view considerably (about 140°).

Image 1 (init)

Image 130

b. Juggling sequence. The black/orange ball is to be tracked in this scene. There is some orange on the poster behind the juggler.

Image 1 (init)

Image 112

c. Diabolo player sequence. A green and purple diabolo is to be tracked in an outdoor scene with lots of green grass and trees in the background.

BP

Sequence

CHS

m

Std

Max

m

Std

Max

Toy box

23

9.86

51

5

2.7

10

Ball juggling

9.5

14.8

61

5.8

5.9

35

Diabolo player

-

-

-

2.4

2.3

12.8

Walking man

1.69

0.9

4.24

2.4

1.6

5.9

The CHS algorithm was able to detect the right object position in all cases but one. In the juggling sequence, the black/orange ball is lost once and an orange mark on the poster behind was detected instead. The BP algorithm fails for the diabolo player sequence and performs poorly on the ball player sequence. Varying illumination sequences Three sequences are presented in Figure 5: a. An indoor Lego car sequence with weak illumination. b. The Doktor Diesel sequence, where natural back-light comes from the right and changes illumination while the character is rotated. c. The Lego outdoor sequence containing strong lightshadow crossings. Table 2. Color space comparison (a) Lego car weak lighting

(b) Doktor Diesel

(c)’ Lego car outdoor Init. sun

(c)’’ Lego car outdoor Init. shadow

RGB (8,8,8) Uniform

1 loss

ok

failure

failure

RGB (128) Min Variance

ok

ok

failure

failure

rg (8,8) Uniform

ok

ok

ok

ok

HS (8,8) Uniform

ok

ok

ok

ok

quantization

Image 1 (init)

Image 180

d. Walking man scene. A man wearing a blue and black jacket walks down the street in the distance.

Figure 4. A series of constant illumination sequences SCI 2004 – July 18-21, 2004, Orlando, Florida (USA)

The 8th World Multi-Conference on SYSTEMICS, CYBERNETICS AND INFORMATICS

page 4/5

scenes and conditions shows potential. The algorithm is easy to implement and to use. There is no empirical parameter tuning required.

Image 1 (init)

Image 112

a. Lego car indoor sequence. The color marker is made of red, green and blue Lego bricks. There is little illumination in the corridor.

The algorithm must now be optimized for real-time video-rate implementation. In this paper we focused on the robustness of the approach, rather than algorithmic optimization. Our preliminary tests enable us to consider real-time video-rate implementation with enthusiasm. The CHS algorithm for a 576 × 720 pixels image using a 20 × 20 pixels model takes about 300 ms on a standard PC workstation. In this test the image is entirely scanned, and we do not take advantage from the notion of image sequence. The next step is to introduce the approach in a visual servoing scheme where the tracked object’s image position is used to control a robot-arm to reach a target defined in the image space. REFERENCES

Image 10

Image 130

b. Doktor Diesel sequence. The toy is moved around. The natural light coming through the window from the rear right produces a back-light effect.

[1] M. J. Swain, D. H. Ballard, Color indexing, International Journal of Computer Vision, 7(1), 1991, 11-32. [2] D. Comaniciu, V. Ramesh, P. Meer, Real-time of non-rigid objects using mean shift, IEEE CVPR 2000. [3] R. Schettini, G. Ciocca, S. Zuffi, A survey of methods for colour image indexing and retrieval in image databases, Color Imaging Science: Exploiting Digital Media, (R. Luo, L. MacDonal eds), J. Wiley, 2001 [4] M. Cahn von Seelen., Adaptive correlation tracking of targets with changing scale, Reconnaisance, Surveillance, and Target Acquisition for the Unmanned Ground Vehicle, Morgan Kaufmann Publishers, San Francisco, CA, 1997, p. 313-322.

Image 1 (init)

Image 56

c. Lego car outdoor sequence. This scene has been taken on a sunny day and contains a challenging light-shadow crossing.

Figure 5. A series of sequences with varying illumination

Using the CHS algorithm, we compare various color spaces. Table 2 presents the tracking quality for the sequence in Figure 5. The sequences of the first series showed that RGB color space works well in many cases. In case of varying light conditions, rg or HS color space are to be preferred. In conclusion, the results obtained with the CHS algorithm, in a color constancy space if necessary, gives surprisingly good results.

[5] D. A. Montera et al., Object tracking through adaptive correlation, Optical Engineering, 33(1), 1994, 294-302 [6] N. Papamarkos et al., Adaptive Color reduction, IEEE Transactions on Systems, Man, and Cybernetics – part B: Cybernetics, 32(1), February 2002 [7] B. Funt, G. Finlayson, Color constant color indexing, IEEE Transactions on Pattern Analysis and Machine Intelligence, 17(5), 1995, 522-529 [8] T. Gevers, A. W.M. Smeulders, Color-based object recognition, Pattern Recognition, 32, 1999, 453-464. [9] D. Comaniciu, P. Meer, Mean shift: a robust approach toward feature space analysis, IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(5), May 2002. [10] G. Jaffré, A. Crouzil, Non-rigid object localization from color model using mean shift, IEEE ICIP 2003. [11] J. I. Agbinya, D. Rees, Multi-object tracking in video, Real-Time Imaging, 5 , 1999, 295-304.

5. CONCLUSION

[12] A. Koschan et al., Color active shape models for tracking non-rigid objects, Pattern Recognition Letters, 24, 2003, 17511765

We have tested a color histogram similarity approach to track an object of interest through a sequence of images. The robustness of localization in a variety of real-world

[13] Q. Liu, S. Ma, H. Lu, Head tracking using shapes and adaptive color histograms, Journal of Computer Science and Technology, 17(6), November 2002, 859,864

SCI 2004 – July 18-21, 2004, Orlando, Florida (USA)

page 5/5

The 8th World Multi-Conference on SYSTEMICS, CYBERNETICS AND INFORMATICS