Optimized two-frequency phase-measuring- profilometry light-sensor

Page 1 .... deformed image with a CCD camera, we calculate the range profile of the .... Since A has rank 11, the vector m can be recovered from. SVD with. A.
504KB taille 28 téléchargements 185 vues
106

J. Opt. Soc. Am. A / Vol. 20, No. 1 / January 2003

Li et al.

Optimized two-frequency phase-measuringprofilometry light-sensor temporal-noise sensitivity Jielin Li,* Laurence G. Hassebrook, and Chun Guan Department of Electrical Engineering, University of Kentucky, 453 AH, Lexington, Kentucky 40506-0046 Received July 16, 2002; accepted August 1, 2002 Temporal frame-to-frame noise in multipattern structured light projection can significantly corrupt depth measurement repeatability. We present a rigorous stochastic analysis of phase-measuring-profilometry temporal noise as a function of the pattern parameters and the reconstruction coefficients. The analysis is used to optimize the two-frequency phase measurement technique. In phase-measuring profilometry, a sequence of phase-shifted sine-wave patterns is projected onto a surface. In two-frequency phase measurement, two sets of pattern sequences are used. The first, low-frequency set establishes a nonambiguous depth estimate, and the second, high-frequency set is unwrapped, based on the low-frequency estimate, to obtain an accurate depth estimate. If the second frequency is too low, then depth error is caused directly by temporal noise in the phase measurement. If the second frequency is too high, temporal noise triggers ambiguous unwrapping, resulting in depth measurement error. We present a solution for finding the second frequency, where intensity noise variance is at its minimum. © 2003 Optical Society of America OCIS codes: 110.6880, 120.2650, 120.5800, 150.5670, 120.4290.

1. INTRODUCTION The structured light (SL) illumination approach to active vision1,2 is one of the most important techniques in current three-dimensional (3D) shape measurement. SL decoding techniques model the optical paths associated with emission and detection of reflected SL patterns to compute range data correspondence by triangulation. Compared with passive vision systems such as stereovision, SL techniques overcome the fundamental ambiguities associated with passive approaches, especially in a lowtexture environment. SL also has the advantage of computational simplicity and high precision. It has been used in the field of biomedical topology,3 quality control,4 and telecollaboration.5 There are a variety of different SL patterns, which include single-stripe,6,7 multistripe,8 gradient,9,10 binary,11 sine-wave,12 and various specialty patterns.1,2 These patterns are typically implemented in one of three ways: single frame,13 lateral shift 14 multiframe, and encoded multiframe.11,15–17 The multiframes are projected in sequence, color encoded,15 or combined into a single composite pattern.13 Hybrid methods, which include the two-frequency phasemeasuring-profilometry (PMP) method,18 on which our study is focused, are used to further enhance specific performance aspects. We have chosen to use the twofrequency PMP for measuring the human face topology because (1) it is resistant to target surface albedo variations, (2) it is resistant to ambient light contribution, (3) it yields nonambiguous depth measurement, (4) with a second, high frequency, it has the potential for attaining a high degree of accuracy in the depth reconstruction, and (5) the phase reconstruction is a pixelwise operation independent of the target object. Unlike the binary encoding methods, the PMP technique is not limited to the number of bit patterns used. However, in contrast to the binary methods, the PMP method depends on the pattern projec1084-7529/2003/010106-10$15.00

tor and the image capture technology having an adequate range in intensity values. That is, if the dynamic range and the intensity resolution of the camera and the frame grabber are not large enough, then PMP is sensitive to saturation caused by highly specular surfaces. The same is true for ambient light if its intensity is significant with respect to the pattern intensity. That is, if the captured intensity saturates, then the phase cannot be accurately recovered. So PMP is best suited for mattelike surfaces with albedo variations and ambient light contributions within the digitization range of the camera and frame grabber technology. An overview of the PMP method used in this study is given in Section 2. A primary concern in SL techniques is the depth measurement error. For example, a rigorous study by Trobina19 mathematically modeled the measurement error due to calibration errors by using a coded-light approach. As another example, Daley and Hassebrook20 applied information theory to SL binary image encoding in order to use binary entropy to find the maximum spatial stripe frequency and thus estimate the maximum bit resolution in depth reconstruction. A key component of estimating the depth error is the calibration process and the resulting reconstruction coefficients. The calibration/ reconstruction method used in this study is detailed in Section 3 and is similar to the procedure in Trobina,19 which has its origins in stereoscopic calibration methods presented by Faugeras and Toscani21 and Tsai.22 Our apparatus, as described in Section 3, had no significant radial distortion, so a classic pinhole model is used to predict the perspective distortion. A common theme in calibration error analysis19,23,24 is emphasis on the effects from spatial and systematic errors from the calibration process. In contrast, we focus our attention on the effect of temporal noise corrupting each captured pattern projection in the PMP process. © 2003 Optical Society of America

Li et al.

Vol. 20, No. 1 / January 2003 / J. Opt. Soc. Am. A

Therefore our analysis determines the PMP method’s repeatability and standard deviation (STD). Errors in calibration accuracy result in systematic errors that do not vary from measurement to measurement. So given that there are no gross errors in the calibration process, these systematic errors have no significant effect on the repeated measurement variance. However, the error analysis, in Section 4, does incorporate the reconstruction coefficients. This analysis is new to PMP methods but is similar to and inspired by a quantization analysis by Behrooz Kamgar-parsi and Behzad Kamgar-parsi.25 The resulting analysis yields a STD of phase for the PMP method, which is used to predict an optimum second frequency. A numerical model is presented for predicting the STD as a function of the temporal sensor noise, the spatial pattern frequency, the sine-wave projection pattern amplitude, and the number of pattern shifts used. The phase STD model, the optimum frequency estimate, and the numerical performance model provide the system designer with valuable insight into the two-frequency PMP method. Our design considerations are given at the end of Section 4. The conclusions are given in Section 5, and acknowledgments are given in Section 6.

Fig. 2.

107

Coordinates in active range finder.

2. PHASE-MEASURING-PROFILOMETRY RANGE-FINDING METHOD All SL techniques are based on triangulation between the camera and the projected light. For example, in Fig. 1, a light projector projects a light stripe pattern onto the object. Shown in Fig. 1 is a light stripe, but other patterns, such as grids, binary bars, and sine-wave fringes, can also be projected. The light pattern hits the object and is deformed if observed from another angle. Capturing the deformed image with a CCD camera, we calculate the range profile of the illuminated object by analyzing the deformation. To extract the depth information from the deformation, we need to know the geometric relationship among the camera, the projector, and the world coordinate system. The procedure to obtain these parameters is regarded as camera calibration. As shown in Fig. 2, there are three coordinate systems: (1) the 3D world coordinate Pw ⫽ (X w , Y w , Z w ), which is the physical coordinate in the object space and whose origin and orientation are determined by the observer’s con-

Fig. 3. Base frequency projections onto a Space Shuttle model for N ⫽ 4.

venience, (2) the camera coordinate Pc ⫽ (x c , y c ), which is a two-dimensional (2D) pixel coordinate on the image plane of the CCD camera, and (3) the projector coordinate Pp ⫽ (x p , y p ), which is also a 2D pixel coordinate. We refer to the camera and projector combination as a 3D sensor. The objective of 3D sensor calibration is to find the mapping transformation from the 3D world coordinate to the 2D camera and projector coordinates. The procedure of range reconstruction is to find the 3D world coordinate from the 2D camera and projector coordinates. As described in Section 1, PMP is an attractive SL method for several reasons. It requires as few as three projection frames and no point-matching or image enhancement to obtain the fringe distortion, which makes it suitable for a pipelined or parallel processing implementation. In PMP, the light pattern projected is a sine-wave pattern, the light pattern is shifted several times, and the captured light pattern is expressed as I n 共 x p , y p 兲 ⫽ A p ⫹ B p cos共 2 ␲ fy p ⫺ 2 ␲ n/N 兲 , p

Fig. 1.

Geometry of single-stripe SL range finder.

p

(1)

where A and B are constants of the projector, f is the frequency of the sine wave, and (x p , y p ) is the projector coordinate. The subscript n represents the phase-shift index. The total number of phase shifts is N. From the viewpoint of the camera, the received image is distorted by the topology and expressed as

108

J. Opt. Soc. Am. A / Vol. 20, No. 1 / January 2003

Li et al.

I n共 x c, y c 兲 ⫽ A 共 x c, y c 兲 ⫹ B 共 x c, y c 兲 ⫻ cos关 ␾ 共 x c , y c 兲 ⫺ 2 ␲ n/N 兴 ,

(2)

where ␾ (x , y ) represents the phase of the sine wave. If the projected sine pattern is shifted by a factor of 2 ␲ /N for N times, the phase distortion ␾ (x c , y c ) is retrieved by14 c

c



␾ 共 x c , y c 兲 ⫽ arctan

N



I n 共 x c , y c 兲 sin共 2 ␲ n/N 兲

n⫽1 N



I n 共 x c , y c 兲 cos共 2 ␲ n/N 兲

n⫽1



. (3)

It can be shown from Eqs. (1) and (2) that the projector coordinate y p can be recovered as y p ⫽ ␾共 x c, y c 兲/共 2 ␲ f 兲.

(4)

Once the phase is obtained, the depth of the objects can be easily obtained by a geometric computation.26 The base frequency is the f that gives 1 cycle across the field of view and therefore yields a nonambiguous depth value. An example of the base frequency projections for N ⫽ 4 is shown in Fig. 3. However, the unit frequency is noisy, so a second, higher frequency is used to improve the accuracy of the depth value.18

3. SYSTEM CALIBRATION AND RECONSTRUCTION Calibration allows versatility in the actual apparatus yet results in accurate reconstruction of the target surface. The apparatus, the calibration formulas, and the reconstruction matrices are now given in detail. The apparatus consisted of a Pulnix TMC-7 color camera and a Texas Instruments digital light processor (DLP) projector. The camera produces composite color video and is captured by data translation device DT3153, yielding 640 ⫻ 480 pixel array with 24-bit color and 8-bit intensity resolution. The Texas Instruments DLP development kit projector has an 800 ⫻ 600 micromechanical mirror array. The camera lens is a TOYO Optics TV zoom lens, and the projector lens is the zoom lens that came with the DLP kit. The camera is set up directly above the projector with an angle of 29.13° between their optical axes, which intersect at the target plane, 1.575 m from the projector/ camera unit. The camera was adjusted to be approximately perpendicular to the target plane, which in turn is aligned parallel with the X w – Y w world coordinate plane.

Fig. 4.

The camera field of view at the target plane is 618 mm wide and 508 mm high. The projector field of view is 840 mm wide and 635 mm high. The ratios of projector pixel to camera pixel are 0.9195 horizontal and 1.0 vertical. The camera pixel size on the target plane is 0.97 mm/pixel along the horizontal and 1.06 mm/pixel along the vertical. The angular resolution of the camera is 0.0327 deg/pixel, and the projector resolution is 0.0368, both along the horizontal. The radial distortion was insignificant for both camera and projector lenses, with a maximum radial barrel distortion along the outer boundaries of their fields of view, with respect to their optical centers, of less than 0.67%. We chose this apparatus setup to scan human busts and faces, but the theory should apply to many other configurations and applications. Because the radial distortion is low, we are able to use the ‘‘pinhole lens’’ to model the perspective distortion. The pinhole lens model is a common geometric model for camera imaging systems and is shown in Fig. 4. Had there been significant radial distortion,27,28 the optics of the camera and/or the projector would need to be calibrated off line and a spatial correction lookup table could be used to make the spatial corrections to the received image in the camera and/or to the projected image of the projector. A. Singular-Value-Decomposition-Based System Calibration We use a standard singular-value-decomposition-based (SVD-based) approach to the camera and projector calibration.21–24 The transforms from world to camera coordinates are given by xc ⫽

y ⫽ c

wc w wc w wc w wc X ⫹ m 12 Y ⫹ m 13 Z ⫹ m 14 m 11 wc w wc w wc w wc m 31 X ⫹ m 32 Y ⫹ m 33 Z ⫹ m 34 wc w wc w wc w wc X ⫹ m 22 Y ⫹ m 23 Z ⫹ m 24 m 21 wc w wc w wc w wc m 31 X ⫹ m 32 Y ⫹ m 33 Z ⫹ m 34

where the perspective matrix is

Mwc ⫽





wc m 11

wc m 12

wc m 13

wc m 14

wc m 21 wc m 31

wc m 22 wc m 32

wc m 23 wc m 33

wc m 24 . wc m 34

,

(5a)

,

(5b)

(6)

As stated by Trucco and Verri,29 the perspective matrix Mwc in Eq. (6) has only 11 independent entries, which can be determined through a homogeneous linear system by at least six world camera point matches. If more than six points can be obtained, then the matrix can be estimated

Perspective projection model.

Li et al.

Vol. 20, No. 1 / January 2003 / J. Opt. Soc. Am. A

through least-squares techniques. If we assume that there are M matched points, the denominator of Eq. (5) can be moved to the left side and Eq. (5) can be rewritten in a linear equation form as Am⫽0, where A is a 2M ⫻ 12 matrix. The matrix A is structured as

A⫽



used anywhere else. If the x p values were not known, they would be replaced by additional known values of y p in the SVD matrix A. B. Range Reconstruction Once the world-to-camera transformation matrix Mwc



X 1w

Y 1w

Z 1w

1

0

0

0

0

⫺x 1c X 1w

⫺x 1c Y 1w

⫺x 1c Z 1w

⫺x 1c

0

0

0

0

X 1w

Y 1w

Z 1w

1

⫺y 1c X 1w

⫺y 1c Y 1w

⫺y 1c Z 1w

⫺y 1c

X 2w

Y 2w

Z 2w

1

0

0

0

0

⫺x 2c X 2w

⫺x 2c Y 2w

⫺x 2c Z 2w

⫺x 2c

0

0

0

0

X 2w

Y 2w

Z 2w

1

⫺y 2c X 2w

⫺y 2c Y 2w

⫺y 2c Z 2w

⫺y 2c ,

¯

¯

¯

¯

¯

¯

¯

¯

¯

¯

¯

¯

Xw M

Yw M

Zw M

1

0

0

0

0

⫺x cM X w M

⫺x cM Y w M

⫺x cM Z w M

⫺x cM

0

0

0

0

Xw M

Yw M

Zw M

1

⫺y cM X w M

c ⫺y N Yw M

⫺y cM Z w M

⫺y cM

and wc m ⫽ 关 m 11

wc m 12

wc m 13

¯

wc T m 34 兴 .

(8)

Since A has rank 11, the vector m can be recovered from SVD with A ⫽ UDVT,

(9)

where U is a 2M ⫻ 2M matrix whose columns are orthogonal vectors, D is a positive diagonal eigenvalue matrix, and V is a 12⫻12 matrix whose columns are orthogonal. The only nontrivial solution corresponds to the last column of V, and that is the solution to the parametric matrix Mwc . To find these calibration points, we use a 16-point pattern marked off on an aluminum frame. The world (X w , Y w , Z w ) coordinates of the points, in millimeters, are (0, 254, 645), (0, 381, 645), (0, 508, 645), (0, 635, 645); (128, 127, 0), (128, 254, 0), (128, 381, 0), (128, 508, 0); (305, 127, 0), (305, 254, 0), (305, 381, 0), (305, 508, 0); and (430, 254, 645), (430, 381, 645), (430, 508, 645), (430, 635, 645). The coordinate accuracy is ⫾2 mm. The projector and camera coordinates are manually found by moving a projected spot onto a target point and then evaluating the camera coordinate. With these 16 points, the transformation matrix Mwc is calculated by solving Am⫽0. For the world-to-projector transformation matrix Mwp , we assume a similar perspective projection, where the camera coordinate (x c , y c ) is replaced by the projector coordinate (x p , y p ), and a similar perspective matrix,

Mwp ⫽





wp m 11

wp m 12

wp m 13

wp m 14

wp m 21

wp m 22

wp m 23

wp m 24 ,

wp m 31

wp m 32

wp m 33

wp m 34

(10)

is obtained for the corresponding points in world and projector coordinates. It should be noted that because a DLP is being used, the x p coordinates are known and thus can be used to contribute in the calculation of the denominator coefficients of the projector y p transform in Eq. (11) in Subsection 3.B. However, the x p transformation is not

109

(7)

and world-to-projector transformation matrix Mwp are obtained, the range of the object, i.e., the world coordinate of the object, can be computed by solving a 3 ⫻ 3 linear equation. For the point (X w , Y w , Z w ), it is projected onto the camera plane through Mwc and onto the projection plane through Mwp . During a scan, the image position (x c , y c ) is known, and the y phase position y p of the projection pattern is determined from the detected phase in Eq. (4). The unknown is the world coordinate (X w , Y w , Z w ), which satisfies Eqs. (5) and yp ⫽

wp w wp w wp w wp X ⫹ m 22 Y ⫹ m 23 Z ⫹ m 24 m 21 wp w wp w wp w wp m 31 X ⫹ m 32 Y ⫹ m 33 Z ⫹ m 34

.

(11)

Rearranging Eqs. (5) and (11) and letting

C⫽





wc wc c m 11 ⫺ m 31 x

wc wc c m 12 ⫺ m 32 x

wc wc c m 13 ⫺ m 33 x

wc m 21 wp m 21

wc m 22 wp m 22

wc wc c m 23 ⫺ m 33 y ,

⫺ ⫺

wc c m 31 y wp p m 31 y

D⫽



⫺ ⫺

wc c m 32 y wp p m 32 y



wp wp p m 23 ⫺ m 33 y (12)

wc c wc m 34 x ⫺ m 14 wc c wc m 34 y ⫺ m 24 , wp p wp m 34 y ⫺ m 24

(13)

we can compute the world coordinate as Pw ⫽ 关 X w

Yw

Z w 兴 T ⫽ C ⫺1 D.

(14)

Thus the world coordinate of the image points can be obtained by solving a 3 ⫻ 3 linear equation. In the range reconstruction, the world point Pw in Eq. (14) is regarded as the intersection point of observing line AB and projecting line CB in Fig. 2. So it is unique, and if for any reason there is no solution for Eq. (14), it means that the point is invalid. An invalid point may occur when there is shadowing, saturation, or low pattern signal energy.

4. INFLUENCE OF INTENSITY NOISE ON RANGE RECONSTRUCTION Compared with other SL algorithms such as light stripe, binary bar, and gray-code projection, the PMP algorithm uses fewer numbers of frames for a given precision.

110

J. Opt. Soc. Am. A / Vol. 20, No. 1 / January 2003

Li et al.

However, projecting a sine-wave light pattern requires the projector to support multiple gray levels. And the reconstructed phase is sensitive to intensity noise in the captured image. The noise sources include ambient light, shadowing, projector illumination noise, camera/ projector flicker, camera noise, and quantization error in the frame grabber and the projector. To model the intensity noise, we add noise to the A(x c , y c ) in Eq. (2) such that A n ⫽ A 共 x c , y c 兲 ⫽ A 0 共 x c , y c 兲 ⫹ ⌬A n 共 x c , y c 兲 ,

⳵An

⳵An



where A (x , y ) is the ideal background intensity and does not change in the test. ⌬A n (x c , y c ) is the random intensity noise. To simplify our analysis, we assume that the ⌬A n (x c , y c ) are independent over n and are secondorder processes. By substituting Eq. (15) into the numerator of Eq. (3) and considering a single pixel (x c , y c ), we can rewrite the numerator as

⳵An

c

N

S共 An兲 ⫽



N

I n sin共 2 ␲ n/N 兲 ⫽

n⫽1

N





A n sin共 2 ␲ n/N 兲

n⫽1

2

B sin共 ␾ 兲 .

(16)

Similarly, the denominator of Eq. (3) can be rewritten as N

C共 An兲 ⫽



n⫽1

N





n⫽1

2

B cos共 ␾ 兲 .

(17)

Inserting Eqs. (16) and (17) into Eq. (3), we have

冋 册 S共 An兲

␾ ⫽ arctan

C共 An兲

.

(18)

To estimate the change of ␾ with respect to ⌬A n , we use the following gradient: N

⌬␾ ⫽



n⫽1

⳵␾



⳵An

⫽ A 0n



⳵␾ ⳵An

⌬A n ,

(19)

A n ⫽A 0n

⫽ A n ⫽A 0n

⌬␾ ⫽

(24)

2 NB

sin共 ␾ ⫹ 2 ␲ n/N 兲 .

(25)

2

N



NB n⫽1

sin共 ␾ ⫹ 2 ␲ n/N 兲 ⌬A n .

(26)

From Eq. (26), we can see that the reconstructed phase error ⌬␾ depends on the phase ␾ and the intensity noise. However, as we show in the following, the phase error variance is independent of ␾. Since the noise is assumed to have zero mean, similar to the analysis by Behrooz Kamgar-parsi and Behzad Kamgar-parsi,25 the variance of phase error ␴ ␾2 can be represented as

␴ ␾2 ⫽

A n cos共 2 ␲ n/N 兲

⫽ cos共 2 ␲ n/N 兲 . A 0n

So Eq. (19) is rewritten as

N

I n cos共 2 ␲ n/N 兲 ⫽

(23)

Substituting Eqs. (21)–(24) into Eq. (20) and noting the properties of triangulation, we obtain

⳵␾

c

⫽ sin共 2 ␲ n/N 兲 , A 0n

⳵C共 An兲

(15)

0

冏 冏

⳵S共 An兲

冉 冊 2

NB

N

2

␴2



n⫽1

sin2 共 ␾ ⫹ 2 ␲ n/N 兲 ⫽

2 ␴2 N B2

. (27)

From Eq. (27), we can see the phase error STD ␴ ␾ increases linearly with intensity noise ␴ and decreases with N, the total number of images used, and the fringe modulation B. This relationship is demonstrated for ␴ ␾ versus N in Fig. 5 and for ␴ ␾ versus B p in Fig. 6, where B p is the fringe modulation of the projector and related to B by B ⫽ ␣ B p and ␣ is a reflection constant less than or equal to 1. Each STD in Figs. 5 and 6 is calculated by using 50 sample values. Thus the phase error variance is demonstrated to be independent of the actual phase value. As shown in Eq. (4), the phase is linear with respect to y p . From Eq. (11), we can find the relationship between

1 1 ⫹ 关 S 共 A n0 兲 /C 共 A n0 兲兴 2





S 共 A n0 兲

⳵C共 An兲 ⳵An

⫺ C 共 A n0 兲

C共 An兲

2

⳵S共 An兲 ⳵An

册冏

, A n ⫽A 0n

(20) where S 共 A n0 兲 ⫽ C 共 A n0 兲 ⫽

N 2 N 2

B sin共 ␾ 兲 ,

(21)

B cos共 ␾ 兲 .

(22)

From Eqs. (16) and (17), we obtain

Fig. 5. Phase STD change with increase of N, with scanning frequency f ⫽ 1, B P ⫽ 128, and ␴⫽2.8309.

Li et al.

Vol. 20, No. 1 / January 2003 / J. Opt. Soc. Am. A

111

angle between the camera and the projector gives a lower resolution in the depth dimension. The reconstruction error depends on N, B, and f, so by increasing any of these parameters, one can reduce the range error. Since N increases scanning time and B is limited by the system setup and difficult to improve, the most practical approach is to increase f, i.e., increase the sine-wave frequency. This is the reason that a two-frequency sinewave projection is used. The unit-frequency sine wave is used to avoid ambiguity in phase unwrapping, and the high-frequency sine wave is used to obtain a higher resolution.18 However, we need to determine the optimal higher frequency. Knowing the noise level, we can numerically find the optimal frequency. This is demonstrated in Fig. 8, where the simulated phase deviation, indicated by the curves, is compared with experimental measurements, indicated by Fig. 6. Phase STD change with increase of B P , with scanning frequency f ⫽ 1, N ⫽ 4, and ␴⫽2.8309.

phase and reconstructed range. Taking a partial derivative of y p in Eq. (11), we have w

CPy p ⫹ C 1 Pw ⫽ D 1 , where w

Py p ⫽

C1 ⫽



⳵Xw

⳵Yw

⳵Zw

⳵yp

⳵yp

⳵yp





T

⫽ 关Xe

0

0

0

0

0

0

wp ⫺m 31

wp ⫺m 32

wp ⫺m 33

D1 ⫽ 关0

(28)



Ye

Z e 兴 T,

(29)

,

(30)

wp T m 34 兴 .

0

(31)

Combining Eq. (28) with Eq. (14), we have Py p ⫽ C ⫺1 共 D 1 ⫺ C 1 C ⫺1 D 兲 . w

(32)

Fig. 7. Reconstructed world coordinate STD change with increase of frequency when N ⫽ 4 and B P ⫽ 128.

So the reconstructed range error can be represented by the phase error as ⌬Pw ⫽ 关 dX w

dY w

w

dZ w 兴 T ⫽ Py p ⌬ ␾ / 共 2 ␲ f 兲 .

(33)

From Eqs. (26) and (33), we find that the reconstructed error of the 3D coordinate can be approximated by a function of intensity error such that w

⌬Pw ⫽

Py p

N



␲ fNB n⫽1

sin共 ␾ ⫺ 2 ␲ n/N 兲 ⌬A n .

(34)

The intensity error is assumed to be zero-mean, Gaussian noise, so it can be shown that the STD of the reconstruction error is 关 ␴ Xw

␴ Yw

␴ Zw 兴 ⫽ 关 兩 X e兩

兩 Y e兩

兩 Z e兩 兴



冑2N ␲ fB

. (35)

An experimental demonstration of position error is shown in Fig. 7. All three sets of experimental results follow the theoretical relationships given in Eq. (35). Note that ␴ X w is lowest because it is orthogonal to the depth distortion along Y w . The deviation along Z w is highest because the

Fig. 8. Two data sets of simulated and experimental normalized unwrapped phase STD with increase of frequency for ␴ ␾ ⫽ 0.025346 and ␴ ␾ ⫽ 0.047689. Experimental sets are scanned with N ⫽ 4. B P ⫽ 48 and B P ⫽ 88 for circles and squares, respectively. f1 and f2 are the optimal frequencies predicted by the mathematical model.

112

J. Opt. Soc. Am. A / Vol. 20, No. 1 / January 2003

Fig. 9.

Li et al.

Range data of a face under (a) unit frequency and (b) two frequencies.

the circles and the squares. The simulation is based on the simulated intensity such that I n 共 x c , y c 兲 ⫽ Q 256(A 共 x c , y c 兲 ) ⫹ Q 256(B 共 x c , y c 兲 ) ⫻ cos关 2 ␲ fQ N p 共 y p N p 兲 /N p ⫺ 2 ␲ n/N 兴 .

(36)

Simulated phase values are obtained through Eq. (3) from the simulated intensity values of Eq. (36). Noise with variance equal to that of experimental measurements is added to the simulated phase values before phase unwrapping. It can be seen in Fig. 8 that the STD of an unwrapped phase normalized by the phase STD ␴ ␾ decreases for frequencies close to the unit frequency. This is because the higher frequency has a small unwrapped STD, which is in agreement with Eq. (35). However, as the frequency continues to increase, the wavelength becomes short compared with the unit-frequency noise floor. This causes an increase in ambiguous phase unwrapping. Eventually, the wavelength becomes relatively small in the higher frequency, so its phase ambiguity error contributes less to the total noise, and the phase noise in the

unit frequency begins to dominate and the curves level off to the unit-frequency STD. Therefore we can use the frequency where the STD of an unwrapped phase reaches its valley as our optimal frequency f opt . It is also noted from Fig. 8 that, for both experimental sets, the simulated data match well with the experimental data before the optimal frequency is reached. The differences between experimental data and simulated data after the optimal frequency are possibly caused by the actual noise being nonstationary. However, the optimal frequencies predicted by the simulation are good approximations of those in experimental values. Furthermore, the unwrapped phase can be mathematically modeled, and optimal frequency can be numerically obtained. During the process of two-frequency phase unwrapping, phase fringes are counted by

Nf ⫽

␾ 1f ⫺ ␾ 2 2␲

,

(37)

Li et al.

Vol. 20, No. 1 / January 2003 / J. Opt. Soc. Am. A

where ␾ 1 and ␾ 2 are the phases measured by the unit frequency and the higher frequency, respectively. Therefore the unwrapped phase error is ⌬ ␾ 2 /f, where ⌬ ␾ 2 is the phase error from the higher frequency. However, N f may be miscounted as the noise increases. Since the range of the phase is (⫺␲,␲], based on our unwrapping algorithm, the phase-unwrapping error happens when ⌬ ␾ 1 ⬎ ␲ /f, where ⌬ ␾ 1 is the phase error from the unit frequency. As f increases, the unwrapped phase error approaches ⌬ ␾ 1 . The phase error of the unwrapped phase can then be modeled as ⌬␾u ⫽



⌬ ␾ 2 /f

when 兩 ⌬ ␾ 1 兩 ⭐ ␲ /f

⌬␾1

when ␲ /f ⬍ 兩 ⌬ ␾ 1 兩 ⭐ ␲

,

(38)

where ⌬ ␾ u is the unwrapped phase error. Therefore the variance of the unwrapped phase is E 兵 ⌬ ␾ u2 其 ⫽ E 兵 E 兵 ⌬ ␾ u2 兩 ⌬ ␾ 1 其其 ⫽

冕冕 ⬁



⫺⬁

⫺⬁

⌬ ␾ u2 f 共 ⌬ ␾ u , ⌬ ␾ 1 兲 d共 ⌬ ␾ 1 兲 d共 ⌬ ␾ u 兲 . (39)

Since ⌬ ␾ 1 and ⌬ ␾ 2 are zero-mean Gaussian distributed, then, assuming that ␴ ␾ 1 ⫽ ␴ ␾ 2 ⫽ ␴ ␾ , the variance of the unwrapped phase is

␴ ␾2 u

⫽ P1

冉冊

␲ ␴ ␾2 f

f2

⫹ ␴ ␾2 ⫺ P 2

冉冊 ␲ f

,

where P 1共 x 兲 ⫽

1

冑2 ␲␴ ␾

冕 冉



y2 exp ⫺ 2 dy, 2␴␾ ⫺x x

Fig. 10.

(40)

P 2共 x 兲 ⫽

1

冑2 ␲␴ ␾



113

冉 冊

y2 y 2 exp ⫺ 2 dy. 2␴␾ ⫺x x

As is indicated in Fig. 8, the optimal higher frequencies f1 and f2 are the frequencies when ⳵␴ ␾2 u / ⳵ f ⫽ 0 and the variance of the unwrapped phase and the reconstructed world coordinates reach their minimal values in Eq. (40). To demonstrate how this theory can be applied to an actual human face, we adopt a procedure for estimating ␴ 2 , ␴ ␾2 , and optimum frequency as follows: 1. Set projector parameters in Eq. (1) to the maximum values as A P ⫽ B P ⫽ 128 and adjust aperture to prevent saturation. 2. Capture a series of images of a typical target object. 3. Use a small area of the target object (could be a typical area or a difficult area) and find the temporal STD of each pixel by using the captured frames. Average the STDs within the small area to estimate ␴ 2 . 4. The mean values of the small area represent the attenuated A P ⫹ B P ⫹ambient light, so project and capture a black area and obtain only the ambient light. Then subtract ambient light from A P ⫹ B P ⫹ambient light, and divide by 2 to get the received A and B values. Choose the number of shifts based on maximum allowable scan time to be N. 5. Use N, B, ␴ 2 , and Eq. (27) to get ␴ ␾2 . 6. Numerically determine ␾ (x c , y c ) values from I n (x c , y c ) in Eq. (36) into Eq. (3) at both the base frequency and a second frequency f and B(x c , y c ) ⫽ B and A(x c , y c ) ⫽ A. 7. Add noise, with variance ␴ ␾2 , to ␾ (x c , y c ) for the base frequency and the second frequency. Unwrap these

Reconstructed world coordinates of a face with intensity values.

114

J. Opt. Soc. Am. A / Vol. 20, No. 1 / January 2003

phase values based on Eq. (37) and estimate the STD of ␾ u (x c , y c ). Numerically plot the STD for a range of frequencies. 8. Numerically solve Eq. (40) for the optimum frequency. Use the optimum frequency as well as the curve found in step 7 to decide on an optimized frequency given the application and performance objectives. The procedure is applied to a human face, ‘‘Timothy.’’ With the use of a sample area on the forehead, the optimum frequency is determined to be 22; the face is scanned at a second frequency of 20 to allow for gradient modulation of the projected frequency. The twofrequency reconstruction is presented in Figs. 9 and 10. In Fig. 9, the depth data are visualized as a specular surface so that depth variation is highlighted. The depth noise is obvious for the unit frequency, as shown in Fig. 9(a), and a significant improvement is obtained by using two frequencies, as shown in Fig. 9(b). To demonstrate the practicality of the process, we map the intensity values onto the world coordinates and view them from an angle, as shown in Fig. 10. Some dark regions such as the eye pupils and the nose nostrils are interpolated. In summary, we provide a procedure for obtaining an optimum frequency as well as a deviation with respect to frequency, such as shown in Fig. 8. A PMP designer can choose a suboptimal frequency that will allow for an acceptable modulation range. Furthermore, a key parameter in system design is the triangulation angle. Typically, the larger the triangulation angle, the lower the measurement STD. So the PMP designer would like to make the triangulation angle as large as possible. However, the larger the triangulation angle, the more the shadowing, the less the reflected light, and the larger the scan head geometry. These trade-offs are made, and once the maximum angle is determined, the high-frequency projection pattern is determined from Eqs. (3) and (36) to allow for the expected frequency modulations caused by the target object depth gradient extrema. If the camera, the projector, and the typical target object surface can be set up to measure the anticipated imaging noise, then the absolute phase error values can be calculated and the calibration coefficients can be found and used to determine the error in repeated measurements.

5. CONCLUSIONS In this paper, we presented a detailed procedure for twofrequency PMP optimization. The PMP algorithm performance is dependent on the temporal intensity noise. To our knowledge, we are the first to introduce a rigorous mathematical analysis of the PMP temporal noise and its effect on unwrapped phases as well as the reconstructed world coordinates. Furthermore, based on the twofrequency phase-unwrapping algorithm, we developed both a practical simulation model of the unwrapped phase noise and, for the first time, a mathematical model for determining optimal higher frequency. Both approaches approximate the experimental data. Hence, given a basic measurement of intensity noise variance, PMP designers can determine an optimized higher-frequency value by using these numerical and mathematical models.

Li et al.

ACKNOWLEDGMENTS Partial funding for this research was provided by NASA cooperative agreement NCC5-222 through Western Kentucky University and from National Science Foundation grant EPS-9874764. Our thanks go to Timothy Hassebrook for being the subject of the face scan. Corresponding author Laurence G. Hassebrook can be reached by e-mail at [email protected] or by mail at the address on the title page.

*Present address, Cisco Systems, Inc., 170 West Tasman Drive, San Jose, California 95134-1706.

REFERENCES 1. 2.

3.

4.

5.

6. 7.

8. 9. 10. 11. 12. 13.

14. 15. 16.

F. Chen, G. M. Brown, and M. Song, ‘‘Overview of threedimensional shape measurement using optical methods,’’ Opt. Eng. 39, 10–22 (2000). J. Batlle, E. Mouaddib, and J. Salvi, ‘‘Recent progress in coded structured light as a technique to solve the correspondence problem: a survey,’’ Pattern Recogn. 31, 963–982 (1998). X. Y. Su and W. S. Zhou, ‘‘Complex object profilometry and its application for dentistry,’’ in Clinical Applications of Modern Imaging Technology II, L. J. Cerullo, K. S. Heiferman, Hong Liu, H. Podbielska, A. O. Wist, and L. J. Eamorano, eds., Proc. SPIE 2132, 484–489 (1994). G. Sansoni, F. Docchio, U. Minoni, and L. Biancardi, ‘‘Adaptive profilometry for industrial applications,’’ in Laser Applications to Mechanical Industry, S. Martellucci and A. N. Chester, eds. (Kluwer Academic, Norwell, Mass., 1993), pp. 351–365. R. Raskar, G. Welch, M. Cutts, A. Lake, L. Stesin, and H. Fuchs, ‘‘The office of the future: a unified approach to image-based modeling and spatially immersive displays,’’ presented at SIGGRAPH 98, Orlando, Fla., July 19–24, 1998. G. Schmaltz, ‘‘A method for presenting the profile curves of rough surfaces,’’ Naturwissenschaften 18, 315–316 (1932). Y. Shirai and M. Suwa, ‘‘Recognition of polyhedrons with a range finder,’’ in Proceeding of the International Joint Conference on Artificial Intelligence (Morgan Kaufman, San Francisco, Calif., 1971), pp. 80–87. P. M. Will and K. S. Pennington, ‘‘Grid coding: a preprocessing technique for robot and machine vision,’’ Artif. Intell. 2, 319–329 (1971). B. Carrihill and R. Hummel, ‘‘Experiments with intensity ratio depth sensor,’’ Comput. Vision Graph. Image Process. 32, 337–358 (1985). D. S. Goodman and L. G. Hassebrook, ‘‘Surface contour measuring instrument,’’ IBM Tech. Discl. Bull. 27(4B), 2671–2673 (1984). J. L. Posdamer and M. D. Altschuler, ‘‘Surface measurement by space-encoded projected beam systems,’’ Comput. Vision Graph. Image Process. 18, 1–17 (1982). D. M. Meadows, W. O. Johnson, and J. B. Allen, ‘‘Generation of surface contours by moire patterns,’’ Appl. Opt. 9, 942 (1970). G. Goli, Chun Guan, L. G. Hassebrook, and D. L. Lau, ‘‘Video rate three dimensional data acquisition using composite light structure patterns,’’ Univ. of Kentucky ECE Tech. Rep. CSP-02-002 (May 30, 2002). V. Srinivasan, H. C. Liu, and M. Halioua, ‘‘Automated phase measuring profilometry: a phase mapping approach,’’ Appl. Opt. 24, 185–188 (1985). K. L. Boyer and A. C. Kak, ‘‘Colored-encoded structured light for rapid active ranging,’’ IEEE Trans. Pattern Anal. Mach. Intell. PAMI-9, 14–28 (1987). L. G. Hassebrook, R. C. Daley, and W. Chimitt, ‘‘Application

Li et al.

17. 18. 19. 20. 21.

22.

of communication theory to high speed structured light illumination,’’ in Three-Dimensional Imaging and LaserBased Systems for Metrology and Inspection III, K. G. Harding and D. J. Svetproff, eds., Proc. SPIE 3204, 102–113 (1997). J. M. Huntley and H. O. Saldner, ‘‘Shape measurement by temporal phase unwrapping: comparison of unwrapping algorithms,’’ Meas. Sci. Technol. 8, 986–992 (1997). H. Zhao, W. Chen, and Y. Tan, ‘‘Phase-unwrapping algorithm for the measurement of three-dimensional object shapes,’’ Appl. Opt. 33, 4497–4500 (1994). M. Trobina, ‘‘Error model of a coded-light range sensor,’’ Tech. Rep. BIWI-TR-164, ETH-Zentrum (September 21, 1995), pp. 1–35. R. C. Daley and L. G. Hassebrook, ‘‘Channel capacity model of binary encoded structured light-stripe illumination,’’ Appl. Opt. 37, 3689–3696 (1998). O. D. Faugeras and G. Toscani, ‘‘The calibration problem for stereo,’’ in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition ’86 (Institute of Electrical and Electronics Engineers, New York, 1986), pp. 15–20 (1986). R. Y. Tsai, ‘‘A versatile camera calibration technique for

Vol. 20, No. 1 / January 2003 / J. Opt. Soc. Am. A

23. 24. 25. 26. 27. 28. 29.

115

high accuracy 3D machine vision metrology using off-theshelf TV cameras and lenses,’’ IEEE Trans. Rob. Autom. RA-3, 323–344 (1987). R. J. Valkenburg and A. M. McIvor, ‘‘Accurate 3D measurement using a structured light system,’’ Image Vision Comput. 16, 99–110 (1998). R. W. DePiero and M. M. Trivedi, ‘‘3-D computer vision using structured light: design, calibration and implementation issues,’’ Adv. Comput. 43, 243–278 (1996). Behrooz Kamgar-parsi and Behzad Kamgar-parsi, ‘‘Evaluation of quantization error in computer vision,’’ IEEE Trans. Pattern Anal. Mach. Intell. 11, 929–939 (1989). W. S. Zhou and X. Y. Su, ‘‘A direct mapping algorithm for phase-measuring profilometry,’’ J. Mod. Opt., 41, 89–94 (1994). F. L. Pedrotti and L. S. Pedrotti, Introduction to Optics, 2nd ed. (Prentice-Hall, Englewood Cliffs, N.J., 1993). J. Weng, P. Cohen, and M. Herniou, ‘‘Camera calibration with distortion models and accuracy evaluation,’’ IEEE Trans. Pattern Anal. Mach. Intell. 14, 965–980 (1992). E. Trucco and A. Verri, Introductory Techniques for 3-D Computer Vision (Prentice-Hall, Englewood Cliffs, N.J., 1998), Chap. 6, pp. 123–138.