Pattern Recognition Wood Crack Detection - Guillaume Lemaitre

look to the problem of crack detection, defining what a crack looks like and its characteristics. II. Wood Cracks and Wood Crack Detection. A wood timber image ...
14MB taille 76 téléchargements 428 vues
1

Pattern Recognition Wood Crack Detection Carlos Becker, Rocio Cabrera, Guillaume Lemaitre, Ali Mirzaei, Mojdeh Rastgoo & Alexandru Rusu

I. Introduction Wood timber defect detection is an important topic for wood production companies, since the number and severity of defects determine the quality of the wood and consequently its price. Defects such as knot, crack and the heterogeneity of the wood fiber are very important for the wood processing company to distinguish between different woods and employ the timbers for different applications. The applications are strongly linked with the mechanical strength and durability of the wood, which can be determined by the amount of defects in it. Fully automated production lines require an un-manned inspection system. These systems extensively use machine vision algorithms to inspect the production lines, which process the logs and timber with speeds as high as 13 meters of timber per second. If the defects are detected properly then the waste and misclassification rate can be reduced. Many researchers are actively searching for an accurate and efficient way to detect wood defects. Finn et al.[21] adopt an algorithm to study color and texture in wood defect detection problems. The recognition mainly depends on color and texture characteristics. However, the system is vulnerable to thick dots on wood surfaces and the errors are relatively high. In addition, German P. Meinlschmidt et al.[1] adopt infra-red thermal imaging technology to study wood. In their experiments, the materials under inspection are heated with radiators to detect wood defects. Many researches indicate that no feature extraction method is absolutely better than another, and the appropriate combination of them is expected to lead to better results. In this report both segmentation and feature extraction methods will be presented, including supervised and unsupervised techniques. The first section will present a closer look to the problem of crack detection, defining what a crack looks like and its characteristics.

and annual rings as well as discrimination between cracks and other defects such as knots and superficial sawing machine effects must be taken into account, since they sometimes contain similar features as cracks in the sense of intensity and color. Therefore, the idea of applying pattern recognition is to use features to discriminate between cracks and other defects, increasing the chance of trapping the cracks and localizing them if appropriate features are selected.

Figure 1: Crack and year rings defects - Courtesy of LuxScan Technologies

Selecting an appropriate feature is essential to the detection process. Indeed, the hardest part of the procedure is feature selection. III. Available Images and Ground Truth The data provided by Luxscan is a set of wood timber acquisitions. Each acquisition is made of four images or channels, named Sc, Ir, Rd and 3d. An example can be seen in figure 2.

II. Wood Cracks and Wood Crack Detection A wood timber image is shown in Fig. 1, where cracks and year rings are highlighted. Cracks are somehow similar to annual rings in the sense of color and intensity. Normally, year ring can be extracted by using an edge detection method, such as the canny edge detection, and a morphological operator. Some preprocessing steps such as contrast enhancement and filtering are required. However, wooden timber pattern changes a lot. Thus, using a simple approach will not cover all the cases in the production line. Attenuation or suppression of background

Figure 2: Imaging channels for a given acqusition.

2

Unfortunately, only some basic guidelines on how to detect cracks were provided but no proper ground truth was given. Therefore, we decided to generate our own ground truth, based on what we think are cracks or not. However, it is very likely that this ground truth is highly unreliable, given that we do not have the necessary experience to differentiate cracks from non-cracks and other wood defects.

minimizing an objective function, in this case a squared error function. The objective function J=

k X n

2 X

(j)

xi − cj j=1 i=1

2

(j) where xi − cj , a chosen distance measure between (j)

a data point xi and the cluster centre cj , is an indicator of the distance of the n data points from their respective cluster center. This algorithm is shown in Fig. 4

Figure 3: Example of the generated ground truth. Red traces are only for visual reference.

An example of the generated ground truth can be seen in figure 3, where the green brush represents the crack area. A Matlab script was generated to extract the green areas from the images and place this information in a structure that can be later used to train and/or test the developed methods. IV. Segmentation A. Crack detection using wavelet descriptor and K-means clustering 1) K-mean classifier: K-means is one of the simplest unsupervised learning algorithms that solve the well known clustering problem. The procedure follows a simple and easy way to classify a given data set through a certain number of clusters (assume k clusters) fixed a priori. The main idea is to define k centroids, one for each cluster. These centroids should be placed in a very specific way because different locations cause different results. Hence, the best choice is to place them as far away as possible from each other. The next step is to take each point belonging to a given data set and associate it to the nearest centroid. When no point is pending, and a decision has been made for every individual points in the data set, the first step is completed and an early grouping is done. At this point we need to re-calculate k new centroids as barycentre of the clusters resulting from the previous step. After we have these k new centroids, a new binding has to be done between the same data set points and the nearest new centroid. A loop has now been generated. In this method we will constantly monitor the movement of the centroid. If the position movement is very small, we will stop the calculation. Finally, this algorithm aims at

Figure 4: K-mean algorithm

After applying k-mean classifier to a bunch of data the result can be classified as in Fig. 5.

Figure 5: Data classification result by k-mean

As seen from Fig. 5, data can be classified into different classes based on the distance between the centroid and every individual point. In this process we should select a feature for classification otherwise the image will be oversegmented using a simple K-means algorithm. 2) Feature extraction and Application: Two kinds of features are used here: wavelet features and color histogram features. •

Wavelet features: If we consider, for example, the image shown in Fig. 6(a) and we apply the wavelet transform toit, we would get the results presented in Fig. 6(b). Now imagine we are selecting a window of size n × n around a given pixel in the original image, to which

3



Figure 6: Original image and Daubechie (db1) result

we apply two steps of wavelet transform. Every coefficient; vertical, horizontal and diagonal are evaluated by wavelet. We arrange these values for every pixel in [v h d] triple vector which v, h and d stand for vertical, horizontal and diagonal coefficient respectively. The norm of this triplet vector is calculated as a feature for a corresponding pixel. The rest of this section briefly explains the fundamentals of wavelet for extracting these three coefficients. Every image in the wavelet space can be expressed by the equation below:

The adaptive wavelet is applied to the image to extract the features. In this way we can adjust an appropriate scale for wavelet. By prior knowledge, we know that If we go to the higher scale, the detail information is mixed up with noise and if we stay in lower scale the information will also contain the background. Since, crack and annual rings have almost same structure so by exploiting this method we can attenuate the effect of these two phenomena. Moreover, we can see from the detail coefficients calculation, that the information of all the pixels are used to extract these coefficients. Therefore, noise can be propagated to the feature by noisy pixels. By applying this technique we reduce and limit the effect of this problem as well. That is why the feature which is extracted in this way is very effective. Color histogram: According to Luxscan, the Rd, Sc and Ir channels can be combined as Red-GreenBlue channels respectively. This way, it is possible to obtain an image that is similar to a color image of the timber. One example for this image is shown in 7. The order of each color channel is important when creating the RGB image, since each acquisition channel corresponds to a different beam wavelength.

Figure 7: Mapping (Rd,Sc,Ir) to (R,G,B)

f (t, s) =

X

aJ,k,l φJ,k,l (t, s) +

+

bhj,k,l Ψhj,k,l (t, s)

j≥J k,l∈Z

k,l∈Z

X X

X X

bvj,k,l Ψvj,k,l (t, s)

j≥J k,l∈Z

+

X X

bdj,k,l Ψdj,k,l (t, s)

j≥J k,l∈Z

where: aJ,k,l =

XX

bhj,k,l bvj,k,l bdj,k,l

f (t, s)φJ,k,l (t, s)aJ,k,l XX = f (t, s)φJ,k,l (t, s) XX = f (t, s)Ψhj,k,l (t, s) XX = f (t, s)Ψvj,k,l (t, s) XX = f (t, s)Ψdj,k,l (t, s)

The functions Ψ and φ depend on the wavelet function, for example Daubechies, Haar, etc. Finally, the feature can be extracted by assuming a wavelet function such as Daubechie (db1) or Haar, with a windows size of n × n (neighbourhood of npixels) and a scale of j, by the equation below (for example j = 2): q D = (bh2,k,l )2 + (bv2,k,l )2 + (bd2,k,l )2 (1)

In this feature, the RGB image is converted to CIE Lab images. Applying the k-mean algorithm to this color space will give better results. A threedimensional histogram is extracted from the CIE Lab images. The K seeds for the K-means segmentation are automatically determined using the hill-climbing algorithm [2] in the three-dimensional CIE Lab histogram of the image. The hill-climbing algorithm can be seen as a search window being run across the space of the d-dimensional histogram to find the largest bin within that window. Fig. 8 explains the algorithm for a one-dimensional case. Since the CIE Lab feature space is three-dimensional, each bin in the color histogram has 3d − 1 = 26 neighbours where d is the number of dimensions of the feature space. The number of peaks obtained indicates the value of K, and the values of these bins form the initial seeds. Since K-means algorithm clusters pixels in the CIE Lab feature space, an 8-neighbour connectedcomponents algorithm is run to connect pixels of each cluster spatially. 3) Result: Fig. 9 and Fig. 10 show the results for color histogram and wavelet feature descriptors respectively. In

4

Figure 8: Finding peaks in a histogram using a search window like (b) for a one dimensional histogram

fact, a color histogram can be used to detect big cracks more efficiently. For example, a histogram of 8 bins is used to extract the result shown in Fig. 9(a). This method is really fast even for large images. For a window size of 3×3 and j = 2, Haar wavelet function will give rise to the result of 9. In this image, the positions of the cracks are highlighted in blue.

Figure 10: Results for wavelet feature descriptor

Figure 11: accuracy measurement of k-mean algorithm

B. Crack detection using thresholding and morphological operations Figure 9: Algorithm result for color histogram feature descriptor

Some areas are over-detected at the left-bottom of image. This feature takes time for a large image. The color histogram was applied to a randomly selected set of images and the accuracy of this method was evaluated. Fig. 11 shows these results where the horizontal axis represents the image number in the set and the vertical axis the accuracy. In this experiment the ground-truth provided manually.

One of the oldest and popular methods are the ones which use thresholds. They are based on histogram analysis [3] , on adaptive thresholding [4], or on Gaussian modelling [5]. These techniques are simple but in some cases not very efficient: the results might show some false detections, which can be also filtered. Some methods are based on morphological operations [5], [6] [7]. The results contain less false detections but they are highly dependent on the choice of the parameters. This approach is trying to merge the thresholding techniques with the

5

morphological tools as will be presented further on. This method is also focused on the computational speed of the detection algorithm, as it has to be designed for industrial applications. It will be shown that the algorithm runs very fast, but with some problems from the accuracy point of view. In order to accomplish the crack detection task, several steps are required: preprocessing, thresholding and post processing. The preprocessing stage involves filtering and other mechanisms to convert the image into a form which is the best suitable for the thresholding task. In this case the preprocessing stage takes in consideration the intensities of the pixels. First, a median filter is applied in order to reduce the noise in the used images, and also contrast is improved by using the imadjust Matlab function. One of the important detail of this approach is that, the images are taken using different sensors (Ir and Sc). The images will be processed individually and their results will be merged at the postprocessing stage. Also the different type of information given by each sensor is taken into account. For example, the image received from the laser scatter effect sensor (Sc) represent the cracks very bright, while in the infrared laser lighting image (Ir) the cracks appear very dark. The latter image will be complemented, so that the cracks will appear bright (like in the case of the Sc image). An example of the used images, after the preprocessing stage is shown in Figure 12.

One of the methods is to find the global image threshold developed by Otsu, using the image histograms. This is performed using the graythresh Matlab function. This is usually followed by conversion of the grayscale image to black and white image, where the pixels with intensities higher than the threshold are converted to white and the rest of the pixels are converted to black. In the practical case, for a better binarization of the images, the determined thresholds were empirically increased by 70% and 60% (depending on the used sensor for the image acquisition). After the binarization process, the information obtained in the two images is fused in order to obtain an image which represents the cracks in the studied piece of wood. In the output image a crack is represented only if in the binary images, on the same position, there is pointed out to exist a crack (this operation is equivalent to the intersection of the two binary images). Figure 13 shows the two binary images and the merged version(representing the cracks in the wood).

(a) Binary Sc Image

(b) Binary Ir Image

(c) Intersection image (the white regions represent the cracks)

Figure 13: The result of the thresholding stage and the fusion of the information from the two sensors

(a) Sc Image

(b) Ir Image

(c) Complemented Ir Image

Figure 12: Images after the preprocessing stage (median filtering and contrast adjusting)

Thresholding is the most important phase of the processes. After the preprocessing stage, the next phase is to effectively identify the cracks in the blocks of wood. Thresholding is one of the popular and simple ways of identifying cracks on pieces of wood, especially in image processing. The basic principle of thresholding is to identify a value for a given property of an image to enable classification and segregation of pixels into categories. However, it is the method and the condition for threshold that differs from procedure to procedure. There are many different ways to identify thresholds.

By analysing the first two images in Figure 13 it can be concluded that the binarization process is very noisy, as it depends on the chosen threshold. Even if in the two images there are a lot of false positives, they are suppressed in the output image, by the information fusion stage. On the fused image, there are applied successively two morphological operations: erosion and dilation, in order to connect adjacent regions and to suppress very small detected regions (which can be considered false positives). Further on, the resulting cracks of the wood will be displayed on the original image and compared with the ground truth. The cracks will be displayed in Figure 14 in two ways, one which draws a bounding box around each detected region and one which highlights the determined crack pixels. In Figure 14.(b) it can be seen that the crack regions are detected, but the adjacent regions are not connected. A good way to try to improve the algorithm, is to connect adjacent regions, or by following the lines (using Hough

6

used only to identify the set of coordinates of the white pixels. In this stage, the discontinuity between cracks is not considered since it is supposed to be eliminated during the connectivity establishment procedure.

(a) Ground truth

(b) Bounding boxes around the detected regions

(c) Highlighted crack pixels

Figure 14: A visual comparison between the ground truth and the detected crack regions

transform) and by discarding the small detected regions (which are false positives). Overall, taking into account the computation time (less than 1 second per image) and the simplicity of the algorithm, it can be concluded it performs well and it can be improved in order to be implemented for industrial applications. Further on there will be discussed postprocessing steps, which can improve the accuracy of the algorithm. One of the reason for the discontinuities of the cracks in the processed image (Figure 14) can be the used threshold for the binarization. This gives a high importance to the identification of the continuity between crack pixels. This is done by considering a 5 x 5 neighbourhood around each pixel and count the crack and non-crack pixels. If the number of crack pixels in the neighbourhood is higher than a minimum threshold, results in the classification of the center pixel as taking part of a crack. The considered area around each pixel is not smaller than 5 x 5 because it is possible that the probable gap between two adjacent crack regions won’t be filled. By using this method, the continuity of the cracks is retained and the identification of the crack pixels is shown better for the further processing. (this is equivalent with the dilation morphological operation) The next postprocessing stage is the representation of the crack. There were studied different methods like: scan line method [8], watershed method[8] or by using the Hough Transform.In this part, only the scan line method will be discussed, as the Hough Transform was presented in other sections of the report. The scan line method, as described in [8], uses scan lines in horizontal and vertical direction. The distance between the scan lines varies with the image dimensions. In the case of a grayscale image, the method involves dividing the image in small regions, gathering the local maxima and identifying the potential crack points and the neighbouring maxima to identify the probability of the cracks. In this case, of the binary image, the scan lines are

Figure 15: An example of image with scan line and crack pixels

In Figure 15 the blue line represents the scan line, while the white pixels are crack points. When the scanning process takes place, this is stored in the form of two clusters. This is identified by the discontinuity of the coordinates of the pixels. The next step is to identify the midpoint of each of the clusters in the horizontal and vertical direction. The midpoints are identified since they form the centre of the crack regions for each cluster in the image. Therefore, tracing the rest of the crack, either horizontal or vertical becomes much easier. An example of how the clusters are represented in the row of a scan line is shown in Figure 16, where the numbers indicate the y coordinates of the pixels (in the case of a horizontal scan line).

Figure 16: An example of an horizontal scan line

The clusters themselves don’t have to be separated since their continuity is identified by the neighbouring value in the array. Therefore, the midpoint of each cluster becomes an appropriate benchmark to identify the region and a possible crack. The next step is to match the midpoints of the clusters from one row with the midpoints of the clusters on the next row, and so on. The process goes from top to bottom in the case of the horizontal scan lines, and from left to right for the vertical ones. The last stage of the postprocessing step is to classify the crack using the Hough transform, as it is considered to be one of the most accurate methods to identify shapes in an image. This method won’t be discussed, as it was detailed in other subsection of this report. To conclude, this section discusses possible improvements for using thresholding and morphological operations in order to detect wood crack. It has been shown that the results are dependent on the used thresholds and on the amount of noise in the image. Actually, the highest disadvantage of this method is that in the case there are no cracks in a test image, false positives will be given (as it uses a threshold based on the histogram of the studied images). An idea to overcome this shortcoming is to use

7

a global threshold based on all the used images. Overall, it has been proven that, in some cases, good detection of cracks can be achieved in a very short time, due to the simplicity of the method.

C. Fuzzy c-means clustering Fuzzy c-means clustering is an unsupervised clustering method proposed first by Dunn [9] and improved by Bezdek [10]. The principle of the fuzzy c-means clustering is very similar to the k-means clustering [11] presented before. The difference between these two clustering methods lie in the fact that k-means returns only the cluster at which the feature belongs to while fuzzy c-means returns a degree of belonging to clusters for each features. As kmeans clustering [11], fuzzy c-means is composed of the same steps: Algorithm 1 Fuzzy c-means algorithm Choose the number of clusters Random assignments coefficients to belong to an clusters repeat Compute the centroid for each cluster as presented in equation (3) For each point compute the coefficients giving the degree of belonging to an cluster as presented in equation () until algorithm has converged (coefficients’ change < ) For the sum of the coefficients for a point is equal to one:

∀x →

nb clusters X

uk (x) = 1

(2)

k=1

where k is the index of the cluster and x is the feature considered. The centroid of each cluster is computed as: P m uij xi cj = Px m x uij

(3)

where cj is the centroid of the cluster j. This centroid is the mean of all points of the cluster weighted by uij which is the degree of belonging to the cluster and is defined as: uij = P

1

2 kxi −cj k m−1 k ( kxi −ck k )

(4)

where m is fuzified parameter such as m > 1. 1) Using fuzzy c-means to segment cracks: The method used in order to segment cracks was originally applied in medical imaging in order to detect retina vessels [12]. The algorithm is as follow:

Algorithm 2 Cracks detection using fuzzy c-means clustering I ← read an image Apply a median filter on I to find and extract the background Create a vector X of features composed of the gray level value of pixels of I Apply fuzzy c-means clustering on the vector X with k=4

2) Results and discussion: The number of cluster is defined at four for the sensors Ir, Rd and Sc. Due of horizontal lines of the 3d images, the number of clusters have to be larger and was fixed at six. However, this parameter can be change. It is the only way that the user can modify results. The algorithm 2 was running on the four type of images (3d,Ir,Rd,Sc). Results are presented in figure 17. The problem of this method is that like all unsupervised methods, the definition of each cluster is not define by the user hence it is not possible to find out automatically the cluster which represents cracks. Due of the preprocessing step (median filtering), it is difficult to characterize the centroid of the cluster "cracks". Hence, its is difficult to evaluate the accuracy of this method which need an human operation to select the good cluster. A future work should be to try to let this selection step automatic. Regarding the results of each sensor images (figure 17), the sensor Sc is the more appropriate since only cracks appear on white which avoid false alarms (figure 17(e)). On Rd and Ir images, cracks appear black. However, other part of the wood have the same gray levels properties which cause false alarms (figure 17(c) 17(d)). Regarding the 3d sensor, the image given appears corrupted by horizontal lines (which could be easily corrected with calibration where a perfect surface will be scan and only artifacts of the acquisition will appear). Thus results appear also corrupted by these lines. More results are presented in appendix A. D. Crack detection using phase congruency method Cracks are usually thin and are characterized by a change of the gray level values at the location of cracks. Hence they can be detected as edges. To detect edges, simple filter could be used as previously (Canny, Sobel, Laplacian). However these methods found only contours (figure 18(b)) and a morphological operations are needed to fill the object found. Features detection using the phase congruency allows to find an solid edge as shown on figure 18(c). The method used to find the phase congruency was proposed by Kovesi [13]. Kovesi proposed to find phase congruency using wavelet and more precisely logarithm Gabor filter bank in order to find information about the frequency at a specific spatial location [13]. Another advantage of phase congruency is that it is supposed to be illumination and contrast invariant.

8

(a) Groundtruth

(b) 3d Sensor

(c) Ir Sensor

(d) Rd Sensor

(e) Sc Sensor

Figure 17: Results of fuzzy c-means clustering for the four different sensors.

(a) Original Image

(b) Canny edges detector

(c) Image features by phase congruency

Figure 18: Features detection using different method. Figure 18(b) which represents a result of Canny detector which allows to detect only contours. Figure 18(c) represents a result using phase congruency. This result shows that a solid edge is found.

The method proposed to detect cracks is as follow: Algorithm 3 Cracks detection using phase congruency I ← read an image Apply a median filter on I to find and extract the background Compute the phase congruency to extract the feature Binarise the image using Otsu’s threshold Remove too small regions Results are presented on figure 19. Each image from different sensors was segmented using algorithm 3. It appears that a combination of these results could be performed. The following algorithm was proposed: Algorithm 4 Combination algorithm Read segmented image from Ir (ISir ), Sc (ISsc ) and 3d (IS3d ) and also Sc original image (Isc ) Compute a mask image Imask ← (ISir AND ISsc ) OR IS3d Combine Imask with Isc as Icomb = Isc × Imask Binarise the image knowing that crack are white on Isc Remove too small regions The result of this combination is shown on figure 19(f).

It appears that crack detected where all real cracks. However, not all cracks appear on this segmentation method. Modifying some parameters, all cracks could be detected but in this case, false alarms will appear too. Using the ground-truth, a confusion matrix was build in order to compute the accuracy and the precision. Accuracy represents the proportion of the total correct predition while precision is the proportion of positive cases which were correct. The accuracy of this algorithm was 0.95 while the precision was 0.64. The low score of the precision could be due of the problem of the ground-truth created. In reality, the precision should be better. More results are presented in appendix A. E. Hessian Matrix and Hough Transform Analysis Frangi et al. [14] developed an approach for vessel enhancement on angiographic images. They examined the usefulness of a multi-scale second order local structure of an image in order to enhance tubular structures, e.g. vessels. 1) Hessian Matrix Analysis: Considering an image, L, the second order Taylor expansion in the neighbourhood of a point xo is given by Equation 5. L(xo + δxo , s) ≈ L(xo , s) + δxTo 5o,s +δxTo Ho,s δxo

(5)

where: • 5o,s : is the gradient vector of the image computed in xo at scale s • Ho,s : is the Hessian matrix of the image computed in xo at scale s. The Hessian matrix is the square matrix of second-order partial derivatives of a function, e.g. an image. Concepts of linear scale space theory were used in order to define differentiation as a convolution with derivatives of Gaussians (G), as in Equation 6. ∂ ∂ L(x, s) = sγ L(x) ∗ G(x, s) ∂x ∂x

(6)

9

(a) Groundtruth

(b) 3d Sensor

(c) Ir Sensor

(d) Rd Sensor (e) Sc Sensor

(f) Combine image

Figure 19: Cracks segmentation using phase congruency

The γ parameter allows to compare the response at different scales, it acts as a normalization parameter; if there is no preferred scale, then γ should be set to unity. As it was mentioned earlier, the Hessian will yield information about the second order derivative of a function. If the Hessian of an image filtered with a Gaussian kernel is analysed, it will provide information about the contrast between two regions (inside and outside) of a a range (−s, s), as can be intuitively inferred from Figure 20. Recalling the Taylor expansion done for the image, L, in Equation 5, it can be seen that the second order directional derivative is given by the third term on the right hand side of the equation.

δxTo Ho,s δxo

 =

∂ ∂δx



∂ ∂δxo

Figure 20: Second order derivative of a Gaussian function.

 L(xo , s)

(7)

Frangi et al. [14] exploited eigenvalue analysis to extract the main second order directional components from the image Hessian. The smallest of these values can be interpreted as the direction along the tubular structure sought, because the contrast along the structure will be lower than along it than transversally. This inherent information is computed directly in this approach, contrary to other more computationally expensive approaches, such as those requiring the computation of the filter response in several orientations [14]. In [14], eigenvalues for three orthonormal directions were computed, they are scale invariant up to a determined scale when mapped by the Hessian matrix. λk is defined to be the eigenvalue with the k − th smallest magnitude; therefore, its corresponding eigenvector will indicate the direction along the vessel while the two remaining eigenvectors will define an orthogonal plane [14]. A spherical neighbourhood, Nxo , centred at xo with unitary radius is mapped by the Hessian matrix onto an ellipsoid whose axes correspond to its eigenvectors’ directions, as shown

Figure 21: Second order ellipse describing the principal local curvature directions.

in Figure 21 [14]. An ideal tubular structure in a three-dimensional image is defined as to have |λ1 | ≈ 0, |λ1 |  |λ2 | and |λ2 | ≈ |λ3 | [14]. Figure 22 shows the other possible structural interpretations for different combinations of eigenvalues. A dissimilarity measure to distinguish between image structures was derived by Frangi et al. ; it is based on two geometric ratios that can be computed from the eigenvalues of the image Hessian [14]. The first ratio, RB , helps distinguish non-blob-like structures but cannot distinguish line- and plate-like patterns; it is shown on Equation 8. The second

10

ratio, RA , helps make the last distinction and is shown on Equation 9. Both geometric ratios have been defined for grey-level images and are grey-level invariant [14]. RB =

|λ1 | V olume/(4π/3) =p 3/2 (LargestCrossSectionArea/π) |λ2 λ3 | (8)

RA =

|λ2 | LargestCrossSectionArea/π = LargestAxisSemi − Length2 |λ3 |

(9)

Figure 22: Possible patterns in 2D and 3D according to eigenvalues (N − noisy, small, L − low, H − high).

In order to compute a measure of structureness and identify non-structural regions in the image, such as a background in an angiography, with no structural information and small eigenvalues, a term S was derived; it is shown on Equation 10. q (10) S = kHkF = Σj≤D λ2j For regions with a high structural value, S, the degree of vesselness can be computed; for cases when λ2 > 0 or λ3 > 0 it has a zero value, otherwise, it is computed using Equation 11 for 3D images and Equation 12 for 2D equations. Here, parameters α, β and c are thresholds controlling the sensitivity of the line filter [14].   2  2 RB S2 RA − − 2     Vo (s) = 1 − e 2α2  e 2β 1 − e 2c2 

Figure 23: Vessel enhancement result on angiographic image.



 2  RB S2 2   Vo (s) = e 2β 1 − e 2c2 

(11)



(12)

Finally, the vessel scale is obtained by computing the maximal vesselness response from Equation 13. Vo (γ) =

max

smin ≤s≤smax

Vo (s, γ)

Figure 24: Left: Original image with cracks annotations. Right: Crack enhancement after image processing.

(13)

A result of the implementation of this algorithm for an angiography is shown on Figure 23. As it can be observed, the blood vessels are greatly enhanced after image processing. For the purpose of this project, the fact that cracks commonly resemble tubular structures is exploited. The method developed by Frangi et al. is used on the wood images as a pre-processing step in order to enhance structures that are likely to be cracks. A sample of the results obtained is shown on Figure 24. 2) Hough Transform Analysis: The Hough transform was developed by Paul Hogh in 1962 as a technique for identifying the locations and orientations of certain types of features in an image. The transform consists of parameterizing a description of a feature at any given location in the original image’s space. A set of lines in a two dimensional space can be described by a two-parameter family [15]. If one of the values is set to have a fixed value, then the line can be represented by a sole number in

the parameter space. Figure 25 illustrates the parameters required to define a line; they consist of an angle θ from the horizontal axis and a distance value from the origin, known as ρ. Then the line can be represented by Equation 14. x cos θ + y sin θ = ρ

(14)

Given a two dimensional image, L, represented as a set [x1 , y1 ...xn , yn ], the points are transformed into the θ − ρ parameter space; a point in this parameter space then defines the line passing through colinear figure points with a common point of intersection [15]. There are several properties of these graphs that are worth mentioning: • A point in the image plane corresponds to a sinusoidal curve in the parameter plane [15] • A pint in the parameter plane corresponds to a straight line in the picture plane [15] Points lying on the same straight line in the image plane correspond to curves through a common point in the parameter

11

Figure 25: Line Parameters.

plane [15] Points lying on the same curve in the parameter plane correspond to lines through the same point in the image plane [15] For the purpose of this project, the fact that cracks can be described as a combination of linear segments once the image had been filtered with the approach developed by Frangi et al. was exploited. Once the image has been transformed into the parameter space, the most relevant linear features could be extracted and expected have a higher probability to represent a crack in the original image. From this approach, the exact location of the crack can be computed. The Matlab functions hough, houghlines and houghpeaks were used in order to detect lines in the previously filtered image. Figure 26 shows the detection of cracks of Figure 24; this result shows that the five out of six labelled cracks were accurately detected; the only non-detected crack, that corresponding to the second top-most bounding box in the annotated image, was not detected because of its low contrast with respect to the image background. One of the disadvantages of this approach is the amount of false positive detections; the horizontal lines, corresponding to wood patterns, are also detected as cracks through this method. This has been found to be a problem with several other methods due to the high similitude between these wood features. Nevertheless, it can be said that the algorithm preformed appropriately in this image. Further examples can be found in the Results section of this report. F. Results and Discussion The results from the Hessian Matrix and Hough Transform Analysis approach are shown in Appendix B. Results from the independent IR, SC, RD and 3D imaging modalities are shown in Figures 36, 37, 38 and 39, respectively. As it can be seen, the infrared (IR)images tend to have a higher performance when compared to the rest of the imaging modalities. On the leftmost image, 4 out of 7 labelled cracks were detected, on central image in Figure 36 5 out of 7 labelled cracks were detected and on the rightmost image, it can be seen that the majority of the detected features are false positive detections. From visual inspection of the individual images it can be seen that non-detected cracks correspond to those that have a lower contrast with the image background with respect to

Figure 26: Left: Original image with cracks annotations. Right: Crack detection over original image.

the horizontal lines that are part of the wood texture. The images that yielded the worst results were those with the 3D imaging modality, where none of the cracks in either of the images were detected. This result is somewhat expected due to the low contrast between the crack location and the background of the image. The SC and RD imaging modalities both had an intermediate performance when compared to the IR and 3D modalities. As can be seen on the rightmost image on Figure 37, the algorithm also has a lower performance when the texture on the wood surface contains linear structures with high contrast with respect to the background. On the other hand, images like those shown on the central and rightmost image in Figure 38 have also a low performance because the cracks are too small and with such a low contrast that they are barely visible to the naked human eye. In these images, the main linear elements with a high contrast are the wood knots, which are erroneously signalled by the algorithm as cracks. The main advantage of this method is its fast computation and simple approach; approaches requiring of multiorientation filter response approaches are generally more costly. Nonetheless, its main disadvantage of this method is its high number of false positives, this is mainly due to its global nature. A local approach could help filter out the false positive detections through a better crack characterization. Another interesting approach could be to attempt to merge the information provided by each of the imaging modalities before proceeding to filter the image and detect the linear features. Perhaps, this would also help counteract the false crack detection found when processing the imaging modalities independently. G. Sobel Filter Analysis Sobel operators are used in Image processing. This known filter is used particularly for edge detection. Sobel is a discrete differentiator, which computes the gradient approximation in x and y direction of image intensity values. The two matrices show in expressions (15) and (16) are basic sobel masks for x gradient and y gradient. Convolution of these two masks with the original image

12

Figure 27: Left: Original image with cracks annotations. Center: Result of Median Filter (Background). Right: Subtracted result.

Figure 28: Left: Original image with cracks annotations. Center: Detected edge with sobel. Right: Crack detection over Edge Image.

will lead to 2 gradient images, x and y. The edge image is the result of square root of sum of both gradient squares, Equation (16) [16].   −1 0 +1 Gx = −2 0 +2 (15) −1 0 +1   −1 −2 −1 0 0 Gy =  0 (16) +1 +2 +1 G=

q G2x + G2y

(17)

In order to detect the cracks in the wood images the 5 × 5, sobel mask was used. The mask is the result of generalized sobel function which is able to determine the gradient masks of vertical and horizontal lines. The combination of vertical and horizontal gradient Images leaded to the final edge image. The window size five was chosen since it suited the best for the tested images; however this parameter could be adjusted depending on the applications. In this project the sobel filter was tried on the 3D images provided in the dataset. These images were used due to the fact that, they contain less noise compare to others and they require less post processing. The sequence of the sobel algorithm is explained in the following: 1) Median Filter Images were post processed by the median filter in order to detect the background. Almost all the images in the data set contain some horizontal lines in the background. These lines considered as the effects of illumination and the sensors artifacts. Considering the median of the lines they could be removed from the image, for this purpose the median of the original image was calculated and subtracted from the original image. This resulted in the image with less artefacts in the background. Figure 27 shows, from the left, original image, followed by the result from the median filter and subtracting result from previous images . 2) Sobel Filter In this stage the sobel filter was applied on the image.

3) Image thresholding The edge result was further processed by thresholding. Otsu threshold from Matlab functions was used inorder to find the suited level. 4) Morphological Operations In order to get rid of extra noise in the threshold image, further morphological operations were applied to the binary image, including matlab morphological operations majority, bridge and bwareaopen. The first morphological operation majority was used to decide the pixel values based on the majority pixels in 3 by 3 neighbourhood. The second operation bwareaopen was done to remove some noise with less connectivity than 10 pixels. The value 10 was chosen in accordance to the characteristics of the datasets, and it should be adjusted based on the applications. In the last part the morphological bridge was selected to fill the holes in the segmented cracks. Figure 28 shows the obtained results based on sobel operator. Left image represent the original image with the highlighted cracks. Middle image shows the edge image after sobel filter and right image shows the final segmentation while four out of six cracks are highlighted properly. Obtained results from sobel detector show, that this operator has a good performance; however far from perfect. In Evaluation of the results none of the false cracks were detected by the sobel operator; however in some cases not all the cracks were detected in one image. This method also has the disadvantage that can only be performed on one set of image dataset, 3D images. Sobel on the other images in the dataset such as Ir, Sc and Rd will detect cracks and some highlighted texture of the wood in the same time. It should also be considered that 3D image contains less noise compare to the other three datasets and less details of course.

H. Gabor Filter Gabor filter which was implemented by Denis Gabor [17] is a linear filter for edge detection. The performance of the gabor filter was found appropriate particularly in texture images [17]. Two dimensional gabor filter can be used for filtering the images in order to detect the edges. This filter is based on Gaussian kernel function

13

Figure 29: From left sequentially: Gabor results on Ir, Rd, Sc and 3D images

which is modulated by a sinusoidal plane wave [17]. The performance of the gabor filter is very similar to wavelets, since different filters can be generated from the main filter by dilation and rotation. [17]. The gabor expression is shown in Equation (18). This function is basically a Gaussian with variance along x and y axis which is modulated by a complex sinusoidal. θ represent the orientation of the gabor filter, while f , shows the frequency value of the sinusoidal function. 0 0 x and y show the center frequencies of the sinusoidal function along x and y axis. These values can be obtained based on Equations (19) and (20)  0   x0 0 y cos(2πf x ) G(x, y, θ, f ) = exp − 0.5 ( 0 )2 + ( 0 )2 σx σy (18) 0 x = x cos θ + y sin θ (19) 0

y = x cos θ − y sin θ

(20)

In this project the gabor filter was selected to be at 2π/6 orientation, with sinusoidal function at frequency of 16 and x variance of 2 and y variance of 4. This gabor filter was applied on all four type of images in the dataset. Figure 29 shows the results of chosen gabor filter on the four images, from left, sequentially Ir image, followed by Rd image, Sc image and 3D image. The crack is appearing black in all the images besides Rd image which appears white. Based on this fact the four results could be combined based on the expression mentioned in Equation (21). Analysing the combined image showed that cracks appeared with negative intensities while the none cracks had a very high intensity values. This means thresholding the combined image at level0 will lead to the binary image while the cracks are segmented. Figure 30 shows the result of the segmented crack in the bone colour map. Combined = Ir + Sc + 3D + (1 − Rd)

(21)

Considering the overall results from the datasets, this method operates more less fine however is very subjective to the texture of the woods in some cases. In general it has more false negative errors, in compare to other methods concerning edge detection. V. Classification Based on Local Features This method makes use of local features by evaluating whether a given image patch of fixed size contains a crack

Figure 30: Detected cracks from thresholded combined image

Figure 31: Positive random patches selected from ground truth (green)

or not. The basic idea is to approach the problem using a classifier, extracting features out of image patches and training a Support Vector Machine (SVM) to learn to distinguish between crack and no crack from a training set. This type of approach requires a precise ground truth, which was not our case unfortunately, as discussed previously. However, we decided to try this approach to observe its feasibility and if appropriate features could be found. A. Method Overview This technique first extracts a set of positive and negative image patches (typically 100 by 100 pixels each) from the provided images. The positive image patches are extracted according to the positive cracks in the ground truth and the negative ones are cropped from the regions where no crack is present. The position of the patches is taken randomly to avoid introducing any patterns that could affect the final accuracy of the classifier. An example is shown on image 31. The positive and negative sampled image patches are divided into two as train and test sets. The train set is used to train an SVM classifier with a set of features that are described in the next section. The division of the two sets is done randomly, taking into account the images where each patch belongs to as well. B. Features The cracks usually follow a line-like shape, and in most cracks those lines are usually straight. Therefore, we propose to employ the line operator[18] to detect the presence of linear structures on the image patches. The line operator has been successfully used previously in applications such as retina segmentation[19] and mammographic image analysis[18].

14

Figure 32: Line operator example[18]

Moreover, evaluating a feature globally on each patch is advantageous in the case that the images to be analyzed (such as Ir, Sc, 3d and Rd channels) are not properly aligned. This is the case of the data provided by Luxscan, where there is a slight offset between each channel that is neglected when this type of analysis is employed, compared to a N-dimensional analysis that expects a nearly perfect alignment. 1) Line Operator: The line operator is realized by convolving the input image with a kernel that is designed to detect a line on a given orientation (see figure 32). By convolving an image with different orientations (angle discretization) it is possible to detect the presence of lines and their orientation. Apart from the method used to combine the result of each line orientation, there are parameters than can be tuned regarding the convolution kernel. These are the kernel size, line width and the number of orientations to discretize the possible line angles. 2) Generating Features: Features are generated by computing a line orientation histogram for each image channel on a given patch and merged to generate a rotationinvariant feature. The process can be summarized as follows: 1) For each channel, convolve each image with the line operator at different orientations. Generate a histogram for each channel whose bins are the line operator orientations, taking into account the orientation of a given region as the orientation that yielded the maximum response. 2) Find the dominating orientation in a reference histogram (the Ir channel seems to be a good indicator) and use it to rotate all the histograms by k bins, where k is the position of the predominant orientation in the reference histogram. This allows us to create a rotation-invariant descriptor, in the case that there is only one predominant orientation and it is detected properly in the reference histogram. Therefore, in the case of choosing the number of bins or orientations to be 16, the dimension of the descriptor for a given patch will be 64. It is important to choose the right line operator parameters (kernel size, line width) since they will determine how accurately it can detect the cracks.

C. Results Different line operator kernel parameters were tested, evaluating their effectiveness with the test set mentioned earlier. The best results were obtained for a kernel size of 7x7 and line width of one pixel. With these parameters an area under the curve (AUC) of 0.98 and average accuracy of 0.93 was obtained. At the point where the accuracy is 0.93, a recall of 0.82 was obtained. This is an impressive result considering that the ground truth is likely to be partially wrong, but it is very sensitive to the line operator parameters. Unfortunately, the results mentioned above only evaluate patch classification and do not properly assess its performance for crack detection. Even though no statistical figures were computed for that specific purpose, some interesting results are presented in appendix C. It can be seen that the proposed technique is able to successfully find the cracks, even though in some cases there are many false positives. The disadvantage of this method is that there are many parameters to tune and the processing can be computationally intensive for real-time operation. Furthermore, the line operator only looks for straight lines, and some important cracks could be missed because of this particularity. Regarding further work, it would be interesting to analyze the results obtained with a multi-scale line operator, to capture lines of different line width, in contrast with our method that uses a single constant line width. Moreover, the results should be analyzed formally with a measure that truly represents the performance of the classifier with respect to every crack in the images. VI. Conclusion and discussion In this paper, several approaches were presented in order to solve the crack detection problem. Each method was discussed separately considering its advantages and disadvantages. The performance of the suggested methods can be summarized as follows: Unsupervised clustering methods (section IV-C) lack robustness. These type of methods could be improved if a pre-processing stage is introduced to remove image artefacts and features that make the clustering methods fail. Regarding segmentation methods (section IV-D, IV-E, IV-G, IV-H), all of them were based on edge detection or edge-like features which are the easiest and most intuitive way to describe and therefore try to find cracks. The main strength of these methods is the short computation time due to the simplicity of the approach. However, these segmentation methods lead usually to over-segmentation or under-segmentation. Hence, combining the results of several methods is not an easy task and special emphasis must be placed on the algorithm used to combine them so that the strength of each method can be properly exploited. The last section about supervised classification presented a framework combining line detection and histogram concatenation. This method was the one providing

15

Figure 33: Feature generation with the line operator

the best results with what respects accuracy and recall statistics. However, this technique is computationally expensive and may not be directly applicable for real-time operation. Future work includes combining segmentation and supervised classification techniques. One option would be to first produce an over-segmented image to constrain the area where classification should be applied. Then, some operators as presented in sections IV-E, IV-G, IV-H and V could be used to generate combined features, leading to a more robust classification technique. References [1] P. Meinlschmidt, “Wilhelm-klauditz-institut (wki), fraunhoferinstitute for wood research, braunsch weig. thermo-graphic detection of defects in wood and wood-based materials,” in 14th international Symposium of nondestructive testing of wood, Hannover, Germany: May 2nd-4th, 2005. [2] T. Ohashi, Z. Aghbari, and A. Makinouchi, “Hill-climbing algorithm for efficient color-based image segmentation.,” in IASTED International Conference on Signal Processing, Pattern Recognition, and Applications, pp. 17–22, 2003. [3] M. M. A, T. C. L. B, and O. C. P. B, “Automated pavement imaging progra (apip) for pavement cracks classification and quantification - a photogrammetric approach.” [4] E. H. H. A. and E. M, “Surface defects detection for ceramic tiles using image processing and morphological techniques..” Proceedings of World Academy of Science, Engineering and Technology (PWASET), 2005. [5] H. Koutsopoulos, “Primitive-based classification of pavement cracking images,” 1993. [6] M. Niskanen, O. Silv, and H. Kauppinen, “A robust approach for automatic detection and segmentation of cracks in underground pipeline images,” 2005. [7] N. Tanaka and K. Uematsu, “A crack detection method in road surface images using morphology,” in MVA, pp. 154–157, 1998. [8] T. E., “Semi-automated detection of defects in road surfaces.” [9] J. Dunn, “Well separated clusters and optimal fuzzy partitions.” Journal of Cybernetics, 1974. [10] J. C. Bezdek, Pattern Recognition with Fuzzy Objective Function Algorithms. Norwell, MA, USA: Kluwer Academic Publishers, 1981.

[11] J. B. MacQueen, “Some methods for classification and analysis of multivariate observations,” in Proc. of the fifth Berkeley Symposium on Mathematical Statistics and Probability (L. M. L. Cam and J. Neyman, eds.), vol. 1, pp. 281–297, University of California Press, 1967. [12] Y. A. Tolias and S. M. Panas, “A fuzzy vessel tracking algorithm for retinal images based on fuzzy clusering,” IEEE Transactions on Medical Imaging, vol. 17, pp. 263–273, 1998. [13] P. Kovesi, “Image features from phase congruency,” 1999. [14] R. F. Frangi, W. J. Niessen, K. L. Vincken, and M. A. Viergever, “Multiscale vessel enhancement filtering,” 1998. [15] O. Richard and P. E. Hart, “Use of the hough transformation to detect lines and curves in pictures,” 1972. [16] “Wikipedia: Sobel operator.” [17] “Wikipedia: Gabor filter.” [18] R. Zwiggelaar, T. Parr, and C. Taylor, “Finding orientated line patterns in digital mammographic images,” in Proceedings of the 7th British Machine Vision Conference, pp. 715–724, Citeseer, 1996. [19] E. Ricci and R. Perfetti, “Retinal blood vessel segmentation using line operators and support vector classification,” Medical Imaging, IEEE Transactions on, vol. 26, no. 10, pp. 1357–1365, 2007. [20] B. Augereau, B. Tremblais, M. Khoudeir, and V. Legeay, “A differential approach for fissures detection on road surface images.,” Mai 2001. 5th International Conference on Quality Control by Artificial Vision, Le Creusot, France. [21] M. Niskanen, O. Silven, and H. Kauppinen, “Color and texture based wood inspection with non-supervised clustering,” 2001. [22] “Gabor matlab script in matlabcentral, file no 5237.”

16

Appendix A Results: phase congruency and fuzzy c-means

(a) Ground truth for crack face image

(d) Ground - truth for crack 101008 image

(g) Ground truth for crack side image

(b) Segmentation for crack face image using phase congruency

(e) Segmentation for crack 101008 image using phase congruency

(h) Segmentation for crack side image using phase congruency Figure 34: Results of segmentation method using phase congruency and fuzzy c-means

(c) Segmentation for crack face image using fuzzy c-means

(f) Segmentation for crack 101008 image using fuzzy c-means

(i) Segmentation for crack side image using fuzzy c-means

17

(a) Ground truth for crack pine image

(d) Ground - truth for false crack image

(g) Ground - truth for resin crack image

(b) Segmentation for crack pine image using phase congruency

(e) Segmentation for false crack image using phase congruency

(c) Segmentation for crack pine image using fuzzy c-means

(f) Segmentation for false crack image using fuzzy c-means

(h) Segmentation for resin crack image using phase congruency Figure 35: Results of segmentation method using phase congruency and fuzzy c-means

(i) Segmentation for resin crack image using fuzzy c-means

18

Appendix B Results. Hessian Matrix and Hough Transform Analysis. A. IR Images

Figure 36: Sample IR Image Results

B. SC Images

Figure 37: Sample SC Image Results

C. RD Images

Figure 38: Sample RD Image Results

D. 3D Images

19

Figure 39: Sample 3D Image Results

20

Appendix C Results: Classification with the Line Operator

Figure 40: Results of crack detection with line operator on positive sample

Figure 41: Results of crack detection with line operator on positive sample

21

Figure 42: Results of crack detection with line operator on positive sample

Figure 43: Results of crack detection with line operator on negative sample. No crack detected.

Figure 44: Results of crack detection with line operator on negative sample. No crack detected.