Image Renaissance Using Discrete Optimization

for the distance from the boundaries of the inpainted region. Such a cost function ... pixel of S the value among these n possible correspond- ing pixels, in the ... However, in the absence of constraints within the in- painted region S, such ..... on Pattern Analysis and Machine Intelligence, 6:721–741, 1984. [11] D. Heeger and ...
3MB taille 6 téléchargements 309 vues
Image Renaissance Using Discrete Optimization C´edric All`ene ENPC-CERTIS and ESIEE-A2SI, France [email protected]

Nikos Paragios ECP-MAS, France [email protected]

Abstract In this paper we propose a novel technique to image completion that addresses image renaissance through a graph-based matching process. To this end, a number of candidate seeds with content similar to the one of the area to be inpainted are considered. They are selected through a particle filter method and then positioned over the missing area. Markov Random Fields are used to formalize inpainting as a labeling estimation problem while a combinatorial approach is used to recover the optimal partition of patches that completes the missing area with the α-expansion process. Promising results in image and texture completion demonstrate the potentials of the proposed method. Figure 1. Image Renaissance (top left: input image, top right: optimal partition, bottom left: reconstructed image, bottom right: 250% zoom)

1. Introduction Image inpainting [4] consists of completing missing or damaged parts of an image. Such a demand was first addressed in painting restoration and has spread in other domains of computational vision like photos and films [13]. Prior art to image completion consists of variational [2, 4, 6, 9, 12, 15, 16] and statistical [7, 8, 10, 11, 14, 17, 18] methods. One can refer to [3] for a comprehensive survey of the state-of-the art inpainting methods. Our approach consists of a graph-based approach, that aims to combine variational and statistical methods. It is based on the concept of progressive stitching. To this end, we assume that one can find portions of the missing information in the image. Once candidates have been determined, using a multiple hypotheses propagation approach which seeks for patches of similar content with the one to be inpainted, we reformulate image renaissance as a problem of min-cut within a graph. To this end, we introduce a MRF-based cost function that accounts for the similarity of the existing image and the patches to be added as well as for the distance from the boundaries of the inpainted region. Such a cost function propagates, in a non-explicit fashion, the information along the isophotes. The lowest potential of this cost function is determined through the equivalence between MRFs and max flow/min cut problem in graphs with

the α-expansion algorithm. The reminder of this paper is organized as follow. In section 2 we introduce the theoretical concept of our approach. Our implementation is described in section 3 with the candidate seeds selection process and the graph-based optimization. Finally, discussion and experimental results are part of section 4.

2

Image Completion

Let us consider a damaged image I 0 : Ω → R, that is a replica of the original image I : Ω → R with the exception of a mask S that refers to the area to be inpainted. The task of inpainting consists of creating a new image F such that: ( I 0 (x),x ∈ Ω − S F(x) = (1) I(x),x ∈ S The central idea of our approach relies on the fact that missing content is present in some other areas of the image: ( F(x) =

I 0 (x), x ∈ Ω − S I 0 (x0 ), x ∈ S, x0 ∈ Ω − S : I 0 (x0 ) = I(x) (2)

Let us consider n image patches {L1 , L2 , ..., Ln } : Li ∈ Ω − S positioned over S, the region to be inpainted. Then the problem of inpainting consists of selecting for every pixel of S the value among these n possible corresponding pixels, in the selected patches, that best approximates the original data. One can see such a task as a labeling problem, where to the pixel x the label ω(x) ∈ [1, n] is attributed, reflecting to: F(x) = Lω(x) (x). However, in the absence of constraints within the inpainted region S, such a minimization is impossible. That’s why, in order to capture the dynamics of the area surrounding the missing content, we introduce the concept of good continuation: by relaxing the constraint on the borders of S we obtain a new working region S 0 , with S ∈ S 0 , where data and, as a consequence, constraints are present. Then, in order to solve the labeling problem, the distance between the selected patches and the existing data must be minimized. To this end we propose the use of the following function that aims at introducing a continuation constraint of appearance between the existing and the new content: X  E(ω) = ρ Lω(x) , I 0 , x (3) x∈S 0 −S

with ρ (G, H, x) being the correlation of intensities between the two image patches. The difference is evaluated at local region/neighbourhood level rather than at the pixel level. Within our approach we assume a local normalized SSD score in the vertical and the horizontal direction pointed by a Gaussian distribution: ρ (G, H, x) = 1 Z

W X m=−W



  |m|2 1 exp − 2 |G(x + m) − H(x + m)| 2σ 2πσ

x∈S 0 −S

with δ (G, H, x) =

ρ (G, H, x) |∇G(x)|2 + |∇H(x)|2

(6)

This function respects constraints with known data but could create discontinuities between the patches selected. To this end, local smoothness constraints on the label domain are considered (neighbouring pixels should have the same label):

  exp δ Lω(x) , I 0 , x

x∈S 0 −S

 +γ



X

X 

x∈S 0

(7)

V(ω (x), ω(y)) dy

y∈N (x)

with γ being a constant balancing the contribution of the two terms and V, known to be the local potential function, having generally the following form: ( V(ω(x), ω(y)) =

+αdif f , ω(x) 6= ω(y) 0, ω(x) = ω(y)

(8)

with αdif f > 0 Such a function aims to create a partition on the labeling space such that the existing part of the image remains the same while creating a continuation from the known part of the image to the inpainted region. One can seek the lowest -sub-optimal- cost of this function using several techniques of various complexity: the iterated conditional modes, the highest confidence first, the mean-field and simulated annealing and the max flow/min cut approach [5] (which is our case). The most important limitation of this approach is that inpainting is done through a stitching process on the boundaries of the inpainted region while poorly smoothness constraints are used to fill in the missing information. In order to overcome this, let us consider a term depending on similarities between patches at their own boundaries in the area to be inpainted: X

E(ω) = α

  exp δ Lω(x) , I 0 , x

x∈S 0 −S

(4)

with (G,H) being the two concerned images or patches and Z a normalization factor. The use of the image and patch derivatives could be considered as support layers on this constraint. To this end, within the context of texture synthesis in [14], a term that is inversely proportional to the norm of the gradient was considered, making pixels with substantial derivatives more significant in the labeling process: X   E(ω) = exp δ Lω(x) , I 0 , x (5)

X

E(ω) =

| +β

X

{z



X

exp

x∈S x0 ∈N (x)

|

  σ2 0 δ L , L , x ω(x) ω(x ) D2 (x, ∂S) {z } (9) patches

 +γ

X

 X

 x∈S 0

|

}

image

V(ω (x), ω(y)) dy

y∈N (x)

{z

smoothness

}

with α  β > γ, D(x, ∂S) being the minimum Euclidean distance between the pixel x and the interface that delineates the boundaries of the region to be inpainted and σ being a constant parameter sampling the importance of progressive reconstruction. The idea behind such a cost function is the following: we assume the optimization process to be a temporal process. Then, given the current completed image, we can introduce good continuation constraints between the already completed content (up to this moment) and the one to be

added. It is natural to align such a time-dependent process with the distance from the borders of the original inpainted region. Such constraint can be enforced through the regularization of the cost function according to the distance from the boundaries between missing and existing content. One can interpret these terms in the following fashion. The first term constraints the patches to match with the image content on the borders of the region to be inpainted, a rather hard constraint that is reflected on the selection of the α coefficient. The second term determines the content within the inpainted region while the third term imposes smoothness on the label space of the reconstructing image (that is not equivalent with smoothness in the image itself).

3

Implementation

Our method could be decomposed in two steps. Firstly is the selection of candidates seeds which would be effective patches for the image reconstruction. Secondly is the combinatorial selection of the labeling partition over the area to be inpainted. The first step is necessary to limit the number of efficient patches which would have to be examined in the second step.

3.1

On the Selection of Candidate Seeds

We call seeds the local image content having good similarities with available visual information at the boundary of the region to be inpainted. One can consider the problem of seed extraction as a tracking problem in the image where given a starting position, the objective is to recover an image region that can be used to replace the missing area. The statistical interpretation of such an objective refers to the introduction of a probability density function (pdf) that uses previous states to predict possible new positions for the added seeds, while image features are used to evaluate the quality of these predictions. Particle filters [1] are sequential Monte-Carlo techniques that can be used to estimate the Bayesian posterior probability density function with a set of samples ωt , conditional to successive observations. In practice, given the boundaries between the region to be inpainted and the existing content, a uniform sampling rule along its borders allow us to consider a number of regions of varying size centered at the boundary points. Then, we apply a random perturbation on the position of the center of the regions. Given a new position and the new characteristics of the local image content, the quadratic distance between the new region and the one of the origin is computed and used to update the weight of the particle. These weights guide the re-sampling process (more samples for particles with important weights) as well as the random perturbations (inversely proportional to these weights). Upon certain number of iterations, a fraction of the particles having the best weights is retained and superimposed to the region to be inpainted.

3.2

Combinatorial Optimization

Since candidate seeds have been determined, we have to find which pixel for each corresponding patch will finally be pasted in the reconstructed area. The function will be minimized through a combinatorial approach based on the graph-cut framework. The α-expansion algorithm [5] consists of an iterative process that often converges to a local minimum through successive binary graph cuts. Within each step, among the n possible labels (which represent copies of the image I 0 with an offset), the label LN corresponding to the unused seed having the best weight is selected. An optimal cut between this label and the actual output I 0 is computed. This process will so enhance the image I 0 at each iteration until stability. A graph is constructed thanks to equation (9): first term is applied on t-links while second is applied on n-links (third term is applied implicitly in the algorithm). Once the graph building process completed, the graphcut algorithm is applied. The cut gives the boundary which offers the best transition between the patch considered, the known image and the partially reconstructed area. So the result indicates for each pixel if we have to copy the pixel corresponding to the patch concerned or if we keep the actual one. If the pixel from the zone to be inpainted is previously empty, the one from the patch is directly copied. Then, the output image is updated between successive cuts.

4

Discussion

In this paper we have introduced a graph-based combinatorial approach to image renaissance. Such a method is based on the concept of stitching, where multiple candidates patches are superimposed over the missing area, forming a multi-source labeling problem solvable by graphs. Parts of the missing region located in the neighbourhood of image information are reconstructed first, then patches in areas where the inference problem is completely ill-posed. Our algorithm was implemented using C++ on a P4 2.8GHz processor machine with 1GB of RAM. Running time is about a couple of minutes depending on the size of the region to be inpainted and the number of seeds calculated. The final result is nearly reached after a small number of α-expansions. Promising experimental results demonstrate the potentials of our approach [F IG . (1),(2)]. Computational complexity is the main limitation of the proposed framework. The cost is polynomial to the number of patches and therefore particular attention is paid on the seeds selection. Despite our efforts, we didn’t find a metric allowing us to compare our results with other methods in a quantitative way. We would like to point out that our method is expected to perform better on textured images when compared with the ones using elastica models [F IG . (2-1)]. Video and structural inpainting are the most prominent future research directions.

(1)

(2)

(3) Figure 2. (1) Grass, (2) Photo, (3) Puppet; left: input image, middle: optimal partition, right: reconstruction

References [1] S. Arulampalam, S. Maskell, N. Gordon, and T. Clapp. A Tutorial on Particle Filters for On-line Non-linear/Non-Gaussian Bayesian Tracking. IEEE Transactions on Signal Processing, 50(2):174–188, 2002. [2] C. Ballester, M. Bertalmio, V. Caselles, G. Sapiro, and J. Verdera. Filling-in by Joint Interpolation of Vector Fields and Grey Levels. IEEE Transactions on Image Processing, 10:1200–1211, 2001. [3] M. Bertalmio, V. Caselles, G. Haro, and G. Sapiro. Pde-based image and surface inpainting. Mathematical Models in Computer Vision: The Handbook, 2005. [4] M. Bertalmio, G. Sapiro, L.-T. Cheng, and S. Osher. Image Inpainting. In ACM SIGGRAPH, pages 417–424, 2000. [5] Y. Boykov, O. Veksler, and R. Zabih. Fast Approximate Energy Minimization via Graph Cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23, 2001. [6] T. Chan and J. Shen. Mathematical Models of Local Non-texture Inpaintings. SIAM Journal on Applied Mathematics, 62:1019–1043, 2001. [7] A. Efros and W. Freeman. Image Quilting for Texture Synthesis and Transfer. In Proc. SIGGRAPH, ACM Press, pages 341–346, 2001. [8] A. Efros and T. Leung. Texture synthesis by non-parametric sampling. In IEEE International Conference on Computer Vision, pages 1033–1038, 1999.

[9] S. Esedoglu and J. Shen. Digital Inpainting Based on the MumfordShah-Euler Image Model. European J. Appl. Math., 13:353–370, 2002. [10] S. Geman and D. Geman. Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6:721–741, 1984. [11] D. Heeger and J. Bergen. Pyramid-Based Texture Analysis/Synthesis. In Proceedings of ACM SIGGRAPH 95, ACM Press, pages 229–238, 1995. [12] G. Kanizsa. Gram´atica de la Visi´on. Paidos, 1986. [13] A. Kokaram. On missing data treatment for degraded video and film archives: a survey and a new bayesian approach. IEEE Transactions on Image Processing, 13:397–415, 2004. [14] V. Kwatra, A. Schodl, I. Essa, G. Turk, and a. Bobick. Graphcut Textures: Image and Video Synthesis Using Graph Cuts. In Proceedings of SIGGRAPH, 2003. [15] S. Masnou and J. Morel. Level-lines based Disocclusion. In IEEE Intenrational Conference on Image Processing, pages 259– 263, Chicago, IL, 1998. [16] M. Nitzberg, D. Mumford, and T. Shiota. Filtering, Segmentation, and Depth. Springer-Verlag, Berlin, 1993. [17] J. Sun, L. Yuan, J. Jia, and H.-Y. Shum. Image Completion with Structure Propagation. SIGGRAPH, 2005. [18] L. Wei and M. Levoy. Fast texture synthesis using tree-structured vector quantization. In Proceedings of SIGGRAPH, pages 479–488, 2000.