Layered Depth Images for Multi-View Coding - Tel Archives ouvertes

Collaboration with IRISA, INSA and Brittany Region in Futurim@ge project. Vincent Jantet (ENS-Cachan .... qx = px + 1 No artifact. Vincent Jantet (ENS-Cachan ...
18MB taille 3 téléchargements 280 vues
Layered Depth Images for Multi-View Coding Vincent Jantet ENS-Cachan, Antenne de Bretagne, Campus de Ker Lann, 35170 Bruz – France INRIA Rennes, Bretagne Atlantique, Campus de Beaulieu, 35042 Rennes – France

Ph.D. Thesis defense, Rennes, 2012 Collaboration with IRISA, INSA and Brittany Region in Futurim@ge project

Vincent Jantet (ENS-Cachan – FR)

Layered Depth Images for Multi-View Coding

Ph.D. defense, 2012

1 / 43

Applicative context Functionalities 3DTV: Depth feeling by stereo-vision simulation FVV: Live viewpoint selection

3DTV

Vincent Jantet (ENS-Cachan – FR)

FVV Layered Depth Images for Multi-View Coding

Ph.D. defense, 2012

2 / 43

3D video processing scheme Real World Z-Map; Mesh; . . . Server side Acquisition

Representation

Multi-cam; Z-cam; . . . Transmission TV Screen; Hologram; . . . Client side Displaying

Rendering Projection; . . .

Virtual World Each choice has an impact on following steps Vincent Jantet (ENS-Cachan – FR)

Layered Depth Images for Multi-View Coding

Ph.D. defense, 2012

3 / 43

Thesis objectives Intermediate representation

Virtual View rendering

Compact

Fast

Bit-Rate scalable

Accurate

Server side Acquisition

Representation Transmission

Client side Displaying

Vincent Jantet (ENS-Cachan – FR)

Rendering

Layered Depth Images for Multi-View Coding

Ph.D. defense, 2012

4 / 43

SoA: Rendering-optimized representations Multi-View Videos

[DTM96]

Plenoptic Function (Light Ray)

[AB91] [YSK+ 02]

Microfacet Billboarding ...

Multi-View Video

Plenoptic Function

Microfacet Billboarding

Advantages

Limitations

Photo-realistic rendering

Huge amount of data

Vincent Jantet (ENS-Cachan – FR)

Layered Depth Images for Multi-View Coding

Ph.D. defense, 2012

5 / 43

SoA: Transmission-optimized representations 2D plus depth video (2D+Z)

[ISO07]

Layered Depth Image (LDI)

[SGHS98]

Billboard Cloud

[DDSD03]

Polygon Mesh ...

2D+Z

LDI

Billboard Cloud

Polygon Mesh

Advantages

Limitations

Compact representation

Hard to construct from real scene

Vincent Jantet (ENS-Cachan – FR)

Layered Depth Images for Multi-View Coding

Ph.D. defense, 2012

6 / 43

Contributions LDI representation

JPF rendering method

Compact representation (naturally remove correlations)

Point-based projection method which handle artifacts

JPF Server side Acquisition

LDI

Representation

MV+Depth Client side Displaying

MVC Transmission

Rendering

JPF

Vincent Jantet (ENS-Cachan – FR)

Layered Depth Images for Multi-View Coding

Ph.D. defense, 2012

7 / 43

Table of contents

1

View synthesis (JPF)

2

Layered Depth Image (LDI)

3

LDI-based multi-view compression

4

Conclusions

Vincent Jantet (ENS-Cachan – FR)

Layered Depth Images for Multi-View Coding

Ph.D. defense, 2012

8 / 43

Table of contents

1

View synthesis (JPF) Projection algorithm Joint Projection Filling (JPF) Rendering results

2

Layered Depth Image (LDI)

3

LDI-based multi-view compression

4

Conclusions

Vincent Jantet (ENS-Cachan – FR)

Layered Depth Images for Multi-View Coding

Ph.D. defense, 2012

9 / 43

View synthesis: Classical Warping algorithm Reference view View synthesis methods use projection algorithm (warping) Warping

Warping algorithm Geometrical projection Input: Texture + Depth map |

{z

}

From reference View Point

+ Cameras parameters Output: Texture + Depth map |

{z

}

Seen from new View Point

Virtual View Vincent Jantet (ENS-Cachan – FR)

Layered Depth Images for Multi-View Coding

Ph.D. defense, 2012

10 / 43

View Synthesis: Warping common artifacts Disocclusions: Occluded areas which become visible Cracks: Small holes due to sampling Ghosting: Boundaries pixels with mixed foreground/background color

Disocclusions

Vincent Jantet (ENS-Cachan – FR)

Cracks

Layered Depth Images for Multi-View Coding

Ghosting

Ph.D. defense, 2012

11 / 43

View Synthesis: Classical scheme 1

2

Forward Warping

Filtering 4 Inpainting 3

Reference

5

Projection

Virtual

Backward Warping

Depth-aided Inpainting

Backward Projection

Depth-aided Inpainting

1 Forward Warping: Lose pixels connectivity Vincent Jantet (ENS-Cachan – FR)

Layered Depth Images for Multi-View Coding

Ph.D. defense, 2012

12 / 43

View Synthesis: Classical scheme 1

2

Forward Warping

Filtering 4 Inpainting 3

Reference

5

Projection

Virtual

Backward Warping

Depth-aided Inpainting

Backward Projection

Depth-aided Inpainting

2 Filtering: Fills Cracks and avoids Ghosting Vincent Jantet (ENS-Cachan – FR)

Layered Depth Images for Multi-View Coding

Ph.D. defense, 2012

12 / 43

View Synthesis: Classical scheme 1

2

Forward Warping

Filtering 4 Inpainting 3

Reference

5

Projection

Virtual

Backward Warping

Depth-aided Inpainting

Backward Projection

Depth-aided Inpainting

3 Backward Warping: Retrieves color from reference view Vincent Jantet (ENS-Cachan – FR)

Layered Depth Images for Multi-View Coding

Ph.D. defense, 2012

12 / 43

View Synthesis: Classical scheme 1

2

Forward Warping

Filtering 4 Inpainting 3

Reference

5

Projection

Virtual

Backward Warping

Depth-aided Inpainting

Backward Projection

Depth-aided Inpainting

4 Depth Inpainting: Fills disocclusions with mixed FG/BG depth Vincent Jantet (ENS-Cachan – FR)

Layered Depth Images for Multi-View Coding

Ph.D. defense, 2012

12 / 43

View Synthesis: Classical scheme 1

2

Forward Warping

Filtering 4 Inpainting 3

Reference

5

Projection

Virtual

Backward Warping

Depth-aided Inpainting

Backward Projection

Depth-aided Inpainting

5 Depth-aided Inpainting: Fills disocclusions with mixed FG/BG texture Vincent Jantet (ENS-Cachan – FR)

Layered Depth Images for Multi-View Coding

Ph.D. defense, 2012

12 / 43

View Synthesis: Classical scheme

Limitations Forward Warping: Lose connectivity Depth Inpainting: Can not retrieve structure Texture Inpainting: May fill BG with FB texture Errors are amplified along the process

Forward Proj.

Need for an accurate virtual depth map synthesizing method Introducing a new Joint Projection Filling method Dir. inpaint. Vincent Jantet (ENS-Cachan – FR)

Layered Depth Images for Multi-View Coding

Ph.D. defense, 2012

13 / 43

JPF: Joint Projection Filling [Jantet et al., 3D Research] McMillan

[McM95]

Pixel scanning order to avoid the use of a zBuffer

Contribution Also provides pixels connectivity information

Projection Reference view

Virtual view

Projection without zBuffer

Process direction

BackGround pixels are projected before ForeGround pixels

Vincent Jantet (ENS-Cachan – FR)

Layered Depth Images for Multi-View Coding

Ph.D. defense, 2012

14 / 43

JPF: For rectified views Process direction Ref. View p q

New View p0 q0

Consider p and q two pixels projected on p 0 and q 0 Vincent Jantet (ENS-Cachan – FR)

 0 0   qx = px + 1

No artifact

  

Layered Depth Images for Multi-View Coding

Ph.D. defense, 2012

15 / 43

JPF: For rectified views Process direction Ref. View p q Overlap New View p0

q0

Consider p and q two pixels projected on p 0 and q 0 Vincent Jantet (ENS-Cachan – FR)

 0 0   qx = px + 1

q0  x




px0 px0

+1 +1

Layered Depth Images for Multi-View Coding

q0

No artifact Overlap Disocclusion Ph.D. defense, 2012

15 / 43

JPF Generalized: For non rectified views Process direction

Disocclusion

Pq

0

q0

p0

0

P q : the last pixel projected on row qy0

Vincent Jantet (ENS-Cachan – FR)

0

qx0 ≤ Pxq + 1 0 qx0 > Pxq + 1

(

Layered Depth Images for Multi-View Coding

No artifact Disocclusion

Ph.D. defense, 2012

16 / 43

JPF: Results Process direction

Forward Proj.

Navier-Strokes

Directional inpaint.

JPF Proj.

JPF method well synthesize sharp boundaries and thin fingers Vincent Jantet (ENS-Cachan – FR)

Layered Depth Images for Multi-View Coding

Ph.D. defense, 2012

17 / 43

JPF: Conclusion

Advantages One-step projection, without post-processing Handles cracks and disocclusions during the projection Fills disocclusions with background Preserves geometrical structures

Limitations Hard to implement on GPU Introduce stretching artifacts if used for texture projection Should be used as a part of a full view synthesis method

Vincent Jantet (ENS-Cachan – FR)

Layered Depth Images for Multi-View Coding

Ph.D. defense, 2012

18 / 43

View Synthesis: Proposed scheme 1 JPF Reference

2

3

Projection

Virtual

Backward Warping

Depth-Aided Inpainting

Backward Projection

Depth-aided Inpainting

JPF method replaces for:

Synthesized Depth used for:

Forward Projection

Backward Warping

Depth Filtering

Depth-Aided Inpainting

Depth Inpainting Vincent Jantet (ENS-Cachan – FR)

Layered Depth Images for Multi-View Coding

Ph.D. defense, 2012

19 / 43

View Synthesis rendering results

Disocclusions

SoA inpaint.1 Daribo’s DAI2

JPF proj.

Full-Z DAI

3

Inconsistent virtual depth map ⇒ Texture artifacts JPF synthesize correct depth map which helps DAI 1

Navier-Strokes’s inpainting [BBS01] Daribo’s Depth Aided Inpainting [DP10] 3 Full-Z Depth Aided Inpainting[Jantet et al., 2011a] (inspired from Daribo) 2

Vincent Jantet (ENS-Cachan – FR)

Layered Depth Images for Multi-View Coding

Ph.D. defense, 2012

20 / 43

Conclusions on Virtual view synthesis

From a single input video + depth: JPF: Synthesize virtual depth map DAI: Recreate missing texture Realistic synthesized video, but: Introduces temporal flickering Incoherence between two synthesized views

From multi-view + depth: Could retrieve real disocclusions textures ⇒ Introducing LDI

Vincent Jantet (ENS-Cachan – FR)

Layered Depth Images for Multi-View Coding

Ph.D. defense, 2012

21 / 43

Table of contents

1

View synthesis (JPF)

2

Layered Depth Image (LDI) Definition Classical LDI construction Incremental-LDI construction (I-LDI) Object-based classification (O-LDI)

3

LDI-based multi-view compression

4

Conclusions

Vincent Jantet (ENS-Cachan – FR)

Layered Depth Images for Multi-View Coding

Ph.D. defense, 2012

22 / 43

LDI: Layered Depth Image Set of pixels, from a reference viewpoint, organized in layers

1st layer (visibles pixels)

2nd layer (occluded pixels)

3rd layer (...)

...

Advantages Disocclusion: Could be filled by real texture Camera freedom: Virtual camera can move inside a large area Compactness: Eliminate some correlated pixels and reduce data size Vincent Jantet (ENS-Cachan – FR)

Layered Depth Images for Multi-View Coding

Ph.D. defense, 2012

23 / 43

LDI: Classical construction scheme Every input views are warped onto a reference viewpoint, and then merged together

[SGHS98] Merging policy Eliminates duplicated pixels

Vue i

Proj.

Point de vue de référence

.. .

Fusion

.

LDI

Vue j

Proj.

Schéma de construction naïve des LDI. Vincent Jantet (ENS-Cachan – FR)

Layered Depth Images for Multi-View Coding

Ph.D. defense, 2012

24 / 43

Classical LDI limitations Redundancies Many pixels in many layers, partially empty Scattered pixels distribution Introducing Incremental LDI construction

Compression artifacts

Scattered distribution

Large depth discontinuities Motion in multi-layer Boundaries in multi-layer Uneasily compressed Introducing Object-based LDI representation Vincent Jantet (ENS-Cachan – FR)

Layered Depth Images for Multi-View Coding

Compressed depth map Ph.D. defense, 2012

25 / 43

I-LDI: Incremental-LDI construction [Jantet et al., 3DTV] Iterate for each input view Use current I-LDI to synthesize one acquired viewpoint Compare with captured view to compute disocclusion texture

Viewpoint i

Insert back textures into the I-LDI

Vincent Jantet (ENS-Cachan – FR)

View synthesis

Layered Depth Images for Multi-View Coding

.

I-LDI

Ph.D. defense, 2012

26 / 43

I-LDI: Incremental-LDI construction [Jantet et al., 3DTV] Iterate for each input view Use current I-LDI to synthesize one acquired viewpoint Compare with captured view to compute disocclusion texture

View i

Disocclusions extraction

Viewpoint i

Insert back textures into the I-LDI

View synthesis

Vincent Jantet (ENS-Cachan – FR)

Layered Depth Images for Multi-View Coding

.

I-LDI

Ph.D. defense, 2012

26 / 43

I-LDI: Incremental-LDI construction [Jantet et al., 3DTV] Iterate for each input view Use current I-LDI to synthesize one acquired viewpoint Compare with captured view to compute disocclusion texture

View i

Insert back textures into the I-LDI

Disocclusions extraction

Viewpoint i

Insertion

Vincent Jantet (ENS-Cachan – FR)

View synthesis

Layered Depth Images for Multi-View Coding

.

I-LDI

Ph.D. defense, 2012

26 / 43

I-LDI vs LDI Comparison LDI frames: many pixels in many layers, with scattered distribution

1st layer

2nd layer

3rd layer

4th layer

...

I-LDI frames: less pixels and less layers with compact distribution

1st layer Vincent Jantet (ENS-Cachan – FR)

2nd layer

3rd layer

Layered Depth Images for Multi-View Coding

4th layer Ph.D. defense, 2012

... 27 / 43

Classical LDI limitations

1st layer

2nd layer

Compressed depth map

Compression artifacts Large depth discontinuities Motion in multi-layer Boundaries in multi-layer Uneasily compressed

Synthesized virtual view

Introducing Object-based LDI

Depth compression artifacts

Vincent Jantet (ENS-Cachan – FR)

Layered Depth Images for Multi-View Coding

Ph.D. defense, 2012

28 / 43

O-LDI: Object-based LDI

[Jantet et al., ICIP]

Organizes pixels into layers to enhance depth continuity

Visible layer

Occluded layer

Classical LDI depth layers

Foreground

Background

Object-based LDI depth layers

Method based on a region growing algorithm Region R initialized with pixels where ZFG and ZBG are already defined For each pixel q outside R: Extrapolate ZFG and ZBG Classify q Vincent Jantet (ENS-Cachan – FR)

Layered Depth Images for Multi-View Coding

Ph.D. defense, 2012

29 / 43

O-LDI: Classification Initializing

Foreground Vincent Jantet (ENS-Cachan – FR)

Unclassified Layered Depth Images for Multi-View Coding

Background Ph.D. defense, 2012

30 / 43

O-LDI: Classification Processing

Foreground Vincent Jantet (ENS-Cachan – FR)

Unclassified Layered Depth Images for Multi-View Coding

Background Ph.D. defense, 2012

31 / 43

O-LDI: Classification Results

Foreground Vincent Jantet (ENS-Cachan – FR)

Unclassified Layered Depth Images for Multi-View Coding

Background Ph.D. defense, 2012

32 / 43

O-LDI: Background inpainting

Background inpainting

Principe Exemplar-based inpainting from Criminisi [CPT03] Robust and time-consuming method Preserves texture and structure Vincent Jantet (ENS-Cachan – FR)

Layered Depth Images for Multi-View Coding

Ph.D. defense, 2012

33 / 43

O-LDI: Fast mesh-based rendering

Object-based LDI

Continuous layers can be rendered as meshes Foreground mesh is partially transparent Vincent Jantet (ENS-Cachan – FR)

Layered Depth Images for Multi-View Coding

Meshes rendering

Ph.D. defense, 2012

34 / 43

O-LDI: Rendering results

Disocclusions

Fast SoA inpainting

O-LDI rendering

Online inpainting limitations

O-LDI advantages

Fast inpainting, introduces:

Robust offline inpainting

Artifacts

Time coherent rendering

Stretching

Multi-view coherent rendering

Temporal flickering Vincent Jantet (ENS-Cachan – FR)

Layered Depth Images for Multi-View Coding

Ph.D. defense, 2012

35 / 43

O-LDI Conclusions

O-LDI Advantages Static background along time Compatible with fast mesh-based rendering Depth continuity improves rendering quality Remove unnecessary boundaries ⇒ Should improve compression

O-LDI Limitations No backward compatibility with 2D decoding scheme

Vincent Jantet (ENS-Cachan – FR)

Layered Depth Images for Multi-View Coding

Ph.D. defense, 2012

36 / 43

Table of contents

1

View synthesis (JPF)

2

Layered Depth Image (LDI)

3

LDI-based multi-view compression

4

Conclusions

Vincent Jantet (ENS-Cachan – FR)

Layered Depth Images for Multi-View Coding

Ph.D. defense, 2012

37 / 43

MVD and LDI compression schemes

MVD compression (MVC) Input views

V1

V3

V5

Input LDI V7

V30

V50

V70

Compressed views

Rendering

VSRS

Final view V600 Vincent Jantet (ENS-Cachan – FR)

LDI4 MVC

MVC V10

LDI compression (MVC)

Compression

LDI40 Compressed LDI DIBR

Rendering

V600 Final view

Layered Depth Images for Multi-View Coding

Ph.D. defense, 2012

38 / 43

"Breakdancing" multi-view video 33.5

90 88 SSIM (%)

PSNR (dB)

33 32.5 32

86 84

31.5 82 31 0

5

10 15 20 Bitrate (Mbit/s)

25

30

0

10 20 Bitrate (Mbit/s)

30

MVC on V.1-3-5-7 — VSRS V.6

LDI from V.4-3-5 — Render V.6

MPEG (MVC/VSRS)

LDI coded with MVC I-LDI coded with MVC O-LDI coded with MVC

"Breakdancing" MVD dataset

Vincent Jantet (ENS-Cachan – FR)

Layered Depth Images for Multi-View Coding

Ph.D. defense, 2012

39 / 43

Table of contents

1

View synthesis (JPF)

2

Layered Depth Image (LDI)

3

LDI-based multi-view compression

4

Conclusions

Vincent Jantet (ENS-Cachan – FR)

Layered Depth Images for Multi-View Coding

Ph.D. defense, 2012

40 / 43

View synthesizing conclusions

JPF: Joint Projection Filling method Projection with occlusion-compatible pixel scanning order Handles cracks Fills disocclusions with background Preserves geometrical structures

Virtual View Synthesis method with Full-Z Depth Aided Inpainting First synthesizes virtual zMap to help synthesizing virtual view Preserves sharp boundaries Realistic disocclusions filling

Vincent Jantet (ENS-Cachan – FR)

Layered Depth Images for Multi-View Coding

Ph.D. defense, 2012

41 / 43

Intermediate representation conclusions I-LDI: Incremental Layered Depth Image Iterative LDI construction to avoid layers correlations Less layers Less pixels Compact distribution

O-LDI: Object-based Layered Depth Image Pixels reorganisation to enhance depth continuity Static background No depth discontinuities ⇒ No compression artifacts Compatible mesh-based rendering

Vincent Jantet (ENS-Cachan – FR)

Layered Depth Images for Multi-View Coding

Ph.D. defense, 2012

42 / 43

Perspectives

Handle depth map inconsistencies Non realistic depth maps drive down rendering quality

Improve temporal coherence During LDI construction During views projection During Depth Aided Inpainting

Use more efficient compression scheme Consider MPEG 3D-HEVC Explore dedicated Depth Map Compression schemes

Vincent Jantet (ENS-Cachan – FR)

Layered Depth Images for Multi-View Coding

Ph.D. defense, 2012

43 / 43

Publications [Bosc et al., 2010] Bosc, E., Jantet, V., Morin, L., Pressigout, M., & Guillemot, C. (2010). Vidéo 3d: quel débit pour la profondeur? In CORESA. [Bosc et al., 2011] Bosc, E., Jantet, V., Pressigout, M., Morin, L., & Guillemot, C. (2011). Bit-rate allocation for multi-view video plus depth data. In 3DTV. [Jantet et al., 2011a] Jantet, V., Guillemot, C., & Morin, L. (2011a). Joint projection filling method for occlusion handling in depth-image-based rendering. 3D Research, 2, 1–13. [Jantet et al., 2011b] Jantet, V., Guillemot, C., & Morin, L. (2011b). Object-based layered depth images for improved virtual view synthesis in rate-constrained context.

In ICIP. [Jantet et al., 2009] Jantet, V., Morin, L., & Guillemot, C. (2009). Incremental-ldi for multi-view coding. In 3DTV. [Jantet et al., 2010] Jantet, V., Morin, L., & Guillemot, C. (2010). Génération, compression et rendu de ldi. In CORESA. [Sourimant et al., 2009] Sourimant, G., Colleu, T., Jantet, V., & Morin, L. (2009). Recalage gps / sig / video, et synthèse de textures de bâtiments. In CORESA. [Sourimant et al., 2011] Sourimant, G., Colleu, T., Jantet, V., Morin, L., & Bouatouch, K. (2011). Toward automatic gis–video initial registration. Annals of Telecommunications, 67, 1–13.

from [AB91] Edward H. Adelson–and R. Bergen. Vincent Jantet (ENS-Cachan FR)JamesLayered Depth Images for Multi-ViewModeling Coding and rendering Ph.D.architecture defense, 2012

44 / 43