Selfsupervised Learning From High Dimensional Data for

Results: Evaluation of Learning and Driving Performance. Marc”Aurelio Ranzato1. Urs Muller2. Yann LeCun1. Online Learning – Logistic Regression with ...
11MB taille 8 téléchargements 310 vues
Self­supervised Learning From High Dimensional  Data for Autonomous Off­Road Driving

Ayse Naz Erkan1 Raia Hadsell1

Marc”Aurelio Ranzato1 Urs Muller2

Pierre Sermanet1,2 Yann LeCun1 Koray Kavukcuoglu1

(1) Courant Institute of Mathematical Sciences, New York University (2) Net­Scale Technologies, Morganville, NJ

Problem: Autonomous, Vision-based Navigation in Complex Off-Road Environments Stereo­based navigation uses simple heuristics to  identify pixels as ground or obstacle. Stereo is insufficient: ● sparse, noisy, and short­range (0­12 meters) ● pure stereo navigation is myopic – driving in fog

The Platform: LAGR Mobile Robot



Challenge: Vision­based Navigation for Mobile Robots Why is it hard? Extreme environmental variability Visual complexity – shadows, clutter Hilly, bumpy, uneven terrain Real­time constraints on processing Tricks – collapsible vegetation, hidden obstacles Position estimation errors – wheel slip, GPS  Planning with uncertainty Lighting variability – glare, time of day



Challenges for machine learning solutions: ● supervised learning limits the variability of environments ● online learning is adaptive, but has no memory ● large image patches are necessary for accurate learning­ high dimension ● generalization from near­range to far­range (inverse size/distance) ● planning with uncertainty from classifiers ● concept drift

LAGR (Learning Applied to Ground Robots) 

DARPA program 2005­2008, 8 competing research labs develop  navigation for fixed platform



Periodic testing in unfamiliar terrain



CMU & NREC designed platform and baseline software: 4 color cameras (2 stereo pairs, 640x480) GPS receiver for global navigation 2 front bumper switches Onboard IMU (inertial measurement unit) 4 onboard Linux computers 2 “eye” machines (dual core 2 Gz) 1 “planning” machine (single core 1 Gz) 1 low­level control computer (single core)

The Solution: Online Self-Supervised Learning    

Strategy: Online Near­to­Far Learning Inputs: large windows in image Labels: heuristics from stereo module Classifier: unsupervised auto­encoder + online logistic regression

input image

stereo labels (0­12 m)

classifier prediction (5­80 m)



Stereo­based  obstacle  detector

i

y=gW ' D



W'D

∥Y ' −F dec  Z ' ∥2

2

∥Y ' −F dec  Z ' ∥

Robust feature extraction  Trained offline  100000 training images from log files

Y'=Z

F ' dec Z '  Z'

Y ' =Z

Kernels (2 layers) learned by Auto-Encoder: 20x7x6 in first layer; 300x6x5 in second layer

D

D=F W  X 

Loss:

 samples          labels

n

L=−∑ log g y⋅W ' D− RW 



Learning:



Inference:  where:

i=1

∂L =y⋅g−y⋅W ' D D ∂W y=gW ' D g z=

X (yuv: 13x24x3)

F ' enc Y ' 

1 1e

 Online Ensemble Learning ­  Mixture of Experts Architecture F ' dec Z '  Z'

Input patch

1. . p

dimensional features extracted via unsupervised auto­encoder network   Weights W are trained with cross entropy loss function   Regularization: decay to default weights, L2 regularization

W

2 Layer Auto-Encoder Network

,Y

 The online classifier is trained at each frame using gradient descent on the 100 

Input image is normalized such that size of an object is independent of its distance from the robot Allows consistent processing of windows at different scales Distance normalization allows learning using large, context-rich windows



X

1. . p

 FW  X 

Online Learning – Logistic Regression with gradient descent

Distance-Normalized Image Pyramid 

Auto­encoder  FW network    

Input: calibrated  stereo images Output: training  set of labeled  feature vectors

Architectures that combine  high­capacity slow learners and  low capacity, highly adaptive controllers could solve the memory problem: a single online  classifier exhibits fast learning and fast forgetting.  Online mixture of experts is one such architecture.

code

F ' enc Y ' 

−z

Output ∑ Controller

Expert

Expert

Expert

Input

Results: Evaluation of Learning and Driving Performance ex. A

road following and man-made obstacle detection

Input image

Stereo Labels

Classifier Output

Input image

Stereo Labels

Classifier Output

ex. B

difficult ground recognition multi-color and shadows

Start

No Learning

With Learning

ex. C

very long range vision to the horizon Input image

Stereo Labels

Classifier Output

 Direct path to goal ends in cul­de­sac  Short­range stereo (