EINRUL: Efficient Implicit Neural Reconstruction Using LiDAR

1Harbin Institute of Technology, 2The University of Hong Kong, 3 Hong Kong University of Science and Technology, 4DJI

Video

Abstract

Modeling scene geometry using implicit neural representation has revealed its advantages in accuracy, flexibility, and low memory usage. Previous approaches have demonstrated impressive results using color or depth images but still have difficulty handling poor light conditions and large-scale scenes. Methods taking global point cloud as input require accurate registration and ground truth coordinate labels, which limits their application scenarios.

In this paper, we propose a new method that uses sparse LiDAR point clouds and rough odometry to reconstruct fine-grained implicit occupancy field efficiently within a few minutes. We introduce a new loss function that supervises directly in 3D space without 2D rendering, avoiding information loss. We also manage to refine poses of input frames in an end-to-end manner, creating consistent geometry without global point cloud registration.

As far as we know, our method is the first to reconstruct implicit scene representation from LiDAR-only input.* Experiments on synthetic and real-world datasets, including indoor and outdoor scenes, prove that our method is effective, efficient, and accurate, obtaining comparable results with existing methods using dense input.

* Until the submission of the paper (Sept. 2022)

Method

EINRUL takes sparse LiDAR frames with rough poses as input and optimizes an implicit representation of the scene in an efficient and accurate way. The rough poses are refined simultaneously to guarantee a consistent geometry. The usage of LiDAR allows it to be applied in various scenarios.

The key point of the method is to reconstruct a dense represenation (an implicit model) of the scene with rather sparse input. To overcome it, we present a method to directly supervise in 3D space without 2D rendering, achieving strong supervision while being unbiased and occlusion-aware.

Since EINRUL is able to convert sparse depth data into a dense representation, depth completion leveraging multiview information can be achieved as a byproduct.

Results

ScanNet


Replica


9-Synthetic-Scenes


Outdoor AVIA


Depth Completion



Color

GT Depth

Pseudo LiDAR vs Rendered Depth


LiDAR Pointcloud

Rendered Depth

Rendered Mesh

BibTeX

@InProceedings{yan2023efficient,
      author = {Dongyu Yan, Xiaoyang Lyu, Jieqi Shi, Yi Lin},
      title = {Efficient Implicit Neural Reconstruction Using LiDAR},
      booktitle={2023 IEEE International Conference on Robotics and Automation (ICRA)},
      year = {2023}}
}