My research interests include computer vision and machine learning, especially on depth estimation, depth completion, depth super-resolution, and NeRF rendering. These tasks are crucial for various applications,
such as self-driving, robotic vision, and related 3D visual perception. I am also fascinated by the task of 3D occupancy prediction.
Degradation Oriented and Regularized Network for Real-World Depth Super-Resolution
Zhengxue Wang*,
Zhiqiang Yan* ✉,
Jinshan Pan,
Guangwei Gao,
Kai Zhang,
Jian Yang ✉
arXiv, 2024, project page
For the first time, we introduce a Degradation Oriented and Regularized Network (DORNet) designed for real-world depth super-resolution, addressing the challenges posed by unconventional and unknown degradations.
The core concept involves estimating implicit degradation representations to achieve effective RGB-D fusion. This degradation learning process is self-supervised.
DCDepth: Progressive Monocular Depth Estimation in Discrete Cosine Domain
Kun Wang,
Zhiqiang Yan,
Junkai Fang,
Wanlu Zhu,
Xiang Li,
Jun Li ✉,
Jian Yang ✉
NeurIPS, 2024, project page
DCDepth transforms depth patches into the discrete cosine domain to estimate frequency coefficients, modeling local depth correlations.
The frequency transformation separates depth information into low-frequency (core structure) and high-frequency (details) components.
The progressive strategy predicts low-frequency components first for global context, then refines details with the high-frequency.
Tri-Perspective View Decomposition for Geometry-Aware Depth Completion
Zhiqiang Yan,
Yuankai Lin,
Kun Wang,
Yupeng Zheng,
Yufei Wang,
Zhenyu Zhang,
Jun Li ✉,
Jian Yang ✉
CVPR, 2024, oral, project page
TPVD decomposes 3D point cloud into three views to capture the fine-grained 3D geometry of scenes. TPV Fusion and GSPN are proposed to refine the depth.
Furthermore, we build a novel depth completion dataset named TOFDC, acquired by the time-of-flight (TOF) sensor and the color camera on smartphones.
Scene Prior Filtering for Depth Map Super-Resolution
Zhengxue Wang*,
Zhiqiang Yan* ✉,
Ming-Hsuan Yang,
Jinshan Pan,
Ying Tai,
Guangwei Gao,
Jian Yang ✉
arXiv, 2024, project page
To address the issues of texture interference and edge inaccuracy in GDSR, for the first time, SPFNet introduces the priors surface normal and semantic map from large-scale models.
As a result, SPFNet achieves state-of-the-art performance.
RigNet++: Semantic Assisted Repetitive Image Guided Network for Depth Completion
Zhiqiang Yan,
Xiang Li,
Le Hui,
Zhenyu Zhang,
Jun Li ✉,
Jian Yang ✉
arXiv, 2024
On the basis of RigNet, in semantic guidance branch, RigNet++ introduces large-scale model SAM, to supply depth with semantic prior.
In image guidance branch, RigNet++ design a dense repetitive hourglass network (DRHN) to provide powerful contextual instruction for depth
prediction. In addition, RigNet++ proposes a region-aware spatial propagation network (RASPN) for further depth refinement based on the
semantic prior constraint.
Distortion and Uncertainty Aware Loss for Panoramic Depth Completion
Zhiqiang Yan,
Xiang Li,
Kun Wang,
Shuo Chen ✉,
Jun Li ✉,
Jian Yang
ICML, 2023
Standard MSE or MAE loss function is commonly used in limited field-of-vision depth completion, treating each pixel equally under a basic assumption that all pixels have same contribution during optimization.
However, the assumption is inapplicable to panoramic data due to its latitude-wise distortion and high uncertainty nearby textures and edges.
To handle these challenges, this paper proposes the distortion and uncertainty aware loss (DUL) that consists of a distortion-aware loss and an uncertainty-aware loss.
DesNet: Decomposed Scale-Consistent Network for Unsupervised Depth Completion
Zhiqiang Yan,
Kun Wang,
Xiang Li,
Zhenyu Zhang,
Jun Li ✉,
Jian Yang ✉
AAAI, 2023, oral
DesNet first introduces a decomposed scale-consistent learning strategy, which disintegrates the absolute depth into relative depth prediction and global scale estimation, contributing to individual learning benefits.
Extensive experiments show the superiority of DesNet on KITTI benchmark, ranking 1st and surpassing the second best more than 12% in RMSE.
Multi-Modal Masked Pre-Training for Monocular Panoramic Depth Completion
Zhiqiang Yan*,
Xiang Li*,
Kun Wang,
Zhenyu Zhang,
Jun Li ✉,
Jian Yang ✉
ECCV, 2022
For the first time, we enable the masked pre-training in a Convolution-based multi-modal task, instead of the Transformer-based single-modal task.
What's more, we introduce the panoramic depth completion, a new task that facilitates 3D reconstruction.
Learning Complementary Correlations for Depth Super-Resolution with Incomplete Data in Real World
Zhiqiang Yan,
Kun Wang,
Xiang Li,
Zhenyu Zhang,
Guangyu Li ✉,
Jun Li ✉,
Jian Yang
TNNLS, 2022
Motivated by pratical applications, this paper introduces a new task, i.e., incomplete depth super-resolution (IDSR),
which recovers dense and high-resolution depth from incomplete and low-resolution one.
Regularizing Nighttime Weirdness: Efficient Self-supervised Monocular Depth Estimation in the Dark
Kun Wang*,
Zhenyu Zhang*,
Zhiqiang Yan,
Xiang Li,
Baobei Xu,
Jun Li ✉,
Jian Yang ✉
ICCV, 2021, project page
RNW introduces a nighttime self-supervised monocular depth estimation framework.
The low visibility brings weak textures while the varying illumination breaks brightness-consistency assumption.
To address these problems, RNW proposes the novel Priors-Based Regularization, Mapping-Consistent Image Enhancement, and
Statistics-Based Mask.
Selected Honors and Awards
- 2023.10, National Scholarship (Top 2%), NJUST;
- 2022.10, Hua Wei Scholarship (Top 1%), NJUST;
Academic Service
- Conference reviewer: CVPR, ICCV, ECCV, NIPS, ICML, ICLR, AAAI, ICRA, 3DV, ACCV
- Journal reviewer: TIP, TCSVT, TIV
This webpage is forked from Junkai Fan. Thanks to him!