TY - GEN
T1 - Cascade Residual Learning
T2 - 16th IEEE International Conference on Computer Vision Workshops, ICCVW 2017
AU - Pang, Jiahao
AU - Sun, Wenxiu
AU - Ren, Jimmy S.J.
AU - Yang, Chengxi
AU - Yan, Qiong
N1 - Publisher Copyright:
© 2017 IEEE.
PY - 2018/1/19
Y1 - 2018/1/19
N2 - Leveraging on the recent developments in convolutional neural networks (CNNs), matching dense correspondence from a stereo pair has been cast as a learning problem, with performance exceeding traditional approaches. However, it remains challenging to generate high-quality disparities for the inherently ill-posed regions. To tackle this problem, we propose a novel cascade CNN architecture composing of two stages. The first stage advances the recently proposed DispNet by equipping it with extra up-convolution modules, leading to disparity images with more details. The second stage explicitly rectifies the disparity initialized by the first stage; it couples with the first-stage and generates residual signals across multiple scales. The summation of the outputs from the two stages gives the final disparity. As opposed to directly learning the disparity at the second stage, we show that residual learning provides more effective refinement. Moreover, it also benefits the training of the overall cascade network. Experimentation shows that our cascade residual learning scheme provides state-of-the-art performance for matching stereo correspondence. By the time of the submission of this paper, our method ranks first in the KITTI 2015 stereo benchmark, surpassing the prior works by a noteworthy margin.
AB - Leveraging on the recent developments in convolutional neural networks (CNNs), matching dense correspondence from a stereo pair has been cast as a learning problem, with performance exceeding traditional approaches. However, it remains challenging to generate high-quality disparities for the inherently ill-posed regions. To tackle this problem, we propose a novel cascade CNN architecture composing of two stages. The first stage advances the recently proposed DispNet by equipping it with extra up-convolution modules, leading to disparity images with more details. The second stage explicitly rectifies the disparity initialized by the first stage; it couples with the first-stage and generates residual signals across multiple scales. The summation of the outputs from the two stages gives the final disparity. As opposed to directly learning the disparity at the second stage, we show that residual learning provides more effective refinement. Moreover, it also benefits the training of the overall cascade network. Experimentation shows that our cascade residual learning scheme provides state-of-the-art performance for matching stereo correspondence. By the time of the submission of this paper, our method ranks first in the KITTI 2015 stereo benchmark, surpassing the prior works by a noteworthy margin.
UR - https://www.scopus.com/pages/publications/85044898026
U2 - 10.1109/ICCVW.2017.108
DO - 10.1109/ICCVW.2017.108
M3 - Conference contribution
AN - SCOPUS:85044898026
T3 - Proceedings - 2017 IEEE International Conference on Computer Vision Workshops, ICCVW 2017
SP - 878
EP - 886
BT - Proceedings - 2017 IEEE International Conference on Computer Vision Workshops, ICCVW 2017
Y2 - 22 October 2017 through 29 October 2017
ER -