Welcome to the homepage of Yihong WU              中文版

说明: \\\html\Faculty\yhwu\6.jpg




Robot Vision Group 


National Laboratory of Pattern Recognition 


Institute of Automation 


Chinese Academy of Sciences


P.O. Box 2728, Beijing, 100080

P.R. China


Email: yhwu at nlpr.ia.ac.cn

Tel:   +86-10-82544697

Fax:  +86-10-82544594


**中国科学院 自动化研究所 模式识别国家重点实验室**


1. 基于RGB-D相机的SLAM
2. 基于深度学习的视觉定位(位置识别,特征匹配)
3. 基于深度学习的点云任务
4. 激光视觉惯导融合的SLAM
5. 视觉惯导融合的里程计
6. 基于激光的SLAM

招收 **3D 视觉方向实习生**,要求:
1. 有严谨的科学态度,**遵守学术道德规范,身心健康,遵纪守法**。
2. 工科专业的本科生、硕士及博士,对编程及学术研究有较强的兴趣;
3. 具有计算机视觉或深度学习相关基础知识;
4. **得到在校导师的许可**,保证半年以上的实习期。

1. 保持已有研究方向并得到导师指导,或参与课题组的研究型项目;
2. 指导发表高水平论文(参与论文撰写或独立撰写),如机器人领域顶级会议ICRA、IROS等,视觉领域顶级会议CVPR、ICCV、ECCV等。
3. 实习工资。

1. 充足的**个人工作站及服务器等**计算资源。
2. 将深入参与知名企业委托的前瞻性课题。
3. 可提供实习证明及推荐信,表现优异者优先获得实验室硕、博深造机会。
4. 地点:北京市海淀区中关村东路95号




3DV 2020 Tutorial: Visual Localization in the Age of Deep Learning

ACCV 2018 Tutorial: Developments of 3 Dimensional Computer Vision Since 2017         PDF


Research Demos (Pose tracking):

1. Vision-based mobile augmented reality





2. SLAM for a general moving object





cylinder SLAM


3. SLAM for scenes





4. SLAM relocalization






5. SLAM with IMU





6. SLAM devices

Realtime and online, outdoor and indoor

SLAM device


Research Interests

1.  Camera calibration and pose determination, image matching, three-dimensional reconstruction, vision geometry, SLAM, vision on mobile devices.

2.  Geometric invariants and applications in computer vision and pattern recognition.  

3.  Polynomial elimination and applications in computer vision.



Selected Publications

  1. Y. Wu, Y.F. Li, and Z. Hu. Detecting and Handling Unreliable Points for Camera Parameter Estimation.International Journal of Computer Vision (IJCV), Vol. 79, No. 2, pp. 209-223, 2008. PDF
  2. Y. Wu and Z. Hu. Invariant Representations of a Quadric Cone and a Twisted Cubic. 
    IEEE Transactions on Pattern Analysis and Machine Intelligence
    (PAMI), Vol. 25, No. 10, pp. 1329-1332, 2003. PDF
  3. Y. Wu and Z. Hu. Geometric Invariants and Applications under Catadioptric Camera Model. 
    The 10th International Conference on Computer Vision
    (ICCV), pp.1547-1554, 2005.
  4. Y. Wu, H. Zhu, Z. Hu, and F. Wu. Camera Calibration from the Quasi-Affine Invariance of Two Parallel Circles. 
    The 8th European Conference on Computer Vision
    (ECCV), pp. 190-202, 2004.
  5. P. Gurdjos, P. Sturm, and Y. Wu. Euclidean Structure from N>=2 Parallel Circles: Theory and Algorithms. 
    The 10th  European Conference on Computer Vision
    (ECCV), pp. 238-252, 2006. PDF
  6. Y. Wu, X. Li, F. Wu, and Z. Hu. Coplanar Circles, Quasi-Affine Invariance and Calibration.  
    Image and Vision Computing
    , Vol. 24, Iss. 4, pp. 319-326, 2006.
  7.  Y. Wu and Z. Hu. PnP Problem Revisited. 
    Journal of Mathematical Imaging and Vision
    , Vol. 24, No. 1, pp. 131-141, 2006. PDF
  8.  H. Li and Y. Wu. Automated Short Proof Generation with Cayley and Bracket Algebras I. Incidence Geometry. 
    Journal of Symbolic Computation
    , Vol. 36, Iss. 5, pp. 717-762, 2003.
  9.  H. Li and Y. Wu. Automated Short Proof Generation with Cayley and Bracket Algebras II. Conic Geometry. 
    Journal of Symbolic Computation
    , Vol. 36, Iss. 5, pp. 763-809, 2003.
  10. H. Li and Y. Wu. Filtered-Graded Transfer of Groebner Basis Computation in Solvable Polynomial Algebra. 
    Communications in Algebra
    , Vol.28, No.1, pp. 15-32, 2000.
  11. B. Zhang, Y. Li, and Y. Wu. Self-Recalibration of a Structured Light System via Plane-Based Homography. Pattern Recognition, Vol. 40, Iss. 4, pp. 1368-1377, 2007.
  12. F. Wu, F. Duan, Z. Hu, and Y. Wu. A New Linear Algorithm for Calibrating Central Catadioptric Cameras. Pattern Recognition, 41(10): 3166-3172, 2008.
  13. Zhijun Dai, Yihong Wu, Fengjun Zhang, and Hongan Wang. A Novel Fast Method for L_infinity Problems in Multiview Geometry. ECCV, pp. 116-129, 2012.
  14. Yihong Wu, Zhanyi Hu, and Youfu Li. Radial Distortion Invariants and Lens Evaluation under a Single-Optical-Axis Omnidirectional Camera. Computer Vision and Image Understanding, 126: 11-27, 2014. PDF
  15. Wei Liu, Yihong Wu, Fusheng Guo, and Zhanyi Hu. An Efficient Approach of 2D to 3D Video Conversion Based on Piece-wise Structure from Motion. The Visual Computer, Vol. 31, Iss. 1, pp 55-68, January, 2015.  PDF
  16. Youji Feng, Lixin Fan, and Yihong Wu. Fast Localization in Large Scale Environments Using Supervised Indexing of Binary Features. IEEE Transactions on Image Processing, Vol. 25, No. 1, pp. 343-358, 2016.   PDF
  17. Youji Feng, Yihong Wu, and Lixin Fan. Real-time SLAM Relocalization with On-line Learning of Binary Feature Indexing. Machine Vision and Applications, 2017.
  18. Xiaomei Zhao, Yihong Wu, Guidong Song, Zhenye Li, Yazhuo Zhang, Yong Fan. A Deep Learning Model Integrating FCNNs and CRFs for Brain Tumor Segmentation. Medical Image Analysis, Vol. 43, pp. 98-111, 2018.
  19. Yihong Wu, Fulin Tang, and Heping Li. Image Based Camera Localization: an Overview. Invited paper by Visual Computing for Industry, Biomedicine and Art, 2018. PDF
  20. Haoren Wang, Juan Lei, Ao li, and Yihong Wu. A Geometric based Point Cloud Reduction Method for Mobile Augmented Reality. Journal of Computer Science and Technology, 2018.
  21. Yihong Wu, Haoren Wang, Fulin Tang, and Zhiheng Wang. Efficient Conic Fitting with a Polar-N-Direction Geometric Distance. Pattern Recognition, 2019.
  22. Fulin Tang, Heping Li, Yihong Wu. FMD Stereo SLAM: Fusing MVG and Direct Formulation Towards Accurate and Fast Stereo SLAM. ICRA, 2019.
  23. Lang Wu, Yihong Wu. Similarity Hierarchy Based Place Recognition by Deep Supervised Hashing for SLAM. IROS 2019.
  24. Xiaomei Zhao, Yihong Wu. Automatically Extract Semi-transparent Motion-blurred Hand from a Single Image. IEEE Trans. on Signal Processing Letters, Vol. 26, No. 11, pp. 1598- 1602, 2019.
  25. Fulin Tang, Yihong Wu, Xiaohui Hou, Haibin Ling. 3D Mapping and 6D Pose Computation for Real Time Augmented Reality on Cylindrical Objects. IEEE Transactions on Circuits and Systems for Video Technology, Vol. 30, Issue 9, pp. 2887-2899, 2020.
  26. Hao Wei, Fulin Tang, Chaofan Zhang, Yihong Wu. Highly Efficient Line Segment Tracking with an IMU-KLT Prediction and a Convex Geometric Distance Minimization. ICRA 2021.
  27. Bingxi Liu, Fulin Tang,Yujie Fu, Yanqun Yang, Yihong Wu. A Flexible and Efficient Loop Closure Detection based on Motion Knowledge. ICRA 2021.
  28. Yujia Zhai, Fulin Tang, Weijun Li, Jian Xu, Yihong Wu. 3OFLR-SLAM: Visual SLAM with 3D-assisting Optical Flow and Local-RANSAC. ACPR, 2021.
  29. Pengju Zhang, Chaofan Zhang, Zheng Rong, Yihong Wu. Learning to Decompose and Restore Low-light Images with Wavelet Transform. ACM MM Asia, 2021.
  30. Zewen Xu, Zheng Rong, Yihong Wu. A Survey: Which Features are Required for Dynamic Visual SLAM? Visual Computing for Industry, Biomedicine, and Art, 2021.
  31. Hao Wei, Fulin Tang, Zewen Xu, Chaofan Zhang, Yihong Wu. A Point-Line VIO System with Novel Feature Hybrids and with Novel Line Predicting-Matching. IEEE Robotics and Automation Letters, 2021. 
  32. Pengju Zhang, Chaofan Zhang,Bingxi Liu, and Yihong Wu. Leveraging Local and Global Descriptors in Parallel to Search Correspondences for Visual Localization. Pattern Recognition, 2021.
  33. 龙霄潇,程新景,朱昊,张朋举,刘浩敏,李俊,郑林涛,胡庆拥,刘浩,曹汛,杨睿刚,吴毅红,章国锋,刘烨斌,徐凯,郭裕兰,陈宝权. 三维视觉前沿进展.中国图象图形学报,2021,26(06):1389-1428.




All Publications







Work Experience

Oct. 2008-Present

Professor, Institute of Automation, Chinese Academy of Sciences  

Aug. 2003-Oct. 2008

Associate professor, Institute of Automation, Chinese Academy of Sciences 

Nov. 2007-Jan. 2008  

Visiting associate professor, City University of Hong Kong

Apr. 2005-Oct. 2006

Senior research associate/ Research fellow, Department of Manufacturing Engineering and Engineering Management, City University of Hong Kong, Collaborating with Prof. Youfu Li

Jun. 2001- Jul. 2003

Postdoctoral researcher, Institute of Automation, Chinese Academy of Sciences, Collaborating with Prof. Zhanyi Hu










Sep. 1998 -Jun. 2001

Mathematics Mechanization Research Center, Institute of Systems Science, Chinese Academy of Sciences

Geometric invariants and applications

Prof. Hongbo Li

Ph. D.

Sep. 1995 -Jul. 1998

Dept. of Mathematics, Shaanxi Normal University

Computational algebra

Prof. Huishi Li

M. S.

Sep. 1991 -Jun. 1995

Dept. of Mathematics, Shanxi Yanbei Normal Institute

Mathematics education


B. S.




Teaching Experience




Fall 2001

Projective geometry

Institute of Automation, Chinese Academy of Sciences

Spring 2002-2008

Projective geometry and 3D vision

Graduate School, Chinese Academy of Sciences

Spring 2017-now,   Robot navigation,       University of Chinese Academy of Sciences


Area Chair of CVPR 2021

Associate Editor of Pattern Recognition

Area Chair of ICPR 2018

Area Chair of PRCV 2021

SPC of IJCAI 2019/2020/2021

Editorial Board Member of ACTA AUTOMATICA SINICA

Editorial Board Member of Journal of CAD & CG

Editorial Board Member of Journal of Frontiers of Computer Science and Technology

Editorial Board Member of the Open Artificial Intelligence /Computer Science Journal

PC Member of VISUAL 2007

PC Member of ICCV 2007

PC Member of PCM 2007

PC Member of WCICA 2008

PC Member of CVPR 2008  

Session Chair of ACCV 2007

Session Chair of RIUPEEEC 2005

PC Member of KES 2009

Reviewer of ICCV 2009/ACCV 2009



Research Projects as PI

Single view based metrology (863)

A study on the theory and algorithm of camera self-calibration (NSFC)

Geometric invariant computation and camera pose determination from n perspective points (NSFC)

Image based 3-dimensional reconstruction (973)

Camera parameter computation from video sequence (Key NSFC)

Ominidirectional camera calibration (IA)

Study on image invariants and applications under multiple camera models (NSFC)

Image-based modeling for complex and large scale environment (NLPR)

Visual SLAM (Nokia RC in Finland)

Non-planar object tracking (Samsung)

Camera pose tracking (Samsung)

VR pose tracking (Huawei)



The First Workshop on Community Based 3D Content and Its Applications in Mobile Internet Environments,In conjunction with ACCV 2009



The Second Workshop on Community Based 3D Content and Its Applications,In conjunction with ICME 2012





ACCV 2018 Tutorial

Developments of 3 Dimensional Computer Vision Since 2017


Virtual reality (VR), augmented reality (AR), robotics, and autonomous driving have recently attracted much attention from the academic as well as the industrial community. 3 dimensional (3D) computer vision plays important roles in these fields. Autonomous localization and navigation is necessary for a moving robot, where using cameras is the most flexible and low cost approach for building map and localization. To augment reality in images, camera pose determination or localization is needed. To view virtual environments, the corresponding viewing angle is necessary to be computed. Furthermore, cameras are ubiquitous and people carry mobile phones that have cameras every day. There have been some AR applications in mobile phones. Therefore, 3D computer vision has great and widespread applications.


This tutorial will provide the developments of 3D computer vision of the past two years. The important works since 2017 will be introduced in image matching, camera localization including camera pose determination and simultaneous localization and mapping (SLAM), 3D reconstruction. 


The contents have five parts:

1. Preface

Some fundamental knowledge of 3D vision is introduced as well as some events related to 3D vision in this past two years.


2. Developments in image matching

Some important works of image feature detectors and descriptors since 2017 are introduced. Also, some important woks of image matching and two dataset since 2017 are introduced. Among these works, deep learning based methods are growing.


3. Developments in camera localization

A complete classification of image based camera localization mapped as a tree structure is given. The important directions are pointed out. The developments of camera localization in both known environments and unknown environments are introduced since 2017. The ones in known environments are PnP works. The ones in unknown environments are SLAM works. SLAM works include the general geometric SLAM, learning SLAM, semantic SLAM, and marker SLAM. Except SLAM under the traditional cameras, there are some SLAM works under the event cameras and RGBD cameras.


4. Developments in 3D reconstruction

This part will introduce structure from motion (SFM) based 3D reconstruction, learning 3D reconstruction, RGBD 3D reconstruction or RGBD SLAM since 2017. 


5. Trends of 3D vision

I will share my views for 3D vision trends.






3DV 2020 Tutorial:

 Visual Localization in the Age of Deep Learning

 JST 10:00----13:00    Nov. 28
 CST 09:00----12:00    Nov. 28
 GMT 01:00----04:00    Nov. 28
 EST 20:00----23:00    Nov. 27
 PST 17:00----20:00    Nov. 27

A short description:

Deep learning is a subset of machine learning and one of the foundations of Artificial Intelligence. Recent years see its remarkable development and wide applications. It has been used in almost every task of computer vision field and shows its powerful ability. 3D computer vision, developed under the Marr framework, re-blooms by deep learning again. Visual localization, a key task in 3D computer vision, has many applications in AR, VR, robot navigation, and driverless car. Deep learning has been used in visual localization by various ways. There are end-to-end and non-end-to-end deep learning visual localizations. The main topics in visual localization include image feature extraction and description, image matching, image retrieval, 2D-3D matching, RANSAC, PnP, feature tracking, loop closure detection, SLAM, bundle adjustment. All these have combined deep learning.

The tutorial consists of five parts. In the first part, this tutorial will give an overview of visual localization, share with some personal viewpoints to use deep learning in 3D vision. In the second part, the tutorial will introduce some important literatures in image feature extraction and description, and matching with deep learning. Then, work of camera pose determinations from known 3D structure knowledge with deep learning is reported in the third part, including the work of image retrieval, 2D-3D matching, RANSAC, and PnP. In the fourth part, work of SLAM with deep learning is reported, including the work of feature tracking, loop closure detection, bundle adjustment etc. As we see, every aspect combines deep learning in visual localization. It shows free exploration to use deep learning. Finally, i.e. in the last part, some personal views are shared for some existing problems when using deep learning, for example, there are no clear training dataset and test dataset given in some public dataset. Besides, future trends for visual localization are also shared.



The contents have five parts:

1. An overview of visual localization

Some fundamental knowledge of 3D vision and an overview of image-based localization are introduced. A complete classification of image-based localization mapped as a tree structure is given. The possible best way to use deep learning in 3D vision is also analyzed.


2. Image feature representation and matching with deep learning

Some important literatures of image feature detectors and descriptors with deep learning are introduced. Also, some important literatures of image matching with deep learning and dataset are introduced.


3. Camera pose determination from known 3D structure knowledge with deep learning

Deserved studies with deep learning are pointed out. The developments of camera localization in known environments are introduced. PnP and RANSAC using deep learning are also presented.


4. SLAM with deep learning

This part will introduce some learning-based SLAM, recent new dataset, and challenges. It also includes feature tracking from videos, loop closure detection, bundle adjustment etc.


5. Problems and trends when view geometry using deep learning

Some personal views are shared for some problems and trends when view geometry using deep learning.



Last Updated  by Yihong Wu, September 2020