机器视觉课题组

模式识别国家重点实验室

中国科学院自动化研究所

  

最新科研进展

国科大《计算机视觉》课程讲义(第1-3章)[第1章][第2章][第3章]
胡占义

摘要:本讲义为国科大硕士研究生春季学期开设的《计算机视觉》课程讲义的第1-3章,由胡占义研究员撰写。

随着互联网的普及,人们对教材与参考文献的阅读习惯已发生了本质的变化。现在似乎已很少有人再仔细研读一本教材,而大家往往是根据需要,从网上寻找“合适的具体内容”。所以,为了便于大家阅读参考,《计算机视觉》的课程讲义也以单章形式给出。

目前几乎任何一所高校都有从事计算机视觉的研究人员,但很多学生,包括老师,大都没有系统上过计算机视觉课,特别是“深度学习热潮”前的相关内容。笔者认为,目前很多计算机视觉研究人员似乎连计算机视觉的奠基者:David Marr, 及其提出的计算视觉理论也很少有人知道了。为了给相关人员提供一些参考和帮助,同时也作为一名科研人员回报社会的方式,本教程讲义完成后,将陆续放在课题组主页上供大家免费下载阅读(http://vision.ia.ac.cn/zh/progress.html)。

该讲义为笔者30多年来从事计算机视觉研究的一些心得和总结,不妥之处请大家批评指正。笔者长期以来得到国家自然科学基金委、科技部、中科院和国科大的资助,在此一并表示感谢。

  

  
Statistics of Visual Responses to Object Stimuli from Primate AIT Neurons to DNN Neurons
Qiulei Dong, Hong Wang, Zhanyi Hu
Neural Computation 2017

Currently deep neural network (DNN) has achieved comparable image object categorization performance with human beings, however its exceptionally good categorization ability is not well understood. Recently, a goal-driven paradigm is proposed for the understanding of visual object recognition pathway [DiCarlo et al.2016], in which it is advocated that by only controlling the last layer’s categorization performance in the learning phase of a hierarchical liner-nonlinear networks, not only its last layer’s output can quantitatively predict IT neuron responses, but its intermediate layers can only automatically predict the responses of the intermediate visual areas, such as V4. In this work, we would explore whether the DNN neurons could possess similar image object representational statistics to monkey IT neurons, in particular, when the network becomes deeper, and the image category becomes larger, via VGG19, a typical deep network of 19 layers. Lehky et al.[2011,2014] systematically investigated the monkey’s IT neuron response statistics by three different measures: single neuron response selectivity, population response sparseness, and the intrinsic dimensionality of neural object representation. In this work, we used the above same three measures to evaluate the DNN neurons responses to images in ImageNet, which contains million images of 1000 different categories. Our results show that VGG19 neurons have quite different response statistics to image objects compared with IT neurons in [Lehky et al. 2011,2014], which seems indicate that a good hierarchical categorization network does not necessarily demand similar response statistics to images with the IT neurons.

 
Commentary: Using Goal-Driven Deep Learning Models to Understand Sensory Cortex
Qiulei Dong, Hong Wang, Zhanyi Hu
Frontiers in Computational Neuroscience 2018

Abstract: Recently, a goal-driven modeling approach of sensory cortex is proposed in (Yamins and DiCarlo, Nature Neuroscience 2016).The basic idea of this approach is to first optimize a hierarchical convolutional neural network (HCNN) for performing an ethologically relevant task, then once the network parameters have been fixed, to compare the outputs of different layers of the network to neural data. The success of this approach is exemplified by the results in (Yamins et al., PNAS 2014), where a 4-layer HCNN, called HMO, was used to predict IT neuron spikes on image object stimuli. In Hong et al. ( Nature Neuroscience 2016) , under the same approach, a 6-layer HCNN was trained on ImageNet to successfully predict category-orthogonal object properties along the ventral stream. In this commentary, we show that due to the inherent divergent feature learning phenomenon in HCNN learning, exposed by Li et al. (ICRL 2016), the goal-driven approach should be used with special care in sensory cortex understanding, in other words, its generality for modeling cortex should not be overestimated.