The Deep Model (TDM)

This is my thesis project. We use deep neural network based models to model explain experiments related to face, object, and scene recognition. We first summarize the power of deep neural networks in encoding and decoding neural responses in a variety of tasks. [J Neuroscience 2015 paper] We then use TDM to model the contribution of central versus peripheral vision in scene, object, and face Recognition. [CogSci 2016 Paper] We then propose and validate the hypothesis of why peripheral advantage of peripheral vision in scene recognition. [In Review]

Basic-level Categorization Faciliates Visual Object Recognition

We propose a network optimization strategy inspired by both of the developmental trajectory of children’s visual object recognition capabilities, and Bar (2003), who hypothesized that basic level information is carried in the fast magnocellular pathway through the prefrontal cortex (PFC) and then projected back to inferior temporal cortex (IT), where subordinate level categorization is achieved. We instantiate this idea by training a deep CNN to perform basic level object categorization first, and then train it on subordinate level categorization. We apply this idea to training AlexNet (Krizhevsky et al., 2012) on the ILSVRC 2012 dataset and show that the top-5 accuracy increases from 80.13% to 82.14%, demonstrating the effectiveness of the method. We also show that subsequent transfer learning on smaller datasets gives superior results. [ICLR 2016 Workshop Paper] [Download pretrained models]

Modeling Object Recognition Pathway: A Deep Hierarchical Model

To recognize objects, the human visual system processes information through a network of hierarchically organized brain regions.. In this work, we use a hierarchical I(ICA algorithm to automatically learn the visual features that account for early visual cortex. We then continue modeling the object recognition pathway using Gnostic Fields. The whole biologically-inspired model not only allows us to develop representations similar to those in primary visual cortex, but also to perform well on standard computer vision object recognition benchmarks. [CogSci 2015 Paper]

Modeling the Relationship Between Face and Object Recognition

Are face and object recognition independent? A recent study by Gauthier et al. (2014) suggests they are not, and that the relationship is moderated by experience with the object categories. Using a neurocomputational model, we show that, as in the data, the shared variance between the performance on faces and the performance on subordinate-level object categorization increases as experience grows. Our analysis of the hidden unit representations suggests that FFA contains a "spreading" transform that moves similar objects apart in representational space. [CogSci 2014 Paper][J Cog Neuro 2016 Paper]

A Computational Model of the Development of Face Processing

Research on the development of contrast sensitivity shows that human infants can only perceive low spatial frequency information from visual stimuli, but their acuity improves gradually with age. Also, the right hemisphere develops earlier than the left hemisphere. Here we show that these constraints, coupled with a desire on the part of the infant to individuate faces, leads naturally to the right hemisphere and low frequency bias for face processing. We propose a neurocomputational model account for this developmental trend for face and object recognition using a modular neural network based on Dailey and Cottrell (1999). [CogSci 2013 Paper]

Real-time Automatic Object Detector Using Saliency

Can we use the camera or cellphone to detect specific object automatically without being an expert in computer vision or pattern recognition? That’s what we do! To obtain accurate object region, we used discriminant saliency to automatically crop the ROI. After we obtained the positive and negative dataset, we started training the object detector using Adaptive ECBoost algorithm. We combined temporal consistency and Kalman filter to reduce the false positive rate. We also embed a whole class of the detector (speed limit signs, cups, chairs, ballons, etc.) into an Android cellphone to reach the real-time performance. See demo here. [arXiv paper]


Courses and Projects

2011-2012 Pattern Recognition and Machine Learning, Object Recognition, DSP, Parameter Estimation, Random Processes
  Convex Optimization Project: Robust Face Recognition via Sparse Representation
  AI-Learning Project: Logistic Regression, Conditional Random Field, Latent Dirichlet Allocation, Deep Learning
2012-2013 Statistical Learning, Design and Analysis of Algorithms
  Wavelets and Filter Banks Project: Object Recognition in Modular Neural Network via Efficient Filter Banks



2015 Fall CSE 190 Neural Networks
2016 Spring CSE 150 Artificial Intelligence