Qualcomm Multimedia R&D, Fall Internship, 2016

I did a Fall internship in Qualcomm Multimedia R&D on Deep Learning based object detection with Gokce Dane and Vasudev Bhaskaran.

LCDet: Low-complexity fully convolutional Neural Networks for Object Detection in Embedded Systems is appearing in CVPRW 2017 and another manuscript Low-complexity Object Detection with Deep Convolutional Neural Network for Embedded System has been accepted for oral presentation in SPIE Symposium.

Google Research and Machine Intelligence, Summer Internship, 2016

I had an awesome summer internship in Google's Research and Machine Intelligence with Maxwell Collins and Matthew Brown !

This work is about pose-conditioned person instance segmentation, available here .

Microsoft Research, Summer Internship, 2015


I am glad to have pursued 2015 summer research internship in the Graphics group of Microsoft Research , Redmond with Brian Guenter. The project was about self-calibrating eye-tracking for Virtual Reality Systems.

This work received the Best Paper Award in WACV 2017.

Deep Learning : Video Object Detection

In this project, we try to detect temporally consistent moving or static objects in video.


Recurrent and Convolutional Neural Network : My current on-going research is about weakly-supervised end-to-end learning for video object detection. My two-months association with Python, Theano and Lasagne has really been enriching and enjoyable so far. Our BMVC 2016 work is about the idea of using RNN to improve object detection.

Code for this RNN-based video object detection is available at Github .

I had a poster in NIPS WiML w/s in December 2016 on this work.

My project involving deep learning based video object detection is the following. We quantified of spatio-temporal edge contents to generate temporally consistent object proposals in videos. We then cluster those video object proposals in a streaming spatio-temporal volume, in order to enable object class labels propagation. For recognition, we fine-tune the classifier for Youtube-Objects with CAFFE deep learning toolbox.

This work is published in WACV 2016. The Arxiv preprint version is available . Source Code of streaming clustering of Video Object Proposals (VOP) is available at Github .


Beyond Semantic Image Segmentation

We explore multiclass image-sequence segmentation. We have evaluated several multiclass image segmentation methods and looked at improving video multiclass segmentation.

We proposed improved framework for dense-CRF based multiclass video segmentation with higher order clique potentials. We presented our poster in 2015 CVPR workshop , WiCV. The extended abstract is available work here . This work also appreared in ISOCC 2015. Source Code for the improved video inference can be found in GitHub .

Earlier, we implemented our framework for dense-CRF based multiclass video segmentation with improved accuracy compared to multiclass image segmentation with no additional time overhead. This work is based on "Large-Scale Semantic Co-Labeling of Image Sets" by Jose Alvarez et al. Source Code for this can be found in GitHub .


StreamGBH+ : Improved Streaming Hierarchical Video Segmentation

In this project, we segment a video in hierarchical streaming fashion. We augment streamGBH by University at Buffalo with bilateral filtering and motion feature and we achieve significant improvement in segmentation quality and accuracy. We call this approach streamGBH+. Our work on streamGBH+ is published in WACV 2014 .

Source Code can be found in GitHub .


Hierarchical Motion Layer Segmentation

We proposed a online hierarchical motion layer based segmentation framework. We begin with oversegmentations and group them iteratively in layers of hierarchies based on geometric distances between the regions. The oversegmentations can be in the form of image blocks, or superpixels or any other segmentations. The geometric distance has been evaluated in the form of directed divergence. Our work on online hierarchical motion segmentation framework is published in WACV 2014 .

Source Code can be found in GitHub .


Prior Industry R&D Projects

I worked in Advanced System Technology group of STMicroelectronics for over six years. Following are some of the topics I worked on : 2D to 3D video conversion; Object tracking; Tracking based compression; Advanced Video Transcoding and Rate Control Algorithms.

I worked on TraceViewer and MP4 analyzer of Vega in Interra Systems.