Research: Object Proposals and Video Object Detection

Ongoing work

I am currently working on weakly-supervised video object detection through sequence modeling.


Recurrent and Convolutional Neural Network : My current on-going research is about weakly-supervised end-to-end learning for video object detection. Our BMVC 2016 work is about the idea of using RNN to improve object detection.

Code for this RNN-based video object detection is available at Github


I will be at the poster session of WiML in NIPS 2016 this year to talk about this work in details.

Video Object Detection

We propose a method for generating Video Object Proposals (VOP) by considering the spatial and temporal edge contents in a video volume. We show that these VOP can learn a better video object detector through fine-tuning AlexNet model on those proposals. Youtube-Video dataset with video object proposals achieves state-of-the art detection accuracy.

We also propose an alternative test time detection framework for faster temporally-consistent detection through propagating labels by spatio-temporal clustering of those VOPs in a streaming fashion.


The work has been published in WACV 2016 and the latest version is available in Arxiv . Source Code of streaming clustering of Video Object Proposals (VOP) is available at Github .


Object detection on Youtube-Objects


[1] EdgeBoxes, ECCV14

[2] R-CNN, CVPR14

[3] Youtube-Objects dataset

[4] GRU

[5] "Context Matters: Context Matters : Refining Object Detection in Video with Recurrent Neural Networks", BMVC 16