I am currently working on weakly-supervised video object detection through sequence modeling.
Recurrent and Convolutional Neural Network : My current on-going research is about weakly-supervised end-to-end learning for video object detection. Our BMVC 2016 work is about the idea of using RNN to improve object detection.
Code for this RNN-based video object detection is available at Github.
I will be at the poster session of WiML in NIPS 2016 this year to talk about this work in details.
We propose a method for generating Video Object Proposals (VOP) by considering the spatial and temporal edge contents in a video volume. We show that these VOP can learn a better video object detector through fine-tuning AlexNet model on those proposals. Youtube-Video dataset with video object proposals achieves state-of-the art detection accuracy.
We also propose an alternative test time detection framework for faster temporally-consistent detection through propagating labels by spatio-temporal clustering of those VOPs in a streaming fashion.
 EdgeBoxes, ECCV14
 R-CNN, CVPR14
 Youtube-Objects dataset
 "Context Matters: Context Matters : Refining Object Detection in Video with Recurrent Neural Networks", BMVC 16