This is the project page for the following paper:

Improving Object Detection with Deep Convolutional Networks via Bayesian Optimization and Structured Prediction
Yuting Zhang, Kihyuk Sohn, Ruben Villegas, Gang Pan, Honglak Lee
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2015. doi: 10.1109/CVPR.2015.7298621
Oral presentation & 1st Winner of CV Community Top Paper Award: CVPR 2015 (OpenCV’s People’s Vote Winning Papers) [link]
[] [] [paper (main, supp.)] [arXiv] [project (code & model)] [slides 7M (high-res 45M)] [poster]

Object detection systems based on the deep convolutional neural network (CNN) have recently made groundbreaking advances on several object detection benchmarks. While the features learned by these high-capacity neural networks are discriminative for categorization, inaccurate localization is still a major source of error for detection. Building upon high-capacity CNN architectures, we address the localization problem by 1) using a search algorithm based on Bayesian optimization that sequentially proposes candidate regions for an object bounding box, and 2) training the CNN with a structured loss that explicitly penalizes the localization inaccuracy. In experiments, we demonstrate that each of the proposed methods improves the detection performance over the baseline method on PASCAL VOC 2007 and 2012 datasets. Furthermore, two methods are complementary and significantly outperform the previous state-of-the-art when combined.
  author={Yuting Zhang and Kihyuk Sohn and Ruben Villegas and Gang Pan and Honglak Lee},
  booktitle={{IEEE} Conference on Computer Vision and Pattern Recognition ({CVPR})},
  title={Improving Object Detection with Deep Convolutional Networks via {Bayesian} Optimization and Structured Prediction},

1. Code & Models

You can obtain the code and more information at my GitHub repository: 

You can also obtain the code in a tarball at this site (may not include the latest updates):

  • stable-v1: [tar.gz (6.3M)]

2. How does the Bayesian optimization based fine-grained search (FGS) work?

2.1. An illustrative example for FGS 


2.2. A real example for FGS


3. Detection result showcases