Video Object Segmentation

Video Object Segmentation

This paper tackles the task of semi-supervised video object segmentation, i.e., the separation of an object from the background in a video, given the mask of the first frame. We present One-Shot Video Object Segmentation (OSVOS), based on a fully-convolutional neural network architecture that is able to successively transfer generic semantic information, learned on ImageNet, to the task of foreground segmentation, and finally to learning the appearance of a single annotated object of the test sequence (hence one-shot). Although all frames are processed independently, the results are temporally coherent and stable. We perform experiments on two annotated video segmentation databases, which show that OSVOS is fast and improves the state of the art by a significant margin (79.8% vs 68.0%).

Publications

2018

PDF Deep Perm-Set Net: Learn to Predict Sets with Unknown Permutation and Cardinality Using Deep Neural Networks.
S. Hamid Rezatofighi, Roman Kaskman, Farbod T. Motlagh, Qinfeng Shi, Daniel Cremers, Laura Leal-Taixe, and Ian Reid.
arxiv:1805.00613, 2018.
[pdf]

PDF Lifting Layers: Analysis and Applications.
Peter Ochs, Tim Meinhardt, Laura Leal-Taixe, and Michael Moeller.
European Conference on Computer Vision (ECCV), 2018.
[pdf]

PDF Deep Appearance Maps.
Maxim Maximov, Tobias Ritschel, and Mario Fritz.
arxiv:1804.00863, 2018.
[pdf]

PDF LIME: Live Intrinsic Material Estimation.
Abhimitra Meka, Maxim Maximov, Michael Zollhoefer, Avishek Chatterjee, Hans-Peter Seidel, Christian Richardt, and Christian Theobalt.
Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
[pdf]

PDF Discrete-Continuous ADMM for Transductive Inference in Higher-Order MRFs.
Emanuel Laude, Jan-Hendrik Lange, Jonas Schuepfer, Csaba Domokos, L. Leal-Taixe, Frank R. Schmidt, Bjoern Andres, and Daniel Cremers.
Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
[pdf]

2017

PDF Learning Proximal Operators: Using Denoising Networks for Regularizing Inverse Imaging Problems.
Tim Meinhardt, Michael Moeller, Caner Hazirbas, and Daniel Cremers.
IEEE International Conference on Computer Vision (ICCV), 2017.
[pdf] [code]

PDF Fusion of Head and Full-Body Detectors for Multi-Object Tracking.
R. Henschel, L. Leal-Taixe, D. Cremers, and B. Rosenhahn.
Computer Vision and Pattern Recognition Workshops (CVPRW), 2017.
[pdf]

PDF Tracking the Trackers: An Analysis of the State of the Art in Multiple Object Tracking.
L. Leal-Taixe, A. Milan, K. Schindler, D. Cremers, I. Reid, and S. Roth.
arXiv:1704.02781, 2017.
[pdf]

PDF Deep Depth from Focus.
C. Hazirbas, L. Leal-Taixe, and D. Cremers.
arXiv:1704.01085, 2017.
[pdf] [challenge]

PDF One-Shot Video Object Segmentation.
S. Caelles, K.-K. Maninis, J. Pont-Tuset, L. Leal-Taixe, D. Cremers, and L. Van Gool.
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
[pdf] [code]

PDF Image-based localization using LSTMs for structured feature correlation.
F. Walch, C. Hazirbas, L. Leal-Taixe, T. Sattler, S. Hilsenbeck, and D. Cremers.
IEEE International Conference on Computer Vision (ICCV), 2017.
[pdf] [challenge]

PDF Video Object Segmentation Without Temporal Information.
K.-K. Maninis, S. Caelles, Y. Chen, J. Pont-Tuset, L. Leal-Taixe, D. Cremers, and L. Van Gool.
Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2017.
[pdf]

2016

PDF Tracking with multi-level features.
R. Henschel, L. Leal-Taixe, B. Rosenhahn, and K. Schindler.
arXiv:1607.07304, 2016.
[pdf]

PDF Learning by tracking: siamese CNN for robust target association.
L. Leal-Taixe, C. Canton-Ferrer, and K. Schindler.
IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPR). DeepVision: Deep Learning for Computer Vision., 2016.
[pdf]

PDF MOT16: A benchmark for multi-object tracking.
A. Milan, L. Leal-Taixe, I. Reid, S. Roth, and K. Schindler.
arXiv:1603.00831, 2016.
[pdf] [challenge]

2015

PDF Continuous Pose Estimation with a Spatial Ensemble of Fisher Regressors.
M. Fenzi, L. Leal-Taixe, J. Ostermann, and T. Tuytelaars.
IEEE International Conference on Computer Vision (ICCV), 2015.
[pdf]

PDF Joint Tracking and Segmentation of Multiple Targets.
A. Milan, L. Leal-Taixe, K. Schindler, and I. Reid.
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015.
[pdf] [code]

PDF MOTChallenge 2015: Towards a Benchmark for Multi-Target Tracking.
L. Leal-Taixe, A. Milan, I. Reid, S. Roth, and K. Schindler.
arXiv:1504.01942, 2015.
[pdf] [challenge]

PDF Automatic tracking of vessel-like structures from a single starting point.
D.A.B. Oliveria, L. Leal-Taixe, R.Q. Feitosa, and B. Rosenhahn.
Computerized Medical Imaging and Graphics, 2015.
[pdf]

PDF Pose Estimation of Object Categories in Videos Using Linear Programming.
M. Fenzi, L. Leal-Taixe, K. Schindler, and B. Rosenhahn.
IEEE Winter Conference on Applications of Computer Vision (WACV), 2015.
[pdf]

2014

PDF Efficient multiple people tracking using minimum cost arborescences.
R. Henschel, L. Leal-Taixe, and B. Rosenhahn.
German Conference on Pattern Recognition (GCPR), 2014.
[pdf]

PDF Learning an image-based motion context for multiple people tracking.
L. Leal-Taixe, M. Fenzi, A. Kuznetsova, B. Rosenhahn, and S. Savarese.
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014.
[pdf] [code]

PDF Multiple object tracking with context awareness.
L. Leal-Taixe.
PhD Thesis, 2014.
[pdf]

2013

PDF Class generative models based on feature regression for pose estimation of object categories.
M. Fenzi, L. Leal-Taixe, B. Rosenhahn, and J. Ostermann.
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2013.
[pdf]

PDF Real-time sign language recognition using a consumer depth camera.
A. Kuznetsova, L. Leal-Taixe, and B. Rosenhahn.
IEEE International Conference on Computer Vision (ICCV) Workshops. 3rd Workshop on Consumer Depth Cameras for Computer Vision (CDC4CV), 2013.
[pdf]

PDF Pedestrian interaction in tracking: the social force model and global optimization methods.
L. Leal-Taixe and B. Rosenhahn.
Modeling, Simulation and Visual Analysis of Crowds: A multidisciplinary perspective, Springer Berlin Heidelberg, 2013.
[pdf]

2012

PDF Outdoor and Large-Scale Real-World Scene Analysis.
F. Dellaert, J.-M. Frahm, M. Pollefeys, L. Leal-Taixe, and B. Rosenhahn.
Springer Berlin Heidelberg, 2012.
[pdf]

PDF 3D Object Recognition and Pose Estimation for Multiple Objects using Multi-Prioritized RANSAC and Model Updating.
M. Fenzi, R. Dragon, L. Leal-Taixe, B. Rosenhahn, and J. Ostermann.
German Conference on Pattern Recognition (GCPR), 2012.
[pdf]

PDF Branch-and-price global optimization for multi-view multi-target tracking.
L. Leal-Taixe, G. Pons-Moll, and B. Rosenhahn.
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2012.
[pdf] [code] [poster] [video]

PDF Exploiting pedestrian interaction via global optimization and social behaviors.
L. Leal-Taixe, G. Pons-Moll, and B. Rosenhahn.
Outdoor and Large-Scale Real-World Scene Analysis, Springer Berlin Heidelberg, 2012.
[pdf]

PDF Three dimensional tracking of exploratory behavior of barnacle cyprids using stereoscopy.
S. Maleschlijski, G. H. Sendra, A. Di Fino, L. Leal-Taixe, I. Thome, A. Terfort, N. Aldred, M. Grunze, A. S. Clare, B. Rosenhahn, and A. Rosenhahn.
Biointerphases, 2012.
[pdf]

PDF Data-driven Manifolds for Outdoor Motion Capture.
G. Pons-Moll, L. Leal-Taixe, J. Gall, and B. Rosenhahn.
Outdoor and Large-Scale Real-World Scene Analysis, Springer Berlin Heidelberg, 2012.
[pdf]

2011

PDF Everybody needs somebody: Modeling social and grouping behavior on a linear programming multiple people tracker.
L. Leal-Taixe, G. Pons-Moll, and B. Rosenhahn.
IEEE International Conference on Computer Vision (ICCV) Workshops. 1st Workshop on Modeling, Simulation and Visual Analysis of Large Crowds, 2011.
[pdf] [code] [video]

PDF A stereoscopic approach for three dimensional tracking of marine biofouling microorganisms.
S. Maleschlijski, L. Leal-Taixe, S. Weisse, A. Di Fino, N. Aldred, A. S. Clare, G. H. Sendra, B. Rosenhahn, and A. Rosenhahn.
Microscopic Image Analysis with Applications in Biology (MIAAB), 2011.
[pdf]

PDF Efficient and robust shape matching for model based human motion capture.
G. Pons-Moll, L. Leal-Taixe, T. Truong, and B. Rosenhahn.
German Conference on Pattern Recognition (GCPR), 2011.
[pdf]

PDF Outdoor human motion capture using inverse kinematics and von Mises-Fisher sampling.
G. Pons-Moll, A. Baak, J. Gall, L. Leal-Taixe, M. Mueller, H.-P.Seidel, and B. Rosenhahn.
IEEE International Conference on Computer Vision (ICCV), 2011.
[pdf] [supplementary]

PDF Understanding what we cannot see: automatic analysis of 4D digital in-line holographic microscopy data.
L. Leal-Taixe, M. Heydt, A. Rosenhahn, and B. Rosenhahn.
Video Processing and Computational Video, Springer Berlin Heidelberg, 2011.
[pdf]

2010

PDF Classification of swimming microorganisms motion patterns in 4D digital in-line holography data.
L. Leal-Taixe, M. Heydt, S. Weisse, A. Rosenhahn, and B. Rosenhahn.
German Conference on Pattern Recognition (GCPR), 2010.
[pdf] [video]

2009

PDF Automatic tracking of swimming microorganisms in 4D digital in-line holography data.
L. Leal-Taixe, M. Heydt, A. Rosenhahn, and B. Rosenhahn.
IEEE Workshops on Motion and Video Computing (WMVC), 2009.
[pdf]

PDF Automatic segmentation of multi-stain histology images of arteries.
L. Leal-Taixe.
Master Thesis, 2009.
[pdf]

PDF Automatic segmentation of arteries in multi-stain histology images.
L. Leal-Taixe, A. U. Coskun, B. Rosenhahn, and D. Brooks.
World Congress on Medical Physics and Biomedical Engineering, 2009.
[pdf]