Dense Semantic Image Segmentation with Objects and Attributes
Shuai Zheng1 Ming-Ming Cheng1 Jonathan Warrell1 Paul Sturgess1 Vibhav Vineet1 Carsten Rother2 Philip H. S. Torr1
1Torr-Vision Group, University of Oxford 2CV-Lab, TU Dresden
The concepts of objects and attributes are both important for describing images precisely since more informative verbal descriptions often contain both adjectives and nouns (e.g. ‘I see a shiny red chair’). In this project, we formulate the problem of joint visual attribute and object class image segmentation as a dense multi-labeling problem, where each pixel in an image can be associated with both an object-class and a set of visual attribute labels.
Fig. 1 Goal of the semantic image segmentation.
Fig.2 Illustration of Factorial-CRF-based Semantic Segmentation for object classes and Attributes. (a) shows the input image. (b) shows the ground truth mask image for object classes. (c) shows the attributes masks. (d) compares various CRF topologies including a grid CRF, a fully-connected CRF, and a hierarchial fully connected CRF
Dataset for Semantic Image Segmentation for Objects and Attributes
Fig. 3 Extra annotation on NYU dataset (aNYU
: Attributes augmented NYU Dataset). Attributes annotations on aPascal
, CORE
dataset.
Fill in a simple survey to let us notify you when we have a new update.
aNYU
dataset for semantic image segmentation with objects and visual attributes
aNYU
is a dataset that augments the NYU v2 dataset with 11 additional visual attributes
1: Wood(Material) 2: Painted(Material) 3: Paper(Material) 4: Glass(Material) 5: Brick(Material) 6: Metal(Material) 7: Flat(Shape) 8: Plastic(Material) 9: Textured(Material) 10: Glossy(Surface) 11: Shiny(Surface).
We have released this dataset (1449 Images in total, with train/validation split as follows. You can also randomly shuffle the 1449 images, and then take the top 725 images for training, then 100 images for validation, and the rest 624 images for the test). You can also use the AttriMarker tool to annotate other attributes you are interested in.

Resource | md5 | size | Note |
---|---|---|---|
AttriMarker.zip | 9fc5265950e027a529ee680977112f2f | 7.5MB | Tool to annotate the attributes for semantic segmentation |
AttriMarkersmi.zip | 01edd7ffa1d4d2c4c776b366a1fe4ee8 | 7.3MB | Tool to annotate the attributes for semantic segmentation UserGuide4AttriMarker.pdf |
aNYU.tar.gz | 6660ea0d900c51ec14e0122352aa92ae | 111MB | Data |
traintestsplit_aNYU_ImageSpirit.tar.gz | 3a149fafecc82dc11c621c7de87d54bf | 40KB | Train/Val Split |
NB: R:random value for visualization, G:object_class_id, B:attribute_id), for the generated annotation, first column is the region_id, second column is the object_class_id, from third to the later are the attribute_id.
CORE dataset for semantic image segmentation with objects and visual attributes
In this paper, we use the CORE dataset for evaluating dense semantic image segmentation for objects and visual attributes (including 1) Bare Metal, 2) Feathers, 3) Fur/Hair, Glass, 4) Painted Metal/Plastic, 5) Rubber,6) Scales,7) Skin, 8) Wood). This dataset contains 1059 Images in total, with train/validation split as follows. There are 594 val images, and 465 training images.
CORE.rar (md5: 8ec5f2357bf3d2301a4a9018f5ca984e, 117MB)
trantestsplit_CORE.tar.gz(md5:034819430f21ff16b09820b78fd69a3f, 1MB)
aPascal
dataset for semantic image segmentation with objects and visual attributes
In this paper, we transfer the aPascal for detection to a new aPascal
dataset for segmentation, by looking into the ground truth masks of train/validation sets in Pascal 2007-2012 datasets. The resulting dataset is a new aPascal
dataset to evaluate dense semantic image segmentation for objects and visual attributes. This dataset contains 639 Images in total, with train/validation split as follows. There are 313 val images, and 326 training images. Region-level attributes in this dataset are those attributes annotated in original aPascal
dataset.
aPascal.tar(md5: bf3ae4591ae270992a2d2e7eb177339d, 107MB)
Train/Val split: link1, link2.
Code: https://github.com/bittnt/ImageSpirit
Note you might need to use ALE library to create the pixel-wise unary potential.
[1] Shuai Zheng, Ming-Ming Cheng, Jonathan Warrell, Paul Sturgess, Vibhav Vineet, Carsten Rother, Philip H. S. Torr, “Dense Semantic Image Segmentation with Objects and Attributes“, IEEE International Conference Computer Vision and Pattern Recognition (IEEE CVPR), 2014. (accepted)[bib][slides][poster][video]
[2] Shuai Zheng, Ming-Ming Cheng, Wen-Yan Lin, Vibhav Vineet, Paul Sturgess, Nigel Crook, Niloy Mitra, Philip H. S. Torr. “ImageSpirit: Verbal Guided Image Parsing” , ACM Transactions on Graphics, 2014. (* indicates the equal contribution.) [bib][project][youtube][youku]
Related Works
[0] Ľubor Ladický, Chris Russell, Pushmeet Kohli, Philip H.S. Torr. Associative Hierarchical Random Fields. IEEE PAMI. 2014. (The Automatic Labeling Environment Library) [Stable Code][ALE1.01].
[1] A. Adams, J. Baek, and A. Davis. Fast High-Dimensional Filtering Using the Permutohedral Lattice, Eurographics 2010.
[2] Philipp Krähenbühl and Vladlen Koltun. Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials. NIPS 2011.
[3] A. Farhadi, I. Endres, D. Hoiem, and D.A. Forsyth, “Describing Objects by their Attributes”, CVPR 2009.
[4] Genevieve Patterson, James Hays. SUN Attribute Database: Discovering, Annotating, and Recognizing Scene Attributes. CVPR 2012.
[5] Ruiqi Guo and Derek Hoiem. Support Surface Prediction in Indoor Scenes ICCV, 2013.
[6] Sean Bell, Kavita Bala, Noah Snavely. Intrinsic Image in the wild. ACM TOG, 2014.
[7] Joseph Tighe and Svetlana Lazebnik “SuperParsing: Scalable Nonparametric Image Parsing with Superpixels,” International Journal of Computer Vision, 2013.
[8] Derek Hoiem, Alexei A. Efros, and Martial Hebert. Geometric Context from a Single Image. ICCV 2005.
Acknowledgment
This project is supported by EPSRC EP/I001107/2 ” Scene Understanding using New Global Energy Models”, ERC HELIOS 2013-2018Advanced Investigator Award “Towards Total Scene Understanding using Structured Models”, and Google Research Award 2012-2013. Carsten Rother was awarded an ERC Consolidator Grant.
请问下在论文《Dense Semantic Image Segmentation with Objects and Attributes》里,为什么需要处理不同级别的属性使用,如果只有一个像素级不可以吗
Good morning. I’m a student from Ecuador. I’m doing research about object recognition and found your work really interesting and useful. I really appreciate that information is available in your site, but it would be helpful for my project if you could share with me the Windows Executable and the source code if it is possible. My project is about semantic web and digital television. I have just started working on this so I will be waiting for your answer.
Thank you very much.
Geovanny
您好,我是来自中国西安的一位大四学生,最近正好在拜读您的CVPR2014有关语义分割的论文,现在有一个不情之请,想学习一下您这篇文章的代码,不知是否介意发送一份您的代码给我?
不胜感激!
还有一个小问题,您的主页在正常情况下,我无法打开,在使用VPN之后才成功打开。不知是否是您服务器的问题还是被墙掉了?
祝好!
高君宇