Dense Semantic Image Segmentation with Objects and Attributes

Dense Semantic Image Segmentation with Objects and Attributes

Shuai Zheng1 Ming-Ming Cheng1 Jonathan Warrell1 Paul Sturgess1 Vibhav Vineet1 Carsten Rother2   Philip H. S. Torr1

1Torr-Vision Group, University of Oxford    2CV-Lab, TU Dresden

The concepts of objects and attributes are both important for describing images precisely since more informative verbal descriptions often contain both adjectives and nouns (e.g. ‘I see a shiny red chair’). In this project, we formulate the problem of joint visual attribute and object class image segmentation as a dense multi-labeling problem, where each pixel in an image can be associated with both an object-class and a set of visual attribute labels.


Fig. 1 Goal of the semantic image segmentation.


Fig.2  Illustration of Factorial-CRF-based Semantic Segmentation for object classes and Attributes. (a) shows the input image. (b) shows the ground truth mask image for object classes. (c) shows the attributes masks. (d) compares various CRF topologies including a grid CRF, a fully-connected CRF, and a hierarchial fully connected CRF

Dataset for Semantic Image Segmentation for Objects and Attributes


Fig. 3 Extra annotation on NYU dataset (aNYU: Attributes augmented NYU Dataset). Attributes annotations on aPascal, CORE dataset.

Fill in a simple survey to let us notify you when we have a new update.

aNYU dataset for semantic image segmentation with objects and visual attributes
aNYU is a dataset that augments the NYU v2 dataset with 11 additional visual attributes

1: Wood(Material) 2: Painted(Material) 3: Paper(Material) 4: Glass(Material) 5: Brick(Material) 6: Metal(Material) 7: Flat(Shape) 8: Plastic(Material) 9: Textured(Material) 10: Glossy(Surface) 11: Shiny(Surface).

We have released this dataset (1449 Images in total, with train/validation split as follows. You can also randomly shuffle the 1449 images, and then take the top 725 images for training, then 100 images for validation, and the rest 624 images for the test). You can also use the AttriMarker tool to annotate other attributes you are interested in.

This is a screenshot of the Qt-based software for AttributeMarker.
Resource md5 size Note 9fc5265950e027a529ee680977112f2f 7.5MB Tool to annotate the attributes for semantic segmentation 01edd7ffa1d4d2c4c776b366a1fe4ee8 7.3MB Tool to annotate the attributes for semantic segmentation  UserGuide4AttriMarker.pdf
aNYU.tar.gz 6660ea0d900c51ec14e0122352aa92ae 111MB Data
traintestsplit_aNYU_ImageSpirit.tar.gz 3a149fafecc82dc11c621c7de87d54bf 40KB Train/Val Split

NB: R:random value for visualization, G:object_class_id, B:attribute_id), for the generated annotation, first column is the region_id, second column is the object_class_id, from third to the later are the attribute_id.

CORE dataset for semantic image segmentation with objects and visual attributes
In this paper, we use the CORE  dataset for evaluating dense semantic image segmentation for objects and visual attributes (including 1) Bare Metal, 2) Feathers, 3) Fur/Hair, Glass, 4) Painted Metal/Plastic, 5) Rubber,6) Scales,7) Skin, 8) Wood). This dataset contains 1059 Images in total, with train/validation split as follows. There are 594 val images, and 465 training images.

CORE.rar (md5: 8ec5f2357bf3d2301a4a9018f5ca984e, 117MB)

trantestsplit_CORE.tar.gz(md5:034819430f21ff16b09820b78fd69a3f, 1MB)

aPascal dataset for semantic image segmentation with objects and visual attributes
In this paper, we transfer the aPascal for detection to a new aPascal dataset for segmentation, by looking into the ground truth masks of train/validation sets in Pascal 2007-2012 datasets. The resulting dataset is a new aPascal dataset to evaluate dense semantic image segmentation for objects and visual attributes. This dataset contains 639 Images in total, with train/validation split as follows. There are 313 val images, and 326 training images. Region-level attributes in this dataset are those attributes annotated in original aPascal dataset.

aPascal.tar(md5: bf3ae4591ae270992a2d2e7eb177339d, 107MB)

Train/Val split: link1, link2.


Note you might need to use ALE library to create the pixel-wise unary potential.

[1] Shuai Zheng, Ming-Ming Cheng, Jonathan Warrell, Paul Sturgess, Vibhav Vineet, Carsten Rother, Philip H. S. Torr, “Dense Semantic Image Segmentation with Objects and Attributes“, IEEE International Conference Computer Vision and Pattern Recognition (IEEE CVPR), 2014. (accepted)[bib][slides][poster][video]

[2] Shuai ZhengMing-Ming Cheng, Wen-Yan Lin, Vibhav Vineet, Paul Sturgess, Nigel Crook, Niloy Mitra, Philip H. S. Torr. “ImageSpirit: Verbal Guided Image Parsing” , ACM Transactions on Graphics, 2014. (* indicates the equal contribution.) [bib][project][youtube][youku]

Related Works

[0] Ľubor Ladický, Chris Russell, Pushmeet Kohli, Philip H.S. Torr. Associative Hierarchical Random Fields. IEEE PAMI. 2014. (The Automatic Labeling Environment Library) [Stable Code][ALE1.01].

[1] A. Adams, J. Baek, and A. Davis. Fast High-Dimensional Filtering Using the Permutohedral Lattice, Eurographics 2010.

[2] Philipp Krähenbühl and Vladlen Koltun. Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials. NIPS 2011.

[3] A. Farhadi, I. Endres, D. Hoiem, and D.A. Forsyth, “Describing Objects by their Attributes”, CVPR 2009.

[4] Genevieve Patterson, James Hays. SUN Attribute Database: Discovering, Annotating, and Recognizing Scene Attributes.  CVPR 2012.

[5] Ruiqi Guo and Derek Hoiem. Support Surface Prediction in Indoor Scenes  ICCV, 2013.

[6] Sean Bell, Kavita Bala, Noah Snavely. Intrinsic Image in the wild. ACM TOG, 2014.

[7] Joseph Tighe and Svetlana Lazebnik “SuperParsing: Scalable Nonparametric Image Parsing with Superpixels,”  International Journal of Computer Vision, 2013.

[8] Derek Hoiem, Alexei A. Efros, and Martial Hebert. Geometric Context from a Single Image. ICCV 2005.


This project is supported by EPSRC EP/I001107/2Scene Understanding using New Global Energy Models”, ERC HELIOS 2013-2018Advanced Investigator Award Towards Total Scene Understanding using Structured Models”, and Google Research Award 2012-2013. Carsten Rother was awarded an ERC Consolidator Grant.

3 thoughts on “Dense Semantic Image Segmentation with Objects and Attributes

  • at

    请问下在论文《Dense Semantic Image Segmentation with Objects and Attributes》里,为什么需要处理不同级别的属性使用,如果只有一个像素级不可以吗

  • at

    Good morning. I’m a student from Ecuador. I’m doing research about object recognition and found your work really interesting and useful. I really appreciate that information is available in your site, but it would be helpful for my project if you could share with me the Windows Executable and the source code if it is possible. My project is about semantic web and digital television. I have just started working on this so I will be waiting for your answer.

    Thank you very much.


  • at



Comments are closed.