The Playment survey on task difficulty suggests that Semantic segmentation is one of the difficult tasks for our annotators. This inspired us to automate the ground-truth annotation to reduce the workforce efforts and efficiently handle our resources. To this extent, we decided to leverage the existing interactive segmentation methods to retrieve apriori map so that our annotators could correct the machine decisions at will instead of labeling from scratch. Thus making the approach an example of our philosophy human-in-the-loop for machine learning.
To begin with, the task involves assigning pixel level labels to an input image. More often than not our images are high dimensional often ranging between 720p to 4k resolution.
Our automated solution is a deep learning implementation of the state-of-the-art interactive segmentation model. The approach is based on the prior knowledge of extreme points of the object of interest instead of a bounding box around the object. This information is injected as an additional channel of heatmap at the extreme points. Though traditional methods have been end-to-end full image based segmentation, the current approach suggests use of object of interest to achieve a better segmentation performance.
The deep learning architecture is inspired by landmark works — ResNet with atrous convolutions followed by pyramid scene parsing network were used. Both the atrous convolutions introduced in the Deeplab works and the PSPNet help handle images at multiple scales by the nature of their design and thus the model is invariant to the input size. Though distance map based interactive segmentation approaches were proposed, the current work out performs them in IoU terms.
While the model requires a minimum of 4 extreme points to evaluate the segmentation maps, our experience suggests we use 4 points to define an object. The presence of unseen classes is quite ubiquitous in our experience. Cross dataset evaluation studies have demonstrated that the solution is class-agnostic(below table shows better performance with extreme points on Grabcut dataset with unseen classes w.r.t the PASCAL training dataset) and hence our preferred approach which enables on automating unseen classes in the images.
This tool automation has significantly brought down the time spent to annotate the images for semantic segmentation and helped us deliver reliable annotations as per the assured Playment standards continuing the quality assurance.
Our Semi-automatic Segmentation tool ultimately simplifies our workforce to label pixel-perfect annotations much faster, so we can continue to label all of your semantic segmentation data at best-in-class accuracy and cost. We are excited to continue exploring other ML techniques to improve our data labeling platform and annotation tools — schedule your 1-1 demo to learn more from our experts today!