Semantic Segmentation of Herbarium Specimens Using Deep Learning Techniques

Abstract

Automated identification of herbarium species is of great interest as quite a number of these collections are still unidentified while others need to be updated following recent taxonomic knowledge. One challenging task in automated identification process of these species is the existence of visual noise such as plant information labels, color codes and other scientific annotations which are mostly placed at different locations on the herbarium mounting sheet. This kind of noise needs to be removed before applying different species identification models as it can significantly affect the models’ performance. In this work we propose the use of deep learning semantic segmentation model as a method for removing the background noise from herbarium images. Two different semantic segmentation models, namely DeepLab version three plus (DeepLabv3+) and the Full- Resolution Residual Networks (FRNN-A), were applied and evaluated in this study. The results indicate that FRNN-A performed slightly better with a mean Intersection of Union (IoU) of 99.2% compared to 98.1% mean IoU attained by DeepLabv3+ model on the test set. The pixel -wise accuracy for two classes (herbarium specimen and background) was found to be 99.5% and 99.7%, respectively using FRNN-A model while the DeepLabv3+ was able to segment herbarium specimen and the rest of the background with a pixel-wise accuracy of 98.4% and 99.6%, respectively. This work evidently suggests that deep learning semantic segmentation could be successfully applied as a pre-processing step in removing visual noise existing in herbarium images before applying different classification models.

Publication
Lecture Notes in Electrical Engineering