Reconstruction of damaged herbarium leaves using deep learning techniques for improving classification accuracy

Abstract

Leaf is one of the most commonly used organs for species identification. The traditional identification process involves a manual analysis of individual dried or fresh leaf’s features by the botanists. Recent advancements in computer vision techniques have assisted in automating the plants families/species identification process based on the digital images of leaves. However, most of the existing studies have focused on using datasets for fresh and intact leaves. A huge amount of data for preserved plants in the form of digitized herbaria specimens have not been effectively utilized for the task of automated identification because of the presence of damaged leaves in specimens. In this study, deep learning techniques have been proposed as a tool for reconstructing the damaged herbarium leaves in order to maximize the usefulness of the digitized specimens for automated plant identification task by increasing the number of individual samples of leaves. The reconstruction results of two different families of convolution neural networks (CNNs) have been compared for data from ten different plant families namely Anacardiaceae, Annonaceae, Dipterocarpaceae, Ebenaceae, Euphorbiaceae, Malvaceae, Phyllanthaceae, Polygalaceae, Rubiaceae and Sapotaceae. The performance of automated identification task was improved by more than 20% using the reconstructed leaves images as compared to using the original data (i.e. images of specimens with damaged leaves). This work evidently suggests that deep learning techniques can be utilized for reconstruction of damaged leaves even on a challenging herbarium leaves dataset.

Publication
Ecological Informatics