ResDense-focal-DeepLabV3+enabled litchi branch semantic segmentation for robotic harvesting

COMPUTERS AND ELECTRONICS IN AGRICULTURE(2023)

Cited 8|Views8
No score
Abstract
Robotic harvesting of litchi fruit should be achieved by clamping and shearing the bearing branches because the litchi fruits are small and can be easily damaged. Therefore, accurate segmentation of litchi branches is critical. This study initially used the generic DeepLabV3+ model to semantically segment litchi branches. To further optimize its segmentation performance, a novel backbone of ResDense was proposed to replace the original backbone of DeepLabV3+ by integrating ResNet and DenseNet networks. Moreover, the cross-entropy loss function in DeepLabV3+ was replaced by an improved focal loss, thus constructing an improved ResDense-focalDeepLabV3+ (or RDf-DeepLabV3+) network for accurate segmentation of litchi branches in orchard environments. To investigate the feature extraction capability using the proposed RDf-DeepLabV3+, the dataset was further divided into three levels of complexity (i.e., simple, medium, and complex) to train, validate, and test the model. In addition, to prove the superiority of the proposed model, the other three powerful models were utilized for a comprehensive comparison, including ResNet-CE, DenseNet, and Xception. The results showed that our RDf-DeepLabV3+ model outperformed others. More specifically, the proposed model achieved a mean Intersection over Union (mIoU) value of 0.797, which was 0.9%, 1.8%, and 12.9% higher than ResNet-CE, DenseNet, and Xception, respectively. When targeting "simple" level images, the mIoU of 0.848 was achieved by the proposed model with only slight improvements compared to others. While targeting "medium" level images, the mIoU of the proposed model was 0.811, 0.7%, 2.7%, and 14.8% better than the same ones, respectively. When considering "complex" level images, the mIoU of 0.770 was achieved by our model, improving by 0.6%, 1.3%, and 15.3%, respectively. Moreover, it was found that the proposed model's training and testing speed are about 7.7% faster than those of Xception. Overall, the RDf-DeepLabV3+ developed in this study achieved the optimal semantic segmentation result on litchi branches.
More
Translated text
Key words
Deep learning,DenseNet,Focal loss,Model optimization,ResNet
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined