Title |
Deep Learning-based Bengali Handwritten Grapheme Classification for Kaggle Bengali.AI Challenge |
Authors |
이채현(Chaehyeon Lee) ; 최재협(Jaehyeop Choi) ; 정희철(Heechul Jung) |
DOI |
https://doi.org/10.5573/ieie.2020.57.9.67 |
Keywords |
Deep learning; handwritten character classification; CNN; data augmentation |
Abstract |
Recently, a new dataset for a Bengali recognition was released at Kaggle and an international challenge called Bengali.AI was held. In this paper, we share the results of our participation in the competition. Since Bengali character has a more complex structure than other languages, recognition of Bengali character is more challenging than the recognition of other languages. To develop the recognition algorithm on Bengali handwritten dataset, we had to classify three components individually: Grapheme root, Vowel diacritics, and Consonant diacritics from a given handwritten Bengali image. We propose a method to improve the performance of recognition for Bengali handwritten dataset in this paper. In order to compare the quality of different models and find the optimal strategy to get better accuracies, we have trained several models based on three modern architectures (GhostNet, EfficientNet, SENet). Furthermore, we have analyzed four kinds of data augmentation methods such as Mixup, Cutout, Cutmix, and GridMask. Finally, we have achieved the best accuracy of 93.74% based on an ensemble network using one EfficientNet-B5 and two SE-ResNeXt-50 with GridMask data augmentation, and this result is the top 3.1% of the 2,059 teams participating in the Kaggle Bengali.AI challenge. |