Learning-Based Ordering Characters on Ancient Document.
Hyeonjin LeeRock-Hyun BaekHyun-Chul ChoiPublished in: Computational intelligence and neuroscience (2022)
Digitalizing and translating a scanned document image entails detecting the characters using a detector and translating the characters in the order they were detected with a translator. However, it is impossible to translate these characters correctly because the detector often detects them in any order. As a result, since it is critical to organize the recognized characters for proper translation, we propose ordering characters from documents with multiple variations using the strength of the learning-based model that learns the necessary operations from the data. In this task, it is difficult to order the characters written on antique handwritten documents that have deviations such as a bent or split line, as opposed to official records that have lines placed uprightly one by one. Because dealing with these many variants using a human-designed algorithm is problematic, we arrange characters printed on papers with diverse variations by taking advantage of a training model that can learn the appropriate function from data. Our method outputs both line id and y -axis and combines them to assign the sequential index. It is difficult to train using simply local regions because sequential character indexes in a large range include long-range dependencies. To solve this problem, we use network architecture to expand the receptive field as wide as possible. The network must learn to give various indexes to characters in similar places for each document because the number and area of characters vary for each document. We offer the ground truth assign method based on the absolute position to assign similar indexes to characters in similar places. Furthermore, even if the network uses absolute ground truth, the network may assign the incorrect line if the center coordinates of characters are biased in one direction. As a result, we employed the Region of Interest (ROI) from the pretrained coordinate layer, which contains position and trend information. We used the modified edit distance to compare the similarity of character indexes from the ground truth and our technique. In addition, we computed the modified fisher criterion to assess the degree of the clustering line. Consequently, our edit distance is just 0.43 times that of the human-designed algorithm, and our fisher criterion is 1.46 times that of the human-designed algorithm, improving the performance of human-designed algorithm.