- Task 1 - Detection
- Task 2 - Detection-Linking
- Task 3 - Detection-Recognition
- Task 4 - Detection-Recognition-Linking
method: MapText Detection Strong Pipeline 2024-05-06
Authors: Yu Xie, Jielei Zhang, Ziyue Wang, Yuchen He, Yihan Meng, Weihang Wang, Peiyi Li, Longwen Gao, Qian Qiao
Affiliation: Bilibili Inc.
Email: xieyu04@bilibili.com
Description: In the detection task of MapText, we employed ViTAE-v2 to extract global features, utilizing an encoder-decoder network architecture (DeepSolo). Data augmentation techniques such as cropping, scaling, saturation, and contrast adjustment were applied. Pre-training was conducted using available real datasets (TextOCR, TotalText, IC15, MLT2017). Post-processing methods were also adopted.
Zhang, Q., Xu, Y., Zhang, J., & Tao, D. (2023). Vitaev2: Vision transformer advanced by exploring inductive bias for image recognition and beyond. International Journal of Computer Vision, 131(5), 1141-1162.
Ye, M., Zhang, J., Zhao, S., Liu, J., Liu, T., Du, B., & Tao, D. (2023). Deepsolo: Let transformer decoder with explicit points solo for text spotting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 19348-19357).
method: dino_map2024-04-29
Authors: Rajat Kumar Singh, Himani Shrotriya, Shivshankar Reddy, Himanshu Bhatt
Affiliation: American Express
Email: rajatks@outlook.com
Description: We trained Mask DINO for both the maps. To further improve the performance, we crop the image into 4 parts with some overlap, we predict on original image and all 4 cropped images and combine the output.
method: MapTest2024-04-29
Authors: Hongen Liu
Affiliation: Tianjin University
Date | Method | Quality | F-score | Tightness | Precision | Recall | |||
---|---|---|---|---|---|---|---|---|---|
2024-05-06 | MapText Detection Strong Pipeline | 76.13% | 92.01% | 82.75% | 94.19% | 89.92% | |||
2024-04-29 | dino_map | 73.38% | 87.34% | 84.02% | 87.21% | 87.47% | |||
2024-04-29 | MapTest | 73.09% | 89.34% | 81.82% | 90.47% | 88.23% | |||
2024-04-29 | dino_mvit | 72.41% | 86.66% | 83.56% | 89.21% | 84.25% | |||
2024-04-29 | MapTextSpotter | 70.62% | 86.71% | 81.45% | 92.61% | 81.51% | |||
2024-04-27 | ensem | 64.25% | 75.05% | 85.61% | 94.36% | 62.30% | |||
2024-03-26 | Baseline TESTR Checkpoint | 55.13% | 69.29% | 79.57% | 71.85% | 66.90% | |||
2024-03-26 | DS-LP | 53.85% | 75.17% | 71.63% | 71.76% | 78.93% | |||
2024-05-04 | MapText Using EasyOCR | 42.67% | 58.33% | 73.16% | 69.29% | 50.36% | |||
2024-04-29 | MapDet | 32.70% | 47.23% | 69.23% | 53.64% | 42.19% |