Inactive evaluations
method: Semi Supervised Learning for OOV Text Recognition - NHN Cloud2023-06-27
Authors: Yeongyu Kim, Jeasung Park
Affiliation: NHN Cloud
Email: yeg.kim@nhn.com
Description: Semi-supervised learning can improve classification task performance by using unlabeled raw images. We investigated the effect of consistency or contrastive loss to train unlabeled images and used the original cross entropy loss for training labeled data. Train dataset provided by OOV organizer and synthetic data (MJ, ST) were used as labeled data, and word images cropped by text detector in open benchmark dataset (TextVQA, ST-VQA, ...) were used as unlabeled data.
Authors: Xuhua Ren, Lu Wang
Email: renxuhua1993@gmail.com
Description: Scene Text Recognition is an important component in various vision and language tasks. Recognizing out-of-vocabulary (OOV) words remains a challenge, and some studies suggest distinguishing between in-vocabulary (IV) and OOV words. To address this issue, we present two novel contributions. First, we propose a novel pseudo-label generation module that combines character detection and image inpainting modules to produce substantial training data. Second, we introduce an approach that optimizes the geodesic distance margins to reduce the impact of noisy samples in pseudo-labels on model convergence during training.
method: Optimized Transformer for OOV Text Recognition - NHN Cloud2023-06-16
Authors: Yeongyu Kim
Affiliation: NHN Cloud
Email: yeg.kim@nhn.com
Description: In the OOV (Out of Vocabulary) task, even word labels that do not exist in the training data must be recognized. We use adaptive positional encoding and our own macaron style transformer encoder. The permutate algorithm was applied to the decoder to make the most of the label combinations of the train data. Synthetic data (MJ, ST) are used along with the provided OOV training data.
IV | OOV | ||||||||
---|---|---|---|---|---|---|---|---|---|
Date | Method | CRW | ED | CRW | ED | CRW | |||
2023-06-27 | Semi Supervised Learning for OOV Text Recognition - NHN Cloud | 71.92% | 92166 | 83.10% | 36494 | 60.74% | |||
2023-03-04 | Open-Vocabulary Scene Text Recognition via Pseudo-Image Labeling and Margin Loss | 70.98% | 100594 | 82.81% | 42608 | 59.15% | |||
2023-06-16 | Optimized Transformer for OOV Text Recognition - NHN Cloud | 70.82% | 98646 | 81.92% | 38529 | 59.71% | |||
2024-02-26 | Self-Supervised Learning for OOV Text Recognition - HuiGuan | 70.38% | 108657 | 81.92% | 49290 | 58.84% | |||
2022-07-21 | OCRFLY_V2 | 70.31% | 123947 | 81.02% | 46048 | 59.61% | |||
2023-02-27 | HuiGuanV2 | 70.28% | 110990 | 81.73% | 49889 | 58.83% | |||
2022-07-21 | oov3decode | 70.22% | 94259 | 81.58% | 40175 | 58.86% | |||
2022-07-21 | Vision Transformer Based Method | 70.00% | 94701 | 81.36% | 40187 | 58.64% | |||
2022-07-21 | dat | 69.90% | 96513 | 80.78% | 40082 | 59.03% | |||
2022-07-20 | ocrfly | 69.83% | 131232 | 80.63% | 53243 | 59.03% | |||
2022-07-21 | ggui | 69.80% | 96597 | 80.74% | 40171 | 58.86% | |||
2022-07-21 | spring | 69.74% | 96477 | 80.74% | 40115 | 58.74% | |||
2022-07-21 | DataMatters | 69.68% | 96544 | 80.71% | 40177 | 58.65% | |||
2022-07-20 | Cropped Recognition | 69.65% | 108766 | 80.63% | 44958 | 58.68% | |||
2022-07-21 | MaskOCR | 69.63% | 108894 | 80.60% | 44971 | 58.65% | |||
2022-07-20 | SCATTER | 69.58% | 113482 | 79.72% | 43890 | 59.45% | |||
2022-07-20 | Summer | 68.77% | 103211 | 79.48% | 42118 | 58.06% | |||
2022-07-18 | let me see see | 68.46% | 116503 | 80.81% | 51165 | 56.11% | |||
2022-07-20 | Using only real data | 68.28% | 118185 | 79.28% | 48517 | 57.27% | |||
2023-04-07 | test1 | 68.21% | 123384 | 79.73% | 56472 | 56.68% | |||
2022-08-11 | Baseline - SCATTER_v2 | 66.68% | 128219 | 77.98% | 52535 | 55.38% | |||
2022-07-18 | PTViT | 66.29% | 120449 | 77.52% | 49410 | 55.06% | |||
2022-07-20 | demo | 65.86% | 124347 | 77.25% | 48907 | 54.47% | |||
2022-08-11 | Baseline - CLOVA_v2 | 64.97% | 138479 | 75.98% | 54346 | 53.96% | |||
2022-10-19 | attn | 64.02% | 144275 | 76.47% | 64446 | 51.57% | |||
2022-07-19 | TRBA_CocoValid_InfRotation2.0_SpaceRemove | 63.98% | 132781 | 77.76% | 60693 | 50.20% | |||
2022-07-19 | HuiGuan | 63.73% | 162870 | 74.77% | 68926 | 52.69% | |||
2022-10-18 | ctc | 63.51% | 141100 | 75.63% | 63866 | 51.39% | |||
2022-07-18 | exp5_merge | 54.87% | 143070 | 70.93% | 57786 | 38.81% | |||
2022-07-20 | EOCR: Ensemble Optical Character Recognition | 46.66% | 350166 | 55.30% | 113317 | 38.02% | |||
2022-07-17 | BASELINE - Official Clova | 44.47% | 365566 | 52.61% | 114101 | 36.34% | |||
2022-07-19 | NNRC | 38.54% | 405603 | 45.36% | 136384 | 31.73% | |||
2022-07-19 | NN | 37.17% | 426074 | 43.38% | 144032 | 30.97% | |||
2022-07-18 | Cluster Character Loss in Scene Text Recognition | 31.06% | 552570 | 47.40% | 202087 | 14.73% | |||
2022-07-20 | Transformer for multi-language OCR | 0.00% | 0.00% | 0.00% | |||||
2022-07-21 | TEST | 0.00% | 0.00% | 0.00% |