- Task 1 - Task 1 (Digits)
- Task 2 - Task 2A (Borg)
- Task 3 - Task 2B (Copiale)
- Task 4 - Task 3A (BNF)
- Task 5 - Task 3B (Ramanacoil)
method: Character Detection and Classification with Transformer Architecture2024-05-07
Authors: Raphael Baena, Syrine Kalleli , Mathieu Aubry
Affiliation: ENPC Imagine
Description: We employ a transformer-based architecture that detects characters in parallel, ensuring fast and accurate predictions. For each character, it provides a boundary box and its likelihood, which are then used for Optical Character Recognition (OCR). Notably, this approach doesn't rely on any language prior.
We first pre-trained the architecture on synthetic data consisting of text lines with characters from various fonts. We use standard classification and bounding box positioning losses for this process.
Then, we can finetune the architecture on real datasets. Unlike synthetic data, these datasets do not include ground truth bounding boxes, but only text transcriptions. Therefore, we can't use the same training losses as before. Instead, we use the pre-trained model to detect the characters' bounding boxes. We then organize the characters based on these bounding boxes and compute the Connectionist Temporal Classification (CTC) loss. During fine-tuning, our approach demonstrates the ability to learn the bounding boxes of new characters.
method: cnn_encoder_strings_round2epoch70_03072024-03-11
Authors: Jiaqianwen
Description: With cnn_encoder_chars_998_post as the pretrained model, change the dictionary, and finetune the model parameter.
method: Baseline - Long Short-Term Memory - Variant B2024-06-03
Authors: The HR-Ciphers 2024 organizers
Affiliation: Computer Vision Center
Description: An Long Short-Term Memory (LSTM) Recurrent Neural Network model inspired by Baró et al. "Optical Music Recognition by Long Short-Term Memory Networks", GREC 2017
Date | Method | CER | |||
---|---|---|---|---|---|
2024-05-07 | Character Detection and Classification with Transformer Architecture | 0.0676 | |||
2024-03-11 | cnn_encoder_strings_round2epoch70_0307 | 0.0710 | |||
2024-06-03 | Baseline - Long Short-Term Memory - Variant B | 0.0742 | |||
2024-05-10 | Sequence-to-Sequence model trained on multiple datasets for Handwriting Rec. of Historical Ciphers | 0.0760 | |||
2024-06-03 | Baseline - Long Short-Term Memory - Variant A | 0.0791 | |||
2024-05-09 | Baseline - Sequence to Sequence | 0.0956 | |||
2024-03-06 | cnn_encoder_chars_998_post | 0.1548 | |||
2024-03-01 | Optimizing Handwriting Recognition for Complex Text with Joint Symbol Counting | 0.2555 | |||
2024-03-05 | layer3-step3 | 0.2562 |