Results - ICDAR 2024 Competition on Handwriting Recognition of Historical Ciphers

method: Sequence-to-Sequence model trained on multiple datasets for Handwriting Rec. of Historical Ciphers2024-05-10

Authors: Simon Corbillé, Elisa H Barney Smith

Affiliation: Machine Learning, Luleå Tekniska Universitet

Description: 1 - Data specification
The images are resized and padding to a fix size of pixels (regarding the mean value of height and width). The training data is divided randomly into a train (80%) and a validation set (20%). During the training, we use affine augmentation on train data for data augmentation.
We found empirically, the use of a combination of the cipher dataset improve the recognition performance. For the task 2A, we train the model on a combination of Borg and BNF datasets. For task 2B, we train the model on a combination of Borg, Copiale and BNF. For task 3A, we train the model on a combination of Copiale and BNF. For the task 3B, we train the model on combination of Borg, Copiale, Ramanacoil datasets and consider classes where the number of samples in the training set is upper to 10.

2 - Method
We use a Sequence-to-Sequence model, one of the state-of-the-art architectures for handwriting recognition. It is composed by an encoder, an attention component and a decoder. The encoder uses a CRNN architecture. It is composed by convolutional layers for extract spatial features and LSTM layers for extract temporal features. The attention module focusses the decoders on a specific part of the features extracted by the encoder to predict character by character.
The model is trained with a hybrid loss (CTC loss for the encoder and Cross entropy loss for the decoder).

3 - Results
We evaluate our model on the validation set with the Character Error Rate (CER) metric. In this case a character can be a letter (A, B, C …), character, a symbol (Libra, Saturn …) or a letter with a diacritic. We can note the number of samples in validation set is low thus the results should be interpreted with caution.

We obtain:
Task 2A: 7.23% CER
Task 2B: 0.75% CER
Task 3A: 1.55% CER
Task 3B: 4.12% CER

We can note:
-Symbols are clearly separated and the writing is of good quality.
-Task 2A contains images with a fold at the beginning or at the end of the line.
-The lines are not clearly segmented in task 3B and can contain the previous and/or the next line.

method: Baseline - Long Short-Term Memory - Variant A2024-06-04

Authors: The HR-Ciphers 2024 organizers

Affiliation: Computer Vision Center

Description: An Long Short-Term Memory (LSTM) Recurrent Neural Network model inspired by Baró et al. "Optical Music Recognition by Long Short-Term Memory Networks", GREC 2017

Baró, A., Riba, P., Calvo-Zaragoza, J., Fornés, A. (2018). Optical Music Recognition by Long Short-Term Memory Networks. In: Fornés, A., Lamiroy, B. (eds) Graphics Recognition. Current Trends and Evolutions. GREC 2017. Lecture Notes in Computer Science(), vol 11009. Springer, Cham. https://doi.org/10.1007/978-3-030-02284-6_7

method: Baseline - Long Short-Term Memory - Variant B2024-06-04

Authors: The HR-Ciphers 2024 organizers

Affiliation: Computer Vision Center

Description: An Long Short-Term Memory (LSTM) Recurrent Neural Network model inspired by Baró et al. "Optical Music Recognition by Long Short-Term Memory Networks", GREC 2017

Ranking Table

Description Paper Source Code

Date	Method	CER
2024-05-10	Sequence-to-Sequence model trained on multiple datasets for Handwriting Rec. of Historical Ciphers	0.0089
2024-06-04	Baseline - Long Short-Term Memory - Variant A	0.0109
2024-06-04	Baseline - Long Short-Term Memory - Variant B	0.0125
2024-05-07	Character Detection and Classification with Transformer Architecture	0.0173
2024-05-29	Baseline - Sequence to Sequence	0.1569

Inactive evaluations

method: Sequence-to-Sequence model trained on multiple datasets for Handwriting Rec. of Historical Ciphers2024-05-10

method: Baseline - Long Short-Term Memory - Variant A2024-06-04

method: Baseline - Long Short-Term Memory - Variant B2024-06-04

Ranking Table

Ranking Graphic