Results - Hierarchical Text: Challenge on Unified OCR and Layout Analysis

method: Upstage KR2023-04-01

Authors: Yunsu Kim, Seung Shin, Bibek Chaudhary, Sanghoon Kim, Dahyun Kim, Sehwan Joo

Affiliation: Upstage

Description: In addressing hierarchical text detection, we implement a two-step approach. First, we perform multi-class semantic segmentation where classes are word, line, and paragraph regions. Then, we use the predicted probability map to extract and organize these entities hierarchically. Specifically, we utilize ensemble of UNets with ImageNet-pretrained EfficientNetB7/MitB4 backbones to extract class masks. Connected components are identified in the predicted mask to separate words from each other, same for lines and paragraphs. Then, word_i is assigned as a child of line_j if line_j has the highest IoU with word_i compared to all other lines. This process is similarly applied to lines and paragraphs.
For training, we erode target entities and dillate predicted entities. Also we ensure that target entities maintain a gap between them. We use symmetric Lovasz loss. We use SynthText dataset to pretrain our models.

method: Upstage KR2023-03-30

Authors: Yunsu Kim, Seung Shin, Bibek Chaudhary, Sanghoon Kim, Dahyun Kim, Sehwan Joo

Affiliation: Upstage

method: hiertext_submit_0401_curve_199_v22023-04-01

Authors: Zhong Humen, Tang Jun, Yang zhibo, Song xiaoge

Affiliation: Alibaba DAMO OCR Team

Email: zhonghumen@gmail.com

Description: Our method is a single end-to-end model designed for hierarchical text detection. Our model utilizes the pipeline of DETR-like methods and design a hierarchical decoder so that the model can detect more text instances with less queries for reducing computational cost.
The model uses ImageNet pretrained Swin-S as backbone and is trained only on HierText training set. Single-scale inference is used during testing. No external data and synthetic data is used.

Ranking Table

Description Paper Source Code

		Word					Line					Paragraph
Date	Method	PQ	Fscore	Precision	Recall	Tightness	PQ	Fscore	Precision	Recall	Tightness	PQ	Fscore	Precision	Recall	Tightness
2023-04-01	Upstage KR	0.7980	0.9188	0.9473	0.8920	0.8685	0.7640	0.8834	0.9132	0.8556	0.8648	0.7454	0.8615	0.8740	0.8494	0.8652
2023-03-30	Upstage KR	0.7948	0.9138	0.9496	0.8807	0.8697	0.7657	0.8797	0.9089	0.8523	0.8704	0.7479	0.8591	0.8711	0.8474	0.8705
2023-04-01	hiertext_submit_0401_curve_199_v2	0.7671	0.8818	0.9271	0.8408	0.8699	0.7143	0.8332	0.8932	0.7807	0.8573	0.6397	0.7483	0.8125	0.6935	0.8548
2023-03-31	hiertext_submit_curve_199	0.7671	0.8818	0.9271	0.8408	0.8699	0.7121	0.8314	0.8882	0.7814	0.8565	0.6406	0.7497	0.8135	0.6953	0.8545
2023-03-28	hiertext_submit_0328	0.7663	0.8811	0.9254	0.8408	0.8698	0.7115	0.8309	0.8865	0.7819	0.8563	0.6383	0.7476	0.8115	0.6930	0.8538
2023-11-14	123	0.7663	0.8811	0.9254	0.8408	0.8698	0.7115	0.8309	0.8865	0.7819	0.8563	0.6383	0.7476	0.8115	0.6930	0.8538
2023-04-01	Global and local instance segmentations for hierarchical text detection	0.7616	0.9072	0.9345	0.8816	0.8395	0.6850	0.8222	0.8024	0.8431	0.8331	0.6255	0.7511	0.7400	0.7625	0.8328
2023-04-02	DeepSE hierarchical detection model	0.7530	0.8849	0.9350	0.8399	0.8510	0.6943	0.8243	0.8265	0.8221	0.8423	0.6851	0.8139	0.8169	0.8110	0.8417
2023-03-24	NVTextSpotter	0.7369	0.8707	0.9510	0.8029	0.8463	0.6776	0.8042	0.9387	0.7035	0.8425	0.6551	0.7804	0.8182	0.7460	0.8394
2023-03-31	Multi Class Deformable Detr for Hierarchal Text Detection	0.7320	0.8889	0.9065	0.8720	0.8235	0.6901	0.8413	0.8483	0.8345	0.8202	0.6380	0.7807	0.7707	0.7909	0.8173
2023-03-17	NVTextSpotter	0.7215	0.8550	0.9522	0.7758	0.8439	0.4803	0.5911	0.8283	0.4596	0.8126	0.6343	0.7599	0.8161	0.7109	0.8348
2023-04-01	Clova DEER	0.7175	0.9195	0.9309	0.9083	0.7803	0.6985	0.8900	0.9126	0.8686	0.7848	0.6531	0.8350	0.8378	0.8322	0.7822
2023-04-02	Ensemble of three task-specific Clova DEER	0.7154	0.9203	0.9382	0.9031	0.7774	0.6964	0.8904	0.9175	0.8649	0.7821	0.6529	0.8370	0.8417	0.8323	0.7801
2023-03-31	Hierarchical Transformers for Text Detection	0.7044	0.8609	0.8847	0.8383	0.8182	0.6930	0.8523	0.8783	0.8278	0.8131	0.6346	0.7840	0.7784	0.7897	0.8094
2023-04-01	SCUT-HUAWEI	0.7008	0.8958	0.8979	0.8937	0.7823	0.6770	0.8620	0.9046	0.8233	0.7853	0.5314	0.6906	0.7403	0.6472	0.7696
2023-12-28	Hi-SAM	0.6430	0.8286	0.8766	0.7856	0.7760	0.6696	0.8530	0.9109	0.8020	0.7850	0.5909	0.7597	0.8152	0.7113	0.7779
2022-11-24	DQ-DETR	0.6101	0.7727	0.8064	0.7417	0.7896	0.2696	0.3591	0.2681	0.5439	0.7507	0.1838	0.2472	0.1599	0.5441	0.7436
2022-11-10	nn_l	0.5362	0.7526	0.8823	0.6561	0.7125	0.5089	0.6978	0.8886	0.5744	0.7292	0.3758	0.5222	0.7347	0.4050	0.7197
2023-09-11	nn_g_cloud	0.5234	0.7386	0.8455	0.6558	0.7086	0.4914	0.6864	0.7973	0.6027	0.7158	0.3863	0.5495	0.6117	0.4989	0.7030
2023-05-15	nn_fixed	0.5204	0.7384	0.8464	0.6548	0.7048	0.4921	0.6880	0.8749	0.5669	0.7153	0.3484	0.5006	0.4835	0.5190	0.6959
2023-05-15	nn_adaptive	0.5204	0.7384	0.8464	0.6548	0.7048	0.4909	0.6859	0.8851	0.5599	0.7156	0.3867	0.5517	0.6101	0.5035	0.7009
2022-11-10	nn	0.5116	0.7316	0.8575	0.6379	0.6993	0.4577	0.6402	0.8583	0.5104	0.7150	0.2072	0.2965	0.6184	0.1950	0.6988
2023-09-11	nn_g13	0.5108	0.7227	0.8618	0.6223	0.7067	0.4828	0.6718	0.8438	0.5581	0.7186	0.3820	0.5418	0.6471	0.4659	0.7050
2023-05-12	adaptive_clustering	0.4830	0.6680	0.8541	0.5484	0.7230	0.4323	0.5976	0.8427	0.4629	0.7234	0.3483	0.4929	0.5364	0.4560	0.7067
2023-05-12	fixed_clustering	0.4830	0.6680	0.8541	0.5484	0.7230	0.4266	0.5909	0.8114	0.4647	0.7220	0.3284	0.4651	0.4684	0.4618	0.7060
2022-08-09	Unified Detector (CVPR 2022 version)	0.4821	0.6151	0.6754	0.5647	0.7838	0.6223	0.7991	0.7964	0.8019	0.7787	0.5360	0.6858	0.7604	0.6245	0.7817
2023-02-06	HierText official ckpt	0.4799	0.6135	0.6719	0.5645	0.7822	0.6220	0.7998	0.8000	0.7996	0.7777	0.5351	0.6856	0.7654	0.6208	0.7805
2023-03-01	test	0.2745	0.4175	0.5182	0.3495	0.6576	0.2561	0.3904	0.5150	0.3143	0.6559	0.1632	0.2452	0.3561	0.1870	0.6657
2023-02-19	UnifiedDetector	0.1686	0.2350	0.3523	0.1763	0.7175	0.2048	0.2909	0.4668	0.2113	0.7041	0.1076	0.1569	0.2408	0.1163	0.6856
2023-02-06	a	0.0000	0.0000	0.0024	0.0000	0.5362	0.0001	0.0001	0.0025	0.0001	0.5129	0.0001	0.0002	0.0021	0.0001	0.5089

Inactive evaluations

method: Upstage KR2023-04-01

method: Upstage KR2023-03-30

method: hiertext_submit_0401_curve_199_v22023-04-01

Ranking Table

Ranking Graphic

Ranking Graphic - Line PQ

Ranking Graphic - Paragraph PQ