- Task 1 - Single Page Document VQA
- Task 2 - Document Collection VQA
- Task 3 - Infographics VQA
- Task 4 - MP-DocVQA
method: qwen7b-retrieval2025-03-18
Authors: yc
Affiliation: CVC
Description: using 7B qwen
method: Qwen2.5-VL-7B-AWQ-lite2025-02-27
Authors: yc
Affiliation: CVC
Description: a half min_pixal and max_pixal
method: qwen2vl-2b ensemble2024-12-19
Authors: Wang KeLong
Description: Qwen2VL-2B is trained for mp-docvqa classification task, and Qwen2VL-2B is trained for sp-docvqa vqa taskļ¼The results from the four models are integrated through Qwen2VL-2B.
Description Paper Source Code
Answer | Page prediction | ANLS per answer page position | ||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Date | Method | ANLS | Accuracy | Page 0 | Page 1 | Page 2 | Page 3 | Page 4 | Page 5 | Page 6 | Page 7 | Page 8 | Page 9 | Page 10 | Page 11 | Page 12 | Page 13 | Page 14 | Page 15 | Page 16 | Page 17 | Page 18 | Page 19 | |||
2025-03-18 | qwen7b-retrieval | 0.8763 | 81.6298 | 0.9259 | 0.8618 | 0.8162 | 0.8186 | 0.8120 | 0.8306 | 0.7558 | 0.8225 | 0.8740 | 0.8152 | 0.7596 | 0.7700 | 0.7529 | 0.7219 | 0.5714 | 0.5455 | 0.7899 | 0.5884 | 0.8077 | 0.6750 | |||
2025-02-27 | Qwen2.5-VL-7B-AWQ-lite | 0.8752 | 50.7870 | 0.9292 | 0.8544 | 0.8266 | 0.8306 | 0.7497 | 0.8548 | 0.7098 | 0.8329 | 0.7614 | 0.7981 | 0.7194 | 0.8100 | 0.7036 | 0.8095 | 0.6769 | 0.5839 | 0.8007 | 0.7000 | 0.9231 | 0.5000 | |||
2024-12-19 | qwen2vl-2b ensemble | 0.8501 | 85.9534 | 0.8937 | 0.8363 | 0.7980 | 0.7965 | 0.7735 | 0.7903 | 0.6698 | 0.8718 | 0.8133 | 0.7004 | 0.7618 | 0.7229 | 0.8027 | 0.7433 | 0.7207 | 0.7031 | 0.7790 | 0.7050 | 0.8846 | 1.0000 | |||
2025-03-14 | Qwen-retrieval | 0.8458 | 50.7870 | 0.8909 | 0.8348 | 0.7964 | 0.8228 | 0.7337 | 0.7931 | 0.6906 | 0.7829 | 0.8432 | 0.7045 | 0.7570 | 0.8000 | 0.7433 | 0.5667 | 0.6429 | 0.6294 | 0.7355 | 0.7084 | 0.7769 | 0.7105 | |||
2025-03-02 | Qwen2.5-VL-3B-Instruct-AWQ | 0.8405 | 50.7870 | 0.8961 | 0.8220 | 0.7964 | 0.7812 | 0.7307 | 0.8131 | 0.6666 | 0.7933 | 0.7622 | 0.6950 | 0.7073 | 0.6400 | 0.6381 | 0.6933 | 0.5952 | 0.4172 | 0.7572 | 0.7600 | 0.8154 | 0.7500 | |||
2024-08-21 | Snowflake Arctic-TILT 0.8B | 0.8122 | 50.7870 | 0.8639 | 0.7967 | 0.7551 | 0.7312 | 0.7105 | 0.7837 | 0.6916 | 0.7239 | 0.7793 | 0.6648 | 0.7817 | 0.6445 | 0.7003 | 0.6393 | 0.7202 | 0.6364 | 0.7355 | 0.5650 | 0.6923 | 1.0000 | |||
2024-01-16 | GRAM | 0.8032 | 19.9841 | 0.8380 | 0.7854 | 0.7528 | 0.7908 | 0.7452 | 0.7922 | 0.7459 | 0.7229 | 0.7464 | 0.7102 | 0.8120 | 0.6905 | 0.7589 | 0.6473 | 0.5714 | 0.5909 | 0.7454 | 0.6367 | 0.8846 | 1.0000 | |||
2024-01-16 | GRAM C-Former | 0.7812 | 19.9841 | 0.8152 | 0.7659 | 0.7363 | 0.7569 | 0.7164 | 0.7238 | 0.7407 | 0.7180 | 0.7587 | 0.8003 | 0.7624 | 0.6771 | 0.7713 | 0.6772 | 0.5798 | 0.6172 | 0.6394 | 0.5664 | 0.8327 | 0.9250 | |||
2024-02-18 | ScreenAI 5B | 0.7711 | 77.8840 | 0.8304 | 0.7394 | 0.7261 | 0.7407 | 0.6100 | 0.7213 | 0.6454 | 0.6389 | 0.6573 | 0.7500 | 0.7262 | 0.7429 | 0.6295 | 0.5147 | 0.5932 | 0.6818 | 0.5383 | 0.5900 | 0.6154 | 0.9605 | |||
2025-01-15 | mPLUG-DocOwls | 0.6932 | 50.7870 | 0.7618 | 0.6636 | 0.6403 | 0.6219 | 0.5282 | 0.5507 | 0.5361 | 0.5960 | 0.6388 | 0.6015 | 0.6342 | 0.5000 | 0.5922 | 0.4351 | 0.4612 | 0.4938 | 0.5152 | 0.5333 | 0.6368 | 0.7105 | |||
2023-10-03 | (OCR-Free) Retrieval-based Baseline | 0.6199 | 81.5501 | 0.6755 | 0.5954 | 0.5802 | 0.5611 | 0.4986 | 0.4989 | 0.5760 | 0.4991 | 0.6062 | 0.6652 | 0.5665 | 0.3438 | 0.4470 | 0.4171 | 0.3713 | 0.5909 | 0.4321 | 0.2575 | 0.7308 | 0.9605 | |||
2023-03-28 | Hi-VT5 | 0.6184 | 79.6374 | 0.6571 | 0.6055 | 0.5907 | 0.5450 | 0.5259 | 0.5431 | 0.6747 | 0.6113 | 0.5971 | 0.7997 | 0.5291 | 0.3694 | 0.5466 | 0.3373 | 0.4144 | 0.3879 | 0.4835 | 0.4001 | 0.6187 | 1.0000 | |||
2023-02-14 | (Baseline) Longformer base concat | 0.5287 | 71.1696 | 0.6293 | 0.4746 | 0.4495 | 0.4371 | 0.3966 | 0.3889 | 0.4451 | 0.3883 | 0.4805 | 0.5049 | 0.2860 | 0.1888 | 0.0861 | 0.1600 | 0.1726 | 0.2448 | 0.1486 | 0.1912 | 0.1154 | 0.6625 | |||
2023-02-14 | (Baseline) T5 base concat | 0.5050 | 0.0000 | 0.7122 | 0.4390 | 0.2567 | 0.2081 | 0.1498 | 0.1533 | 0.2186 | 0.1415 | 0.1301 | 0.3135 | 0.1108 | 0.0829 | 0.0866 | 0.0774 | 0.0873 | 0.0481 | 0.1648 | 0.2240 | 0.0000 | 0.3875 | |||
2023-02-14 | (Baseline) BigBird ITC base concat | 0.4929 | 67.5433 | 0.6506 | 0.4529 | 0.3729 | 0.2883 | 0.1890 | 0.1726 | 0.1681 | 0.1962 | 0.1887 | 0.2957 | 0.1802 | 0.0800 | 0.0829 | 0.0595 | 0.0238 | 0.1993 | 0.0778 | 0.1400 | 0.0769 | 0.2375 | |||
2023-02-14 | (Baseline) LayoutLMv3 base - concat | 0.4538 | 51.9426 | 0.6624 | 0.3962 | 0.2020 | 0.1105 | 0.1609 | 0.0494 | 0.1165 | 0.0467 | 0.0596 | 0.3198 | 0.0980 | 0.0800 | 0.0433 | 0.1131 | 0.0000 | 0.0455 | 0.0978 | 0.1467 | 0.0385 | 0.2105 |