Abstract: Document Visual Question Answering (DocVQA) offers a promising approach to extracting insights from large document corpora. However, existing benchmarks focus on evaluating multi-modal ...