PyMuPDF Tutorial - Search News

Extraction of User-Defined Information from PDF

Abstract: Exporting selected textual data from PDF formats is a challenging task due to the diverse structures of these documents. This project introduces a tool for efficient extraction of ...

GitHub

Multimodal RAG & Evaluation Pipeline

This project implements a clean, modular pipeline for technical PDFs: ingestion → index → RAG → evaluation. It extracts text, tables, and images, builds a vector index, answers questions with grounded ...

GitHub

hypeprlane/complex-pdf-rag

A comprehensive PDF processing pipeline that extracts structured data from complex PDFs, including OCR text, tables, images, and rich context-aware metadata using Large Language Models (LLMs).

Some results have been hidden because they may be inaccessible to you

Show inaccessible results

Extraction of User-Defined Information from PDF

Multimodal RAG & Evaluation Pipeline

hypeprlane/complex-pdf-rag

Trending now