🚀 Just shipped a PDF text and image extraction tool. I built a full stack system that converts PDFs into structured outputs you can actually work with. The goal: make it simple to extract both text and visual content from large documents to feed your LLM easily. What it does 📄 Extracts text from PDFs and converts it into clean Markdown (headers, paragraphs, tables) 🖼 Detects and exports figures and tables as separate images 📦 Supports batch uploads with a live progress tracker ⬇️ One-click Download All to export everything as a ZIP Tech stack 🖥 Frontend Next.js 14 (App Router), TypeScript, React, Tailwind — deployed on Vercel 🐍 Backend Python + Flask with a sequential job queue for reliable multi-file processing — deployed on Hugging Face Spaces 🔗 Architecture Next.js API proxy routes backend calls and keeps the HF Space private and secure 📑 PDF processing PyMuPDF4LLM for text extraction + DocLayout-YOLO for layout detection Challenges I ran into 🧩 Tables and figures split across pages → built logic to detect bounding boxes across pages and stitch them into a single image 📝 Pairing images with their captions → added spatial matching between figures and nearby caption blocks ⚙️ Handling multi-file uploads safely → implemented a sequential background queue 🎥You can try a live demo here : https://lnkd.in/dGhQwa6N #DataEngineering #Python #NextJS #PDFProcessing #DataExtraction #FullStackDevelopment #BuildInPublic

Very impressive saif, wouldn't it be better if for example you credit what/who helped?

Great work saif so impressive ❤️❤️

👏🏻👏🏻👏🏻👏🏻👏🏻👏🏻

Just shipped. That's the move. Next.js, TypeScript, Tailwind. The stack that gets out of the way. Python backend for the heavy lifting. Simple. What was the hardest part? PDF parsing or the integration?

Like
Reply
See more comments

To view or add a comment, sign in

Explore content categories