Name: PDF Text and Image Extraction Tool Built with Next.js and Python | Saif Saeed posted on the topic | LinkedIn
Uploaded: 2026-03-09T15:26:19.947Z
Duration: 1 min 5 s
Channel: Saif Saeed

Saif Saeed

1mo

🚀 Just shipped a PDF text and image extraction tool. I built a full stack system that converts PDFs into structured outputs you can actually work with. The goal: make it simple to extract both text and visual content from large documents to feed your LLM easily. What it does 📄 Extracts text from PDFs and converts it into clean Markdown (headers, paragraphs, tables) 🖼 Detects and exports figures and tables as separate images 📦 Supports batch uploads with a live progress tracker ⬇️ One-click Download All to export everything as a ZIP Tech stack 🖥 Frontend Next.js 14 (App Router), TypeScript, React, Tailwind — deployed on Vercel 🐍 Backend Python + Flask with a sequential job queue for reliable multi-file processing — deployed on Hugging Face Spaces 🔗 Architecture Next.js API proxy routes backend calls and keeps the HF Space private and secure 📑 PDF processing PyMuPDF4LLM for text extraction + DocLayout-YOLO for layout detection Challenges I ran into 🧩 Tables and figures split across pages → built logic to detect bounding boxes across pages and stitch them into a single image 📝 Pairing images with their captions → added spatial matching between figures and nearby caption blocks ⚙️ Handling multi-file uploads safely → implemented a sequential background queue 🎥You can try a live demo here : https://lnkd.in/dGhQwa6N #DataEngineering #Python #NextJS #PDFProcessing #DataExtraction #FullStackDevelopment #BuildInPublic

9 Comments

youssef hussain 1mo

Very impressive saif, wouldn't it be better if for example you credit what/who helped?

1 Reaction

Mahmoud Mohamed 1mo

Great work saif so impressive ❤️❤️

1 Reaction

Hager Henidy 1mo

👏🏻👏🏻👏🏻👏🏻👏🏻👏🏻

1 Reaction

Hager Henidy 1mo

So impressive♥️♥️

1 Reaction

Mohamed El-Sayed 4w

Amazing 👏

Abdelrahman Haroun 1mo

Amazing work ❤️

1 Reaction

Aziz Ali 1mo

Just shipped. That's the move. Next.js, TypeScript, Tailwind. The stack that gets out of the way. Python backend for the heavy lifting. Simple. What was the hardest part? PDF parsing or the integration?

See more comments

To view or add a comment, sign in

More Relevant Posts

Hafiz Abdullah Yasir
1mo
Report this post
🚀 𝗡𝗲𝘄 𝗣𝗥 𝗠𝗲𝗿𝗴𝗲𝗱 Just tackled a fun logical challenge: finding the intersection of two arrays. The goal was to identify elements present in both input arrays. I approached this using JavaScript. My strategy involved iterating through the first array and checking for the existence of each element in the second array. To optimize this lookup, I leveraged a Set data structure, which provides average O(1) time complexity for checking membership. During the 🐞 𝗗𝗲𝗯𝘂𝗴𝗴𝗶𝗻𝗴 𝗣𝗿𝗼𝗰𝗲𝘀𝘀, I found dry runs and visualizing the data flow particularly helpful. Stepping through the code with a debugger allowed me to pinpoint exactly where my logic was diverging from the expected output. A 📚 𝗞𝗲𝘆 𝗟𝗲𝗮𝗿𝗻𝗶𝗻𝗴 for me was the significant performance improvement gained by using a Set for lookups compared to nested loops or Array.prototype.includes within a loop. Check out the implementation and contribute to the discussion here: https://lnkd.in/dvQbUFGK How do you typically ⚙️ 𝗔𝗽𝗽𝗿𝗼𝗮𝗰𝗵 array intersection problems? 📦 Repo: https://lnkd.in/dvQbUFGK #Algorithm #JavaScript #ProblemSolving #DataStructures #Set #CodingChallenge #Developer #Tech #InterviewQuestion #LogicalThinking
1 Comment
Like Comment
To view or add a comment, sign in
Azwer Mateen
4w
Report this post
I’m excited to share a project I’ve been working on! I developed a full-stack Car Price Prediction System that estimates the market value of a vehicle based on its features. The Tech Stack: 🐍 Backend: Python & Flask 📊 Data Science: Pandas & Scikit-Learn (Linear Regression) 💻 Frontend: HTML5, Bootstrap 5, & JavaScript (AJAX) Key Challenges Solved: Data Cleaning: Processed a raw dataset to handle missing values and inconsistent naming. Dynamic UI: Built a dependent dropdown system using JavaScript so users only see models corresponding to the selected brand. Asynchronous Prediction: Used AJAX to deliver real-time predictions without refreshing the page. Check out the demo below! I'd love to hear your thoughts on how to improve the model accuracy or the UI experience. Link the GitHub: https://lnkd.in/dHCUggPY #Python #DataScience #WebDev #MachineLearning #Flask #PortfolioProject

1 Comment
Like Comment
To view or add a comment, sign in
chris mccabe
1mo Edited
Report this post
I built a LeetCode-style coding playground inside a browser — with zero backend. No server. No containers. No API calls. Just your browser running real Python and JavaScript. The Technical Interview Prep tool now lets you: Write and run actual code against test cases — instantly - Python runs via WebAssembly (Pyodide) — a full Python interpreter in your browser - JavaScript executes in isolated Web Workers — fast and sandboxed - 19 classic interview problems with auto-generated function stubs - Real test runners with pass/fail results, stdout capture, and execution timing Practice the way you'll be tested - Two Sum, Valid Parentheses, Binary Search, Merge Intervals — the hits are all here - Switch between Python and JavaScript with one click - See input, expected output, and your actual output side by side - Get instant feedback — no waiting for a server round-trip The guided framework that sets it apart This isn't just a code editor. Each problem walks you through a 10-step structured approach: Restate → Clarify → I/O/Constraints → Brute Force → Trade-offs → Optimal → Time O(?) → Space O(?) → Edge Cases → Reflect Pattern recognition drills, complexity quizzes, flashcards, and a mock interview timer round it out. The fun part? The hardest bug wasn't an algorithm — it was the Alpine.js DevTools Chrome extension silently blocking our Web Workers from loading. Hours of debugging CSP headers, blob URLs, and service workers... only to find a browser extension was the culprit. Link in the comments Day 34(ish) of building a tool every day for 100 days #SoftwareEngineering #InterviewPrep #WebDevelopment #Python #JavaScript #CodingInterview #TechCareers #BuildInPublic
1 Comment
Like Comment
To view or add a comment, sign in
A3 IT SKILLS SOLUTIONS

1,908 followers
1mo
Report this post
Behind the Screen: What it actually takes to build for the web 💻 This visual perfectly summarizes the modern developer's ecosystem. It’s never "just a website." It’s a combination of: Front-end: HTML/CSS/JS & UI Design (The face of the project). Back-end: Python, APIs, and Data (The brain). Infrastructure: Cloud Platforms & Version Control (The backbone). The complexity is high, but the reward of seeing it all come together is higher. 🚀 #Python #Javascript #CloudComputing #CodingLife #WebDesign
1 Comment
Like Comment
To view or add a comment, sign in
DIVYAM GUPTA
1mo
Report this post
Built something today because nothing out there worked the way I needed. I came up with a new sorting approach — “Zipper Sort” — and wanted to visualize it step by step. Existing tools didn’t cut it (too abstract, not real execution), so I built my own 👇 🚀 ALGO_TYPEWRITER — a browser-based visualizer that runs real C/C++ sorting code and animates it live. 💡 What it does • Paste C/C++ → hit RUN • Watch comparisons (green) & swaps (bars slide, not resize) • Pause/resume, control speed, adjust array size • Get stats: time, comparisons, swaps + inferred complexity ⚙️ How it works • Regex-based transpiler → C/C++ → JS • Proxy layer logs comparisons/swaps (no code modification) • Action log replayed as smooth animations 📊 Complexity is measured empirically, not hardcoded. 🔥 Built this to test one idea… ended up building a full system. Would love to hear your thoughts on Zipper Sort and whether tools like this would help you understand algorithms better. #buildinpublic #algorithms #cpp #javascript #webdev #learningbydoing Link to the GitHub Repo housing this Project https://lnkd.in/dmdHfWQf

1 Comment
Like Comment
To view or add a comment, sign in
Oleg A.
4w
Report this post
I compared the same logic in JS and Rust. The result? The "complex" Rust version wasn't just drastically faster — it was actually shorter and cleaner. If you’ve worked with JavaScript, Python, or Java, you’ve likely encountered the classic problem of counting how many times each character appears in a string. In JavaScript, the typical approach looks like this: if (map.has(ch)) { map.set(ch, map.get(ch) + 1); } else { map.set(ch, 1); } While this seems straightforward, there’s a hidden performance flaw: The Double Lookup & Value Copying. This one-liner requires extra work from the engine: 1️⃣ map.get(ch): Calculates the hash, traverses memory, finds the bucket, and extracts a copy of the number. 2️⃣ + 1: Creates a brand-new number primitive in memory. 3️⃣ map.set(ch, ...): Calculates the hash again, traverses memory again, finds the same bucket, and copies the new number back into it. Now, let's see how Rust handles the same logic: *counts.entry(ch).or_insert(0) += 1; This isn't just syntactic sugar; it utilizes Rust's Entry API, designed for maximum hardware efficiency. Here’s why it’s blazingly fast: - It calculates the hash exactly once. - It locates the memory bucket exactly once. - It returns a &mut (a direct mutable pointer/reference) right to that memory slot. The += operator modifies the primitive value in-place without copying it out or needing a .set() method to put it back. This results in code that reads like a high-level script but executes with the speed of a systems language. Zero-cost abstractions at their finest! #Rust #JavaScript #Programming #Performance #SoftwareEngineering #WebDev #RustLang
51 Comments
Like Comment
To view or add a comment, sign in
Arthur Chukwuka
1mo
Report this post
Currently building a Geographic Information System(GIS)-based application and it has pushed me slightly outside my usual stack. I mostly work with JavaScript (React, modern frontend tools), but for this project I need OpenCV, which works better with Python. So I’m learning Flask to integrate the computer vision component with the web app. Interesting challenges so far: . Structuring communication between Python backend and JS frontend • Managing performance when processing images • Keeping the architecture simple while mixing technologies One thing I’m learning again is that real-world projects often require flexibility beyond your primary stack. If anyone has experience combining Flask + frontend frameworks, I’d appreciate any tips or best practices.
Like Comment
To view or add a comment, sign in
Subha Sundar Das
1mo
Report this post
🚀 𝗝𝘂𝘀𝘁 𝗘𝘅𝗽𝗲𝗿𝗶𝗺𝗲𝗻𝘁𝗶𝗻𝗴 𝘄𝗶𝘁𝗵 𝗧𝗲𝘅𝘁𝘂𝗮𝗹... 𝗮𝗻𝗱 𝗶𝘁’𝘀 𝗶𝗻𝘁𝗲𝗿𝗲𝘀𝘁𝗶𝗻𝗴 👀 Lately, I’ve been playing around with Textual (Python TUI framework) — no big project yet, just pure experimentation. And honestly… it feels different. 💡 Building UI without a browser 💡 No React, no Angular 💡 Just Python + terminal Still early for me, but a few things stood out: • Super fast to spin up • Clean UI with CSS-like styling • Everything in one language (Python) • Runs anywhere — even over SSH Not saying it replaces web or desktop apps… But for internal tools, dashboards, or admin panels — this could be really useful. For now, I’m just exploring and testing ideas. Let’s see where it goes 𝗔𝗻𝘆𝗼𝗻𝗲 𝗲𝗹𝘀𝗲 𝘁𝗿𝗶𝗲𝗱 𝗧𝗲𝘅𝘁𝘂𝗮𝗹 𝘆𝗲𝘁? #Python #Textual #LearningInPublic #DevExperiment #FullStackDeveloper
Like Comment
To view or add a comment, sign in
Clark Thompson
1mo Edited
Report this post
I went back and cleaned up an old project today, a stock trend scanner I originally built as a Python desktop app like 18 months ago. At the time, it worked but it was very much a “get it working” type build. PySide UI, everything kind of bundled together, not something you’d confidently show to someone. So instead of building something new, I treated it like a refactor exercise: - kept the core scan logic - pulled it apart into a proper backend (FastAPI) - rebuilt the UI as a web dashboard (Next.js + Tailwind) - reorganized the repo so it actually reads like a real project No over-engineering, no full rewrite Big takeaway for me: A lot of “old” projects aren’t bad, they’re just badly presented. Refactoring them forces you to think more like: 1) a product engineer (what should stay vs go) 2) a reviewer (does this repo make sense at a glance) 3) a user (does this flow actually feel usable) If you’ve got old projects sitting around, it’s worth revisiting them. There’s usually more value there than you think.
Like Comment
To view or add a comment, sign in

655 followers

10 Posts

View Profile Follow

More Relevant Posts

Explore content categories