Extract Reddit Data with Reddit API JSON Endpoint

3mo

Found a great workflow for RAG pipelines and quick data extraction today. If you need structured data from Reddit, Inc. but don't want to deal with OAuth flows, just append /.json to the thread URL. It returns the post and comments in a clean JSON format. My workflow: Reddit URL + .json --> Python/GoLang Script --> LLM Context Window. Result: Instant sentiment analysis and topic extraction without the scraping overhead. Note: Be a good citizen of the web. Set a unique User-Agent string to respect their API guidelines and avoid the 10 req/min rate limit. #DevTips #Python #Automation #LLM #DataEngineering #RAG #API #Golang

To view or add a comment, sign in

More Relevant Posts

Devi Sree
2mo Edited
Report this post
Project: Built and deployed an end-to-end Sentiment Analysis web application using modern ML and backend tools. Model: DistilBERT Dataset Source: Kaggle : 🔗 https://lnkd.in/g-4v3TUv TechStack: Python Hugging Face Transformers DistilBERT FastAPI uvicorn HTML, CSS, JavaScript Git & GitHub 🔗 GitHub Repository: https://lnkd.in/g9XymNHv #machinelearning #python #huggingface #sentimentanalysis #fastapi #uvicorn
Like Comment
To view or add a comment, sign in
GyaanSetu AI (Artificial Intelligence)

878 followers
2mo
Report this post
𝗜 𝗕𝗎𝗶𝗹𝗍 𝗔 𝗣𝗶𝗽-𝗜𝗻𝘀𝘁𝗮𝗹𝗹𝗮𝗯𝗹𝗲 𝗥𝗔𝗚 𝗖𝗵𝗮𝘁𝗯𝗼𝘁 You can chat with any document in 3 lines of Python. I turned my document Q&A chatbot into a proper Python package with a clean API and CLI. Here's how it works: - You install the package using pip - You import the DocumentQA class - You use the class to index your documents and ask questions You can use it inside a Flask API, run it from the command line, or import it into a Jupyter notebook. You can ask questions like "What are the payment terms?" and get answers from your documents. I pulled the RAG pipeline into a clean DocumentQA class. The key design decisions include: - Auto-detecting file types from extensions - Building conversation memory into the class - Creating a dual mode that works as general chat without any files, and RAG mode with files You can try it out by running pip install docqa-rag and then using the docqa command with your document. Source: https://lnkd.in/gqQErNVG
Like Comment
To view or add a comment, sign in
Sourav Singh
2mo
Report this post
🎥 Week 5 | Phase 0 Foundation (Python) | Mini Project video is up. Every LLM test needs data. Prompts. Expected outputs. Thousands of them. This week I built dataset-loader — a tool to load, validate, and save test datasets. What it does: → Load JSON files (single dataset) → Load JSONL files (line-by-line, standard for LLM fine-tuning data) → Validate test case structure → Save results back to file File handling. JSON parsing. pathlib for cleaner paths. The stuff you need when working with real datasets. If you're following along, try building it yourself first. Video's there if you get stuck or want a different perspective. Link in comments. #GenAITesting #LLMTesting #LearnInPublic #Python

1 Comment
Like Comment
To view or add a comment, sign in
Santhosh Reddy
2mo
Report this post
Built a Python-based web scraper that collects top news headlines from a public website and stores them in a text file. The project uses HTTP requests to fetch HTML content and BeautifulSoup to parse and extract headline data automatically. This helped me practice web scraping, HTML parsing, and basic file handling in Python. GitHub: https://lnkd.in/gi56cKgZ #Python #WebScraping #BeautifulSoup #Automation #Learning

GitHub - santhoshreddy28/news-headline-scraper: This is a simple Python script that scrapes top news headlines from a public website and saves them into a text file. Instead of opening a browser and scrolling endlessly, this script does the boring work automatically. github.com

3 Comments
Like Comment
To view or add a comment, sign in
Onkar Lapate
2mo
Report this post
How the asyncio event loop actually works? TL;DR: It's a loop that checks: "Which tasks are ready to run right now?" The Event Loop keeps doing the following - 1. Check which tasks are ready 2. Run one task until it hits 'await' 3. Task pauses, switch to next ready task 4. Repeat What Happens code hits await some_operation()? -> Task says: "I'm waiting for I/O" -> Event loop skips it, runs another task -> When I/O completes, task becomes ready again -> Loop picks it up and resumes Catch - Event loop runs in one thread. This is why blocking calls freeze everything. I’m deep-diving into Python internals and performance. Do follow along and tell your experiences in comments. #Python #PythonInternals #SoftwareEngineering #BackendDevelopment #Asyncio
Like Comment
To view or add a comment, sign in
Gopinath S
2mo
Report this post
🚀 Day 44 of #100DaysOfCode 📌 Problem: 762 – Prime Number of Set Bits in Binary Representation Today I solved an interesting bit manipulation problem on LeetCode that combines binary representation with prime number logic. 🔎 Problem Summary: Given two integers left and right, count how many numbers in that range have a prime number of set bits (1s) in their binary form. 💡 Key Insight: The maximum number of set bits for numbers ≤ 10⁶ is 20. So we only need to check prime numbers up to 20: {2, 3, 5, 7, 11, 13, 17, 19} For each number, count the set bits using: Python Copy code num.bit_count() ✅ Python Solution: Python Copy code class Solution: def countPrimeSetBits(self, left: int, right: int) -> int: primes = {2, 3, 5, 7, 11, 13, 17, 19} count = 0 for num in range(left, right + 1): if num.bit_count() in primes: count += 1 return count ⏱ Time Complexity: O(n) 🧠 Concepts Used: Bit Manipulation | Prime Numbers | Set Every day I’m getting more comfortable with binary operations and optimization techniques. #LeetCode #Day42 #CodingJourney #Python #ProblemSolving #BitManipulation #100DaysOfCode
Like Comment
To view or add a comment, sign in
Joshua Odmark
2mo
Report this post
Checkout this blog post I wrote this morning that shows a great way to improve the success rate of Claude Code when fetching things on the Internet. Especially for web scraping! It involves injecting bash and python code into Claude Code's context and happens automatically in the background for WebFetch calls. https://lnkd.in/gwCMeHSm
Like Comment
To view or add a comment, sign in
Bijoy Sharma
2mo
Report this post
Recently, I was working on a problem that required dynamically constructing a string. My initial implementation was straightforward and functionally correct. At first glance, it seemed perfectly acceptable. However, upon reviewing the logic more carefully, I revisited how Python handles strings internally. Since strings in Python are immutable, each concatenation inside a loop creates a new string object. This means that with every iteration, memory is reallocated and the existing content is copied over. As input size increases, this results in repeated allocations and copying — leading to unnecessary overhead and potential quadratic time complexity. While this may not be noticeable for small inputs, it becomes increasingly inefficient in production environments where code runs frequently or processes large datasets. To optimize the solution, I refactored the implementation to accumulate values in a list and join them at the end. This approach avoids repeated string creation and achieves linear time complexity, improving both performance and memory efficiency. It was a small refactor, but a meaningful one. Moments like this are a good reminder that understanding language internals — even for simple operations — can significantly impact the quality and efficiency of the code we write. #Python3 #Performance #CleanCode #SoftwareEngineering #Optimization
7 Comments
Like Comment
To view or add a comment, sign in
Arturo Javier Borbon Rojas
3mo
Report this post
Weekly Challenge 2: Sum Two Numbers Optimized. Yesterday I shared a Brute Force solution for the Two Sum problem. It worked, but it is slow(O(n^2)). Today, let's optimize it using a Hash Map. The strategy, instead of using nest loops to compare wverything against everything, we use memory to our advantage. As we iterate through the list, we calculate the "complement" (Target- Current) and ask "Have I seen this number before?" A highly efficient Python implementation of the 'Two Sum' problem using a Hash Map (Dictionary). Unlike the Brute Force approach, this script solves the problem in a single pass ($O(n)$) by storing visited numbers and checking for their complement instantly. Check the code on my GitHub: https://lnkd.in/eq5cQvWT #python #Optimization #Algorithms #DataStructures #BigO
Like Comment
To view or add a comment, sign in

7,611 followers

310 Posts

View Profile Follow

Extract Reddit Data with Reddit API JSON Endpoint

More Relevant Posts

Explore related topics

Explore content categories