Reddit Has a Public JSON API Most Scrapers Ignore

Most Reddit scrapers parse HTML and break on every redesign. But Reddit has a public JSON API hiding in plain sight. The Secret: Just Add .json Append .json to any Reddit URL: https://www.reddit.co...

By · · 1 min read
Reddit Has a Public JSON API Most Scrapers Ignore

Source: DEV Community

Most Reddit scrapers parse HTML and break on every redesign. But Reddit has a public JSON API hiding in plain sight. The Secret: Just Add .json Append .json to any Reddit URL: https://www.reddit.com/r/programming/hot.json https://www.reddit.com/search.json?q=web+scraping https://www.reddit.com/r/programming/comments/abc123/title.json What You Get Structured JSON with 20+ fields per post: title, author, score, upvote_ratio num_comments, flair, awards selftext (full post body) url, domain, is_video, thumbnail created_utc, permalink Plus full comment trees with nested replies. Why It's Better Never breaks on redesigns — JSON API is separate from the UI Complete data — fields not visible in the UI Faster — JSON is lighter than HTML Pagination — use after parameter Caveats Need proper User-Agent header Rate limit: don't exceed 1 req/sec Cloud scraping needs residential proxy (Reddit blocks datacenter IPs) I built a Reddit scraper based on this approach — free on Apify Store (search knotless