Reddit Scraper | Extract Reddit Data Effortlessly

Reddit Collector

Scrape all reddits made under a specific reddit.

PROBLEM

STATEMENT

Our client wanted to download all data (including post text and media) on-demand for any given sub-reddit for data analysis and machine learning. The app was required to be fast and reliable so that a large amount of data could be collected without any missing pieces.

SOLUTION

This app crawled all posts made under a specific subreddit in a given time period. The extracted data includes title, timestamp, post text, permalink, category, votes, media type (audio/video) and media of each post scraped. The textual data was saved in a mongo DB while the media files were saved on client’s server.

Input

No of inputs:

Output

The extracted data was saved as follows:

Reddit Collector

PROBLEM

STATEMENT

SOLUTION

Input

Output

Tools &
Technologies

Python (Scrapy)

MongoDB

Cronjob

High Growth Startup Finder

01. Home

02. Portfolio

03. Services

04. About

05. Blog

Office

Contact

Follow us

Reddit Collector

PROBLEM

STATEMENT

SOLUTION

Input

Output

Tools & Technologies

Python (Scrapy)

MongoDB

Cronjob

High Growth Startup Finder

01. Home

02. Portfolio

03. Services

04. About

05. Blog

Office

Contact

Follow us

Tools &
Technologies