Instagram Flyer Finder

AI-Powered Instagram Scraper for Automated Event Detection & De-duplication

PROBLEM

STATEMENT

The client needed an Instagram scraper to scan images from the posts and stories of certain Instagram handles and classify them as posters or flyers. These images had to be analyzed by an AI tool, like Google Vision API or ChatGPT (OpenAI API), for getting the details of these flyers (event name, event date, etc.). 

 

Based on these details, the system needs to find the upcoming events and save the details into the database. The client also wanted the scraper to handle the issues of past events and duplication as different users can post about the same event.

SOLUTION

I started by finding a Rocket API that could provide all the required information. Then, I implemented the Python script to retrieve images based on the criteria specified by the client. The scraper was optimized to store only the images that meet the specified criteria. 

 

For this purpose, each image was passed to the OpenAI API (alongside a specific prompt) to ensure that it meets the criteria before being stored into the S3 bucket. The details of the stored image were then stored in a MySQL database. The images that failed to meet the specifications were discarded on the spot (instead of first saving and then deleting). This optimization helped in saving a lot of money in terms of storage cost and API credits.

Input

List of Instagram handles

Output

The details of all the upcoming events (posted by given Instagram handles) were fetched and saved in S3 and MySQL database.

Tools &
Technologies

Instagram Flyer Finder

Python

Instagram Flyer Finder

Rocket API

Instagram Flyer Finder

OpenAI API (ChatGPT)

Instagram Flyer Finder

AWS S3

Instagram Flyer Finder

MySQL

cronjob.png

Cronjob

Wikipedia Scraping

Scroll to Top

01. Home

02. Portfolio

03. Services

04. About

05. Blog

Office

Contact

Follow us