This weekend, I got it done to retrieve Instagram data.
It was a difficult task, to the point where it would be appropriate to say that I got it done. They don't provide APIs unless it's a business account, and they thoroughly block scraping, so it's practically impossible for individuals to utilize Instagram data.
At first, I tried to get the data directly using curl, but it was not possible because it was dynamic data. External tools like rss.app had a limited period, and since there was no operating environment for the Python scraping library, I tried running it on pythonanywhere, but the IP was already blocked.
After countless attempts and thoughts, I solved it by combining the php-webdriver library, chromedriver, and an Instagram mirror. Additionally, I finished the work by setting it to run once a day.
Only public Instagram data is retrieved, and you can see it in the Your Story module at https://achor.net/module/urstory.
I recall that this was the most challenging task I've tackled while being deeply immersed in programming lately. After completing it, I feel confident that I can scrape data from anywhere in a PHP environment.
- achor
|