I have scraped a lot of links from instagram and threads using selenium python. It was a good learning experience. I will be running that script for few days more and will see how many more media links I can scrape from instagram and threads.
However, the problem is that the media isn’t tagged so we don’t know what type of media it is. I wonder if there is an AI or something that can categorize this random media links to an organized list.
if you want to download all the media from the links you can run the following command:
# This command will download file with all the links
wget -O links.txt https://gist.githubusercontent.com/Ghodawalaaman/f331d95550f64afac67a6b2a68903bf7/raw/7cc4cc57cdf5ab8aef6471c9407585315ca9d628/gistfile1.txt
# This command will actually download the media from the links file we got from the above command
wget -i links1.txt
I was thinking about storing all of these. there is two ways of storing these. the first one is to just store the links.txt file and download the content when needed or we can download the content from the links save it to a hard drive. the second method will consume more space, so the first method is good imo.
I hope it was something you like :)