View Single Post
Old 03-23-2016, 11:18 AM  
johnnyloadproductions
Account Shutdown
 
Industry Role:
Join Date: Oct 2008
Location: Gone
Posts: 3,611
:stoned 4chan /b/, 2 days, 56 GB, 121830 media files.

.jpeg, .png, .gif, .webm

I'm starting to realize the power of well thought out bots.

Before anyone gets on me about being liberal with scraping and content, I thought it would be awesome to hash images in multiple formats and create a database on these hashes, then find out what images are popular (as they would appear multiple times) and see if that was a sponsor of some kind.
Google works this way with their image search but this could be useful for affiliates and/or programs.

Meaning? perhaps a good idea to promote them?

This is my bot source, crowd sourced, way of telling what's hot without anyone directly telling me. Weeeeeeeeeee!!!!!!

Simple script, It's brute force which means if the thread is seen again it will overwrite all the files again, it's probably pulled 200GB of files. Will modify in the future.

Cool stuff, you are happy that I shared!!!!!

Other idea is setting up something where if there were an image and you can't quite figure out the rest of the series, well search the hash and it'd pull up a thread from that search that just may have those images.

FUCKING AWESOME IDEAS!!!!

johnnyloadproductions is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote