View Single Post
Old 05-19-2015, 01:39 PM  
aka123
Confirmed User
 
aka123's Avatar
 
Industry Role:
Join Date: Jul 2014
Location: 64 00 N, 26 00 E
Posts: 4,450
Quote:
Originally Posted by AdultSites View Post
I am a lot more familiar with this, and it is not an easy thing, still. I mean, it is easy, but you cant get 100% of the pages, without scraping the site, or scraping serps, and there is no guarantee, this will be 100% of the pages too.

For a page to be known, it needs to be linked to from some other place on the Internet, could be search engine sites. If there is no single link to a page, there is no way for people to know, that it exists, except for people who own the site, or have access to its admin.

There is multiple methods of scraping, Seo Spider by Screaming Frog, and Xenus Link Sleuth, are not good enough for websites, which are large (over 3,000,000 urls). People to some tweaking, and it works better, but there is no a perfect solution to this.

So, in general, to scrape large websites, you need to do it programatically, and it takes time too.

So, getting an idea of a strucure of a website, is not an easy thing, and like I am saying, I am very familiar with anything that can or needs to be done, for this.
I don't think that you need to scrape sites to be able to link to those. Have you tried browsing the site with browser (like with Indian Explorer)? I know it is hard, but it can be done. I am almost sure about it.
__________________
aka123 is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote