GoFuckYourself.com - Adult Webmaster Forum

GoFuckYourself.com - Adult Webmaster Forum (https://gfy.com/index.php)
-   Fucking Around & Business Discussion (https://gfy.com/forumdisplay.php?f=26)
-   -   How to map / get idea of a large site? (https://gfy.com/showthread.php?t=1161131)

AdultSites 02-13-2015 03:46 AM

How to map / get idea of a large site?
 
There is a website that I would need to map, get directory structure, and it is around 1,000,000 pages big. I have no access to the server, or cms. Not all the pages show in Google, and other search engines, and even if they would, it would not be too much help.

I probably need a sitemap, but from what I remember, it is kind of hard to make a sitemap for a big page like that. What would be the best way to work on that / get it done? Something free would be good too. I need website structure, and all urls.

Thanks.

clickity click 02-13-2015 12:18 PM

Just use something like httrack which will make a directory on your PC with all the files. If you don't need images you can disable scanning for them.
Most sites have a sitemap as well. Usually something like site.com/sitemap.xml

Might be worth checking first.

Harmon 02-13-2015 12:22 PM

So basically, and let me get this straight... you are 2 posts deep into this forum and you are trying to copy somebody elses business?

You'll fit right in.:upsidedow

The Porn Nerd 02-13-2015 12:27 PM

Quote:

Originally Posted by Harmon (Post 20392363)
So basically, and let me get this straight... you are 2 posts deep into this forum and you are trying to copy somebody elses business?

You'll fit right in.:upsidedow

:1orglaugh:1orglaugh:1orglaugh:1orglaugh

sandman! 02-13-2015 01:11 PM

:1orglaugh:1orglaugh:1orglaugh:1orglaugh:1orglaugh


Quote:

Originally Posted by Harmon (Post 20392363)
So basically, and let me get this straight... you are 2 posts deep into this forum and you are trying to copy somebody elses business?

You'll fit right in.:upsidedow


Bladewire 02-13-2015 03:41 PM

Quote:

Originally Posted by Harmon (Post 20392363)
So basically, and let me get this straight... you are 2 posts deep into this forum and you are trying to copy somebody elses business?

You'll fit right in.:upsidedow

Quote:

Originally Posted by The Porn Nerd (Post 20392368)
:1orglaugh:1orglaugh:1orglaugh:1orglaugh

Quote:

Originally Posted by sandman! (Post 20392428)
:1orglaugh:1orglaugh:1orglaugh:1orglaugh:1orglaugh


AdultKing 02-13-2015 03:49 PM

Quote:

Originally Posted by Harmon (Post 20392363)
So basically, and let me get this straight... you are 2 posts deep into this forum and you are trying to copy somebody elses business?

You'll fit right in.:upsidedow

:2 cents::2 cents:

This forum only attracts assholes now.

Barry-xlovecam 02-13-2015 05:48 PM

learn to hack :P

Quote:

ssh user@target
user@target password:

omitted


scp user@target:

omitted


ssh again

user@target:~$/path to document route/$ rm remote-tree.csv
user@target:~$ cd~
user@target:~$ nano .bash_history
You want root so you can clean your connections from the system logs
su {or sudo}
??? look it up l4m3r :P
nano pico vi whatever ...
delete your tracks ^x save
^d

http://i.ytimg.com/i/Fe_w2aL0SAkEUdL...1.jpg?v=9e966d

And don't use your mother's IP :1orglaugh

epitome 02-13-2015 05:53 PM

How the fuck was the name AdultSites not registered on GFY until 2015?

Seems like such an obvious choice.

freecartoonporn 02-13-2015 11:34 PM

are you scraping Wikipedia ?

post the url so i can give some input,

AdultSites 02-14-2015 04:28 AM

Quote:

Originally Posted by clickity click (Post 20392360)
Just use something like httrack which will make a directory on your PC with all the files. If you don't need images you can disable scanning for them.
Most sites have a sitemap as well. Usually something like site.com/sitemap.xml

Might be worth checking first.

This program does not seem to be perfect, but I may be doing it wrong too. I will look at the settings, and see what I can make of it. It looks like Teleport Pro, and other download managing programs. They dont seem to be 100% good. They should just work, but there seem to be a lot of bs with that. In the end I end up checking on these programs a lot, instead of getting everything done in the background. Again, I may be doing it wrong, as I did not pay too much attention to this, I was just testing.

clickity click 02-14-2015 05:55 AM

what exactly are you trying to do. What it the site. You do know you can't just copy a site and have it running unless it's HTML only.

Roald 02-14-2015 06:55 AM

Quote:

Originally Posted by AdultKing (Post 20392595)
:2 cents::2 cents:

This forum only attracts assholes now.

this forum always attracted assholes. nothing new really.


All times are GMT -7. The time now is 03:42 PM.

Powered by vBulletin® Version 3.8.8
Copyright ©2000 - 2025, vBulletin Solutions, Inc.
©2000-, AI Media Network Inc