![]() |
![]() |
![]() |
||||
Welcome to the GoFuckYourself.com - Adult Webmaster Forum forums. You are currently viewing our boards as a guest which gives you limited access to view most discussions and access our other features. By joining our free community you will have access to post topics, communicate privately with other members (PM), respond to polls, upload content and access many other special features. Registration is fast, simple and absolutely free so please, join our community today! If you have any problems with the registration process or your account login, please contact us. |
![]() ![]() |
|
Discuss what's fucking going on, and which programs are best and worst. One-time "program" announcements from "established" webmasters are allowed. |
|
Thread Tools |
![]() |
#1 |
Confirmed User
Join Date: Jul 2006
Posts: 808
|
Robots.txt
I am looking into the SEO of my sites and someone mentioned a robots.txt file that I should have ??? I am of course going to speak to my best friend Mr Google... but wondered if anyone could take 5 mins to explain what it is and what it should consist of... ?
![]()
__________________
![]() BIG PIMP CASH - Upto 70% Revshare and Upto 30$ PPS [email protected] - ICQ - 197119155 Need Content ? TRASHY CONTENT |
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#2 |
Confirmed User
Industry Role:
Join Date: Jan 2004
Location: Wisconsin
Posts: 4,517
|
In a nutshell, robots.txt is used to tell spiders / crawlers what NOT to scan.
Not every robot obeys it however. User-agent: * Disallow: / Would tell ALL robots not to look at any pages. I recommend NOT using the example above. |
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#3 |
Adult Content Provider
Industry Role:
Join Date: May 2005
Location: Europe
Posts: 18,243
|
I tell the robots the adress to my sitemap, done!
|
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#4 |
I like Dutch Girls
Join Date: Feb 2003
Location: dutchteencash.com
Posts: 21,684
|
google it and read some stuff it can do a lot for your sites for sure
__________________
![]() ICQ 16 91 547 - SKYPE dutchteencash bob AT dutchteencash DOT com ... did you see our newest Sweet Natural Girl Priscilla (18)? |
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#5 | |
Confirmed User
Join Date: Jul 2006
Posts: 808
|
Hmmm...
Quote:
I understand however some bots are undesirable... where is there a list of the bots that are not wanted ??? Someone said about bots that some are bad bots that for example harvest emails, and others that are site rippers... but it appears these are more blockable by re writing your HTACESS file as opposed to editing the robots.txt file... but again I would ask where you find a list of bad bots/site rippers that is up to date to put into your HTACCESS
__________________
![]() BIG PIMP CASH - Upto 70% Revshare and Upto 30$ PPS [email protected] - ICQ - 197119155 Need Content ? TRASHY CONTENT |
|
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#6 |
Confirmed User
Join Date: Apr 2007
Posts: 983
|
Robot text files are pretty much useless unless you're trying to get Google to not index a page. Google will crawl your page and index your content so just keep the robot.txt file out of your header. You really don't need it.
![]()
__________________
Skype: Triplexprint |
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#7 | |
Confirmed User
Industry Role:
Join Date: Mar 2004
Location: Great White North
Posts: 5,794
|
Quote:
To allow all robots complete access: User-agent: * Disallow: To exclude all robots from the server: User-agent: * Disallow: / To exclude all robots from parts of a server: User-agent: * Disallow: /private/ Disallow: /images-saved/ Disallow: /images-working/ To exclude a single robot from the server: User-agent: Named Bot Disallow: /
__________________
LinkSpun - Premier Adult Link Trading Community - ICQ - 464/\281/\250 Be Seen By New Webmasters/Affiliates * Target out webmasters/affiliates based on niches your sites are for less than $20 a month. AmeriNOC - Proudly hosted @ AmeriNOC! |
|
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#8 |
Too lazy to set a custom title
Industry Role:
Join Date: Feb 2003
Location: NJ
Posts: 13,336
|
You should hooked this up on your site.
http://www.google.com/webmasters/tools/
__________________
ISeekGirls.com since 2005 |
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#9 |
So Fucking Banned
Industry Role:
Join Date: Apr 2001
Location: the beach, SoCal
Posts: 107,089
|
Use it to keep spiders out of certain parts of your site.
|
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#10 |
Confirmed User
Join Date: May 2008
Location: BROOKLYN!!!
Posts: 3,474
|
They are pretty simple files and can do alot for your site. I also suggest using a .xml site map and have google spider it often
|
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#11 |
Confirmed User
Join Date: Jul 2006
Posts: 808
|
Thanks guys...
great advice keep it coming... I was thinking of a xml sitemap, just not sure how many things will be on it as we're a paysite and not that many pages
![]() How do I get google to look at it lots ![]()
__________________
![]() BIG PIMP CASH - Upto 70% Revshare and Upto 30$ PPS [email protected] - ICQ - 197119155 Need Content ? TRASHY CONTENT |
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#12 |
Too lazy to set a custom title
Industry Role:
Join Date: May 2004
Location: West Coast, Canada.
Posts: 10,217
|
This is what I put on all of my sites
User-agent: Fasterfox Disallow: / User-agent: * Disallow: Sitemap: http://...../.....txt The sitemap is just a text listing of all the urls in UTF-8. Sitemap is a recent addition to the robot.txt spec and the big 3 all grab it now. Much easier if you have lots of sites to do this than fuck around with the google stuff. http://www.sitemaps.org/protocol.php#informing The Fasterfox thing is due to some FF plugin or something that causes "false" hits to your site because it preloads pages. |
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#13 |
Confirmed User
Join Date: Jul 2006
Posts: 808
|
Ok...
So the general consencus is we should create some form of site map... either xml or txt... and then put within the robots.txt a command to make sure that the bots read/crawl the sitemap
![]() Anyone add anything to keep the bad ones out... or is that done mainly via the htpaccess file ? either way is there a good example ? For either... that shows ones that should be included ? ![]()
__________________
![]() BIG PIMP CASH - Upto 70% Revshare and Upto 30$ PPS [email protected] - ICQ - 197119155 Need Content ? TRASHY CONTENT |
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#14 | |
Registered User
Join Date: Feb 2008
Location: Mona = "female monkey" in Spanish
Posts: 1,940
|
Quote:
![]() If you submit your sitemap to Google Webmaster Tools, then they'll spider everything regularly...Amongst lots of other cool stuff. |
|
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#15 | |
Registered User
Join Date: Feb 2008
Location: Mona = "female monkey" in Spanish
Posts: 1,940
|
Quote:
![]() ![]() |
|
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#16 |
. . .
Industry Role:
Join Date: Apr 2007
Location: NY
Posts: 13,724
|
I add this to all of my sites' robots.txt :
User-agent: ia_archiver Disallow: / that keeps archive.org from spidering and keeping a copy of your site forever
__________________
__________________ Looking for a custom TUBE SCRIPT that supports massive traffic, load balancing, billing support, and h264 encoding? Hit up Konrad!
Looking for designs for your websites or custom tubesite design? Hit up Zuzana Designs Check out the #1 WordPress SEO Plugin: CyberSEO Suite |
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#17 |
Confirmed User
Join Date: Jul 2006
Posts: 808
|
Question
Can I ask why this is a bad thing ? Isnt ia_archiver the alexa one ?
__________________
![]() BIG PIMP CASH - Upto 70% Revshare and Upto 30$ PPS [email protected] - ICQ - 197119155 Need Content ? TRASHY CONTENT |
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#18 | |
Too lazy to set a custom title
Industry Role:
Join Date: Dec 2003
Posts: 11,089
|
Quote:
http://www.alexa.com/site/help/webmasters
__________________
... |
|
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#19 |
Confirmed User
Join Date: Jul 2006
Posts: 808
|
Question
Can anyone show me an example of a xml sitemap for a paysite that they use to encourage the spiders to crawl the page... ???
![]()
__________________
![]() BIG PIMP CASH - Upto 70% Revshare and Upto 30$ PPS [email protected] - ICQ - 197119155 Need Content ? TRASHY CONTENT |
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#20 | ||
scriptmaster
Industry Role:
Join Date: May 2006
Location: Serbia
Posts: 5,237
|
Quote:
Quote:
![]() ![]() |
||
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#21 |
Confirmed User
Join Date: Oct 2007
Posts: 102
|
__________________
If you have sites with creampie gangbangs, gang bangs, or creampies please submit them to my new site: http://creampiegangbang.org/add.php http://creampiegangbang.org/ |
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#22 |
Confirmed User
Join Date: Jul 2006
Posts: 808
|
Hey
Thanks for that Wizzart... I take it thats what you use to stop bad bots and site rippers... ??? Does it work ok in the robots.txt as I was told a lot of them ignore the robots.txt file... and its better used putting it in the HTACCESS file instead ???
__________________
![]() BIG PIMP CASH - Upto 70% Revshare and Upto 30$ PPS [email protected] - ICQ - 197119155 Need Content ? TRASHY CONTENT |
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#23 | |
Confirmed User
Join Date: Apr 2007
Posts: 983
|
Quote:
![]()
__________________
Skype: Triplexprint |
|
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#24 |
Confirmed User
Join Date: Feb 2003
Location: Here There and Everywhere
Posts: 5,477
|
Google would help
__________________
Free to Play MMOs and MMORPGs |
![]() |
![]() ![]() ![]() ![]() ![]() |