GoFuckYourself.com - Adult Webmaster Forum

GoFuckYourself.com - Adult Webmaster Forum (https://gfy.com/index.php)
-   Fucking Around & Business Discussion (https://gfy.com/forumdisplay.php?f=26)
-   -   archive of shut-down Geocities released as 1TB torrent (https://gfy.com/showthread.php?t=995076)

CYF 10-29-2010 04:29 PM

archive of shut-down Geocities released as 1TB torrent
 
http://www.techdirt.com/articles/201...-torrent.shtml

Quote:

Archive Of Geocities Released As A 1TB Torrent
from the archiving-history dept

In early 2009, Yahoo announced that it was going to put Geocities out of its misery and finally shut down the site entirely, even as it was still getting 11 million unique visitors per month. Soon after the announcement, we had heard about some projects to try to archive the entire site (with some claims that it couldn't be done in time). The actual shut down occurred almost exactly a year ago, and yet a group calling itself The Archive Team is apparently releasing its entire Geocities archive, blinking flashing "under construction signs" and all, as a nearly 1 TB torrent. They don't think they got everything, but do believe they archived "a significant percentage" of the site.

GrouchyAdmin 10-29-2010 04:34 PM

I can't think of a better way to waste a terabyte than old geocities pages. Boy, howdy.

BIGTYMER 10-29-2010 04:36 PM

Good stuff man! :)

signupdamnit 10-29-2010 04:37 PM

The thing is someday this information might be considered near priceless to someone. Hard to believe, but true.

Like this:

Quote:

ruth (profile), Oct 29th, 2010 @ 8:29am

This is where I am. I don't have the space for all that. All I care about is my friend's profile. I would love to read it. She was an avid Geocities user...she took her life about 3 yrs ago & would be nice to reflect back on what she wrote. If I could just download her profile only. But not all of geocities.

Highest Def 10-29-2010 04:39 PM

Quote:

Originally Posted by GrouchyAdmin (Post 17652788)
I can't think of a better way to waste a terabyte than old geocities pages. Boy, howdy.

:1orglaugh Headed to Best Buy now.

Funny story, but as usual, the comments on that article are where it gets REAL funny.

trevesty 10-29-2010 04:41 PM

Quote:

Originally Posted by signupdamnit (Post 17652810)
The thing is someday this information might be considered near priceless to someone. Hard to believe, but true.

Yep, exactly. Someone on here who maybe started on geocities then launches the next Google.

rowan 10-29-2010 05:19 PM

I do search engine research and data mining so I'm interested in this. Wouldn't it be kind of geeky cool to have a search engine dedicated to indexing stale websites from geocities? :D

Question is whether 1TB is the size of the raw data, or the (heavily?) compressed archive...

$5 submissions 10-29-2010 05:21 PM

11 million uniques a month.

camgirlshide 10-29-2010 05:26 PM

Of course we need to consider the legal implications of putting this content back on the web.

iSpyCams 10-29-2010 05:27 PM

I wonder what it would take to get it back up, and if it would even be legal to put it back up.

rowan 10-29-2010 05:29 PM

Looks like it's already being done:

http://www.reocities.com/neighborhoods/
http://www.oocities.com/
http://www.archive.org/web/geocities.php

signupdamnit 10-29-2010 05:38 PM

Quote:

Originally Posted by rowan (Post 17652921)
I do search engine research and data mining so I'm interested in this. Wouldn't it be kind of geeky cool to have a search engine dedicated to indexing stale websites from geocities? :D

Question is whether 1TB is the size of the raw data, or the (heavily?) compressed archive...

I'd kill to get raw access to archive.org's database. Talk about the ultimate in data mining. They used to offer it and I recall seriously considering it and reading some of the details but the links to more information about it seem dead.

Quote:

Can I search the Archive?

Using the Internet Archive Wayback Machine, it is possible to search for the names of sites contained in the Archive (URLs) and to specify date ranges for your search. We hope to implement a full text search engine at some point in the future.
Quote:

Who has access to the collections? What about the public?

Anyone can access our collections through our website archive.org. The web archive can be searched using the Wayback Machine.

The Archive makes the collections available at no cost to researchers, historians, and scholars. At present, it takes someone with a certain level of technical knowledge to access collections in a way other than our website, but there is no requirement that a user be affiliated with any particular organization.
http://www.archive.org/about/faqs.php

http://www.archive.org/web/researche...nded_users.php (dead link - but if I remember right it gave a lot of information on how the collection worked and gaining access after requesting it. I believe they took it offline mainly due to funding and technical issues -- at least that's what I think the page claimed last when I read it?)

They also tried to grab a lot of Geocities stuff: http://www.archive.org/web/geocities.php

Hopefully they merge this new collection with their own.

</rambling about archive.org since someone got me thinking of it from another topic>

rowan 10-29-2010 05:41 PM

Copying the entire archive.org database would present some technical challenges! That's like 30 partial archives of the entire internet :D

PAR 10-29-2010 05:46 PM

Download it..
Toss it on a server..
Add banner..
Let people make simple blogs on the same url.

signupdamnit 10-29-2010 05:47 PM

Quote:

Originally Posted by rowan (Post 17652976)
Copying the entire archive.org database would present some technical challenges! That's like 30 partial archives of the entire internet :D

No kidding! But finally implementing the full text search would be the ultimate accomplishment -with regexp as well would be badass. I'm surprised Google hasn't picked it up although it'd probably bring up all sorts of new copyright and privacy issues as all that shit comes back alive. But man... all that information. :pimp

garce 10-29-2010 05:48 PM

archive.org has more of my old work archived than I do.

Elli 10-29-2010 05:51 PM

Quote:

Originally Posted by garce (Post 17652989)
archive.org has more of my old work archived than I do.

you can say that again!

Chris 10-29-2010 06:08 PM

my mom started an adoption search service on geocities and now runs a fulltime private investigation firm that specializes in adoptions .. all from her geocities site :P

BlackCrayon 10-29-2010 07:24 PM

i wish i could get my site out of that mess.

Angry Jew Cat - Banned for Life 10-29-2010 08:32 PM

Quote:

Originally Posted by signupdamnit (Post 17652810)
The thing is someday this information might be considered near priceless to someone. Hard to believe, but true.

Like this:

Develop a way to sift through all that crap efficiently, and you could have yourself an excellent database to mine content from :2 cents:


All times are GMT -7. The time now is 03:47 AM.

Powered by vBulletin® Version 3.8.8
Copyright ©2000 - 2025, vBulletin Solutions, Inc.
©2000-, AI Media Network Inc