![]() |
![]() |
![]() |
||||
Welcome to the GoFuckYourself.com - Adult Webmaster Forum forums. You are currently viewing our boards as a guest which gives you limited access to view most discussions and access our other features. By joining our free community you will have access to post topics, communicate privately with other members (PM), respond to polls, upload content and access many other special features. Registration is fast, simple and absolutely free so please, join our community today! If you have any problems with the registration process or your account login, please contact us. |
![]() ![]() |
|
Discuss what's fucking going on, and which programs are best and worst. One-time "program" announcements from "established" webmasters are allowed. |
|
Thread Tools |
![]() |
#1 |
Confirmed User
Industry Role:
Join Date: Dec 2003
Location: Wisconsin
Posts: 3,574
|
Need an email scraper
I need to contact a shitload of people - (read: CONTACT, not spam) - who all happen to be members of a site that lists their email addresses and phone numbers. Is there a program that will search the site and collect the email addresses for me?
|
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#2 |
Adult Content Provider
Industry Role:
Join Date: May 2005
Location: Europe
Posts: 18,243
|
so you are going to write personal messages to each and every single one of them? If not, its spam
![]() |
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#3 |
Too lazy to set a koala
Industry Role:
Join Date: Jan 2007
Location: CZ/EU forever!
Posts: 16,139
|
if its your site you have access to the database and thats easy, if its not your site, you are going to spam no matter how you call it
__________________
|
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#4 |
Confirmed User
Industry Role:
Join Date: Dec 2003
Location: Wisconsin
Posts: 3,574
|
I'm not here for your opinion. I'm here to find a solution to my need.
If anyone knows of a program, feel free to email me - wolfyman at gmail. |
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#5 |
Guest
Posts: n/a
|
^[a-z0-9._%+-]+@[a-z0-9.-]+\.[a-z]{2,4}$
|
![]() ![]() ![]() ![]() ![]() |
![]() |
#6 |
Too lazy to set a custom title
Join Date: Oct 2001
Location: Spartaaaaaaaaa
Posts: 14,136
|
use httrack to download the site (grab html only) then run an email extractor on the html files, I'm sure that there are better ways, google email extractor and see what you come up with
|
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#7 | |
Confirmed User
Industry Role:
Join Date: Dec 2003
Location: Wisconsin
Posts: 3,574
|
Is that a search query? Where would I use that?
Google: "site:example.com ^[a-z0-9._%+-]+@[a-z0-9.-]+\.[a-z]{2,4}$" returned nothing, what am I missing? Quote:
![]() |
|
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#8 |
Confirmed User
Join Date: Aug 2006
Location: Online (duh)
Posts: 167
|
LOL. That's a regular expression. Why don't you just hire some cheap coder on Rent-a-Coder and have him write you a script which does it? Probably cost like $50 max.
__________________
HeatSeek: The iTunes of Porn ! $30 PPS, conversion ratios 10x better than content sites. |
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#10 |
Guest
Posts: n/a
|
|
![]() ![]() ![]() ![]() ![]() |
![]() |
#11 | |
Confirmed User
Industry Role:
Join Date: Dec 2003
Location: Wisconsin
Posts: 3,574
|
Killswitch, it doesn't appear to work using it in google - how would I use that "regex"?
Quote:
|
|
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#12 | |
Guest
Posts: n/a
|
Quote:
|
|
![]() ![]() ![]() ![]() ![]() |
![]() |
#13 |
sex dwarf
Join Date: May 2002
Posts: 17,860
|
It's a regular expression. Used to identify patterns - in this case, email addresses.
__________________
/(bb|[^b]{2})/ |
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#14 | |
Confirmed User
Industry Role:
Join Date: Dec 2003
Location: Wisconsin
Posts: 3,574
|
I've retrieved all the email addresses on the site, thanks to a cool mofo that I'll happily give credit to if he wants it.
Quote:
For the sake of knowledge only, since my mission has been accomplished - how do I apply that regex expression to a task like this? I don't have control of the site, so any php pages I would build would be outside of the domain and would not have access to any databases. What am I missing? |
|
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#15 |
Confirmed User
Join Date: Jul 2007
Posts: 7,687
|
i am sure you can buy it from internet classifieds.
|
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#16 |
Unregistered Abuser
Industry Role:
Join Date: Oct 2007
Posts: 15,547
|
if what your doing is not spam then I don't know what is
![]() |
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#17 |
Now choke yourself!
Industry Role:
Join Date: Apr 2006
Posts: 12,085
|
I still prefer this regex.
__________________
|
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#18 |
►SouthOfHeaven
Join Date: Jun 2004
Location: PlanetEarth MyBoardRank: GerbilMaster My-Penis-Size: extralarge MyWeapon: Computer
Posts: 28,609
|
i would have to see the site , but if you put your email on a public site , arent you basiclly asking for unsolicited mail ? Kinda like putting your # on a bathroom stall then saying every pervert who contacts you is "spamming" the only exception to the rule i would think would be if its implied on the site the mail is to be used for a specific purpose, or its a whois info that isn't posted by the user
__________________
hatisblack at yahoo.com |
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#19 |
So Fucking Banned
Join Date: Jan 2009
Posts: 2,377
|
I need a poop scraper.
|
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#20 |
(felis madjewicus)
Industry Role:
Join Date: Jul 2006
Location: In Mom & Dad's Basement
Posts: 20,368
|
were you able to sort the first site out?
|
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#21 | |
Guest
Posts: n/a
|
Quote:
|
|
![]() ![]() ![]() ![]() ![]() |
![]() |
#22 | |
sex dwarf
Join Date: May 2002
Posts: 17,860
|
Quote:
Personally, I'd go for another language than php for this, but really, it can be done in pretty much any programming language. Set a bot like that loose on a big directory, and you'll eventually build up a list of millions of email addresses. Of course, others do the same thing as well, so the email addresses won't exactly be fresh. Keep in mind that site owners might have email harvester traps, which generate a list of random invalid email addresses and generate dynamic links to themselves as well, ensuring that if your harvester bot isn't protected from them, it will keep getting new invalid email addresses from them forever.
__________________
/(bb|[^b]{2})/ |
|
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#23 |
(felis madjewicus)
Industry Role:
Join Date: Jul 2006
Location: In Mom & Dad's Basement
Posts: 20,368
|
Perl allows you to be one lazy ass coder. Jump on CPAN and install the Net::Scan::Extract module
Code:
use Net::Scan::Extract qw( :all ); my @emails = Extract_Email($source); print "$_\n" for @emails; |
![]() |
![]() ![]() ![]() ![]() ![]() |