![]() |
![]() |
![]() |
||||
Welcome to the GoFuckYourself.com - Adult Webmaster Forum forums. You are currently viewing our boards as a guest which gives you limited access to view most discussions and access our other features. By joining our free community you will have access to post topics, communicate privately with other members (PM), respond to polls, upload content and access many other special features. Registration is fast, simple and absolutely free so please, join our community today! If you have any problems with the registration process or your account login, please contact us. |
![]() ![]() |
|
Discuss what's fucking going on, and which programs are best and worst. One-time "program" announcements from "established" webmasters are allowed. |
|
Thread Tools |
![]() |
#1 |
Confirmed User
Join Date: Mar 2004
Posts: 5,116
|
A program that can extract data from a list of URLs?
Anyone know a program that can extract data from a list of URLs? I want to be able to set where in the document it should start to grab data. Example:
Start from: <a href=" Stop at: "> Is there any program that can do that? ![]() |
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#2 |
there's no $$$ in porn
Industry Role:
Join Date: Jul 2005
Location: icq: 195./568.-230 (btw: not getting offline msgs)
Posts: 33,063
|
perl is your friend.
|
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#3 |
Confirmed User
Join Date: Feb 2004
Location: United Kingdom
Posts: 575
|
Ya I was gonna say Perl as well check out m{href\=\"(.*?)\"} or the module tokenparser. Have fun!!!
|
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#4 |
Confirmed User
Join Date: Sep 2003
Posts: 8,713
|
Regex is your friend..
![]()
__________________
![]() TrafficCashGold Paying Webmasters Since 1996! Awesome Conversions! Fast Weekly Payments! Over 125 Tours! |
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#5 | |
Confirmed User
Join Date: Jan 2005
Posts: 422
|
Quote:
__________________
SIG TOO BIG! Maximum 120x60 button and no more than 3 text lines of DEFAULT SIZE and COLOR. Unless your sig is for a GFY top banner sponsor, you may use a 624x80 instead of a 120x60. Let me repeat... A 120 x 60 button and no more that 3 lines of DEFAULT SIZE AND COLOR text. |
|
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#6 |
Confirmed User
Join Date: Mar 2004
Posts: 5,116
|
I could program something in visual basic... but I know there have to be some applications out there already for this. (and im also lazy)
|
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#7 |
Confirmed User
Join Date: Feb 2004
Location: If i was up your ass you'd know
Posts: 3,695
|
perl was designed to do exactly that. Php is pretty good for doing it as well.
|
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#8 |
see you later, I'm gone
Industry Role:
Join Date: Oct 2002
Posts: 14,111
|
Here is a way to do it without regex:
<?php // buffer is a variable to hold the data we are working on $buffer=''; // set vars for the beginning of what we want to parse and the end of what we want to parse $begin_pattern='<a href="'; $end_pattern='">'; // set up var for data being extracted. This could be an array or string to write to a file whatever // here I am just using it to echo the data extracted $dataout=''; // set file2read to point at the path and file that the list is stored in $file2read='testfile.txt'; // open the file $filein=fopen('testfile.txt','r'); // suck the entire file into a variable while (!feof($filein)){ $buffer=$buffer . fgets($filein); } // close the file fclose($filein); // check to make sure we got something out of the file if ($buffer>''){ // do this while any occurences of the beginning pattern are still in the data while( substr_count(strtolower($buffer),$begin_pattern)>0 ){ // trim the data to just past the next beginning pattern occurence $buffer=substr($buffer, strpos(strtolower($buffer),$begin_pattern)+strlen( $begin_pattern)); // pull the data in from where we trimmed the data to the occurence of the next end pattern $dataout=substr($buffer,0,strpos($buffer,$end_patt ern)); // trim the buffer by the length of the data we pulled $buffer=substr($buffer,strlen($dataout)); // output the data we pulled - could go into an array here or write it to a file whatever echo $dataout . '<br>'; } } ?> takes a file that looks like this: <a href="testurl1.com">crapcrapcrap<a href="testurl2.com">morecrapmorecrap<a href="testurl3.com">yesevenmore<a href="testurl4.com">awholelottacrap<a href="testurl5.com"><a href="testurl6.com"><a href="testurl7.com"><a href="testurl8.com"><a href="testurl9.com"><a href="testurl10.com"><a href="testurl11.com"><a href="testurl12.com"><a href="testurl13.com"><a href="testurl14.com"><a href="testurl15.com"><a href="testurl16.com"><a href="testurl17.com"><a href="testurl18.com"><a href="testurl19.com"><a href="testurl20.com"><a href="testurl21.com"><a href="testurl22.com"><a href="testurl23.com"><a href="testurl24.com"><a href="testurl25.com"> and outputs it like this: testurl1.com testurl2.com testurl3.com testurl4.com testurl5.com testurl6.com testurl7.com testurl8.com testurl9.com testurl10.com testurl11.com testurl12.com testurl13.com testurl14.com testurl15.com testurl16.com testurl17.com testurl18.com testurl19.com testurl20.com testurl21.com testurl22.com testurl23.com testurl24.com testurl25.com
__________________
All cookies cleared! |
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#9 |
<&(©¿©)&>
Industry Role:
Join Date: Jul 2002
Location: Chicago
Posts: 47,882
|
if you want a custom solution, icq: 33375924
__________________
Custom Software Development, email: woj#at#wojfun#.#com to discuss details or skype: wojl2000 or gchat: wojfun or telegram: wojl2000 Affiliate program tools: Hosted Galleries Manager Banner Manager Video Manager ![]() Wordpress Affiliate Plugin Pic/Movie of the Day Fansign Generator Zip Manager |
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#10 |
Confirmed User
Join Date: Mar 2004
Posts: 5,116
|
Wow, thank you VERY much sarettah! =))
|
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#11 | |
see you later, I'm gone
Industry Role:
Join Date: Oct 2002
Posts: 14,111
|
Quote:
You're welcome. However, looking back I have a code error in there from when I was making it "friendly" where I have: // open the file $filein=fopen('testfile.txt','r'); Change it to: // open the file $filein=fopen($file2read,'r'); ![]()
__________________
All cookies cleared! |
|
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#12 |
Registered User
Join Date: Aug 2005
Posts: 3,570
|
hey sweet
![]() |
![]() |
![]() ![]() ![]() ![]() ![]() |