Welcome to the GoFuckYourself.com - Adult Webmaster Forum forums.

You are currently viewing our boards as a guest which gives you limited access to view most discussions and access our other features. By joining our free community you will have access to post topics, communicate privately with other members (PM), respond to polls, upload content and access many other special features. Registration is fast, simple and absolutely free so please, join our community today!

If you have any problems with the registration process or your account login, please contact us.

Post New Thread Reply

Register GFY Rules Calendar
Go Back   GoFuckYourself.com - Adult Webmaster Forum > >
Discuss what's fucking going on, and which programs are best and worst. One-time "program" announcements from "established" webmasters are allowed.

 
Thread Tools
Old 02-19-2014, 09:37 AM   #1
Jel
Confirmed User
 
Jel's Avatar
 
Industry Role:
Join Date: Feb 2007
Posts: 6,904
how do I do this...

text file 1 has 5000 unique lines, text file 2 has 500 unique lines

I want to remove each *entire line* containing the data on text file 2, from file 1, leaving me 4500 lines

!I don't know php!

!I barely know excel (well, open office) but if it's doable in that I can follow simpleton instructions !

Ideally an online place similar to textmechanic.com, where I load one, load the other, and hey presto.

thanks in advance, and I might even thank you again afterwards for double the fun
Jel is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 02-19-2014, 09:47 AM   #2
klinton
So Fucking Banned
 
Industry Role:
Join Date: Apr 2003
Location: online
Posts: 8,766
how about simple file splitting ?

like for 10 parts ? it gives you 500 lines which u can extract later
klinton is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 02-19-2014, 09:58 AM   #3
DamianJ
Too lazy to set a custom title
 
DamianJ's Avatar
 
Industry Role:
Join Date: Jul 2006
Location: A magical land
Posts: 15,808
fdupes -r /some/directory/path > /some/directory/path/fdupes.log

or if you want a GUI:

http://download.cnet.com/Remove-Dele...-75741543.html

Google is awesome.
DamianJ is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 02-19-2014, 10:02 AM   #4
robwod
Confirmed User
 
Industry Role:
Join Date: Nov 2005
Posts: 2,539
If I understand correctly, you want to simply look up all the data in list #2 to find a match on list #1, then remove that record?

A slightly inefficient method would be to use a single Excel workbook, then place list #1 in one sheet (tab) and sheet two in another. Give the data in sheet a range name and then using @vlookup in an adjacent empty cell on the first line of list #1 to perform a lookup. Se the condition that if a Match is found, then insert "Y" into that cell.

At that point, drag the vlookup formula down the column to apply it to all records in the first list. Then you can simply sort the list #1 by the vlookup column, and highlight/delete all the ones with a "Y" on them.

Someone else can likely provide a more efficient method, but this should work if you have exact match records. Use F1 inside Excel for help on using the vlookup function or a simple Google query will show you all sorts of usage examples.

ETA: see? I knew someone else would post something more efficient :D
__________________
NSFW

Last edited by robwod; 02-19-2014 at 10:03 AM..
robwod is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 02-19-2014, 10:04 AM   #5
Jel
Confirmed User
 
Jel's Avatar
 
Industry Role:
Join Date: Feb 2007
Posts: 6,904
Quote:
Originally Posted by robwod View Post
If I understand correctly, you want to simply look up all the data in list #2 to find a match on list #1, then remove that record?
correct mate

Quote:
Originally Posted by robwod View Post
A slightly inefficient method would be to use a single Excel workbook, then place list #1 in one sheet (tab) and sheet two in another. Give the data in sheet a range name and then using @vlookup in an adjacent empty cell on the first line of list #1 to perform a lookup. Se the condition that if a Match is found, then insert "Y" into that cell.

At that point, drag the vlookup formula down the column to apply it to all records in the first list. Then you can simply sort the list #1 by the vlookup column, and highlight/delete all the ones with a "Y" on them.

Someone else can likely provide a more efficient method, but this should work if you have exact match records. Use F1 inside Excel for help on using the vlookup function or a simple Google query will show you all sorts of usage examples.
I'm praying there's an easier way to do this :D
Jel is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 02-19-2014, 12:34 PM   #6
DamianJ
Too lazy to set a custom title
 
DamianJ's Avatar
 
Industry Role:
Join Date: Jul 2006
Location: A magical land
Posts: 15,808
Quote:
Originally Posted by Jel View Post
I'm praying there's an easier way to do this :D
I've already posted two.

Dur.
DamianJ is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 02-19-2014, 01:07 PM   #7
Jel
Confirmed User
 
Jel's Avatar
 
Industry Role:
Join Date: Feb 2007
Posts: 6,904
gaymian - your GUI suggestion doesn't do what I need it to (nor is it a full version).
Jel is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 02-19-2014, 03:28 PM   #8
Barry-xlovecam
It's 42
 
Industry Role:
Join Date: Jun 2010
Location: Global
Posts: 18,083
Quote:
~:$cd /home/user/directory-where-the-files-are/
/home/user/directory-where-the-files-are/:$cat file1 file2 > bigfile.csv
/home/user/directory-where-the-files-are/:$sort | bigfile.csv uniq > sortedfile.csv
Use your LINUX webserver in SSH or a LINUX computer terminal for this.

Google has some excel solutions (for the point an click crowd)...
https://www.google.com/search?q=remo...la8&oe=ut f-8

Quote:
wc -l bigfile.csv
47784 bigfile.csv
wc -l sortfile.csv
29466 sortfile.csv
My 'bigfile.csv' has 47784 lines. Sorting out -18318 duplicate lines took less than 2 seconds -- touch that Excel -- eat my dust!
Barry-xlovecam is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 02-19-2014, 03:35 PM   #9
myleene
Confirmed User
 
Industry Role:
Join Date: Oct 2013
Location: Canada
Posts: 866
Notepad++ can do this in 5 seconds with TextFX.

http://stackoverflow.com/questions/3...ows-in-notepad

(*) Lines will have to be sorted either ascending or descending though if you use the first method.


Otherwise... Use a regex.

Last edited by myleene; 02-19-2014 at 03:37 PM..
myleene is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 02-19-2014, 03:36 PM   #10
stoka
Confirmed User
 
stoka's Avatar
 
Join Date: Dec 2005
Posts: 956
useful stuff
stoka is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 02-19-2014, 03:57 PM   #11
OneHungLo
So Fucking Banned
 
Industry Role:
Join Date: May 2001
Location: Your mom's front hole
Posts: 40,906
http://textmechanic.com/Delimited-Column-Extractor.html

edit...I saw that you tried textmechanic.com...Are you sure you can't do it there?

Last edited by OneHungLo; 02-19-2014 at 03:59 PM..
OneHungLo is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 02-19-2014, 04:40 PM   #12
Seth Manson
Please dont fuck animals
 
Seth Manson's Avatar
 
Industry Role:
Join Date: Jul 2010
Location: Henderson, NV
Posts: 3,988
I would use UltraCompare. It's part of the UltraEdit suite. Pretty sure you can get a free trial.
Seth Manson is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 02-19-2014, 05:22 PM   #13
Jel
Confirmed User
 
Jel's Avatar
 
Industry Role:
Join Date: Feb 2007
Posts: 6,904
regex, wc>@blah etc - waaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaay over my head...

onehunglo - don't think it does what I want - the lines on files 1 & 2 aren't the same, so aren't duplicates as such.

eg file 1 consists of:

line1: info data blahblah OH-LOOK-THIS-PHRASE-1 etc etc more data on this line
line2: info data blahblah OH-LOOK-THIS-PHRASE-2 etc etc more data on this line
line3: info data blahblah OH-LOOK-THIS-PHRASE-3 etc etc more data on this line
--
line5000: info data blahblah OH-LOOK-THIS-PHRASE-5000 etc etc more data on this line

file 2 consists of:
line1: OH-LOOK-THIS-PHRASE-1
line2: OH-LOOK-THIS-PHRASE-2
line3: OH-LOOK-THIS-PHRASE-3
--
line450: OH-LOOK-THIS-PHRASE-450

what I need:
input file 1, input file 2, and have it seek file 1 for OH-LOOK-THIS-PHRASE-X, and remove that entire line.
Jel is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 02-20-2014, 05:30 AM   #14
Jel
Confirmed User
 
Jel's Avatar
 
Industry Role:
Join Date: Feb 2007
Posts: 6,904
bump for this shift in case there's an easy way while I wait for my coder bloke to get back from holiday (and try ultracompare)
Jel is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 02-20-2014, 05:41 AM   #15
DamianJ
Too lazy to set a custom title
 
DamianJ's Avatar
 
Industry Role:
Join Date: Jul 2006
Location: A magical land
Posts: 15,808
Quote:
Originally Posted by Jel View Post
gaymian
Oh that is genius! You took the first bit of my name and changed it to GAY!

HiLARious!

Last edited by DamianJ; 02-20-2014 at 05:42 AM..
DamianJ is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 02-20-2014, 07:03 AM   #16
Jel
Confirmed User
 
Jel's Avatar
 
Industry Role:
Join Date: Feb 2007
Posts: 6,904
almost as funny as you following dvt around and using:

bummer boy
spacker
likes bumming men
looking for men to bum
lollington
ad hominem
now I know why people got the hump with me and markham
etc

over and over and over and over. Why don't you fuck off out of this biz thread wankstain, it's annoying keep having to 'view post' when you post in my biz threads.

You're a fucking cock, you don't like me, I don't like you, so fuck off. Or don't, and go look up the word provocation, and how it's a viable defence in assault cases.

Cue 'wahwah' remarks, 'oh are you THREATENING me???' variations, 'hmm, bannable offence' bollocks, or any other of your predictable and tedious as fuck posts. don't forget the grammer mistaikes as well, mr im-too-stupid-to-realise-i-act-like-the-precocious-11-year-old-kid-that-thinks-he-is-funny-but-everyone-thinks-is-an-out-and-out-cunt
Jel is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 02-20-2014, 07:56 AM   #17
pinkz
Mr 1%
 
pinkz's Avatar
 
Industry Role:
Join Date: May 2005
Location: Nomad
Posts: 1,397
http://textmechanic.com/Big-File-Too...ate-Lines.html
__________________
$$$$ Video Secrets $$$$
pinkz is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 02-20-2014, 08:04 AM   #18
Jel
Confirmed User
 
Jel's Avatar
 
Industry Role:
Join Date: Feb 2007
Posts: 6,904
they aren't duplicate lines - see post #13

still to fire up ultracompare to see if that does the trick
Jel is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Post New Thread Reply
Go Back   GoFuckYourself.com - Adult Webmaster Forum > >

Bookmarks



Advertising inquiries - marketing at gfy dot com

Contact Admin - Advertise - GFY Rules - Top

©2000-, AI Media Network Inc



Powered by vBulletin
Copyright © 2000- Jelsoft Enterprises Limited.