Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Removing records from csv file 1

Status
Not open for further replies.

ianicr

IS-IT--Management
Nov 4, 2003
230
GB
I have a csv file of about 140000 lines of address and telephone data. I've got another file of 50000 phone numbers that need to be removed from the first file. I've tried doing a grep -v -f file2 file1 >file3 but this takes forever. Is there an easier way to do it other than splitting the second file into smaller chunks and then greping with the smaller files?
 
sample records from both files, pls

vlad
+----------------------------+
| #include<disclaimer.h> |
+----------------------------+
 
This awk program is not tested and assumes that the records in the second file contain only a phone number. Change the phfld variable to the field in the first file containing the phone number.

BEGIN {
FS = &quot;,&quot;
while ((getline < &quot;file2&quot;) >0) a[$0] = 1
phfld = 1
}
{
if (!a[$phfld]) print
}

CaKiwi

&quot;I love mankind, it's people I can't stand&quot; - Linus Van Pelt
 
Sample records are as follows:
file1:
Mr,J,Bird,439 Stevens Road,Atecks Green,DERBY,,DE12 4GK,01332999999

The other file is just a file of numbers like:
01332123456
01332987654

And sorry i'm not familiar with awk. Do I just save to a file and do awk -f programname filename?
 
yes

CaKiwi

&quot;I love mankind, it's people I can't stand&quot; - Linus Van Pelt
 
For you sample data, phfld should be 9 or if the phone number is always the last field, you could use

if (!a[$NF]) print



CaKiwi

&quot;I love mankind, it's people I can't stand&quot; - Linus Van Pelt
 
Just a quick question. Is there a way in awk using the above script to echo the records it has removed into another file?

I'm very new to awk but it seems like an excellent utility to learn. Does anyone knows of any good awk tutorials/books?
 
To write the records removed into another file, add an else to the if, like so
{
if (!a[$NF]) print
else print > &quot;file3&quot;
}

I use the O'Reilly book &quot;Sed & awk&quot; by Dougherty and Robbins. There are a lot of resources online including the gawk manual at



CaKiwi

&quot;I love mankind, it's people I can't stand&quot; - Linus Van Pelt
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top