Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Fields are in different locations

Status
Not open for further replies.

rscotty

IS-IT--Management
Mar 27, 2002
12
0
0
US
I am trying to use awk to grab records within a text file. One of the field locations however is a moving target within each record. One record the field will be at location 54, and the next record will be at location 60. Is there a way to perform a pattern match on say a social security number format 111-11-1111 and grab the field along with a few other fields for that record?

Thanks for any help!

rscotty
 
Use function "match" to find a RE [your ssi number pattern] within a given record.

pls post a sample file with explanation of what needs to be done.

vlad
 

12,SCOTT,RAY,2,3,4,5,6,111-12-1234,0,0,0,0
13,SCOTT,RAYMOND,2,3,4,5,6,7,8,222-11-9876,0,0,0,0

I need to grab the following info from the file:
Field1, Field4, Field9 from first record
and Field1, Field4, Field11 from second record and so on. Alot of the records have the fields in the same location however there are enough of them where the SSN location is in a different location. Its a bear!

Thanks for looking at this for me!

Rscotty
 
Use an extended regular expression to find your target. Then print out your fields and the target range. I assume in this example you want fields 1, 4, and the ssn to be comma delimited.

BEGIN {
FS=","
OFS=","
ssn="[0-9]{3}-[0-9]{2}-[0-9]{4}"
}
{
print $1, $2, substr( $0, match( $0, ssn ), 11 )
}

Cheers,
ND [smile]

bigoldbulldog@hotmail.com
 
Thanks for your help. However, the match is not looking at the ssn properly. It is finding the first 11 digits in the record and printing those out instead. Each record does have other info in it that I dont want and most of this info are digits.

Thank you,
rscotty
 
Hi rscotty,

If your first 4 fields can be counted on to be static, then, this should work for you. I tested it on your short snippet of input and it seems to do what you want. I set it up to be comma delimited, of course!


awk 'BEGIN{FS=","}

{
for(i=5;i<=NF;i++)
if($i ~ /[0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9][0-9][0-9]/) {
printf(&quot;%s,%s,%s\n&quot;, $1, $4, $i)
}
}' input > output


Hope this helps!


flogrr
flogr@yahoo.com

 
That worked! Thanks so much for everyones help. This is a great forum that I will tell all my IT and programmer friends about.

rscotty
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top