Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Problems with FS " "

Status
Not open for further replies.

Orale

Technical User
Nov 8, 2006
5
US
Hello everyone,

I have 3 files that I need to deal with and create 1 single file from all of them.
myfirst file looks like:
00001
00002
00003
00004.... and what ever numbers this file contains is what I need to process against.

mysecond file looks like:
"00001","a description","character","size","F",.....
"00002","a, description","character","size","F",.....
"00003","a description","character","size","F",.....
"00004","a, description","character","size","F",.....
...
...
I need to process myfirst file against mysecond file and when it finds the matches from the first field, awk will write the compared field, second field and fourth field to mynewfile.
The issue that I'm having is that when I started, I used the comma as the FS which created a problme becuase the second field sometimes has a comma eventhough is 1 field so it throws off my logic below:

awk -F, '
NR==FNR{a[$1]=$0;next}
{for(i in a)if($1~i){print a" " $2"--"$4 ;next}}
' myfirst mysecond >>newfile

When the above is ran, the fields are written incorrectly when an extra comma is encountered in the second field of the second file.
I tried changing "-F," to "-F" "" and all I get is the comparing number and a bunch of "--" through out the new file.
I'm expecting:
00001 a description -- size
and I'm getting:
00001 a -- character
or
00001 ---
Could you please tell me how to solve for that?

Finally, mythird file looks like:
"00001","11.234","03/18/1998"
"00002","11.234","03/18/1998"
"00003","11.234","03/18/1998"
"00004","11.234","03/18/1998"
...
...

Could you tell me how to make the 2nd field on the thirdfile the last field on my newfile (of course matching the first field between the two files) once the first problem is resolved?

Thank you in advance for all your help and sorry for the long note.

Orale!
 
Oh Boy......

Does anybody know or could guide me on how to fix the issue that I'm having???
I'm OK with at least some tips on how to deal with the double ". I don't need the full script or my whole issue resolved but a few tips might take me to the right path to figure it out.
If not, any links or books regarding awk that you guys would recommend?

Thanks again,
Orale
 
Well, you can't have a field delimiter character that also occurs within the text of a field, awk simply splits that text and increments the number of fields counter for the record at hand.

If you have to deal with this kind of source data, you will have to 'prime' the data first, e.g. with sed to choose a different field delimiter. As an example, to change to ':' character try this

[tt]sed 's/^"//; s/"$//; s/","/:/g' <mysecond >mysecond.primed
awk -F: 'your program here' myfirst mysecond.primed[/tt]


HTH,

p5wizard
 
If every field is surrounded by quotes, then try this:
Code:
FS = "\",\""
For every line, you'll need this:
Code:
gsub( /^"|"$/, "" )
 
how can i read only string content in " " for example to read "there is a wild dog" using awk line ? Thank you for your answer
 
Thank you guys, that's exactly what I needed.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top