Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

substituting fields in one file with same fields from another 1

Status
Not open for further replies.

starlite79

Technical User
Aug 15, 2008
89
US
Hi everyone.

I'm fairly new to AWK, but from what I've read I think it can do some powerful things if I know what I'm doing.

I would like to be able to substitute two fields in one file (a Fortran program I wrote) with the same fields from a C program that is very similar.

Here is one line from each file:

C file line:
vvd(test(-0.1), 6.183185307179586477, 1e-12, "test", "a", status);

Fortran file line:
CALL VVD ( test ( -0.1D0 ), 6.183185307179587D0,
: 1D-12, 'test', 'A', STATUS )

I began with writing a simple awk program to make the field separator a comma and print the second field as so:

BEGIN { FS = "," # make comma the field separator
}
$1 ~ /VVD/ { print $2
}

The C file has records which take only one line. On the other hand, the Fortran program (to conform to the 72 character requirement of my colleague) has records that have multiple lines (indicated by the continuation character :). I only am concerned about the second and third fields and need to know how to make ; the record separator for the one file and something else (maybe a newline?) for the second.

Any help would be very much appreciated.
 
Thanks again. I was able to match indices using 9 substitutions (I'll have to think about how to do it with less-right now I just want something that works) and get the D0 appended to precision when it is not there.

I'm trying to work out on my own how to get the comma back when it is missing after the precision value. I did a print of "n" and noticed it is missing only when (at least in the code I'm testing) n is 2. This is also when the third field is on the next line. I can't think of a good way to split the line other than to use the "one or more commas."

I tried to use if (n == "2"), then redefine the prec[ind] with a comma attached at the end, but that didn't work. I'm trying to work within that conditional oldtol="" clause.
No matter where I attempt to redefine the prec[ind], it is overwritten by the split.

I also tried to remove the horrible ^M that appear in my new file but it doesn't work. Do they show up in yours? For me they appear when the tolerance IS on the same line as the precision (and there is no comma missing then).

Here is the piece of the code I think I would need to fix:

Code:
        oldtol=a[3]
        # if the tolerance was not on this line, it
        # must be on the next
        if (oldtol == "") {
                print
                getline
                n=split($0,a,", +")
                oldtol=a[1]
                # remove line continuation and white space
                sub("[[:space:]:]*","",oldtol)
        }
        if (prec[ind] !~ ",") {
              prec[ind]=prec[ind]","
        }
        sub(oldtol,tol[ind])
        print
        next
}
# remove ^M characters in the new file
sub("\r$","")
# print any other lines in the Fortran input
FNR != NR { print }

Well, I gave it quite a few tries and still do not get it.
Am I at least in the ballpark?
 
Hi again. I removed the ^M characters and reduced the index substitutions to six. I'm still having trouble with getting that comma back in.

Here is the code in its present form:
Code:
# match vvd preceded by white space
/^[ ]*vvd/ {
        # split lines up by one or more commas
        split($0,a,", +")
        ind=tolower(a[1])
        # strip off the function name and bracket
        sub(".*vvd[(]","",ind)
        # match C array syntax with Fortran array syntax
        sub("[]][]]",",",ind)
        sub("[[]","(",ind)
        sub("[]][[]",",",ind)
        sub("[]]",")",ind)
        # change C array indices to match Fortran indices
        sub("2,","3,",ind)
        sub("1,","2,",ind)
        sub("0,","1,",ind)
        sub(",2",",3",ind)
        sub(",1",",2",ind)
        sub(",0",",1",ind)
        prec[ind]=a[2]
        tol[ind]=a[3]
        # change exponent syntax
        sub("e","D",tol[ind])
        sub("e","D",prec[ind])
        next # skip to the next record
}
# match CALL VVD anywhere on a line
/CALL VVD/ {
        # split lines up by one or more commas
        n=split($0,a,", +")
        ind=a[1]
        # remove the function call
        sub(".*CALL VVD [(]","",ind)
        # remove spaces
        gsub(" ","",ind)
        # remove decimal
        sub("D0","",ind)
        # remove underscore
        sub("_","",ind)
        ind=tolower(ind)
        # if precision does not have "D" in it, append "D0"
        if (prec[ind] !~ "D") {
              prec[ind]=prec[ind]"D0"
        }
        sub(a[2],prec[ind])
        if (n == "2") {
              prec[ind]=prec[ind]","
        }
        oldtol=a[3]
        # if the tolerance was not on this line, it
        # must be on the next
        if (oldtol == "") {
                print
                getline
                n=split($0,a,", +")
                oldtol=a[1]
                # remove line continuation and white space
                sub("[[:space:]:]*","",oldtol)
        }
        sub(oldtol,tol[ind])
        # remove ^M characters in the new file
        sub("\r$","")
        print
        next
}
# print any other lines in the Fortran input
FNR != NR { print }

Any hints would be greatly appreciated.
 
You just need to substitute it into the output line *after* you append the comma. Also I think you can remove the sub("[]][]]",",",ind) as you never need to match ]].

Annihilannic.
 
Ok, I'll give that a try. I'd like to take your word about removing that sub, but I thought it was what converted the "][" to a comma. I'll have to look back at my code.

Thanks again for everything! Maybe this thread can now rest in peace :).
 
Ugh. I thought I understood what you said, but it isn't working :(. Can I ask for your help again?

I placed the code after the remove ^M line thinking it would replace the right a[2], but I was wrong.

Could you tell me where exactly the code should go?

Code:
# match vvd preceded by white space
/^[ ]*vvd/ {
        # split lines up by more than one commas
        split($0,a,", +")
        ind=tolower(a[1])
        # strip off the function name and bracket
        sub(".*vvd[(]","",ind)
        # match C array syntax with Fortran array syntax
        sub("[]][]]",",",ind)
        sub("[[]","(",ind)
        sub("[]][[]",",",ind)
        sub("[]]",")",ind)
        # change C array indices to match Fortran indices
        sub("2,","3,",ind)
        sub("1,","2,",ind)
        sub("0,","1,",ind)
        sub(",2",",3",ind)
        sub(",1",",2",ind)
        sub(",0",",1",ind)
        prec[ind]=a[2]
        tol[ind]=a[3]
        # change exponent syntax
        sub("e","D",tol[ind])
        sub("e","D",prec[ind])
        next # skip to the next record
}
# match CALL VVD anywhere on a line
/CALL VVD/ {
        # split lines up by more than one commas
        n=split($0,a,", +")
        ind=a[1]
        # remove the function call
        sub(".*CALL VVD [(]","",ind)
        # remove spaces
        gsub(" ","",ind)
        # remove decimal
        sub("D0","",ind)
        # remove underscore
        sub("_","",ind)
        ind=tolower(ind)
        # if precision does not have "D" in it, append "D0"
        if (prec[ind] !~ "D") {
              prec[ind]=prec[ind]"D0"
        }
        sub(a[2],prec[ind])
[red]        if (n == "2") {
              prec[ind]=prec[ind]","
        }
        sub(a[2],prec[ind])[/red]
        oldtol=a[3]
        # if the tolerance was not on this line, it
        # must be on the next
        if (oldtol == "") {
                print
                getline
                n=split($0,a,", +")
                oldtol=a[1]
                # remove line continuation and white space
                sub("[[:space:]:]*","",oldtol)
        }
        sub(oldtol,tol[ind])
        # remove ^M characters in the new file
        sub("\r$","")
        print
        next
}
# print any other lines in the Fortran input
FNR != NR { print }
 
You still have the sub(a[2],prec[ind]) in there twice.

Annihilannic 16 Oct 08 20:08 said:
The second sub() won't find a match because the first one has already replaced the matching string with a new precision.

Let's say you have a string, "one two three", and you substitute four for two, e.g. sub("two","four"), and then you decide you want it to have a comma after it... if you try and sub("two","four,") it won't work, because the string is already "one four three" and the sub() will find no match.

One thing that may not be immediately apparent is that I did not specify which string the sub() should modify... usually that's the third parameter of the sub() function, but if you check man awk you'll see that by default, if no string is specified, it operates on $0, i.e. the current input line.

So just remove the first sub(), so that it isn't substituted into the output line until you've either appended the comma or not, depending on whether you need to.

Annihilannic.
 
Thanks! I wasn't seeing that logic until you explained it. Now I feel silly for missing it.

 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top