dougcranston
Technical User
I have a pipe delimited file with 20+ fields per record.
What I need to do is match a "portion" of one field against a portion of another field, and where an exception exists write the record out to a file for investigation and correction.
The fields are not fixed length, and the key character used to establish a position reference can be repeated multiple times in the same field.
And worse yet, the number of those "key characters" can vary from one field to another in the same record.
So the issue I cannot seem to address is that the two fields contain multiple hyphens, and that the last hyphen marks the start of the data I want to check for.
Ex. for simplicity, 3 fields:
Texas-Austin-Museum|300000|Texas-Austin-County-Mueseum
So what I would like to do is compare from Field 1 is "Museum" to Field 3 "Mueseum" which would fail due to spelling, and then I would print the record out as an error.
Index(x,n) will only identify the say the first hyphen, ex. Index($1,"-") would give me the first hyphens position, but not the last which I could then use Substr to establish the starting position in that field.
Essentially I have a 20+ field file, and I need to identify records where certain fields are not exactly the same. The file contains in excess of 10000 rows, and currently, requires manual validation 3-4 times a month. Being able to mechanically do the pattern matching would reduce the time spent trying to identify issues, and alot of eye strain, and increase accuracy in the long run.
Any suggestions would be appreciated.
Thanks in advance.
What I need to do is match a "portion" of one field against a portion of another field, and where an exception exists write the record out to a file for investigation and correction.
The fields are not fixed length, and the key character used to establish a position reference can be repeated multiple times in the same field.
And worse yet, the number of those "key characters" can vary from one field to another in the same record.
So the issue I cannot seem to address is that the two fields contain multiple hyphens, and that the last hyphen marks the start of the data I want to check for.
Ex. for simplicity, 3 fields:
Texas-Austin-Museum|300000|Texas-Austin-County-Mueseum
So what I would like to do is compare from Field 1 is "Museum" to Field 3 "Mueseum" which would fail due to spelling, and then I would print the record out as an error.
Index(x,n) will only identify the say the first hyphen, ex. Index($1,"-") would give me the first hyphens position, but not the last which I could then use Substr to establish the starting position in that field.
Essentially I have a 20+ field file, and I need to identify records where certain fields are not exactly the same. The file contains in excess of 10000 rows, and currently, requires manual validation 3-4 times a month. Being able to mechanically do the pattern matching would reduce the time spent trying to identify issues, and alot of eye strain, and increase accuracy in the long run.
Any suggestions would be appreciated.
Thanks in advance.