Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

How to delete partial matching lines, not one after another?

Status
Not open for further replies.

ok1397

Technical User
Feb 7, 2005
70
0
0
US
Hello everyone, need some help with a text file. I need to check the first 40 characters of each line and if they match delete the previous line. The problem is that the lines with matching first 40 characters are not one after another. Is this possible? Any help will be greatly appreciated...
 
My suggestion is to create an 'out' file to hold your final data.

Use two string to hold your input, one for current line, one for previous line. If current line matches the correct amount of characters, don't write it to the output file. Otherwise, write it to the file. Then assign the current line to previous line, read in new data to the current line, and repeat.
 
If I understand you correctly, you are wanting to check for a line, lets say AAAAAAA, and if a duplicate of that line is found anywhere in the text file, then the original AAAAAAA line is deleted from the file. Do you need to do check for duplicates of every line in the text file, or just one particular line?

 
Any particular reason as to why you want to use Aspect(procomm)
to proccess this text-file?
There are other tools that are more suited for
this kind of things, PERL, SED and AWK comes to mind...
 
Hi knob, i need to check for AAAAAAA duplicates of every line in the text file. Unfortunately i cannot determine which line will have the duplicate so i have to check every line in the text file. Thank you. And thank you all for responding.
 
knob, one more thing, the most duplicates i'll have in the text file are 2-duplicates !!!! for example:

AAAAAAA ;this line to delete
dkjslkdjflsdfkj
AAAAAAA
ljkhkjdfkjdsfjkh

BBBBBBB ;this line to delete
dlkfjdkfjkljkjlkl
BBBBBBB
sdfklajdklfjdkfjl
 
knob, length is 49 (need to compare the first 49 characters) !!!!
 
knob, sorry, it varies, there's no telling how long the file will be. It's a report saved in a text file and some lines the first 49 characters appear in up to 2 lines.

for example:

this what it looks like this, in this example we are comparing the first 20 characters: ABC123 TOTAL TO ORDER:
and DEF456 TOTAL TO ORDER:

ABC123 TOTAL TO ORDER: 0 ;delete the first instance
dfkjdslkfldkflkjlkjklkjlkklj
ABC123 TOTAL TO ORDER: 0
sdkfjksdflkdsjflkdjflkdjfkldj
DEF456 TOTAL TO ORDER: 0 ;delete the first instance
sdkfjldkfjldkjkjkjasd
DEF456 TOTAL TO ORDER: 50

so really what needs to be deleted is the previous line which matches (in the above example) the first 20 characters. (But in file are the first 49 characters or string).
 
It looks like another Tek Tips reader sent me a script a while back that may get you most of the way with this problem. The script can be found here:


It reads a text file and creates a second text file with the number of times each line was found in the text file. You could have your script either call this script or incorporate it into your script and read the output file it creates. You would want to read it line by line, looking for each line that starts with a 002 (indicating two matches found, and also assuming that there are no more than no duplicates of each line in your report). You would use the strextract command or something similar to get the line after the colon in the output file. Next, you would reopen the text file with the report you are going through, read it line by line, and copy non-duplicate lines to a new text file. If the line read from the report file does match the duplicated line, you do not write it to the next text file, and you would also continue reading the "unique" text file to find your next duplicate line if one exists. If one does, you would continue reading from the report file until you find its first occurrence, otherwise you can just read from the report file and write to the next file until the end of the file is reached.

 
knob, thank you, i think this will get me started. I'll give it a shot !!!!! Thank you.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top