Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Need to find two lines, Line1 and the last of a Line2 and print lines in pairs. 1

Status
Not open for further replies.

Tester_V

Technical User
Nov 22, 2019
54
0
0
US
Trying to pull out paired lines, LINE 1 and LINE 2 from a file for farther processing.
The file goes like this:

Some lines
LINE 1
Some lines
Some lines
LINE 2
LINE 2
LINE 2
Some lines
Some lines
Some lines
LINE 1
Some lines
LINE 2
And so on.

I’d like to print:
LINE 1
LINE 2 (preferably last of LINE 2)
Also, if after the last LINE 1 no LINE 2 found I’d like to print the last line of the file.

The Python3.x code I’m come up so far prints All LINE 1 and LINE 2 not what I want.

Python:
filepath = 'mytext1.txt'
line1  = 'LINE 1'
line2 = 'LINE 2'



with open(filepath) as fp:  
    line = fp.readline() 

    while line:
        if line1 in line:
            print("string found in line ---"+line)

        if line2 in line:
            print("string found in line ---"+line)
        line = fp.readline()
 
Create a flag to indicate whether LINE2 has been found. Set it to False initially.
Make the contents of the LINE2 buffer empty.

When you find a LINE1, if the flag is set, print the LINE2 buffer
Set it to False

Set it to True when you find LINE2 - just save LINE2 but don't print it

When you exit the loop, if the flag is set, print LINE2.
 
I'm confused, ( I do not know what the "flag is" but I'm already printing Lines 1 and Lines 2.
I need to print the last of Line 2 for each of Line 1.

Some lines
LINE 1----------> print this one (pair one)
Some lines
Some lines
LINE 2
LINE 2
LINE 2----------> print this one (pair one)
Some lines
Some lines
Some lines
LINE 1----------> print this one (pair two)
Some lines
LINE 2----------> print this one (pair two)
And so on.
 
Flag = boolean variable. Basically save the LINE2 and only print it when you get a LINE1.
 
Then you will need to loop through the file multiple times.

Probably helpful to keep track of what lines you found things on.

But are you sure that you understand your assignment?
 
Try something like this, comments how it works are in the code:

Code:
filepath = [COLOR=#ff00ff]'[/color][COLOR=#ff00ff]mytext1.txt[/color][COLOR=#ff00ff]'[/color]
line1 = [COLOR=#ff00ff]'[/color][COLOR=#ff00ff]LINE 1[/color][COLOR=#ff00ff]'[/color]
line2 = [COLOR=#ff00ff]'[/color][COLOR=#ff00ff]LINE 2[/color][COLOR=#ff00ff]'[/color]


line_pair1 =  [COLOR=#ff00ff]""[/color]
line_pair2 =  [COLOR=#ff00ff]""[/color]
[COLOR=#a52a2a][b]with[/b][/color] [COLOR=#008b8b]open[/color](filepath) [COLOR=#a52a2a][b]as[/b][/color] fp:  
    line = fp.readline() 

    [COLOR=#a52a2a][b]while[/b][/color] line:
        [COLOR=#0000ff]# remove \n from line[/color]
        line = line[:-[COLOR=#ff00ff]1[/color]]
        [COLOR=#0000ff]#print("processing line: " + line)[/color]
        [COLOR=#a52a2a][b]if[/b][/color] line1 [COLOR=#a52a2a][b]in[/b][/color] line:
           [COLOR=#a52a2a][b]if[/b][/color] line_pair2 != [COLOR=#ff00ff]""[/color]:
              [COLOR=#008b8b]print[/color]([COLOR=#ff00ff]"[/color][COLOR=#ff00ff]* line pair found:[/color][COLOR=#ff00ff]"[/color])
              [COLOR=#008b8b]print[/color](line_pair1) 
              [COLOR=#008b8b]print[/color](line_pair2) 
              [COLOR=#0000ff]# initialize second line from pair[/color]
              line_pair2 = [COLOR=#ff00ff]""[/color]
           [COLOR=#0000ff]# set first line of pair[/color]
           line_pair1 = line

        [COLOR=#a52a2a][b]if[/b][/color] line2 [COLOR=#a52a2a][b]in[/b][/color] line:
           line_pair2 = line

        [COLOR=#0000ff]# next line[/color]
        line = fp.readline()            
     
    [COLOR=#0000ff]# finally at end of file, print the last pair[/color]
    [COLOR=#a52a2a][b]if[/b][/color] line_pair1 != [COLOR=#ff00ff]""[/color] [COLOR=#a52a2a][b]and[/b][/color] line_pair2 != [COLOR=#ff00ff]""[/color]:
       [COLOR=#008b8b]print[/color]([COLOR=#ff00ff]"[/color][COLOR=#ff00ff]* line pair found:[/color][COLOR=#ff00ff]"[/color])
       [COLOR=#008b8b]print[/color](line_pair1) 
       [COLOR=#008b8b]print[/color](line_pair2)

Output on the data you posted:
Code:
$ python3 tester_v2.py
* line pair found:
LINE 1----------> print this one (pair one)
LINE 2----------> print this one (pair one)
* line pair found:
LINE 1----------> print this one (pair two)
LINE 2----------> print this one (pair two)
 
Thanks to "xwb" and "mintjulep"! I really appreciate your help. I understand your logic
but I do not understand how to "translate it" to code.

I really do not understand how it all works for some reason.
And the whole code structure without some kind of braces is really confusing for me.
It is beyond me....
here the snippet with the "flag" but now it prints only LINE 1. It is clear the second nested loop is not working and I do not understand why.

Python:
filepath = 'mytext1.txt'
line1  = 'LINE 1'
line2 = 'LINE 2'

flag =False


with open(filepath) as fp:  
    line = fp.readline() 

    while line:
        if line1 in line:
            print("string found in line ---"+line)
            flag=True
            if line2 in line:
                print("string found in line ---"+line)
        line = fp.readline()
 
mikrom, this is amazing code man!
I do not understand how did you get the second or third of the LINE2:

LINE 1----------> print this one (pair one)
Some lines
Some lines
LINE 2
LINE 2
LINE 2----------> print this one (pair one)

I understand it is all happening here but I do not understand how
Python:
        if line1 in line:
           if line_pair2 != "":
              print("* line pair found:")
              print(line_pair1) 
              print(line_pair2) 
              # initialize second line from pair
              line_pair2 = ""
           # set first line of pair
           line_pair1 = line

        if line2 in line:
           line_pair2 = line
 
Hi Tester_V,
Before the loop I create two variables which holds the lines from pair and initialize them:
line_pair1 = ""
line_pair2 = ""

Then I read lines in the loop as you programmed before.
Every time when the line2 occurs i store it in the variable line_pair2.
But if the line1 occurs, before reading it in line_pair1, i first try to print the previous pair - so i ask: is the variable line_pair2 not empty ? Because if the variable line_pair2 is not empty, that means that I found the pair in the loop before. So I print the previous pair. Then for searching the next pair, i initialize line_pair2 and store the beginning line of the next pair into line_pair1. At the end of file i print the last pair found if available.

To understand better how it works consider this data file:
Code:
LINE 1----------> print this one (pair one)
Some lines
Some lines
LINE 2
LINE 2
LINE 2----------> print this one (pair one)
Some lines
Some lines
Some lines
LINE 1----------> print this one (pair two)
Some lines
LINE 2----------> print this one (pair two)
And so on.

on the beginning when this line comes
Code:
LINE 1----------> print this one (pair one)
we have line_pair2 = "", so the pair will not be printed.
In this case we only store this line into variable line_pair1, i.e.
we have:
line_pair1 = "LINE 1----------> print this one (pair one)"

When lines
Code:
Some lines
come we read next line

When line
Code:
LINE 2
first and second time occurs we store it in line_pair2, so we have
line_pair2 = "LINE 2"
When line
Code:
LINE 2----------> print this one (pair one)
third time occurs we store it in line_pair2, so now we have
line_pair2 = "LINE 2----------> print this one (pair one)"

on lines
Code:
Some lines
we read next line

Now when this line comes (i.e. the beginning of the next pair)
Code:
LINE 1----------> print this one (pair two)
we have in the pair variables the previous pair:
line_pair1 = "LINE 1----------> print this one (pair one)"
line_pair2 = "LINE 2----------> print this one (pair one)"
So we print the previous pair, then initialize the variable line_pair2 and store
the current line into variable line_pair1.
Now we have in the pair variables
line_pair1 = "LINE 1----------> print this one (pair two)"
line_pair2 = ""
and by reading the next lines we are collecting the next pair.

on
Code:
Some lines
we read next line
on
Code:
LINE 2----------> print this one (pair two)
we store the current line into line_pair2

At end of file we try to print the last pair if available.
In this case it exists, because the variable line_pair2 is not empty.
We have in our pair variables:
line_pair1 = "LINE 1----------> print this one (pair two)"
line_pair2 = "LINE 2----------> print this one (pair two)"
So we print the last pair and end.

 
In principle, the variable line_pair2 serves as a boolean flag pair_found.
line_pair2 = "" means the same as pair_found = False and
line_pair2 != "" means the same as pair_found = True
 
more questions.
How var "line_pair2" can have anything if nothing is passed to it or assign to it?

Python:
if line1 in line:
           if line_pair2 != "":  <<------------ nothing is asign to the to it
 
if you look at the code I posted above on 1 May 20 09:30,
you can see, that the variable value will be assigned in the while-loop in this if-block:
Code:
...
    while line:
        ...
        ...
        [highlight #FCE94F]if line2 in line:[/highlight]
           [highlight #FCE94F]line_pair2 = line[/highlight]

        # next line
        line = fp.readline()
...
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top