Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chris Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Simple python problem - process strings in loops

Status
Not open for further replies.

Me4President

Technical User
Sep 30, 2009
7
NL
The problem:
I want to enter a text file, in the format below:
Code:
      3974.1399      1693.0822      1921.5103 1 3  3 event_outline
      3877.0623      1658.3441      1854.6477 2 3  3 event_outline
      3818.4111      1641.9969      1769.8671 2 3  3 event_outline
      2461.6321      1632.2740      1836.2170 2 3  3 event_outline
      2405.4290      1687.7454      1903.2582 2 3  3 event_outline
      2347.6643      1721.6591      1968.0997 2 3  3 event_outline
      2287.6213      1784.1433      1968.1174 2 3  3 event_outline
      2287.6213      2512.1030      1634.8781 2 3  3 event_outline

And I want to create a new file, with the format below:
Code:
POLGON  3974  1693  3877  1658  3818  1641  2461  1632  2405  1687  2347  1721
i.e. a string, followed by 12 numbers on each line. The numbers are the first and third 4 digits of the input file - for example the first two numbers in the row are
Code:
3974 & 1693
from the first line
Code:
3974.1399      1693.0822      1921.5103 1 3  3 event_outline
This is my attempt so far - I dont know how to loop over 6 lines, write this to one output line, and carry on until I reach the end of the input file. The input file will vary in length, and the output file must have the specified string, followed by 6 pairs of numbers on each line.

Code:
#!/usr/bin/python

filename = raw_input('input file? ') 
file = open(filename,"r")
fileout = raw_input('file out? ')
#file = open(fileout,"wb")

#fo = file.readline() # trying to use readline()
sipstr = raw_input('SIPMAP string? ')
fa = open(fileout,"a") # to append each new number to file... Not ideal..
fa.write (sipstr) # write the string to each new line
fa.write ("  ") # give the correct spaces after the string

lines = 0
for line in file: # looping over each line
  x = line[6:10] # get first number
  y = line[21:25] # get second number
  fa.write (x)
  fa.write ("  ")
  fa.write (y)
  fa.write ("  ")
  if lines == 6:  # when done 6 lines, make new line and start again
    fa.write ("\n")
    fa.write (sipstr)
    fa.write ("  ")
  lines += 1
print 'Input file %r has %r pairs\n' % (filename, lines)
print 'Output file %r will have %r lines\n' % (filename, 1+lines/6)
fa.close()
file.close()
raw_input('Press enter to quit... ')

Any help would be greatly appreciated!

 
Well, I solved my loop problem, and I can now get it to display exactly the format I want on screen:
Code:
uinp = raw_input('input file? ')
inp = open(uinp,"r")
uout = raw_input('output file? ')
outp = open(uout,"a")         
sipstr = raw_input('SIPMAP string? ')

# Set counter to 0
i = 0

# iterate over the file printing each item
# making new line if after every 6 lines
for line in inp:
    if i%6 == 0:
        print "\n", sipstr,"",
    x = line[6:10]
    y = line[21:25]
    aTuple = (x, y)
    i += 1
    print aTuple[0], "", aTuple[1], "",
# Now close it again
inp.close()

But I now want to be able to write to the output file specified.
I have tried using outp.write() but it doesnt seem to work the same as print.

Code:
#!/usr/bin/python
# First open the file to read(r) and file to write (w)
uinp = raw_input('input file? ')
inp = open(uinp,"r")
uout = raw_input('output file? ')
outp = open(uout,"a")         
sipstr = raw_input('SIPMAP string? ')

# Set counter to 0
i = 0

# iterate over the file printing each item
for line in inp:
    if i%6 == 0:
        print "\n", sipstr,"",
        outp.write("\n",)
        outp.write(sipstr,)
        outp.write("",)
    x = line[6:10]
    y = line[21:25]
    aTuple = (x, y)
    i += 1
    print aTuple[0], "", aTuple[1], "",
    outp.write("\n",)
    outp.write(aTuple[0],)
    outp.write(aTuple[1],)
    outp.write("",)
# Now close it again
inp.close()

I will keep trying, but if there is a better way than I am doing it, or anyone could point me in the direction of some examples, I woul dbe most grateful.
Thanks
 
Sorry for not realising I could actually do it... added
Code:
outp.write('\n' + sipstr + "  ")
and
Code:
outp.write(aTuple[0] + "  " + aTuple[1] + "  ")
after the two print commands and it works great.

Now making it not add a new line at the start...
 
Hi Me4President,
I thing this is the continuation of your fortran thread

Here is my example, how to solve it in python
poly.py
Code:
[COLOR=#a020f0]import[/color] string

[COLOR=#0000ff]# open the files[/color]
inp_file = open("[COLOR=#ff00ff]poly_inp.txt[/color]","[COLOR=#ff00ff]r[/color]")
out_file = open("[COLOR=#ff00ff]poly_out.txt[/color]","[COLOR=#ff00ff]w[/color]")

[COLOR=#0000ff]# initialize global vars[/color]
poly_sep = "[COLOR=#ff00ff]  [/color]"  [COLOR=#0000ff]# separator of the polygon points[/color]
poly_list = []   [COLOR=#0000ff]# list of polygon points[/color]
nr_lines = 0     [COLOR=#0000ff]# number of lines[/color]

[COLOR=#0000ff]# process lines[/color]
[COLOR=#804040][b]for[/b][/color] line [COLOR=#804040][b]in[/b][/color]  inp_file:
   nr_lines += 1
   [COLOR=#804040][b]print[/b][/color] "[COLOR=#ff00ff]%03d: '%s'[/color]" % (nr_lines, line.strip())
   [COLOR=#0000ff]# first replace in the line '.' with ' ' and then  [/color]
   [COLOR=#0000ff]# split the line into list using whitespace as separator[/color]
   line_list = line.replace("[COLOR=#ff00ff].[/color]", "[COLOR=#ff00ff] [/color]").split()
   [COLOR=#0000ff]# add 1. and 3. element to the poly_list[/color]
   poly_list += [line_list[0], line_list[2]]
   [COLOR=#804040][b]if[/b][/color] nr_lines % 6 == 0:
      poly_line = "[COLOR=#ff00ff]POLGON%s%s[/color][COLOR=#6a5acd]\n[/color]" % (poly_sep, string.join(poly_list,poly_sep))
      [COLOR=#804040][b]print[/b][/color] "[COLOR=#ff00ff]==>:[/color]"
      [COLOR=#804040][b]print[/b][/color] poly_line
      out_file.write(poly_line)
      poly_list = [] [COLOR=#0000ff]# initialize list of polygon points[/color]

[COLOR=#0000ff]# at the end print the last line, when it was not printed before, [/color]
[COLOR=#0000ff]# i.e. only when it has less then 12 elements[/color]
[COLOR=#804040][b]if[/b][/color] len(poly_list) < 12:
   poly_line = "[COLOR=#ff00ff]POLGON%s%s[/color][COLOR=#6a5acd]\n[/color]" % (poly_sep, string.join(poly_list,poly_sep))
   [COLOR=#804040][b]print[/b][/color] "[COLOR=#ff00ff]==>:[/color]"
   [COLOR=#804040][b]print[/b][/color] poly_line
   out_file.write(poly_line)

[COLOR=#0000ff]# close the files[/color]
inp_file.close()
out_file.close()

for your data given above it produces on screen
Code:
c:\Users\Roman\Work>python poly.py
001: '3974.1399      1693.0822      1921.5103 1 3  3 event_outline'
002: '3877.0623      1658.3441      1854.6477 2 3  3 event_outline'
003: '3818.4111      1641.9969      1769.8671 2 3  3 event_outline'
004: '2461.6321      1632.2740      1836.2170 2 3  3 event_outline'
005: '2405.4290      1687.7454      1903.2582 2 3  3 event_outline'
006: '2347.6643      1721.6591      1968.0997 2 3  3 event_outline'
==>:
POLGON  3974  1693  3877  1658  3818  1641  2461  1632  2405  1687  2347  1721

007: '2287.6213      1784.1433      1968.1174 2 3  3 event_outline'
008: '2287.6213      2512.1030      1634.8781 2 3  3 event_outline'
==>:
POLGON  2287  1784  2287  2512
and writes in the file poly_out.txt 2 lines
Code:
POLGON  3974  1693  3877  1658  3818  1641  2461  1632  2405  1687  2347  1721
POLGON  2287  1784  2287  2512
 
Thanks a million Mikrom - this is a much nicer way of doing , and I could adapt this easily for lots of other files I export...

I've added a header with the date and time to the output file, and have asked user for input and output files, as well as what string to put at the start of each line.

I will try to use python instead of shell scripting and awk now.. Thanks for showing me the use of lists!

 
Hi Me4President,

I have a little error in the source I posted:
When your data file has number of lines which could be divided by 6 without rest (i.e. 6,12,18,...) then the last line will be printed in the loop and then the poly_list will be initialized to zero length list. But then len(poly_list) = 0 which fulfill the condition after the loop, so in this case one additionally line without point pairs will be printed:
Code:
POLGON
To avoid this, change the condition after the loop from
Code:
if len(poly_list) < 12:
to
Code:
if len(poly_list) > 0 and len(poly_list) < 12:
or alternativelly to
Code:
if nr_lines % 6 != 0:

So you see, with python you don't need to use for preprocessing shell and awk scripts additionally with a fortran program.
Now you can do all you need in the one script.
Python (like Perl, Ruby, Tcl, REXX,...) is more powerfull than fortran for processing text files and lots of other things. It's easy to learn and the programming productivity is higher.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top