Hello, I would like to write a python script that eliminate all the duplicated occurrences in the second column, keeping the first match.
For an input like this:
Code:
101000249 101000249
101000250 5552931
101000251 101000251
101000254 5552931
101000255 101000255
101000256 101000256
101000257 5552605
101000258 5552605
101000259 101000259
101000260 101000260
I should get that:
Code:
101000249 101000249
101000250 5552931
101000251 101000251
101000255 101000255
101000256 101000256
101000257 5552605
101000259 101000259
101000260 101000260
The python code that I attempted is the following:
Code:
#/bin/python
file_object=open('file1.txt','r')
file_object2=open('file2.txt','w')
read_data=file_object.readlines()
nd=[]
for line in read_data:
s=line
if s[2] not in nd:
nd.append(s[2])
line = line.strip('\n')
file_object2.write(str(line)+"\n")
Thank you very much for your support!