Looking for help with sed/regex 1

rhoover · Jun 19, 2007

I am new to scripting and I have an ldif file that I am cleaning up in order to import in into Oracle OID which is Oracles LDAP. It has to be in a specific format in order to load. The problem I am getting stuck at is there are some instances where there is more than one mail address listed for an employee. Does anyone know how I can use sed/regex to delete all except the first occurence. There are about 100,000 employees.

example:
dn: cn=0123456,cn=users,dc=company,dc=com
telephonenumber: 123-246-7890
mail: employee@company.com
uid: 0123456
cn: New Employee
givenname: New
sn: Employee
orclisVisible: True
objectclass: person
objectclass: organizationalperson
objectclass: inetorgperson
objectclass: orcluserv2

dn: cn=0123abc,cn=users,dc=company,dc=com
telephonenumber: 123-246-7890
mail: robert_smith@company.com
mail: bs@company.com
mail: bob@company.com
mail: rsmith@company.com
uid: 0123abc
cn: Robert Smith
givenname: Robert
sn: Smith
orclisVisible: True
objectclass: person
objectclass: organizationalperson
objectclass: inetorgperson
objectclass: orcluserv2

In the example of Robert Smith, I want to end up with:

dn: cn=0123abc,cn=users,dc=company,dc=com
telephonenumber: 123-246-7890
mail: robert_smith@company.com
uid: 0123abc
cn: Robert Smith
givenname: Robert
sn: Smith
orclisVisible: True
objectclass: person
objectclass: organizationalperson
objectclass: inetorgperson
objectclass: orcluserv2

One thing I came across that looks close is:

http://sed.sourceforge.net/sed1line.txt

# delete duplicate, consecutive lines from a file (emulates "uniq").
# First line in a set of duplicate lines is kept, rest are deleted.
sed '$!N; /^$.*$\n\1$/!P; D'

Any help would be greatly appreciated.

PHV · Jun 19, 2007

An awk way:
awk '$1=="mail:" && $1==p{next}{p=$1;print}' /path/to/input < output

Hope This Helps, PH.
FAQ219-2884
FAQ181-2886

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Looking for help with sed/regex 1

rhoover

Programmer

PHV

MIS

Similar threads

Part and Inventory Search

Sponsor