Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Parse name field into components

Status
Not open for further replies.

JenJohnson

Programmer
Nov 3, 2003
15
0
0
US
Good Day, All:

I have a field called Manager_Name which contains data, some of which looks like this:

C. Lennis Koontz,II
Theresa Holt
Victor T. Murhman
Paul A. Graham, Jr.
Thomas Murphy Jr.
Jon B. Lovelace Jr.
David B. Rees III
Mary Jo Ochson
M. L. Conery
Michael A. Del Balso

I need to parse out the last name of each of these records. (The last name I'm expecting is in red).
Has anyone else had the need to do something like this?

Any help would be greatly appreciated.

Thanks,
Jennifer
 
This is mostly impossible, your record format is not consistant enough.

if you NEVER had a middle name you could do it by ignoring the first field and taking everything else as the last name.

Since you sometimes have a middle name and sometimes dont and sometimes have a suffix and sometimes dont you can not really determine any kind of field format.

Maybe someone smarter then me can figure it out.
 
Well, it seemed to me that if there were 4 blank-separated fields you want the last 2, otherwise you want just the last one. This almost works, except for "Thomas Murphy Jr.". So I just added something to the condition that says if the line ends in "Jr." take the last 2 fields. This works with the data you posted:
Code:
#!perl
use strict;

my @arr;
my $lname;

while (<>)
{
	chomp;
	@arr = split;
	$lname = @arr == 4 || /Jr\.$/ ?
		join ' ', ($arr[-2], $arr[-1]) :
		$arr[-1];
	print qq($_ => $lname\n);
}
Here's the output with the posted data:

C. Lennis Koontz,II => Koontz,II
Theresa Holt => Holt
Victor T. Murhman => Murhman
Paul A. Graham, Jr. => Graham, Jr.
Thomas Murphy Jr. => Murphy Jr.
Jon B. Lovelace Jr. => Lovelace Jr.
David B. Rees III => Rees III
Mary Jo Ochson => Ochson
M. L. Conery => Conery
Michael A. Del Balso => Del Balso
 
You may want to make that a hash with Sr Jr etc etc in it so that you can catch all the variations.

This scheme would break if you have a name like

Michael Del Balso

At least I think it would, it would probably take Del as the middle name.

But i am guessing, its late.

 
Hi

This seems to work:--

@names = (&quot;C. Lennis Koontz,II&quot;,
&quot;Theresa Holt&quot;,
&quot;Victor T. Murhman&quot;,
&quot;Paul A. Graham, Jr.&quot;,
&quot;Thomas Murphy Jr.&quot;,
&quot;Jon B. Lovelace Jr.&quot;,
&quot;David B. Rees III&quot;,
&quot;Mary Jo Ochson&quot;,
&quot;M. L. Conery&quot;,
&quot;Michael A. Del Balso&quot;);

foreach $name (@names) {

if ($name =~ m/^[A-Z][a-z]+ ([A-Z][a-z]+ )?([A-Z][a-z]+)$/) { # Xxxxxx Xxxxxx
print &quot;#1# $2\n&quot;;
}
elsif ($name =~ m/^[A-Z]\. [A-Z][a-z]+ (.+)$/) { # X. Xxxxxx
print &quot;#2# $1\n&quot;;
}
elsif ($name =~ m/^[A-Z][a-z]+ [A-Z]\. (.+)$/) { # Xxxxxx X.
print &quot;#3# $1\n&quot;;
}
elsif ($name =~ m/^[A-Z]\. [A-Z]\. ([A-Z][a-z]+)$/) { # X. X. Xxxxxx
print &quot;#4# $1\n&quot;;
}
elsif ($name =~ m/[A-Z][a-z]+ ([A-Z][a-z]+ .+)/) { # Xxxxxx Xxxxxx ?????
print &quot;#5# $1\n&quot;;
}
else {
print &quot;*** cannot match pattern ***\n&quot;;
}

}

print &quot;\n\n&quot;;


regards
Duncan
 
Thanks, Duncan! This really did the trick for me, and it will catch any new name types that they throw at me!

-- Jennifer
 
thank you Jennifer! very glad to be of help.


Kind Regards
Duncan
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top