Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Westi on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Search string, compare, and return based on percentage similar

Status
Not open for further replies.

beacon5

Programmer
Dec 29, 2008
74
US
Hi everybody,

Version: [Crystal Reports 8.5]

I'm still fairly new to Crystal Reports, so please be thorough in your explanation.

I'm working with a back-end database that has a list of people's names in it. I know I need to group the people by department and I think I'm probably going to need to group by person as well.

I need to compare the people's names in each department with the other people in the department and return those people that have similar names. I only want to return names that are similar by a percentage of 50% or more.

Code:
For example: 
Smith, John and Smith, John A. would be approximately 95% similar 

while 

Smith, John and Smith, William J. would be about 25% or so similar.

My thinking is that the report will compare and return a percentage similar for all persons in the department and will suppress the people that aren't greater than or equal to 50%.

I've programmed string comparisons for exact matches in C++, but I've never done similar names based on a percentage. Can this be done? I've been racking my brain and don't know where to begin.

I really appreciate the help...
 
Please identify the fields you are working with and show samples of how these fields display in the detail section. We don't know whether you have separate fields for first, last, middle initial or whether the entire name (and possibly more than one) is in one field.

It might help to know the point of this exercise. Once you have the names that are similar, how will this information be used?

-LB
 
Hi,
In addition, how is the % similar determined..Common letters in the same order, last name same, etc..

How about names that sound alike: 'Peterson, Petersen'?



[profile]

To Paraphrase:"The Help you get is proportional to the Help you give.."
 
Sorry for the lack of info. I really pride myself on trying to provide as much detail as possible, but sometimes its difficult after staring at the problem for hours on end. [morning]

The point of the report is that we need to generate a name alert. For every person in the department, we want to see if anyone has a similar name as any of the other people in the department.

The field is a string of the entire name, not just first, last, or middle initial, but I think it would probably be best to work with a formula field that splits the name into three separate fields (or maybe just first and last name, although I'm obviously open to suggestion).

Example field: {table.Employee_Name}
Example data: "Smith, John A."

Since pronunciation would almost be impossible, I thought that finding the length of the string, comparing the letters in it at each location, and then dividing the matches by the total characters would give a percentage similar.

If three separate fields are used, the percentages would be averaged for a total similar. Then, anything less than 50% would be suppressed.

I hope this helps clarify things, but if it doesn't I'll be happy to provide more info. I think my biggest issue is how to loop through the data for each person

Thanks again for the help...
 
This reply is to bump this thread to see if anyone has any additional thoughts on the question.

Thanks...
 
You didn't really answer my question regarding why you would want to do this. How will this information be used once you have created this alert? The reason I ask is that what you want to do would be very complex, and there might be a much simpler solution.

-LB
 
I replied that it was to create a name alert to find similar names, but since that wasn't enough info here's some extra. This is all I was really given to go off of.

We have an option in the database application we use that allows us to create alerts (basically message boxes warning the users) for various items. The committee I'm on wants to create a report to see which names are similar so that these people can have an alert created in the application.

The report is going to be used as a reference to work with in creating these name alerts.
 
You have again restated that you want an alert when names are similar--which does not explain how anyone will use this--so I guess if you don't know, you can't tell us.

If you want to compare letter by letter and take into consideration sequencing, I think this is virtually undoable--at least I wouldn't try.

-LB

 
Hi,
The only way I can think of to even make this a possiblity is to ignore using CR entirely..Your DBA/database developers ( assuming a real database and not some file system based toy) should be able to create a procedure that iterates through each name and does a string comparison with each other name( actually several string comparisons to get the result you need) to determine similarity....




[profile]

To Paraphrase:"The Help you get is proportional to the Help you give.."
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top