Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Need Help Randomizing Customer Data

Status
Not open for further replies.

ruckrock

Programmer
Jun 21, 2007
9
0
0
US
Hi,

I have a quick question.

I have a SQL Server database with customer data in 1 table. I want to use this 'live' data for a testing database, but need a way to randomize the customer info. I figured out ways to do it for SSN# and telephone #, as well as address, but I am stuck on the customer names themselves.

The customer's names are split between two columns, first_name and last_name within the same table. All I really want to do is update all the customer records with random combinations of the existing first and last names in the database so our 'testers' do not have access to real people names and their addresses/ssn/etc.

Anyone know of a way to do this? So far, I have all the data imported into a temp table and was able to change randomize everything but the names.

Thanks in advance
 
I don't have SQL Server at my disposal right now, but here's a not-even-pseudo code idea:

create table variables or temp tables for first and last name (with an ID column).

Insert all your first / last names to these ordered by newid()

Then, inner join the two tables on IDCol = IDCol

If I am correct, this should result in a random assortment of names.

HOpe this helps,

Alex

Ignorance of certain subjects is a great part of wisdom
 
Just be aware that mixing up first and last names isn't always completely sufficient. Some last names are quite unique in various regions and could be identified, especially those that are hyphenated.

One way that I solved the problem was just to think up about 200 generic first names, 200 generic last names, and combined with a middle initial, that's over a million fake names. A couple of CROSS JOINs and you're good!

John G. Smith
John H. Smith
John I. Smith
John K. Smith



[COLOR=#aa88aa black]Cum catapultae proscriptae erunt tum soli proscript catapultas habebunt.[/color]
 
I agree with ESquared. You want completely false data for the testing environment. Don't derive any values based on actual customer data. If you don't want to take the time to type or download names, then running a loop to produce values like Cust1, Cust2, Cust3, etc. will work as well.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top