Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Distinct output to a table? SQL 7

Status
Not open for further replies.

hughed00

Technical User
Jan 15, 2001
43
0
0
US
I have a mess on my hands and very limited personal or organizational resources. I apologize in advance for what is likely a stupid question, but here goes;

I have a fixed field text file that consists of the headers from MQ Series messages. Each record 'should' contain a unique 48 character message ID. My database should only contain one record containing any single message ID.

However, due to custom ancillary code which cannot be modified, the text file occasionally contains both duplicate records and records with duplicate message IDs (which are bogus and don't effect message delivery, that is why no one will fix it for this "little" problem).

I realize I can do a DISTINCT on the message ID field to view the records I want. One problem is that the message ID is the only unique (sic) field and needs to be indexed. This database is reaching millions of records quickly and without indexing is virtually DOA.

What I have been told to do is remove the offending records and hold them in a seperate table. Then define the primary key in the "clean" table as the message ID. Then use the "clean" table as the basis for our reporting/auditing system.

Is there a way to do this without modifying the text file source and accomplish the task within SQL 7 ? I have considered some kind of script (Perl?) to clean up the text file prior to importing using DTS, but have been discouraged.
 
Couldn't you copy the distinct records to a view, delete the table and copy the records back from the view ? Alter your table by adding a unique counter to distinquish each future row. Although running time might be lengthy...?

Just a thought,
Pat
 
That gets me half way there by eliminating records that are identical, but leaves those that are the same except for the timestamp (which is the error that clued us in to the problem).

I basically would be happy if I could just get one record for each supposedly unique message id. I need to eliminate both the ones that are identical and those that have the message id repeated.

Thanks
 
Dear ;
You can also use Count(*) in having cluase to find out duplicate records.

Like

Select DuplicateName from Table
having Count(*) > 1
Group by DuplicateName

May be it gives you a hint.

essa2000

 
I haven't been dealing with SQL server very long, but using MS Access I think I could make the type of direct import by creating an empty table with the both the key and the data as unique indices and doing the import letting the duplicate records error out on duplication of either the key or the text.

I don't know what happens if you do the same sort of import using the Data Tranfromation Services (DTS) Import Wizard, but, if you open up the transform, I think there is a setting for the maximum number of errors it will tolerate.

Success hinges on is whether DTS appends one record at a time, erroring records that have duplicate keys or it appends all the records at once and rolls back the whole import on failure.

If this won't work the direct unique values query won't work either since AB and AC are considered different as are DE and DF, both cases you want to treat as duplicaton.

You can get a list of duplicated keys by grouping on the key and selecting those keys having COUNT(*) > 1. Append these to a new table. Since you want to eliminate both records with duplicated keys, you can inner joins on this result to first copy the duplicates to a wastebasket table and then delete them.

Once you have done this, you can get down to unique messages by grouping on the text, using the first key (apply the FIRST function) for each group, and appending this result to a corrected table.

Maybe one of the gurus knows a more direct way of doing this.

Best,

Harry Rich

 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top