I have a mess on my hands and very limited personal or organizational resources. I apologize in advance for what is likely a stupid question, but here goes;
I have a fixed field text file that consists of the headers from MQ Series messages. Each record 'should' contain a unique 48 character message ID. My database should only contain one record containing any single message ID.
However, due to custom ancillary code which cannot be modified, the text file occasionally contains both duplicate records and records with duplicate message IDs (which are bogus and don't effect message delivery, that is why no one will fix it for this "little" problem).
I realize I can do a DISTINCT on the message ID field to view the records I want. One problem is that the message ID is the only unique (sic) field and needs to be indexed. This database is reaching millions of records quickly and without indexing is virtually DOA.
What I have been told to do is remove the offending records and hold them in a seperate table. Then define the primary key in the "clean" table as the message ID. Then use the "clean" table as the basis for our reporting/auditing system.
Is there a way to do this without modifying the text file source and accomplish the task within SQL 7 ? I have considered some kind of script (Perl?) to clean up the text file prior to importing using DTS, but have been discouraged.
I have a fixed field text file that consists of the headers from MQ Series messages. Each record 'should' contain a unique 48 character message ID. My database should only contain one record containing any single message ID.
However, due to custom ancillary code which cannot be modified, the text file occasionally contains both duplicate records and records with duplicate message IDs (which are bogus and don't effect message delivery, that is why no one will fix it for this "little" problem).
I realize I can do a DISTINCT on the message ID field to view the records I want. One problem is that the message ID is the only unique (sic) field and needs to be indexed. This database is reaching millions of records quickly and without indexing is virtually DOA.
What I have been told to do is remove the offending records and hold them in a seperate table. Then define the primary key in the "clean" table as the message ID. Then use the "clean" table as the basis for our reporting/auditing system.
Is there a way to do this without modifying the text file source and accomplish the task within SQL 7 ? I have considered some kind of script (Perl?) to clean up the text file prior to importing using DTS, but have been discouraged.