Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Delete query assistance 1

Status
Not open for further replies.

micang

Technical User
Aug 9, 2006
626
US
SQL 2008 R2

Hi All,

I have this data below, I have a unique ID. There are mutiple contract numbers with a transaction date.

My end goal result is to have 1 record, the criteria to have one record is that I need to have the max transaction date. So what I would like to achieve is have a delete query that will delete all invalid records. In the sample data below, you will see that there is a duplicate max transactiondate, in this case I would want to keep the max id from these 2 records.

[pre]
UniqueID ContractNo Transactiondate
253417 555874 2/21/2002
253418 555874 7/24/2003
253419 555874 10/23/2003
253420 555874 8/27/2015
253421 555874 10/27/2015
253422 555874 10/27/2015
[/pre]

Would it be possible to achieve this in one delete query?

Many thanks
Michael
 
Well, my idea would base on my solution of your previous question (thread183-1761789):

Code:
Declare @tableTest as Table (uniqueid int, contractno int, transactiondate date);

insert into @tableTest Values
 (253417,	        555874,          '20020221')
,(253418,	        555874,          '20030724')
,(253419,	        555874,          '20151027')
,(253420,	        555874,          '20151027')
,(253421,	        555874,          '20031023')
,(253422,	        555874,          '20150827');

with LatestTransfer as
(
  select contractno, max(transactiondate) as LatestTransactionDate from @tableTest 
  group by contractno
), KeepIDs as
(
  select tt.contractno, max(uniqueid) as keepid from  @tableTest tt
  inner join LatestTransfer lt on  lt.contractno = tt.contractno and lt.LatestTransactionDate = tt.transactiondate 
  group by tt.contractno
)
Delete @tableTest
From @tableTest tt
left outer join KeepIDs k on tt.uniqueid = k.keepid 
Where keepid is null;

Select * From @tableTest

I put the latest transfer in the middle of the data just to demonstrate the picking of the max(uniqueid) is picking the correct record, not simply the max(uniqueid) per contractno.

On the other hand, if your uniqueid is a sequence number (like integer identity) you can always expect the max(uniqueid) per contractno to have the latest transactiondate, as the sequence number rises in chronological order, anyway. In that circumstance the max(uniqueid) as keepid from @tableTest group by contractno would be sufficient without first determining the latest transaction date.

Bye, Olaf.
 
You may better see how this works, if you do this first, instead of the DELETE statement:

Code:
Select *
From @tableTest tt
left join KeepIDs k on tt.uniqueid = k.keepid;

This shows, how the left join works. It's an outer join, so it also creates result rows, when there is no match in KeepIDs. And all those non matches are the ones you don't want to keep and thus delete.

Bye, Olaf.
 
Olaf,

Thank you so much, this query is brilliant, it does exactly what I require.

The unique id is a sequence id, but because I do not know how the data is populated, I thought if I do it this way, I am pretty safe (as can be) to keep the max record irrespective. I really have no idea how this data is populated so I am erring on the side of caution.

You have been extremely helpful with both my threads (yes, they are related), I appreciate your time.

Michael
 
You're welcome,

Perhaps someone else has an even simpler idea, to me it's just going straight forward, step by step. Even though others avoid CTEs, they just make writing queries easier, that are based on previous queries. In this case even in two levels.

Of course I know even less than you may be able to find out about the process, in which this data was collected. I would assume perhaps flat files are colleced from several affiliates and simply imported fully here. I would have expected in a next step they would be aggrageted in a way you address right now. Anyway, if those flat files are not imported in correct order you could have your case of the latest record not having the max uniqueid. And that would just be one such scenario.

Bye, Olaf.
 
I agree with Olaf - nice solution. DO NOT do the delete directly against your table. Make a new table with the records you want to keep, then archive the old table, then rename the new table as the production version. This helps keep the table and index space clean and will help your performance. When you DELETE, most databases do not recover the space very efficiently. And the more deletes you do, the more inefficiency you create.

==================================
adaptive uber info galaxies (bigger, better, faster, and more adept than cognitive innovative agile big data clouds)


 
Yes, John has good advice. I already guessed the table you see here could be a staging table for a real contracts table into which all the imports are merged, this would only have the latest state in it at any time.

Anyway, it's easy to create populate the new table from the KeepIDs as

Code:
Insert Into newcontractstable
Select tt.*
From @tableTest tt
inner join KeepIDs k on tt.uniqueid = k.keepid;

In the long run though, you'd only process new records of the staging table into the contracts table and then empty the staging table, you don't need to keep such staging data once it is processed.
To not pollute the transaction log of the database with staging data and it's deletion this could also go into temp tables or a separate staging database on the same SQL Server instance. Such a database could much easier be cleaned, not only via TRUNCATE of staging data tables, but also via DBCC shrink, and its recovery model could also be set to simple. If you even keep it at all and not just CREATE DATABASE staging for importing staging data and DROP DATABASE after having processed some import data.

As you are not sure about the meaning and history of this data, it's all just guesswork, but a transfer date points out to some data transfer/import mechanism. You would rather keep the flat files of the import process, if law provides you keep a history of any contract and financial and customer related data in an auditable manner.

Bye, Olaf.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top