Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

speed up delete where value not in multiple tables 1

Status
Not open for further replies.

jhaganjr

IS-IT--Management
Dec 11, 2002
62
0
0
US
Hello Amazingly Smart People,

I did a search and found the answer to my question. However, the answer is excessively slow. I'm wondering if you know a better way.

Need: To delete records from TblA when the key value of the records does not appear in several related tables.

Solution:
DELETE * FROM tblA
WHERE value Not In (SELECT value FROM tblBSub)
AND value Not In (SELECT value FROM tblCSub)
AND value Not In (SELECT value FROM tblDSub)
AND value Not In (SELECT value FROM tblESub)
AND value Not In (SELECT value FROM tblFSub);

I'm dealing with only thousands and tens of thousands of records, but this takes many minutes to run. I haven't had the patience to wait beyond 5 or 6 minutes for it to complete. But I know it works by testing it on small test data sets first.

Another solution is to use a union query to create a single list of "value" from all the sub-tables. The union query runs very fast on its own. But when I place it into the Not In statement above the Delete query runs just as slow.

Another solution is to update a True/False field in tblA first, then delete records from tblA based on that field.

This runs MUCH faster - in a matter of 20 seconds or so - but it is a two step process ... first the update then the delete. I'm guessing the first two solutions are slower because using the IN approach may eliminate the benefits of indexing??

The update, then delete solution is not a problem, per se, but I'm wondering if anyone knows how to speed up the delete query itself. Is there better syntax that executes faster?

Thanks,
Joe
 
How are ya jhaganjr . . .

The best I can think of is to cut down the amount of records returned by each subQuery so your not parsing thru entire tables! Example:
Code:
[blue]DELETE tblA.Value FROM tblA 
WHERE (tblA.Value Not In (SELECT tblBSub.value FROM tblBSub 
			  WHERE (tblA.Value=tblBSub.Value))) AND 
      (tblA.Value Not In (SELECT tblCSub.value FROM tblCSub 
			  WHERE (tblA.Value=tblCSub.Value))) AND 
      (tblA.Value Not In (SELECT tblDSub.value FROM tblDSub 
			  WHERE (tblA.Value=tblDSub.Value))) AND 
      (tblA.Value Not In (SELECT tblESub.value FROM tblESub 
			  WHERE (tblA.Value=tblESub.Value))) AND 
      (tblA.Value Not In (SELECT tblFSub.value FROM tblFSub 
			  WHERE (tblA.Value=tblFSub.Value)));[/blue]
[blue]Your Thoughts? . . .[/blue]

Calvin.gif
See Ya! . . . . . .

Be sure to see thread181-473997
Also faq181-2886
 
My thoughts WERE (when I first read it) ...

I understand the goal, but how does what you wrote achieve it?

Still, I plugged it into my live data, and it works. Thanks a million!

I'm still trying to get my amateur brain around why it works. It seems like a "double-negative" ... looking for something that's NOT in a table, but using a WHERE clause that specifies the values from each table are the same.

I keep reading it and trying to imagine the data sub-sets, but I just don't get it.

No matter. It works. Plus it's lightning fast.

Thanks, again!

Joe
 
This is multi-step but you should be able to do it in far less than 20 minutes
Code:
Public Sub DeleteFromTblA()
    ' Create an indexed temporary table.
    CurrentDb.Execute "CREATE TABLE Temp (Value text(255),  " & _
                      "CONSTRAINT [Primary] PRIMARY KEY (Value));"

    ' Insert all the values into it
    CurrentDb.Execute "INSERT INTO Temp (Value) " & _
                      "Select Value From " & _
                      "(" & _
                      "      SELECT value FROM tblBSub " & _
                      "UNION SELECT value FROM tblCSub " & _
                      "UNION SELECT value FROM tblDSub " & _
                      "UNION SELECT value FROM tblESub " & _
                      "UNION SELECT value FROM tblFSub " & _
                      ");"

    ' Create another indexed temporary table.
    CurrentDb.Execute "CREATE TABLE Temp1 (Value text(255),  " & _
                      "CONSTRAINT [Primary] PRIMARY KEY (Value));"

    ' Insert everything that's only in tblA into Temp1
    CurrentDb.Execute "INSERT INTO Temp1 (Value) " & _
                      "Select Value " & _
                      "From tblA LEFT JOIN Temp " & _
                      "    ON tblA.Value = Temp.Value " & _
                      "Where Temp.Value IS NULL;"

    ' Now Delete everything in that table from tblA.
    CurrentDb.Execute "DELETE * FROM tblA  " & _
                      "Where tblA.Value IN (Select Value From Temp1);"

    ' Clean up your mess.
    CurrentDb.Execute "DROP TABLE Temp;"
    CurrentDb.Execute "DROP TABLE Temp1;"

End Sub

This will (I hope) be able to use indexes and, dispite the multiple steps, should be faster than the full-table scans that you are having to endure now.
 
jhaganjr . . .

I'm at work and will explain when I get home. However, you may want to try a union query with the new SQL (like before). Should get even faster! . . .

Calvin.gif
See Ya! . . . . . .

Be sure to see thread181-473997
Also faq181-2886
 
Golom,

Thanks for the code. I like it, and will be able to use it in other settings.

Joe
 
jhaganjr . . .

Sorry to get back so late! To explain:
jhaganjr said:
[blue] . . . It seems like a "double-negative" ... looking for something that's NOT in a table, but using a WHERE clause that specifies the values from each table are the same[/blue]
[purple]The key is in the amount of records returned by each subQuery[/purple], available for searching with the [blue]Not In[/blue] clause. Consider . . .
Code:
[blue]SELECT value
FROM tblBSub[/blue]
. . . this subQuery [red]returns all records in the table[/red] for the field [blue]'Value'[/blue], which according to you could be thousands to tens of thousands of records. [blue]This is your time killer for the search[/blue], espcially considering you have to search thousands to tens of thousands of records for each subQuery!

Now consider . . .
Code:
[blue]SELECT tblBSub.Value
FROM tblBSub 
WHERE ([purple][b]tblA[/b][/purple].Value=[purple][b]tblBSub[/b][/purple].Value[/blue]
. . . we have the same subQuery, only now we've set criteria to match that were looking for. The subQuery now only returns those records where 'Value' exists! This gives us a much smaller recordcount for searching! We now have a few records to search thru, or none! if 'Value' is not found. Remember, [purple]were talking records returned by the subQuery for search by the [blue]NOT IN[/blue] clause![/purple] When no records are returned [blue]NOT IN[/blue] is True because no records are available for search. Hence your fastest execution occurs when 'Value' doesn't exist for all subQueries.

I hope the above is a little better insight for you.

If you'd like to crank out more speed!, optimize a little by performing the following:
[ol][li]In succession open each table in design view then
[ol a][li]Select the Value Field.[/li]
[li]In the [blue]General Tab[/blue] on the bottom left, check the [blue]Indexed[/blue] property. If is saids [blue]No[/blue] then change it to [blue]Yes (Duplicates OK)[/blue][/li].
[li]Save & close the table.[/li][/ol][/li]
[li]For each subQuery:
Code:
[blue]Change: SELECT [i]TableName[/i]
To    : SELECT [purple][b]Top 1[/b][/purple] [i]TableName[/i][/blue]
For those situations where more than one record is found in a subQuery, the above [blue]makes sure only one record is returned! After all, for each subQuery, we only need to know if 'Value' occurs once.[/blue][/li][/ol]

Special Note: The first time you run any query (after editing), its optimized by the jet, making the first run the slowest. There after the query runs in its optimized/fastest state until you edit again.

[blue]Your Thoughts? . . .[/blue]

Calvin.gif
See Ya! . . . . . .

Be sure to see thread181-473997
Also faq181-2886
 
Your explanations make perfect sense. And I know from putting your solution into practice it works fantastically.

My indexes were already in place, but I did add the "Top 1" syntax to speed things up even further.

Thanks, again. I learned a lot from your posts!

Joe
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top