Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Rewritting a slow query that uses "in" ?

Status
Not open for further replies.

clone45

Programmer
Mar 28, 2006
22
0
0
US
Hello! I keep running into situations where I use the "in" operator which results in really _slow_ queries. I need to find a graceful way to work around it.

Here's my example.. We have a table called "campaigns" (for marketing campaigns) and another table called "tokens" (representing uuids that are embedded in URLs for online marketing.)

campaigns
---------
campaign_id
name

tokens
------
token_id
uuid
campaign_id

As you can probably see, each token is associated with a campaign. Selecting which campaign a specific token is associated with can be done using:

SELECT campaigns.name
FROM campaigns, tokens
WHERE tokens.campaign_id = campaigns.campaign_id
AND tokens.uuid = 'DC84A2FA-3048-2FCD-167A16251DEFC0CA'

That's easy. However, now I have another table (visitors) that represents visitors coming from our campaigns. It looks like:

visitors
--------
visitor_id
uuid

I have to write a query that shows me all visitors coming from a particular campaign. I've done it this way:

SELECT *
FROM visitors
WHERE visitor.uuid IN (
SELECT campaigns.uuid
FROM campaigns, tokens
WHERE tokens.campaign_id = campaigns.campaign_id
AND campaigns.name = 'yahoo' )

This works, but it started running very slowly once my data set got large. Can anyone suggest an alternative query that's faster?

Thanks!
- Bret
 
Perhaps something like this ?
SELECT V.*
FROM visitors V
INNER JOIN tokens T ON V.uuid = T.uuid
INNER JOIN campaigns C ON T.campaign_id = C.campaign_id
WHERE C.name = 'yahoo'

Hope This Helps, PH.
Want to get great answers to your Tek-Tips questions? Have a look at FAQ219-2884 or FAQ181-2886
 
You need to look at your indexes as well as your query.

How are these tables indexed?

Denny
MCSA (2003) / MCDBA (SQL 2000)
MCTS (SQL 2005 / Microsoft Windows SharePoint Services 3.0: Configuration / Microsoft Office SharePoint Server 2007: Configuration)
MCITP Database Administrator (SQL 2005) / Database Developer (SQL 2005)

--Anything is possible. All it takes is a little research. (Me)
[noevil]
 
I'll have to follow up tomorrow to be certain, but I think that's it! I'll have to read up on my inner joins to understand exactly what's going on.

Thanks!
- Bret
 
Hello!

I have a follow up question. The query above seemed to work fine and I'm trying to extend on it. I'm trying to list out the number of visitors coming in to the site with a specific uuid. The list should also contain the campaign name if it exists, or NULL otherwise.

In my previous example, I called the unique identifier "uuid", which is inconsistent with my *real* code. Just for completeness, I'll restate my real table structures:

visitors
========
visitor_id [int/pk]
client_id [int/fk]
ctoken [varchar (uuid)]
dt [date/time]

ctokens
=======
ctoken_id [int/pk]
ctoken [varchar (uuid)]
campaign_id [int/fk]

campaigns
=========
campaign_id [int/pk]
name [varchar]
client_id [int/fk]

clients
=======
client_id
name

My attempt at this query was to do a three-way outer join. (I mimicked an example from the book 'Learning SQL',p.182) Everything seemed to go great except for my Count():

Here's the query:

SELECT
visitors.ctoken as ctoken,
count( visitors.ctoken ) AS ctoken_count,
visitors.dt,
campaigns.name as campaign_name

FROM visitors
LEFT OUTER JOIN ctokens ON visitors.ctoken = ctokens.ctoken
LEFT OUTER JOIN campaigns ON ctokens.campaign_id = campaigns.campaign_id

WHERE visitors.client_id = #clients.client_id#
AND visitors.dt > '#session.sm_start_date#' and visitors.dt < '#session.sm_end_date#'

GROUP BY visitors.ctoken

My ctoken_count were way to high. Otherwise, everything looked great. If someone could explain what I'm doing wrong, that'd be great.

Thanks!
- Bret
 
It's because of the outer joins. Try using inner joins.

When you use outer joins if there is a record in the visitors table but no match in the ctokens table the visitors record will still show.

You can also change the count from count(visitors.ctoken) to count(ctokens.ctoken). That should give you a different number.

Denny
MCSA (2003) / MCDBA (SQL 2000)
MCTS (SQL 2005 / Microsoft Windows SharePoint Services 3.0: Configuration / Microsoft Office SharePoint Server 2007: Configuration)
MCITP Database Administrator (SQL 2005) / Database Developer (SQL 2005)

--Anything is possible. All it takes is a little research. (Me)
[noevil]
 
Thanks mrdenny,

I was using outer joins because I wanted visitor rows to show up regardless if they were associated with a campaign. I'll see what happens if I count(ctokens.ctoken) instead. I must admit, although I can handle simple queries, these big ones are a mystery. Are there any good books or sites that can teach me how to think about queries in a way that can reveal the logic behind them?

Thanks!
- Bret
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top