Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chris Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Compare 2 DB with a NOT EXISTS Clause

Status
Not open for further replies.

glthornton

Programmer
Oct 19, 2005
108
US
Hi, I'm trying to compare 2 separate DB (SQL 2000 Server) tables to determine which Medical Record numbers are missing. I've developed a query to do that and it executes successfully, but I am not getting the end result that I am anticipating. Here is my query:

Code:
SELECT DISTINCT MR_NUMBER
FROM OPENDATASOURCE('SQLOLEDB','Data Source=MAH-SQL1;User ID=sa;Password=').EHRSQL.dbo.od_ar_demographics
WHERE NOT EXISTS (SELECT DISTINCT id.value
FROM OPENDATASOURCE('SQLOLEDB','Data Source=MAH-ORION;User ID=sa;Password=').empi.dbo.patient pt INNER JOIN OPENDATASOURCE('SQLOLEDB','Data Source=MAH-ORION;User ID=sa;Password=').empi.dbo.name nm ON pt.uniqueid=nm.patientid INNER JOIN OPENDATASOURCE('SQLOLEDB','Data Source=MAH-ORION;User ID=sa;Password=').empi.dbo.identifier id ON nm.patientid=id.patientid
WHERE id.assigningauthnamespace='MRN' AND ISNUMERIC(id.value)=1 AND id.value <>'.')

If I run each query separately, I get the expected results from each table/DB. I was curious to know if my 'NOT EXISTS' has problems comparing one 'VarChar' field to another? The MR_NUMBER field is defined as VarChar 11 and the id.value field is defined as VarChar 255. Can someone please let me know what I could be doing wrong here? For this query to work properly, I am expecting to get a result of about 924 MR_NUMBERS.

Any and all assistance would greatly be appreciated. Cheers, Glenn
 
The problem is the syntax. The query you're executing looks like this in natural language:

[tt]When no rows at all exist in MAH-ORION.empi.dbo.identifier, select all rows from od_ar_demographics;
when at least one row exists in MAH-ORION.empi.dbo.identifier, select no rows from od_ar_demographics.[/tt]

Here's a modified query that should give what you want, parts changed in bold:

Code:
SELECT DISTINCT MR_NUMBER
FROM
   OPENDATASOURCE('SQLOLEDB','Data Source=MAH-SQL1;User ID=sa;Password=').EHRSQL.dbo.od_ar_demographics [b]d[/b]
WHERE NOT EXISTS (
   SELECT [b]*[/b]
   FROM
      OPENDATASOURCE('SQLOLEDB','Data Source=MAH-ORION;User ID=sa;Password=').empi.dbo.patient pt
      INNER JOIN OPENDATASOURCE('SQLOLEDB','Data Source=MAH-ORION;User ID=sa;Password=').empi.dbo.name nm ON pt.uniqueid=nm.patientid
      INNER JOIN OPENDATASOURCE('SQLOLEDB','Data Source=MAH-ORION;User ID=sa;Password=').empi.dbo.identifier id ON nm.patientid=id.patientid
   WHERE
      id.assigningauthnamespace='MRN'
      AND ISNUMERIC(id.value)=1
      AND id.value <>'.'
      [b]AND id.value = d.value[/b]
)
The query you presented is a simple subquery. With the correlation clause I added in bold, it becomes a correlated subquery, however because it is inside an EXISTS clause it becomes an outer semi-join.

Look at the execution plan for your original query and the new one, and you'll see what I'm talking about.

Last, if this query is slow, try removing tables from the EXISTS query. Can't you just use the identifier table without joining to the other tables?

You also might try

WHERE id.value NOT LIKE '%[^0-9]%'

in place of the isnumeric clause above.

[COLOR=black #e0e0e0]For SQL and technical ideas, visit my blog, Squared Thoughts.

[sub]The best part about anything that has cheese is the cheese.[/sub][/color]
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top