Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

How to find the duplicate records in the database

Status
Not open for further replies.

chinnashara

Technical User
Dec 12, 2007
27
US
Can some one help me in figuring out the duplicate entries based on the below requirements.

I have a table with have a field for machine name
I have a table for the serial number for these machine names
I have a table for the date registered field for these machines.

As we can have many machines with the duplicate serial numbers, we need to figure out those machines with the duplicate serial numbers and having the oldest date registered.

EX: Machine Name Serial Number Date registered

Machine1 Ma1SNO1 31-11-07
Machine2 Ma1SNO1 20-11-07
Machine3 MA3SNO3 13-12-07
Machine5 MA4SNO4 14-9-07

Now, even Machine1 and Machine2 have the same serial number I need only the machine2 as it contains the oldest Date Registered


Please help me

Thanks in Advance
 
SELECT [Machine Name], [Serial Number], Min([Date Registered)
FROM TableName
GROUP BY [Machine Name], [Serial Number]

Leslie

In an open world there's no need for windows and gates
 
Perhaps:

Code:
SELECT tblMC.MachineName, tblMC.SerialNumber, tblMC.DateRegistered
FROM tblMC INNER JOIN [SELECT  SerialNumber, Max(DateRegistered) As MaxDate
FROM tblMC Group By SerialNumber]. AS McMax ON (tblMC.DateRegistered = McMax.MaxDate) AND (tblMC.SerialNumber = McMax.SerialNumber);

You may wish to post future SQL questions in forum701.
 
Thanks alot Lespaul,Remou. I am trying to map the queries you given with our DB. I will let you if I get any Issues..
Thanks alot again.

Regards,
Sharath
 
I'd stick to lespaul's answer if I were you [blush].
 
Hi lespaul/Remou,

Thanks for your quick help...Am good at netwoking and need this stuff do some investigation where am not much good :)

Let we take an other step to get this done.

We have 2 tables names Agent and Discovered and some field in these tables are....

Agent -- Object_uuid, Date
Discovered -- dis_hw_uuid, Host_name, Serial Number
(Here Object_uuid and dis_hw_uuid both these are same)


It seems we can get the duplicate serial Number from the table Discovered,but as we need Host name which has duplicate Serial number and with the oldest date from the Agent table.

Please help me on this...

Regards,
Sharath
 
ok maybe (typed untested, my best guess):

Code:
SELECT dis_hw_uuid, Host_name, [Serial Number], OldestDate FROM 
Discovered D
INNER JOIN (SELECT Object_uuid, Max([Date]) As OldestDate FROM Agent GROUP BY Object_uuid) A ON A.Object_uuid = D.Object_uuid
GROUP BY dis_hw_uuid, Host_name, [Serial Number], OldestDate
HAVING COUNT([Serial Number]) > 1
 
Here is the error message...

Invalid column name 'object_uuid'.


But we have the filed Object_uuid inthe Agent table...

And thanks for such a prompt response

 
copy and paste error:
Code:
SELECT dis_hw_uuid, Host_name, [Serial Number], OldestDate FROM
Discovered D
INNER JOIN (SELECT Object_uuid, Max([Date]) As OldestDate FROM Agent GROUP BY Object_uuid) A ON A.Object_uuid = [b]D.dis_hw_uuid[/b]
GROUP BY dis_hw_uuid, Host_name, [Serial Number], OldestDate
HAVING COUNT([Serial Number]) > 1

Leslie

In an open world there's no need for windows and gates
 
We just getting the fileds but no records are being generated...It doesnt mean we dont have any Machines with duplicate Serial Numbers :)


Below are the fields we got
dis_hw_uuid,Host_name, Serial Number and OldestDate

Thanks!!!
 
THose are the fields that I listed in the SELECT clause, if you need different/more fields feel free to add them or change them.

maybe it needs to be like this (again, typed, not tested, another guess):

Code:
SELECT * FROM (
SELECT dis_hw_uuid, Host_name, [Serial Number]
FROM Discovered D
GROUP BY dis_hw_uuid, Host_name
HAVING COUNT([Serial Number] > 1) D
INNER JOIN (SELECT Object_uuid, Max([Date]) As OldestDate FROM Agent GROUP BY Object_uuid) A ON A.Object_uuid = D.dis_hw_uuid

Depending on the number of records this may take a while to run. You may want to add an index to the tables on the relevant fields and that may speed it up.

If this doesn't work, can you provide some sample records and expected results....I can't but think there's an easier query for this, but I can't put my head around it!
 
Hi,

below are the error which iam getting when executing the query.

Server: Msg 170, Level 15, State 1, Line 5
Line 5: Incorrect syntax near '>'.
Server: Msg 170, Level 15, State 1, Line 6
Line 6: Incorrect syntax near 'A'.


Let me discuss what exactly we have and the required result..

As I told you We have 2 tables names Agent and Discovered and some field in these tables are....

Agent -- Object_uuid, Date
Discovered -- dis_hw_uuid, Host_name, Serial Number

Here is an example of what i really required...

AGENT-TABLE

Object_uuid date
000123456 22-10-2007
000123457 29-10-2007
000123458 24-5-2007
000123459 24-5-2007
000123460 20-10-2006
000123461 30-12-2004


DISCOVERED--TABLE


dis_hw_uuid Host_name Serial_number
000123456 LESPAUL1 SERIAL001
000123457 LESPAUL2 SERIAL002
000123458 LESPAUL3 SERIAL003
000123459 SHARATH1 SERIAL003
000123460 SHARATH2 SERILA002
000123461 SHARATH3 SERIAL003

Here from the discovered table we could see LESPAUL2,SHARATH2 are the Host_name which have the duplicate serial numbers…and when we map the UUID with the one in the AGENT Table we could see that these machines have the date as 29-10-2007 and 20-10-2007 respectively.

Now as the machine SHARATH2 has the oldest date..i want this machine to be the output…


And much thanksful to you for your time on this.


Have a great day..

regards,
Sharath.M
 
As it alomost very late nihgt for us..please i may not respond...will get you tomorow...

thanks..
 
ok, that helps, I think this may be my last guess! Hopefully this will do what you need:

Code:
SELECT dis_hw_uuid, Host_name, D.[SERIAL NUMBER], OldestDate 
FROM Discovered D
INNER JOIN (SELECT [SERIAL NUMBER] FROM DISCOVERED
GROUP BY [SERIAL NUMBER]
HAVING COUNT(*) > 1) D2 ON D.[SERIAL NUMBER] = D2.[SERIAL NUMBER]
INNER JOIN (SELECT Object_uuid, MIN(date) As OldestDate FROM Agents GROUP BY Object_uuid) A ON A.Object_uuid = D.dis_hw_uuid

Leslie

In an open world there's no need for windows and gates
 
select host_name

from agent ag,discovered_hardware ca,(select min(last_run_date) as lst_run_dt,serial_number

from agent,( select dis_hw_uuid,serial_number

from discovered_hardware

where serial_number in (SELECT serial_number

FROM discovered_hardware

group by serial_number HAVING COUNT(*) > 1)) i

where object_uuid = i.dis_hw_uuid

group by serial_number) i

where ag.last_run_date = i.lst_run_dt

and ca.serial_number=i.serial_number

and object_uuid=dis_hw_uuid

 
The above query worked fine for me...but infact it even had some problem like...if we have 3 machines with the same serial number, as i should exculde the one which have the latest date and include the rest two machines

Host_Name Serial Number Date
A1 a 21-dec-07
B1 a 30-sept-07
C1 a 4-jan-07

As i should get both the machines B1 and C1 my query displaying only C1 as it contains the oldest date...Any way i wll not have this sort of issue.

Thanks a lot for all your support...

I wll be on leave for the whole next week....

So I wish you a very Happy Xmas...


Bye
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top