Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

DISTINCT still showing duplicate records 1

Status
Not open for further replies.

snowboardr

Programmer
Feb 22, 2002
1,401
PH

Why won't distinct work? The sql IS selecting correctly but its showling records with the same f.preplyid

Code:
SELECT DISTINCT (f.preplyid), f.pid, f.ptid, f.ptid2, f.puid, f.psubject, f.ptype, f.ptime, t1.tid, t1.tname tname1, t2.tid, t2.tname tname2 FROM forumpost as f, thread as t1, thread as t2 WHERE t1.tid = f.ptid AND t2.tid = f.ptid2 AND f.puid =2 AND f.ptype=1 AND f.puid NOT IN (SELECT fp.puid FROM forumpost as fp WHERE fp.ptype=0 AND fp.pid=f.preplyid) ORDER by f.ptime desc LIMIT 7
 

would you happen to know where you learned that sql?

because DISTINCT is not a function

i wish i knew where this mis-information is coming from

anyhow, if you want only one row per f.preplyid, which one would that be? how can you identify the row you want?

also, try not to use NOT EXISTS -- there's usually a LEFT OUTER JOIN that could work instead

(hmmm, i think i've seen your tables before somewhere...)

r937.com | rudy.ca
 
The term Distinct is used in MS Access SQL. Makes the recordset not include duplicates.
 
actually, DISTINCT is used in every variation of sql, including standard sql

and yes, it ensures that there are no duplicate rows in the result set -- complete rows

but i have seen DISTINCT (mis)used as a function many, many times -- see query above for a prime example

i'm really curious where this (mis)use comes from...

r937.com | rudy.ca
 
I looked it up here



So you think it would work on this one instead:


Code:
'ignore the single quotes etc 

		'sql = "SELECT "_
		'& "post.ptid," _
		'& "post.ptid2," _
		'& "post.puid," _
		'& "post.psubject," _
		'& "post.ptype," _
		'& "post.ptime," _
		'& "thread1.tid," _
		'& "thread1.tname tname1," _
		'& "thread2.tid," _
		'& "thread2.tname tname2" _
		'& " FROM forumpost as post INNER JOIN thread as thread1 ON thread1.tid = post.ptid INNER JOIN thread as thread2 ON thread2.tid = post.ptid2 WHERE post.puid = " & uid & " AND  post.ptype=0 ORDER by post.ptime desc LIMIT 7"
 
DISTINCT in that query wouldn't have any noticeable effect -- each row is already distinct because of the primary keys being selected

r937.com | rudy.ca
 
Well its returning records with the same preplyid...
 
Well its returning records with the same preplyid...
that's because DISTINCT is not a function


DISTINCT ensures distinct rows

examine the rows in their entirety -- you will find that each row is different

as i asked before, if you want only one row per f.preplyid, which one would that be? how can you identify the row you want?


r937.com | rudy.ca
 
It doesn't matter which one it is, because I am selecting "replys" so thus they all goto the same post. So basicly I am selecting every reply the user has made, but only 1 reply out of each "main post".
 
okay, making progress...

only one reply out of each "main post" -- but which reply?

(and what's a main post? and what's a reply? i see tables called forumpost and thread but not reply)

r937.com | rudy.ca
 
Okay

A main post has a post.ptype=0 and a reply post.ptype=1 also a reply will have a post.preplyid of the "main" post id

does that make sense
 
i'm sorry, you are going to have to offer a bit more information about your tables and columns and what each of them is for, and in particular how the tables elate to each other

r937.com | rudy.ca
 
okay..

forumpost

pid | ptid | ptid2 | puid | psubject | pbody | ptype | preplyid

1 10 20 3 subject mainpost.txt 0
2 10 20 2 subject reply.txt 1 1
3 10 20 2 subject reply.txt.2 1 1


thread

tid | tname
10 category1
20 category2


Okay the reason I am doing the subquery is because I need to select replies that the user didn't start. And the INNER JOIN part is so i can get the categorys the post is in... ie category1 & category2

Code:
INNER JOIN thread as thread1 ON thread1.tid = post.ptid INNER JOIN thread as thread2 ON thread2.tid = post.ptid2

As you can see in the above example, there is 2 posts by the same user puid 2 but I would like to select only 1 reply from each post that user puid=2 made, because all i am doing is outputing the "main post" link to click on for each post ... because if I dont then we would have pid1 subject twice in the list of "replies". This list of records is on the user's profile and th list is called "Posts I have participated in"

Jason
 
well, that helps, i think

so you are retrieving replies that puid 2 posted, and wish to make sure that these are replies to threads that puid 2 didn't himself start

and thus there are two rows in your sample data that meet this criterion

which of them do you want to return and why? this is the problem

r937.com | rudy.ca
 
I just want to return 1 reply by that user because the link is going to the main post anyways, i dont want multiple replys because they goto the same post


Both replys goto this post :

pid | ptid | ptid2 | puid | psubject | pbody | ptype | preplyid

1 10 20 3 subject mainpost.txt 0


I want to do a DISTINCT "of some sort" on preplyid so there is no duplicates of that particular ID.
 
okay, i think i get it now

try this --
Code:
select distinct
       main.pid
     , main.puid
     , main.psubject
     , main.ptime
     , main.ptid
     , t1.tname tname1
     , main.ptid2
     , t2.tname tname2 
  from forumpost as reply
inner
  join forumpost as main
    on main.pid = reply.preplyid 
   and main.ptype = 0
   and main.puid <> reply.puid
inner
  join thread as t1
    on t1.tid = main.ptid
inner
  join thread as t2 
    on t2.tid = main.ptid2 
 where reply.puid = 2 
   and reply.ptype = 1
order 
    by main.ptime desc limit 7

r937.com | rudy.ca
 
Hey r937, I am trying to figure out how to modify this to count the records that the below would return, the one i came up with return 19, and there is only 7 so I know im doing something wrong... Counts arn't my strong point and doing a INNER join... ha...

Code:
select distinct 
main.pid 
, count(main.puid)
, main.ptid 
, t1.tname tname1
, main.ptid2
, t2.tname tname2 
FROM
forumpost as reply 
INNER JOIN 
forumpost as main 
on main.pid = reply.preplyid  
AND main.ptype = 0 
AND main.puid <> reply.puid 
INNER JOIN 
thread as t1 on t1.tid = main.ptid 
INNER JOIN 
thread as t2 on t2.tid = main.ptid2 
WHERE reply.puid = 2 and reply.ptype = 1 order by main.ptime desc limit 7


Code:
select distinct 
count(main.pid) as testtest, 
main.puid,
main.ptid
FROM
forumpost as reply 
INNER JOIN 
forumpost as main 
on main.pid = reply.preplyid  
AND 
main.ptype = 0 
and 
main.puid <> reply.puid 
WHERE 
reply.puid = 2 and reply.ptype = 1 
GROUP by main.ptype

Army : Combat Engineer : 21B

 
what exactly are you trying to count? whatever it is, you need to make suer that every column in the SELECT (except the count) is in the GROUP BY

r937.com | rudy.ca
 
I am trying to count the records from that the SQL statement you helped me on before returns...

Code:
select distinct
       main.pid
     , main.puid
     , main.psubject
     , main.ptime
     , main.ptid
     , t1.tname tname1
     , main.ptid2
     , t2.tname tname2
  from forumpost as reply
inner
  join forumpost as main
    on main.pid = reply.preplyid
   and main.ptype = 0
   and main.puid <> reply.puid
inner
  join thread as t1
    on t1.tid = main.ptid
inner
  join thread as t2
    on t2.tid = main.ptid2
 where reply.puid = 2
   and reply.ptype = 1
order
    by main.ptime desc limit 7

Army : Combat Engineer : 21B

 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top