Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Odd MAX() Results 1

Status
Not open for further replies.

PCHomepage

Programmer
Feb 24, 2009
609
US
I'm running a very simple query:

Code:
SELECT MAX(MailingDate), ID
FROM newsletter

but it seems to "think" that 1009411200 is larger than 1106870400 so keeps returning the former. What I am missing here?
 
what is the MailingDate field type? and what is the collation of the underlying table and database.

If you want the latest mailing date for an ID, you need to reverse the fields, too of course.

Code:
SELECT
  ID,
  MAX(CAST(MailingDate AS datetime)) as LastMailing
FROM 
  newsletter
GROUP BY
  ID asc
 
I finally figured it out. It turns out that it was pulling up the proper MailingDate entry but it was getting the ID from a different entry and it was the ID I was using to fetch the entry (in PHP) for viewing. It seems overly complicated and unnecessary so was there a simpler way of getting the required result?

Code:
SELECT ID FROM newsletters 
WHERE MailingDate 
IN (SELECT MAX(MailingDate) FROM newsletters)
 
i see. you were trying to find the last ID to be mailed.

Code:
SELECT      ID
FROM        newsletters
ORDER BY    CAST(MailingDate AS DATETIME) DESC
LIMIT       1
 
MailingDate is just simply an integer and the required entry is not necessarily the highest ID in the table because the entries were added manually after-the-fact in no particular order.

Your first code pulls up all entries without MailingDate values and your second also gives the wrong ID (same one as mine) but, as you possibly presumed that MailingDate was something special, it also gives other errors. Oddly, the ID it's fetching is the lowest one but I had tried reversing the order and it made no difference. My follow-up query above does work however.
 
the code ignores ID. it just returns the ID for the highest value of mailingdate. if there is more than one value of ID for which mailing date is the same and max, then the lower of the n IDs will be returned. you can change this behaviour by adding a second order by clause on ID desc

the first code i posted returns the last mailing date for each distinct ID. so if an ID were mailed ten times, it would return only one row for that ID (being the ID and the last date that it was mailed)
 
and if you want the last mailing date for a given id

Code:
select cast(mailingdate as datetime) 
from newsletter
where ID = ?
order by cast(mailingdate as datetime) desc
limit 1

your second post will return all the IDs which were mailshotted last. so if there were multiple sent at the same time, they will all be returned.
 
The ID is distinct and for only a single mailing so no dates are even close - most are several years apart. Your last code above, of course, requires the ID as an input but it is the ID I'm trying to fetch.

Although it's working now with my last posted query, what I can't figure out is how it can pull values from two separate lines in the table. This, for example, pulls the proper MailingDate value but the ID is from a different row:

Code:
SELECT ID, MAX(MailingDate) AS MailingDate
FROM newsletters
LIMIT 1

Remember, the MailingDate is simply an integer so using datetime codes has no significance.
 
Because you are using grouping functions without a group by clause.

I am still guessing at what you want to achieve by way of record set as you have not expressly said. From your last post you want to obtain a set of all ids and for each the Max value of mailing date. The assumption is that id is not the primary key here.

If so
Code:
Select id, max(maileddate)
From newsletters 
Group by id

I am assuming as well that the mailing date column is expressly set as an integer data type. Posting the results of a show columns query will help. If it is not then I suspect that casting the column to an integer will help with sorting.
Code:
Max(cast(mailingdate as unsigned integer))
 
No, I want simply the ID of the single record with the highest MailingDate value. Yes, MailingDate is an integer and it is an INT(10) field. No, I didn't use any grouping functions unless that's what MAX() is.

However, my follow-up posting to the original question is giving me the needed results but it seems overly complicated for something that should be simple and I still don't see why my original query was giving the mismatch between ID and MailingDate values. Anyway, I think we can give this a rest now and thank you for your help!
 
No, I want simply the ID of the single record with the highest MailingDate value.
ok. good to know.

Yes, MailingDate is an integer and it is an INT(10) field
good

No, I didn't use any grouping functions unless that's what MAX() is
yes. that is what max() is. it is known as a grouping, group by, aggregation or aggregate function. it and its co-functions work across sets of records. this is the root of your issues.

However, my follow-up posting to the original question is giving me the needed results
as I posted above - it may not always give you the desired results unless the newsletter mailing date is guaranteed always to be unique.

it seems overly complicated for something that should be simple
yes, although not terribly so, particularly if the mailingdate column is indexed. see the end of the post for why this type of solution may be the right solution in certain circumstances.

I still don't see why my original query was giving the mismatch between ID and MailingDate values

for the reasons i have posted several times. you have no group by clause.

from the manual
manual said:
If you use a group function in a statement containing no GROUP BY clause, it is equivalent to grouping on all rows.
Read further and in more detail: here

I think we can give this a rest now

you may feel that the topic is closed for you (in which case no need to reply) but this site is for all readers and so it may be useful to continue the thread and provide examples.

so here goes

consider this simple table

Code:
CREATE TABLE `newsletters` (
 `rowID` int(10) NOT NULL AUTO_INCREMENT,
 `ID` int(10) NOT NULL,
 `mailingdate` int(12) NOT NULL,
 PRIMARY KEY (`rowID`)
) ENGINE=InnoDB AUTO_INCREMENT=1001 DEFAULT CHARSET=latin1

then populate it with random data. I used 1000 rows populated with the following php script
Code:
<?php
date_default_timezone_set('UTC');
try{
	$pdo = new PDO(	'mysql:host=127.0.0.1;dbname=don;port=8889',
					'root',
					'root');
} catch (PDOException $e) {
    echo 'Connection failed: ' . $e->getMessage();
}
$s = $pdo->prepare("Insert into newsletters (rowID, ID, mailingdate) VALUES (?,?,?)");
if(!$s):
	die(print_r($pdo->errorinfo(), true));
endif;
for($i=0; $i<1000; $i++):
	$params = array(	NULL,
						floor(rand(1,10)),
						strtotime("+$i hours"));
	$s->execute($params);
endfor;

now let's consider some result sets
Code:
SQL query: SELECT MAX(MailingDate), ID FROM newsletters ; 
MAX(MailingDate)    ID
1388798277	    1
so, the mailing date provided is indeed the highest in the table, but the value for ID is meaningless as it is simply the first in the table. there is no group by clause (or equivalent where/having).

going back to your actual requirement - to find the ID with the highest mailingdate integer.

this can be achieved with an aggregate function. taking it one step at a time

Code:
SELECT ID, mailingdate
FROM newsletters 
group by ID
having max(mailingdate)
this returns a recordset of each ID and the latest mailing date for that ID.

this could equally be achieved without a having clause
Code:
SELECT ID, max(mailingdate)
FROM newsletters 
group by ID

the important thing here is the group by clause. without that a single row would get returned, typically of the first value for ID and the maximum mailingdate.

now, to retrieve JUST the ID with the highest mailing date, continuing with the group by functions one approach might be to order the recordset
Code:
SELECT    ID, mailingdate
FROM      newsletters 
group by  ID
order by  mailingdate desc
now we know that the ID is in the first row. we can limit the recordset to just the first row too, of course
Code:
SELECT    ID,mailingdate
FROM      newsletters 
group by  mailingdate desc, ID
LIMIT 1

None of these aggregations are necessary, however, in your case. My post of 22 Nov 13 @ 15:53 refers. Just select the whole recordset, order by mailingdate and limit.

Code:
Select    ID
from      newsletters
order by  mailingdate desc
limit     1

the casting I posted earlier was incorrect. I should, instead, have used from_unixtime

Code:
select    id, from_unixtime(mailingdate)
from      newsletters
order by  mailingdate desc
limit 1

but none of the above provide for the scenario where there is more than one row for a given mailing date - for example a mail-shot. so let's alter the recordset to provide for this circumstance. I used this php script to do so (it selects 5 rows at random and boots mailingdate long into the future)

Code:
date_default_timezone_set('UTC');
$pdo = new PDO(	'mysql:host=127.0.0.1;dbname=don;port=8889',
					'root',
					'root');
$range = range(1,1000);
shuffle($range);
$query = "update newsletters set mailingdate=? where rowID=?";
$s = $pdo->prepare($query);
$now = strtotime("+1 year");
for($i=0; $i<5; $i++):
	$r = $s->execute(array($now, $range[$i]));
endfor;

Code:
SELECT    group_concat(id), from_unixtime(mailingdate)
from      newsletters
group by  mailingdate desc
limit 1

this provides a comma separated list in the first returned column.

if you want the recordset as a normal unconcatenated set then you need to revert to your solution and use joined tables

Code:
SELECT  id
FROM    newsletters n1
JOIN    (
        SELECT MAX( mailingdate ) AS md
        FROM   newsletters
        )n2 
ON      n1.mailingdate = n2.md

this is relatively efficient. on my system, over a 1000 records, the query time on an unindexed column was 0.012 seconds and, once indexed, 0.0005 seconds. showing the difference indexing makes.

i see no particular difference between the explicit join and the implicit join-in-the-where that you employed. the former is preferable to me because it shows what is going on. but that's a personal style choice. a modern version of mysql will almost certainly optimise both the same way.

i trust this clarifies for future readers.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top