After reading one more post on SitePoint forums about selecting not aggregated values of columns, which were not specified in the GROUP BY clause, I decided it is worth to highlight in my blog what is wrong with queries that make use of this approach.
At the start, I’d like to mention such queries are allowed only in MySQL, which is known for its unstrict behaviour. Other database servers with produce errors in such case.
One of the most common reasoning for such queries, for example:
select user_id , post_time from posts group by user_id
is the wrong assumption that query will return the most recent entry for ungrouped columns (latest post time for each user in this case).
As MySQL manual says, returned value for ungrouped column can be any value from the group it belongs to. So the query above, is free to return time of user’s any post.
A solution for such cases is to use an aggregate function:
select user_id , max(post_time) from posts group by user_id
This will guarant that returned post time for each group is the latest within the group.
Although, MySQL manual states that in case you are sure that values within a group are constant, using columns in SELECT (or HAVING), but not in GROUP BY can give performance increase by avoiding unnecessary grouping, I highly suggest to keep away from this technique as it is not standard (supported by MySQL only), and it makes behaviour of queries unclear, increasing possibility of queries working not as expected.

















