Postgresql extract last row for each id

 From sql - Postgresql extract last row for each id - Stack Overflow


(...)

For most scenarios, The most efficient way is to use GROUP BY

I saw the accepted answer which determine that using distinct on (id) id is The most efficient way to solve the problem which was described in the question but I believe it's extremely not accurate. Sadly I couldn't find any helpfull insights from POSTGRES doc' but I did find this article which refference few others and provide examples whereas

GROUP BY approach definitely leads to better performance


We had discussion over this subject at work and did a little experience over a table that holds some data about tags' blinks with 4,114,692 rows, and has indexes over tag_id and over timestamp (seperated indexes)

Here are the queries:

1.using ditinct:

select distinct on (tag_id) tag_id, timestamp, some_data 
from blinks 
order by id, timestamp desc;

2.using CTE + group by + join:

`with blink_last_timestamp as (
     select tag_id, max(timestamp) as max_timestamp
     from blinks 
     group by tag_id )
 select bl.tag_id, max_timestamp, some_data
 from blink_last_timestamp bl 
 join blinks b on 
     b.tag_id = bl.tag_id and 
     bd.timestamp = bl.max_timestamp` 

The results where unambiguous and favored the second solution for this scenario (Which is pretty generic in my opinion),

showing that it is being 10X times (!) faster 1655.991 ms (00:01.656) vs 16723.346 ms (00:16.723) and of course delivered the same data.

Comments

Popular posts from this blog

Add GitHub Repository to DBeaver CE Secured with SSH Protocol

Keyboard Shortcut to "Toggle Line Comments" in DBeaver

DBeaver Shortcuts