Postgresql extract last row for each id
- Get link
- X
- Other Apps
From sql - Postgresql extract last row for each id - Stack Overflow
(...)
For most scenarios, The most efficient way is to use GROUP BY
I saw the accepted answer which determine that using distinct on (id) id
is The most efficient way to solve the problem which was described in the question but I believe it's extremely not accurate. Sadly I couldn't find any helpfull insights from POSTGRES doc' but I did find this article which refference few others and provide examples whereas
GROUP BY approach definitely leads to better performance
We had discussion over this subject at work and did a little experience over a table that holds some data about tags' blinks with 4,114,692 rows, and has indexes over tag_id and over timestamp (seperated indexes)
Here are the queries:
1.using ditinct:
select distinct on (tag_id) tag_id, timestamp, some_data
from blinks
order by id, timestamp desc;
2.using CTE + group by + join:
`with blink_last_timestamp as (
select tag_id, max(timestamp) as max_timestamp
from blinks
group by tag_id )
select bl.tag_id, max_timestamp, some_data
from blink_last_timestamp bl
join blinks b on
b.tag_id = bl.tag_id and
bd.timestamp = bl.max_timestamp`
The results where unambiguous and favored the second solution for this scenario (Which is pretty generic in my opinion),
showing that it is being 10X times (!) faster 1655.991 ms (00:01.656) vs 16723.346 ms (00:16.723) and of course delivered the same data.
- Get link
- X
- Other Apps
Comments
Post a Comment