Dobrý den,
dobrá připomínka, na SELF JOIN jsem si nevzpomněl - a máte pravdu, že vůči korelovanému dotazu je to stále vynikající řešení (a to bez toho, že bych se uchýlil k uloženým procedurám).
--bez indexu postgres=# explain analyze select t1.id, t1.sale_date, t1.product, t1.sale_price, sum(t2.sale_price) from history t1 inner join history t2 on t1.id >= t2.id and t1.product = t2.product group by t1.id, t1.sale_date, t1.product, t1.sale_price order by t1.id ; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------- Sort (cost=8241.68..8254.18 rows=5001 width=19) (actual time=2073.968..2081.068 rows=5001 loops=1) Sort Key: t1.id Sort Method: quicksort Memory: 480kB -> HashAggregate (cost=7871.91..7934.42 rows=5001 width=19) (actual time=2049.964..2060.161 rows=5001 loops=1) -> Merge Join (cost=824.54..6521.46 rows=108036 width=19) (actual time=83.099..1632.142 rows=171001 loops=1) Merge Cond: ((t1.product)::text = (t2.product)::text) Join Filter: (t1.id >= t2.id) -> Sort (cost=412.27..424.77 rows=5001 width=15) (actual time=43.820..51.850 rows=5001 loops=1) Sort Key: t1.product Sort Method: quicksort Memory: 324kB -> Seq Scan on history t1 (cost=0.00..105.01 rows=5001 width=15) (actual time=0.124..10.570 rows=5001 loops=1) -> Sort (cost=412.27..424.77 rows=5001 width=11) (actual time=39.241..605.987 rows=337001 loops=1) Sort Key: t2.product Sort Method: quicksort Memory: 324kB -> Seq Scan on history t2 (cost=0.00..105.01 rows=5001 width=11) (actual time=0.064..8.448 rows=5001 loops=1) Total runtime: 2087.921 ms (16 rows) postgres=# explain analyze SELECT sale_date, product, sale_price, COALESCE((SELECT SUM(sale_price) FROM history WHERE product = o.product AND id <= o.id), 0) as total from history o; query plan ----------------------------------------------------------------------------------------------------------------------------------------- seq scan on o (cost=0.00..438222.42 rows=5001 width=15) (actual time=0.193..8727.369 loops=1) subplan -> Aggregate (cost=87.60..87.61 rows=1 width=4) (actual time=1.735..1.736 rows=1 loops=5001) -> Index Scan using history_pkey on history (cost=0.00..87.59 rows=1 width=4) (actual time=0.804..1.670 rows=34 loops=5001) Index Cond: (id <= $1) filter: ((product)::text total runtime: 8746.833 ms (7 rows) postgres=# create index fxxx on history (product,id); CREATE INDEX Time: 59,664 ms postgres=# explain analyze select t1.id, t1.sale_date, t1.product, t1.sale_price, sum(t2.sale_price) from history t1 inner join history t2 on t1.id >= t2.id and t1.product = t2.product group by t1.id, t1.sale_date, t1.product, t1.sale_price order by t1.id ; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------- Sort (cost=3541.92..3554.42 rows=5001 width=19) (actual time=1524.078..1531.054 rows=5001 loops=1) Sort Key: t1.id Sort Method: quicksort Memory: 480kB -> HashAggregate (cost=3172.15..3234.66 rows=5001 width=19) (actual time=1500.061..1511.894 rows=5001 loops=1) -> Nested Loop (cost=0.00..1821.70 rows=108036 width=19) (actual time=0.345..1072.113 rows=171001 loops=1) -> Seq Scan on history t1 (cost=0.00..105.01 rows=5001 width=15) (actual time=0.118..7.966 rows=5001 loops=1) -> Index Scan using fxxx on history t2 (cost=0.00..0.33 rows=1 width=11) (actual time=0.022..0.087 rows=34 loops=5001) Index Cond: (((t2.product)::text = (t1.product)::text) AND (t1.id >= t2.id)) Total runtime: 1537.842 ms (9 rows) Time: 1541,110 ms postgres=# explain analyze select * from report1(); QUERY PLAN ------------------------------------------------------------------------------------------------------------------- Function Scan on report1 (cost=0.00..260.00 rows=1000 width=44) (actual time=227.831..234.816 rows=5001 loops=1) Total runtime: 241.232 ms (2 rows) Time: 241,836 ms
U mne je SELF JOIN cca 4x rychlejší než varianta používající korelovaný poddotaz. Nicméně uložená procedura běžela 0.27 ms, tj. minimálně 8x rychleji (JOIN), cca 24x rychleji než korelovaný dotaz - což je dost na to, abych preferoval uloženou proceduru nebo na Oraclu analytický dotaz.
díky za opravu a doplnění