|
[
Permalink
| « Hide
]
Dmitry Yemanov added a comment - 25/Aug/14 11:36 AM
Please test with a next snapshot build.
Test again on LI-T3.0.0.31301.
Results: ======= SQL> recreate table tn(x int primary key using index tn_x); commit; SQL> set plan on; SQL> select * from (select x a from tn) r join (select x b from tn) s on r.a = s.b; PLAN JOIN (R TN NATURAL, S TN INDEX (TN_X)) -- OK, because of index on `X` field SQL> select * from (select x a from tn) r join (select x b from tn) s on r.a+0 = s.b+0; PLAN HASH (S TN NATURAL, R TN NATURAL) -- OK: suppress index usage leads to HJ -------------- Now the same but with RDB$DB_KEY : SQL> select * from (select rdb$db_key a from tn) r join (select rdb$db_key b from tn) s on r.a = s.b; PLAN JOIN (R TN NATURAL, S TN INDEX ()) -- OK, because we did not add ||'' (analog for "+0" for numeric fields in join cond.) -- Now add this ||'': SQL> select * from (select rdb$db_key a from tn) r join (select rdb$db_key b from tn) s on r.a||'' = s.b||''; PLAN JOIN (S TN NATURAL, R TN INDEX ()) -- but can`t get HJ. PS. As workaround: hash join will be in use only if we do like this: SQL> select * from (select rdb$db_key||'' a from tn) r join (select rdb$db_key||'' b from tn) s on r.a = s.b; PLAN HASH (S TN NATURAL, R TN NATURAL) (i.e. move ||'' inside EVERY datasource from JOIN condition). PPS. I'm not sure is this related to source problem of ticket or no. It's not hard to 'move' all such ||'' inside datasources from join expression but it will be look as something 'not-common'. Dmitry,
This case seems to be related to core-4530 which you just opened, are they? Pavel, concatenation is somewhat specific operation for DBKEYs, there may be a different issue. I will investigate and open a separate ticket, if required.
HASH join often wins vs NL (maybe you remember: I did some tests few months ago and sent results to you).
IMO, it will be worth-while if HJ will be avaliable when joining using rdb$db_key - for instance, to avoid huge memory consuming when doing SORT on wide record. Of course, this will have sense only when OUTER hash joins will be implemented (to give us ability to specify order of join). Currently NL using RDB$DB_KEY *wins* 'plain old' SORT if TempCacheLimit set to default (small) 64 M (I'm speaking about SS). DDL: ==== recreate table t(id int primary key, x int, y int, s1 varchar(10000), s2 varchar(10000), s3 varchar(10000)); commit; insert into t select row_number()over() ,rand()*100 ,rand()*100 ,rpad('',10000, uuid_to_char(gen_uuid())) ,rpad('',10000, uuid_to_char(gen_uuid())) ,rpad('',10000, uuid_to_char(gen_uuid())) from rdb$types,rdb$types,(select 1 i from rdb$types) rows 100000; commit; Consider that we need to get such result: select x,s1,s2,s3 from t order by y; Test-1: sort only 'narrow' source + Nested Loops join with rest fields using RDB$DB_KEY: ====================================================================== commit; select mon$stat_id si,mon$stat_group sg,mon$max_memory_used mmu,mon$max_memory_allocated mma from mon$memory_usage order by sg,si; out /dev/null; set stat on; select y.x, y.s1, y.s2, y.s3 from( select rdb$db_key k from t order by y ) x left join t y on x.k=y.rdb$db_key; set stat off; out; commit; select mon$stat_id si,mon$stat_group sg,mon$max_memory_used mmu,mon$max_memory_allocated mma from mon$memory_usage order by sg,si; Test-2: 'plain old' way to SORT rows ============================ commit; select mon$stat_id si,mon$stat_group sg,mon$max_memory_used mmu,mon$max_memory_allocated mma from mon$memory_usage order by sg,si; out /dev/null; set stat on; select x,s1,s2,s3 from t order by y; -- no joins, but HUGE memory consumption (see below) set stat off; out; commit; select mon$stat_id si,mon$stat_group sg,mon$max_memory_used mmu,mon$max_memory_allocated mma from mon$memory_usage order by sg,si; RESULTS: ######## Trace for TEST-1: ============= select y.x, y.s1, y.s2, y.s3 from( select rdb$db_key k from t order by y ) x left join t y on x.k=y.rdb$db_key ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Select Expression -> Nested Loop Join (outer) -> Sort (record length: 24, key length: 8) -> Table "T" as "X T" Full Scan -> Filter -> Table "T" as "Y" Access By ID -> DBKEY 100000 records fetched 4844 ms, 1001585 read(s), 1900056 fetch(es) Table Natural Index **************************************************** T 100000 100000 TempSpace usage for TEST-1: NO detected fb_sort_* files which exceeded 64M (command for watch: lsof -a +L1 /dev/shm) Trace for TEST-2: ============== select x,s1,s2,s3 from t order by y ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Select Expression -> Sort (record length: 30040, key length: 8) -> Table "T" Full Scan 100000 records fetched 10328 ms, 497974 read(s), 1000056 fetch(es) Table Natural Index **************************************************** T 100000 TempSpace usage for TEST-2: COMMAND PID USER FD TYPE DEVICE SIZE/OFF NLINK NODE NAME firebird 29585 firebird 6u REG 0,17 869212160 0 2199948 /dev/shm/fb_sort_VBA1RC (deleted) Tue Aug 26 14:05:13 MSK 2014 firebird 29585 firebird 6u REG 0,17 1663041536 0 2199948 /dev/shm/fb_sort_VBA1RC (deleted) Tue Aug 26 14:05:15 MSK 2014 firebird 29585 firebird 6u REG 0,17 2389704704 0 2199948 /dev/shm/fb_sort_VBA1RC (deleted) Tue Aug 26 14:05:17 MSK 2014 firebird 29585 firebird 6u REG 0,17 3094347776 0 2199948 /dev/shm/fb_sort_VBA1RC (deleted) As we can see, ~3 Gb of memory was consumed for this sort. PS. Test with Hash Join show results worse than NL, but I had to run INCORRECT query: INNER join only can be used now. select y.x, y.s1, y.s2, y.s3 from( select rdb$db_key||'' k from t order by y ) x join t y on x.k||'' = y.rdb$db_key||'' -- ::: NB ::: unfortunately, only INNER join can be used now for hash ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Select Expression -> Filter -> Hash Join (inner) -> Table "T" as "Y" Full Scan -> Record Buffer (record length: 25) -> Sort (record length: 24, key length: 8) -> Table "T" as "X T" Full Scan 100000 records fetched 6015 ms, 1500073 read(s), 2000112 fetch(es) Table Natural Index Updat ************************************************************ T 200000 But here also NO detected fb_sort_* files which exceeded 64M. |