-
-
Notifications
You must be signed in to change notification settings - Fork 233
Allow hash/merge joins for non-field (dbkey or derived expression) equalities [CORE4528] #4846
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Modified by: @dyemanovassignee: Dmitry Yemanov [ dimitr ] |
Modified by: @dyemanovVersion: 2.5.3 [ 10461 ] Version: 2.1.6 [ 10460 ] Version: 3.0 Alpha 1 [ 10331 ] Version: 2.5.2 Update 1 [ 10521 ] Version: 2.1.5 Update 1 [ 10522 ] Version: 2.5.2 [ 10450 ] Version: 2.1.5 [ 10420 ] Component: Engine [ 10000 ] summary: Improve plan when self-join table using rdb$db_key => Allow hash/merge joins for non-field (dbkey or derived expression) equalities |
Modified by: @dyemanovstatus: Open [ 1 ] => Resolved [ 5 ] resolution: Fixed [ 1 ] Fix Version: 3.0 Beta 1 [ 10332 ] |
Commented by: @dyemanov Please test with a next snapshot build. |
Modified by: @dyemanovdescription: Test #1recreate table tn(x int); var #1 PLAN HASH (S TN NATURAL, R TN NATURAL) -- OK var #2 PLAN JOIN (R TN NATURAL, S TN NATURAL) -- hash join could be here Test #2create sequence g; commit; var #1 (emulating left join with select only one field from driven table) -- works very fast: Select Expression Table Natural Index var #2 (common left join using rdb$db_key): -- works very slow: # LI-T3.0.0.31288: Table Natural Index == vs 2.5: == # LI-V2.5.3.26790: Table Natural Index Update => Test case: recreate table tn(x int); var #1 PLAN HASH (S TN NATURAL, R TN NATURAL) -- OK var #2 PLAN JOIN (R TN NATURAL, S TN NATURAL) -- hash join could be here |
Commented by: @pavel-zotov Test again on LI-T3.0.0.31301. Results:SQL> recreate table tn(x int primary key using index tn_x); commit; PLAN JOIN (R TN NATURAL, S TN INDEX (TN_X)) -- OK, because of index on `X` field SQL> select * from (select x a from tn) r join (select x b from tn) s on r.a+0 = s.b+0; PLAN HASH (S TN NATURAL, R TN NATURAL) -- OK: suppress index usage leads to HJ -------------- Now the same but with RDB$DB_KEY : SQL> select * from (select rdb$db_key a from tn) r join (select rdb$db_key b from tn) s on r.a = s.b; PLAN JOIN (R TN NATURAL, S TN INDEX ()) -- OK, because we did not add ||'' (analog for "+0" for numeric fields in join cond.) -- Now add this ||'': PLAN JOIN (S TN NATURAL, R TN INDEX ()) -- but can`t get HJ. PS. As workaround: hash join will be in use only if we do like this: SQL> select * from (select rdb$db_key||'' a from tn) r join (select rdb$db_key||'' b from tn) s on r.a = s.b; PLAN HASH (S TN NATURAL, R TN NATURAL) (i.e. move ||'' inside EVERY datasource from JOIN condition). PPS. I'm not sure is this related to source problem of ticket or no. It's not hard to 'move' all such ||'' inside datasources from join expression but it will be look as something 'not-common'. |
Commented by: Sean Leyne (seanleyne) Dmitry, This case seems to be related to core-4530 which you just opened, are they? |
Commented by: @dyemanov No, they're unrelated. |
Commented by: @dyemanov Pavel, concatenation is somewhat specific operation for DBKEYs, there may be a different issue. I will investigate and open a separate ticket, if required. |
Commented by: @pavel-zotov HASH join often wins vs NL (maybe you remember: I did some tests few months ago and sent results to you). IMO, it will be worth-while if HJ will be avaliable when joining using rdb$db_key - for instance, to avoid huge memory consuming when doing SORT on wide record. Currently NL using RDB$DB_KEY *wins* 'plain old' SORT if TempCacheLimit set to default (small) 64 M (I'm speaking about SS). DDL:recreate table t(id int primary key, x int, y int, s1 varchar(10000), s2 varchar(10000), s3 varchar(10000)); commit; Consider that we need to get such result: select x,s1,s2,s3 from t order by y; Test-1: sort only 'narrow' source + Nested Loops join with rest fields using RDB$DB_KEY:commit; select mon$stat_id si,mon$stat_group sg,mon$max_memory_used mmu,mon$max_memory_allocated mma from mon$memory_usage order by sg,si; select y.x, y.s1, y.s2, y.s3 set stat off; Test-2: 'plain old' way to SORT rowscommit; select mon$stat_id si,mon$stat_group sg,mon$max_memory_used mmu,mon$max_memory_allocated mma from mon$memory_usage order by sg,si; RESULTS: Trace for TEST-1:select y.x, y.s1, y.s2, y.s3 Table Natural Index TempSpace usage for TEST-1: NO detected fb_sort_* files which exceeded 64M (command for watch: lsof -a +L1 /dev/shm) Trace for TEST-2:select x,s1,s2,s3 from t order by y Table Natural Index TempSpace usage for TEST-2: COMMAND PID USER FD TYPE DEVICE SIZE/OFF NLINK NODE NAME Tue Aug 26 14:05:13 MSK 2014 Tue Aug 26 14:05:15 MSK 2014 Tue Aug 26 14:05:17 MSK 2014 As we can see, ~3 Gb of memory was consumed for this sort. PS. Test with Hash Join show results worse than NL, but I had to run INCORRECT query: INNER join only can be used now. select y.x, y.s1, y.s2, y.s3 Table Natural Index Updat But here also NO detected fb_sort_* files which exceeded 64M. |
Modified by: @pavel-zotovstatus: Resolved [ 5 ] => Resolved [ 5 ] QA Status: Done successfully Test Details: As of build #31840 (~20-may-2015), join of three and more sources which is specified by `USING` or `NATURAL` clauses will use nested loops. |
Submitted by: @pavel-zotov
Test case:
recreate table tn(x int);
commit;
set planonly;
var #1
--------
SQL> select * from (select x a from tn) r join (select x b from tn) s on r.a = s.b;
PLAN HASH (S TN NATURAL, R TN NATURAL) -- OK
var #2
--------
SQL> select * from (select rdb$db_key||'' a from tn) r join (select rdb$db_key||'' b from tn) s on r.a = s.b;
PLAN JOIN (R TN NATURAL, S TN NATURAL) -- hash join could be here
Commits: ee30e8e FirebirdSQL/fbt-repository@ee218fc
====== Test Details ======
As of build #31840 (~20-may-2015), join of three and more sources which is specified by `USING` or `NATURAL` clauses will use nested loops.
See also: CORE4809
The text was updated successfully, but these errors were encountered: