New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HASH/MERGE JOIN is not used for more than two streams if they are joined via USING/NATURAL clauses and join is based on DBKEY concatenations [CORE4809] #5107
Comments
Commented by: Sean Leyne (seanleyne) It appears that USING is not SQL standard conformant -- it seems to be an MySQL'ism. |
Commented by: @pavel-zotov > USING is not SQL standard conforman Quote 5WD-02-Foundation-2003-09.pdf (SQL-2003), page 312: 7.7 <joined table> Format What about this ? PS. I was sure that JOIN USING & NATURAL JOIN both are just 'syntax sugar' in FB... :-) |
Commented by: Sean Leyne (seanleyne) Sorry, I had done some googling and could not find any usage aside from MySQL, hence the comment. |
Modified by: @dyemanovassignee: Dmitry Yemanov [ dimitr ] |
Modified by: @dyemanovVersion: 2.5.4 [ 10585 ] Version: 2.5.3 Update 1 [ 10650 ] Version: 2.1.7 [ 10651 ] Version: 3.0 Beta 1 [ 10332 ] Version: 2.5.3 [ 10461 ] Version: 2.1.6 [ 10460 ] Version: 3.0 Alpha 2 [ 10560 ] Version: 3.0 Alpha 1 [ 10331 ] Version: 2.5.2 Update 1 [ 10521 ] Version: 2.1.5 Update 1 [ 10522 ] Version: 2.5.2 [ 10450 ] Version: 2.5.1 [ 10333 ] Version: 2.5.0 [ 10221 ] Component: Engine [ 10000 ] summary: Force optimizer to consider usage of HASH JOIN when number of data sources is three and more and they are joined on USING(<col>) or by NATURAL clauses => Force optimizer to consider usage of HASH/MERGE JOIN when number of data sources is three and more and they are joined on USING(<col>) or by NATURAL clauses Version: 3.0 Beta 2 [ 10586 ] => |
Commented by: @pavel-zotov If key for join is changed from RDB$DB_KEY to common field than all become OK in 3.0: optimizer DOES make plans with HASH join in all cases (checked on WI-T3.0.0.31846): recreate table tn(id_key int primary key using index tn_id_key); set planonly; ----------- test `traditional` join form ----------------- select count(*) PLAN HASH (S TN NATURAL, R TN NATURAL) select count(*) PLAN HASH (HASH (T TN NATURAL, S TN NATURAL), R TN NATURAL) select count(*) PLAN HASH (HASH (HASH (U TN NATURAL, T TN NATURAL), S TN NATURAL), R TN NATURAL) ----------- test join on named columns form ----------------- select count(*) PLAN HASH (S TN NATURAL, R TN NATURAL) select count(*) PLAN HASH (T TN NATURAL, HASH (S TN NATURAL, R TN NATURAL)) select count(*) PLAN HASH (U TN NATURAL, HASH (T TN NATURAL, HASH (S TN NATURAL, R TN NATURAL))) ----------- test natural join form ----------------- select count(*) PLAN HASH (S TN NATURAL, R TN NATURAL) select count(*) PLAN HASH (T TN NATURAL, HASH (S TN NATURAL, R TN NATURAL)) select count(*) PLAN HASH (U TN NATURAL, HASH (T TN NATURAL, HASH (S TN NATURAL, R TN NATURAL))) |
Modified by: @dyemanovissuetype: Improvement [ 4 ] => Bug [ 1 ] summary: Force optimizer to consider usage of HASH/MERGE JOIN when number of data sources is three and more and they are joined on USING(<col>) or by NATURAL clauses => HASH/MERGE JOIN is not used for more than two streams if they are joined via USING/NATURAL clauses and join is based on DBKEY concatenations |
Modified by: @dyemanovstatus: Open [ 1 ] => Resolved [ 5 ] resolution: Fixed [ 1 ] Fix Version: 3.0 Beta 2 [ 10586 ] Fix Version: 2.5.5 [ 10670 ] |
Modified by: @pavel-zotovstatus: Resolved [ 5 ] => Resolved [ 5 ] QA Status: Done successfully Test Details: Test verifies only 3.0. For 2.5.x see test for CORE4822. |
Modified by: @pavel-zotovstatus: Resolved [ 5 ] => Closed [ 6 ] |
Submitted by: @pavel-zotov
Votes: 1
recreate table tn(x int primary key using index tn_x);
commit;
insert into tn select row_number()over() from rdb$types;
commit;
set planonly;
----------- test `traditional` join form -----------------
select count(*)
from (select rdb$db_key||'' a from tn) r
join (select rdb$db_key||'' a from tn) s on r.a = s.a;
PLAN HASH (S TN NATURAL, R TN NATURAL)
select count(*)
from (select rdb$db_key||'' a from tn) r
join (select rdb$db_key||'' a from tn) s on r.a = s.a
join (select rdb$db_key||'' a from tn) t on s.a = t.a;
PLAN HASH (HASH (T TN NATURAL, S TN NATURAL), R TN NATURAL)
select count(*)
from (select rdb$db_key||'' a from tn) r
join (select rdb$db_key||'' a from tn) s on r.a = s.a
join (select rdb$db_key||'' a from tn) t on s.a = t.a
join (select rdb$db_key||'' a from tn) u on t.a = u.a;
PLAN HASH (HASH (HASH (U TN NATURAL, T TN NATURAL), S TN NATURAL), R TN NATURAL)
Conclusion: ALL OK for this form of joins, optimizer DOES take in account ability to apply HJ regardless of data sources number.
----------- test join on named columns form -----------------
select count(*)
from (select rdb$db_key||'' a from tn) r
join (select rdb$db_key||'' a from tn) s using(a);
PLAN HASH (S TN NATURAL, R TN NATURAL)
select count(*)
from (select rdb$db_key||'' a from tn) r
join (select rdb$db_key||'' a from tn) s using(a)
join (select rdb$db_key||'' a from tn) t using(a);
PLAN JOIN (HASH (S TN NATURAL, R TN NATURAL), T TN NATURAL) ----------- NESTED LOOPS on outer phase
select count(*)
from (select rdb$db_key||'' a from tn) r
join (select rdb$db_key||'' a from tn) s using(a)
join (select rdb$db_key||'' a from tn) t using(a)
join (select rdb$db_key||'' a from tn) u using(a);
PLAN JOIN (HASH (S TN NATURAL, R TN NATURAL), T TN NATURAL, U TN NATURAL) ----------- NESTED LOOPS on outer phase
----------- test natural join form -----------------
select count(*)
from (select rdb$db_key||'' a from tn) r
natural join (select rdb$db_key||'' a from tn) s;
PLAN HASH (S TN NATURAL, R TN NATURAL)
select count(*)
from (select rdb$db_key||'' a from tn) r
natural join (select rdb$db_key||'' a from tn) s
natural join (select rdb$db_key||'' a from tn) t;
PLAN JOIN (HASH (S TN NATURAL, R TN NATURAL), T TN NATURAL) ----------- NESTED LOOPS on outer phase
select count(*)
from (select rdb$db_key||'' a from tn) r
natural join (select rdb$db_key||'' a from tn) s
natural join (select rdb$db_key||'' a from tn) t
natural join (select rdb$db_key||'' a from tn) u;
PLAN JOIN (HASH (S TN NATURAL, R TN NATURAL), T TN NATURAL, U TN NATURAL) ----------- NESTED LOOPS on outer phase
Commits: cb6ddcd 07263d6 FirebirdSQL/fbt-repository@f697ea0 FirebirdSQL/fbt-repository@f93b285
====== Test Details ======
Test verifies only 3.0. For 2.5.x see test for CORE4822.
The text was updated successfully, but these errors were encountered: