Issue Details (XML | Word | Printable)

Key: CORE-4528
Type: Improvement Improvement
Status: Resolved Resolved
Resolution: Fixed
Priority: Minor Minor
Assignee: Dmitry Yemanov
Reporter: Pavel Zotov
Votes: 0
Watchers: 1
Operations

If you were logged in you would be able to see more operations.
Firebird Core

Allow hash/merge joins for non-field (dbkey or derived expression) equalities

Created: 25/Aug/14 09:08 AM   Updated: 27/May/15 04:52 PM
Component/s: Engine
Affects Version/s: 2.1.5, 2.5.2, 2.1.5 Update 1, 2.5.2 Update 1, 3.0 Alpha 1, 3.0 Alpha 2, 2.1.6, 2.5.3
Fix Version/s: 3.0 Beta 1

QA Status: Done successfully
Test Details:
As of build #31840 (~20-may-2015), join of three and more sources which is specified by `USING` or `NATURAL` clauses will use nested loops.
See also: CORE-4809


 Description  « Hide
Test case:

recreate table tn(x int);
commit;
set planonly;

var #1
--------
SQL> select * from (select x a from tn) r join (select x b from tn) s on r.a = s.b;

PLAN HASH (S TN NATURAL, R TN NATURAL) -- OK

var #2
--------
SQL> select * from (select rdb$db_key||'' a from tn) r join (select rdb$db_key||'' b from tn) s on r.a = s.b;

PLAN JOIN (R TN NATURAL, S TN NATURAL) -- hash join could be here


 All   Comments   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Dmitry Yemanov made changes - 25/Aug/14 10:51 AM
Field Original Value New Value
Assignee Dmitry Yemanov [ dimitr ]
Dmitry Yemanov made changes - 25/Aug/14 11:32 AM
Summary Improve plan when self-join table using rdb$db_key Allow hash/merge joins for non-field (dbkey or derived expression) equalities
Affects Version/s 2.5.3 [ 10461 ]
Affects Version/s 2.1.6 [ 10460 ]
Affects Version/s 3.0 Alpha 1 [ 10331 ]
Affects Version/s 2.5.2 Update 1 [ 10521 ]
Affects Version/s 2.1.5 Update 1 [ 10522 ]
Affects Version/s 2.5.2 [ 10450 ]
Affects Version/s 2.1.5 [ 10420 ]
Component/s Engine [ 10000 ]
Dmitry Yemanov made changes - 25/Aug/14 11:35 AM
Status Open [ 1 ] Resolved [ 5 ]
Fix Version/s 3.0 Beta 1 [ 10332 ]
Resolution Fixed [ 1 ]
Dmitry Yemanov made changes - 25/Aug/14 01:26 PM
Description Test #1
======
recreate table tn(x int);
commit;
set planonly;

var #1
--------
SQL> select * from (select x a from tn) r join (select x b from tn) s on r.a = s.b;

PLAN HASH (S TN NATURAL, R TN NATURAL) -- OK

var #2
--------
SQL> select * from (select rdb$db_key||'' a from tn) r join (select rdb$db_key||'' b from tn) s on r.a = s.b;

PLAN JOIN (R TN NATURAL, S TN NATURAL) -- hash join could be here

Test #2
======
create sequence g; commit;
recreate table t( id int primary key, s varchar(36) ); commit;
insert into t
select i,s from
(
  select gen_id(g,1) i, uuid_to_char(gen_uuid()) s
  from rdb$types a,rdb$types b,(select 1 i from rdb$types rows 20) rows 10000
)
order by rand();
commit;

var #1 (emulating left join with select only one field from driven table)
--------
select count(*) from (
  select x.id, (select z.s from t z where z.rdb$db_key = x.k) s
  from (select id, rdb$db_key k from t a order by id) x
)

-- works very fast:

Select Expression
    -> Singularity Check
        -> Filter
            -> Table "T" as "Z" Access By ID
                -> DBKEY
Select Expression
    -> Aggregate
        -> Materialize
            -> Table "T" as "X A" Access By ID
                -> Index "RDB$PRIMARY2" Full Scan
1 records fetched
     31 ms, 50015 fetch(es)

Table Natural Index
****************************************************
T 20000

var #2 (common left join using rdb$db_key):
--------
select count(*) from (
  select x.id,z.s
  from (select id, rdb$db_key k from t a order by id) x
  left join t z on x.k = z.rdb$db_key
);

-- works very slow:

# LI-T3.0.0.31288:
Select Expression
    -> Aggregate
        -> Nested Loop Join (outer)
            -> Table "T" as "X A" Access By ID
                -> Index "RDB$PRIMARY2" Full Scan
            -> Filter
                -> Table "T" as "Z" Full Scan
1 records fetched
  67248 ms, 227 read(s), 204300015 fetch(es)

Table Natural Index
****************************************************
T 100000000 10000

== vs 2.5: ==

# LI-V2.5.3.26790:
  49146 ms, 204300015 fetch(es)

Table Natural Index Update
***************************************************************
T 100000000 10000
Test case:

recreate table tn(x int);
commit;
set planonly;

var #1
--------
SQL> select * from (select x a from tn) r join (select x b from tn) s on r.a = s.b;

PLAN HASH (S TN NATURAL, R TN NATURAL) -- OK

var #2
--------
SQL> select * from (select rdb$db_key||'' a from tn) r join (select rdb$db_key||'' b from tn) s on r.a = s.b;

PLAN JOIN (R TN NATURAL, S TN NATURAL) -- hash join could be here
Pavel Zotov made changes - 27/May/15 04:52 PM
Status Resolved [ 5 ] Resolved [ 5 ]
Test Details As of build #31840 (~20-may-2015), join of three and more sources which is specified by `USING` or `NATURAL` clauses will use nested loops.
See also: CORE-4809
QA Status Done successfully