Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Predicate IS [NOT] DISTINCT FROM is not pushed into unions/aggregates thus causing sub-optimal plans [CORE4921] #1956

Closed
firebird-automations opened this issue Sep 1, 2015 · 5 comments

Comments

@firebird-automations
Copy link
Collaborator

Submitted by: @pavel-zotov

Consider following script:

============================================
create or alter view v_test as select 1 id from rdb$database;
commit;

recreate table t1(x int, y int);
create index t1_x on t1(x);
create index t1_yx on t1(y, x);

recreate table t2(x int, y int);
create index t2_x on t2(x);
create index t2_yx on t2(y, x);

recreate table t3(x int, y int);
create index t3_x on t3(x);
create index t3_yx on t3(y, x);
commit;

create or alter view v_test as
select * from t1
union all
select * from t2
union all
select * from t3;
commit;

set explain on;
set planonly;

set echo on;

select * from v_test where x = 1; --- case #⁠1

select * from v_test where x is not distinct from 1; -- case #⁠2

select * from v_test where y = 1 and x = 1; -- case #⁠3

select * from v_test where y = 1 and x is not distinct from 1; -- case #⁠4

-- Unnest view, check for usage of single-field index:

select * from t1 where x is not distinct from 1
union all
select * from t2 where x is not distinct from 1
union all
select * from t3 where x is not distinct from 1
;

-- Unnest view, check for usage of compound index:

select * from t1 where y = 1 and x is not distinct from 1
union all
select * from t2 where y = 1 and x is not distinct from 1
union all
select * from t3 where y = 1 and x is not distinct from 1
;

Output of this script on WI-V3.0.0.32008:
select * from v_test where x = 1; --- case #⁠1

Select Expression
-> Filter
-> Union
-> Filter
-> Table "T1" as "V_TEST T1" Access By ID
-> Bitmap
-> Index "T1_X" Range Scan (full match)
-> Filter
-> Table "T2" as "V_TEST T2" Access By ID
-> Bitmap
-> Index "T2_X" Range Scan (full match)
-> Filter
-> Table "T3" as "V_TEST T3" Access By ID
-> Bitmap
-> Index "T3_X" Range Scan (full match)

// So far, so good. Optimizer DOES push predicate "X = 1" inside view and does apply it then for each of view parts.

select * from v_test where x is not distinct from 1; -- case #⁠2

Select Expression
-> Filter
-> Union
-> Table "T1" as "V_TEST T1" Full Scan
-> Table "T2" as "V_TEST T2" Full Scan
-> Table "T3" as "V_TEST T3" Full Scan

// BAD. Why predicate 'x IS DISTINCT FROM 1' is worse than 'X = 1' ? :(

select * from v_test where y = 1 and x = 1; -- case #⁠3

Select Expression
-> Filter
-> Union
-> Filter
-> Table "T1" as "V_TEST T1" Access By ID
-> Bitmap
-> Index "T1_YX" Range Scan (full match)
-> Filter
-> Table "T2" as "V_TEST T2" Access By ID
-> Bitmap
-> Index "T2_YX" Range Scan (full match)
-> Filter
-> Table "T3" as "V_TEST T3" Access By ID
-> Bitmap
-> Index "T3_YX" Range Scan (full match)

// OK, as in case #⁠1: optimizer DOES push predicate inside view and uses scanning on compound index T_YX.

select * from v_test where y = 1 and x is not distinct from 1; -- case #⁠4

Select Expression
-> Filter
-> Union
-> Filter
-> Table "T1" as "V_TEST T1" Access By ID
-> Bitmap
-> Index "T1_YX" Range Scan (partial match: 1/2)
-> Filter
-> Table "T2" as "V_TEST T2" Access By ID
-> Bitmap
-> Index "T2_YX" Range Scan (partial match: 1/2)
-> Filter
-> Table "T3" as "V_TEST T3" Access By ID
-> Bitmap
-> Index "T3_YX" Range Scan (partial match: 1/2)

// Not so bad as in case #⁠2 but why only PARTIAL matching ? If this index will contain lot of keys with the same starting part we'll get poor performance becase of excessive fetches. Probably, such query will run worse than natural scan because of huge random IO.

-- Unnest view, check for usage of single-field index:

select * from t1 where x is not distinct from 1
union all
select * from t2 where x is not distinct from 1
union all
select * from t3 where x is not distinct from 1
;

Select Expression
-> Union
-> Filter
-> Table "T1" Access By ID
-> Bitmap
-> Index "T1_X" Range Scan (full match)
-> Filter
-> Table "T2" Access By ID
-> Bitmap
-> Index "T2_X" Range Scan (full match)
-> Filter
-> Table "T3" Access By ID
-> Bitmap
-> Index "T3_X" Range Scan (full match)

// OK (compare with case #⁠2!)

-- Unnest view, check for usage of compound index:

select * from t1 where y = 1 and x is not distinct from 1
union all
select * from t2 where y = 1 and x is not distinct from 1
union all
select * from t3 where y = 1 and x is not distinct from 1
;

Select Expression
-> Union
-> Filter
-> Table "T1" Access By ID
-> Bitmap
-> Index "T1_YX" Range Scan (full match)
-> Filter
-> Table "T2" Access By ID
-> Bitmap
-> Index "T2_YX" Range Scan (full match)
-> Filter
-> Table "T3" Access By ID
-> Bitmap
-> Index "T3_YX" Range Scan (full match)

// Also OK, FULL match on index scan.

IMHO, optimizer can consider usage of index scan with FULL matching after predicate pushing in case #⁠2 (most needed!) and case #⁠4.

Commits: 10a0d6f b0630cb FirebirdSQL/fbt-repository@af5c3b2 FirebirdSQL/fbt-repository@656eed0

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

assignee: Dmitry Yemanov [ dimitr ]

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

Version: 3.0 Beta 2 [ 10586 ]

Version: 2.5.4 [ 10585 ]

Version: 2.5.3 Update 1 [ 10650 ]

Version: 2.1.7 [ 10651 ]

Version: 3.0 Beta 1 [ 10332 ]

Version: 2.5.3 [ 10461 ]

Version: 2.1.6 [ 10460 ]

Version: 3.0 Alpha 2 [ 10560 ]

Version: 3.0 Alpha 1 [ 10331 ]

Version: 2.5.2 Update 1 [ 10521 ]

Version: 2.1.5 Update 1 [ 10522 ]

Version: 2.5.2 [ 10450 ]

Version: 2.5.1 [ 10333 ]

Version: 2.5.0 [ 10221 ]

summary: Predicate 'IS NOT DISTINCT FROM' can lead to PLAN FULL or partial index matching when applying to VIEW or DT that is defined as UNION of several data sources => Predicate IS [NOT] DISTINCT FROM is not pushed into unions/aggregates thus causing sub-optimal plans

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

status: Open [ 1 ] => Resolved [ 5 ]

resolution: Fixed [ 1 ]

Fix Version: 3.0 RC 1 [ 10584 ]

Fix Version: 2.5.5 [ 10670 ]

@firebird-automations
Copy link
Collaborator Author

Modified by: @pavel-zotov

status: Resolved [ 5 ] => Resolved [ 5 ]

QA Status: No test => Done successfully

@firebird-automations
Copy link
Collaborator Author

Modified by: @pavel-zotov

status: Resolved [ 5 ] => Closed [ 6 ]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment