Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OR'ed optimization in 3.0 is ineffective if WHERE-clause contains two expressions [CORE4537] #4855

Open
firebird-automations opened this issue Sep 2, 2014 · 8 comments

Comments

@firebird-automations
Copy link
Collaborator

firebird-automations commented Sep 2, 2014

Submitted by: @pavel-zotov

DDL:

create or alter procedure null_arg_force_idx_usage as begin end;
create or alter procedure null_arg_optimization_test_01 as begin end;
create or alter procedure null_arg_optimization_test_02 as begin end;
recreate table tmp(id int, doc_id int, qty int);
commit;

set term ^;
execute block as
    declare i int = 0;
    declare n int = 1000000;
begin
    while (i<n) do insert into tmp(id, doc_id, qty) values(:i, rand()*100, 1) returning :i+1 into i;
end
^ set term ;^
commit;

commit;
create unique index tmp_id_unq on tmp(id);
create index tmp_doc_id on tmp(doc_id);
commit;

set term ^;

create or alter procedure null_arg_force_idx_usage (
    a_doc_list_id bigint default null,
    a_doc_data_id bigint default null
)
returns (
    dd_first_id int,
    dd_first_qty numeric(12,2)
)
as
begin
        select d.id, d.qty
        from tmp d
        where
            d.id between coalesce( :a_doc_data_id, -2147483648) and coalesce( :a_doc_data_id, 2147483647)
            and
            d.doc_id between coalesce( :a_doc_list_id, -2147483648) and coalesce( :a_doc_list_id, 2147483647)
        rows 1
        into dd_first_id, dd_first_qty;
        suspend;
end
^

create or alter procedure null_arg_optimization_test_01 (
    a_doc_list_id bigint default null,
    a_doc_data_id bigint default null
)
returns (
    dd_first_id int,
    dd_first_qty numeric(12,2)
)
as
begin
        select d.id, d.qty
        from tmp d
        where -- OR'ed optimization, 3.0 only
            (d.doc_id  = :a_doc_list_id or :a_doc_list_id is null)
            and
            (d.id = :a_doc_data_id or :a_doc_data_id is null\)
        rows 1
        into dd_first_id, dd_first_qty;
        suspend;
end
^

create or alter procedure null_arg_optimization_test_02 (
    a_doc_list_id bigint default null,
    a_doc_data_id bigint default null
)
returns (
    dd_first_id int,
    dd_first_qty numeric(12,2)
)
as
begin
        select d.id, d.qty
        from tmp d
        where -- OR'ed optimization, 3.0 only
            (d.id = :a_doc_data_id or :a_doc_data_id is null)
            and
            (d.doc_id  = :a_doc_list_id or :a_doc_list_id is null)
        rows 1
        into dd_first_id, dd_first_qty;
        suspend;
end
^
set term ;^
commit;

So, the table tmp(id int, doc_id int, qty int) contains 10^6 rows and we have three procedures which all take two arguments (both nullable) and returns first record from TMP which match to one and/or two values of input args (or returns first record if no arguments passed).

Procedure null_arg_force_idx_usage does not use so called "OR'ed optimization" -- it only forces FB to use indexes via BETWEEN + pair of literals which covers all possible values of input args. It is provided here only for comparison.

Procedures null_arg_optimization_test_01 and null_arg_optimization_test_02 differs only in the order of expressions in WHERE-clause.

The following samples shows statistics of these SPs.

1.1 select * from null_arg_force_idx_usage(null, null);
-- IR = 1, NIR = 0;  174 ms, 1331 fetch(es)
1.2 select * from null_arg_optimization_test_01(null, null);
-- IR = 0, NIR = 1; 0 ms, 6 fetch(es) -- ok, but why NATURAL scan here ?
1.3 select * from null_arg_optimization_test_02(null, null\);
-- IR = 0, NIR = 1; 0 ms, 6 fetch(es) -- the same as 1.2

2.1 select * from null_arg_force_idx_usage(1);
-- IR = 1, NIR = 0; 23 ms, 738 fetch(es)
2.2 select * from null_arg_optimization_test_01(1);
-- IR = 1, NIR = 0; 0 ms, 11 fetch(es)
2.3 select * from null_arg_optimization_test_02(1);
-- IR = 0, NIR = 145; 0 ms, 296 fetch(es) -- why mutliple NATURAL scans here ?

3.1 select * from null_arg_force_idx_usage(-1);
-- no statistics, 21 ms, 730 fetch(es)
3.2 select * from null_arg_optimization_test_01(-1);
-- no statistics, 0 ms, 3 fetch(es)
3.3 select * from null_arg_optimization_test_02(-1);
-- IR = 0, NIR = 1'000'000 (!), 938 ms, 2014194 fetch(es) -- quite ineffective

4.1 select * from null_arg_force_idx_usage(null,1);
-- IR=1; 14 ms, 607 fetch(es)
4.2 select * from null_arg_optimization_test_01(null,1);
-- IR = 0, NIR=2; 0 ms, 8 fetch(es) -- why TWO naturals here ?
4.3 select * from null_arg_optimization_test_02(null,1);
-- IR=0, NIR=1

5.1 select * from null_arg_force_idx_usage(null,-1);
-- no statistics; 0 ms, 3 fetch(es)
5.2 select * from null_arg_optimization_test_01(null,-1);
-- IR=0, NIR=1'000'000 (!); 938 ms, 2014194 fetch(es) -- quite ineffective
5.3 select * from null_arg_optimization_test_02(null,-1);
-- no statistics; 0 ms, 3 fetch(es)
@firebird-automations
Copy link
Collaborator Author

Modified by: @pavel-zotov

description: DDL:
#⁠#⁠#⁠#⁠

create or alter procedure null_arg_force_idx_usage as begin end;
create or alter procedure null_arg_optimization_test_01 as begin end;
create or alter procedure null_arg_optimization_test_02 as begin end;
recreate table tmp(id int, doc_id int, qty int);
commit;

set term ^;
execute block as
declare i int = 0;
declare n int = 1000000;
begin
while(i<n) do insert into tmp(id, doc_id, qty) values(:i, rand()*100, 1) returning :i+1 into i;
end
^ set term ;^
commit;

commit;
create unique index tmp_id_unq on tmp(id);
create index tmp_doc_id on tmp(doc_id);
commit;

set term ^;

create or alter procedure null_arg_force_idx_usage(
a_doc_list_id bigint default null
,a_doc_data_id bigint default null
)
returns(
dd_first_id int,
dd_first_qty numeric(12,2)
)
as
begin
select http://d.id, d.qty
from tmp d
where
http://d.id between coalesce( :a_doc_data_id, -2147483648) and coalesce( :a_doc_data_id, 2147483647)
and
d.doc_id between coalesce( :a_doc_list_id, -2147483648) and coalesce( :a_doc_list_id, 2147483647)
rows 1
into dd_first_id, dd_first_qty;
suspend;
end
^

create or alter procedure null_arg_optimization_test_01(
a_doc_list_id bigint default null
,a_doc_data_id bigint default null
)
returns(
dd_first_id int,
dd_first_qty numeric(12,2)
)
as
begin
select http://d.id, d.qty
from tmp d
where -- OR'ed optimization, 3.0 only
(d.doc_id = :a_doc_list_id or :a_doc_list_id is null)
and
(http://d.id = :a_doc_data_id or :a_doc_data_id is null)
rows 1
into dd_first_id, dd_first_qty;
suspend;
end
^

create or alter procedure null_arg_optimization_test_02(
a_doc_list_id bigint default null
,a_doc_data_id bigint default null
)
returns(
dd_first_id int,
dd_first_qty numeric(12,2)
)
as
begin
select http://d.id, d.qty
from tmp d
where -- OR'ed optimization, 3.0 only
(http://d.id = :a_doc_data_id or :a_doc_data_id is null)
and
(d.doc_id = :a_doc_list_id or :a_doc_list_id is null)
rows 1
into dd_first_id, dd_first_qty;
suspend;
end
^
set term ;^
commit;

So, the table tmp(id int, doc_id int, qty int) contains 10^6 rows and we have three procedures which all take two arguments (both nullable) and returns first record from TMP which match to one and/or two values of input args (or returns first record if no arguments passed).

Procedure `null_arg_force_idx_usage` does not use so called "OR'ed optimization" - it only forces FB to use indexes via BETWEEN + pair of literals which covers all possible values of input args. It is provided here only for comparison.

Procedures `null_arg_optimization_test_01` and `null_arg_optimization_test_02` differs only in the order of expressions in WHERE-clause.

The following samples shows statistics of these SPs.

1.1 select * from null_arg_force_idx_usage(null, null) ==> IR = 1, NIR = 0; 174 ms, 1331 fetch(es)
1.2 select * from null_arg_optimization_test_01(null, null) ==> IR = 0, NIR = 1; 0 ms, 6 fetch(es) -- ok, but why NATURAL scan here ?
1.3 select * from null_arg_optimization_test_02(null, null) ==> IR = 0, NIR = 1; 0 ms, 6 fetch(es) -- the same as 1.2

2.1 select * from null_arg_force_idx_usage(1) ==> IR = 1, NIR = 0; 23 ms, 738 fetch(es)
2.2 select * from null_arg_optimization_test_01(1) ==> IR = 1, NIR = 0; 0 ms, 11 fetch(es) -- again: why NATURAL scan here ?
2.3 select * from null_arg_optimization_test_02(1) ==> IR = 0, NIR = 145; 0 ms, 296 fetch(es) -- the same as 2.2

3.1 select * from null_arg_force_idx_usage(-1); ==> no statistics, 21 ms, 730 fetch(es)
3.2 select * from null_arg_optimization_test_01(-1); ==> no statistics, 0 ms, 3 fetch(es)
3.3 select * from null_arg_optimization_test_02(-1); ==> IR = 0, NIR = 1'000'000 (!), 938 ms, 2014194 fetch(es) -- quite ineffective

4.1 select * from null_arg_force_idx_usage(null,1); ==> IR=1; 14 ms, 607 fetch(es)
4.2 select * from null_arg_optimization_test_01(null,1); ==> IR = 0, NIR=2; 0 ms, 8 fetch(es) -- why TWO naturals here ?
4.3 select * from null_arg_optimization_test_02(null,1); ==> IR=0, NIR=1

5.1 select * from null_arg_force_idx_usage(null,-1); ==> no statistics; 0 ms, 3 fetch(es)
5.2 select * from null_arg_optimization_test_01(null,-1); ==> IR=0, NIR=1'000'000(!); 938 ms, 2014194 fetch(es) -- quite ineffective
5.3 select * from null_arg_optimization_test_02(null,-1); ==> no statistics; 0 ms, 3 fetch(es)

=>

DDL:
#⁠#⁠#⁠#⁠

create or alter procedure null_arg_force_idx_usage as begin end;
create or alter procedure null_arg_optimization_test_01 as begin end;
create or alter procedure null_arg_optimization_test_02 as begin end;
recreate table tmp(id int, doc_id int, qty int);
commit;

set term ^;
execute block as
declare i int = 0;
declare n int = 1000000;
begin
while(i<n) do insert into tmp(id, doc_id, qty) values(:i, rand()*100, 1) returning :i+1 into i;
end
^ set term ;^
commit;

commit;
create unique index tmp_id_unq on tmp(id);
create index tmp_doc_id on tmp(doc_id);
commit;

set term ^;

create or alter procedure null_arg_force_idx_usage(
a_doc_list_id bigint default null
,a_doc_data_id bigint default null
)
returns(
dd_first_id int,
dd_first_qty numeric(12,2)
)
as
begin
select http://d.id, d.qty
from tmp d
where
http://d.id between coalesce( :a_doc_data_id, -2147483648) and coalesce( :a_doc_data_id, 2147483647)
and
d.doc_id between coalesce( :a_doc_list_id, -2147483648) and coalesce( :a_doc_list_id, 2147483647)
rows 1
into dd_first_id, dd_first_qty;
suspend;
end
^

create or alter procedure null_arg_optimization_test_01(
a_doc_list_id bigint default null
,a_doc_data_id bigint default null
)
returns(
dd_first_id int,
dd_first_qty numeric(12,2)
)
as
begin
select http://d.id, d.qty
from tmp d
where -- OR'ed optimization, 3.0 only
(d.doc_id = :a_doc_list_id or :a_doc_list_id is null)
and
(http://d.id = :a_doc_data_id or :a_doc_data_id is null)
rows 1
into dd_first_id, dd_first_qty;
suspend;
end
^

create or alter procedure null_arg_optimization_test_02(
a_doc_list_id bigint default null
,a_doc_data_id bigint default null
)
returns(
dd_first_id int,
dd_first_qty numeric(12,2)
)
as
begin
select http://d.id, d.qty
from tmp d
where -- OR'ed optimization, 3.0 only
(http://d.id = :a_doc_data_id or :a_doc_data_id is null)
and
(d.doc_id = :a_doc_list_id or :a_doc_list_id is null)
rows 1
into dd_first_id, dd_first_qty;
suspend;
end
^
set term ;^
commit;

So, the table tmp(id int, doc_id int, qty int) contains 10^6 rows and we have three procedures which all take two arguments (both nullable) and returns first record from TMP which match to one and/or two values of input args (or returns first record if no arguments passed).

Procedure `null_arg_force_idx_usage` does not use so called "OR'ed optimization" - it only forces FB to use indexes via BETWEEN + pair of literals which covers all possible values of input args. It is provided here only for comparison.

Procedures `null_arg_optimization_test_01` and `null_arg_optimization_test_02` differs only in the order of expressions in WHERE-clause.

The following samples shows statistics of these SPs.

1.1 select * from null_arg_force_idx_usage(null, null) ==> IR = 1, NIR = 0; 174 ms, 1331 fetch(es)
1.2 select * from null_arg_optimization_test_01(null, null) ==> IR = 0, NIR = 1; 0 ms, 6 fetch(es) -- ok, but why NATURAL scan here ?
1.3 select * from null_arg_optimization_test_02(null, null) ==> IR = 0, NIR = 1; 0 ms, 6 fetch(es) -- the same as 1.2

2.1 select * from null_arg_force_idx_usage(1) ==> IR = 1, NIR = 0; 23 ms, 738 fetch(es)
2.2 select * from null_arg_optimization_test_01(1) ==> IR = 1, NIR = 0; 0 ms, 11 fetch(es)
2.3 select * from null_arg_optimization_test_02(1) ==> IR = 0, NIR = 145; 0 ms, 296 fetch(es) -- why mutliple NATURAL scans here ?

3.1 select * from null_arg_force_idx_usage(-1); ==> no statistics, 21 ms, 730 fetch(es)
3.2 select * from null_arg_optimization_test_01(-1); ==> no statistics, 0 ms, 3 fetch(es)
3.3 select * from null_arg_optimization_test_02(-1); ==> IR = 0, NIR = 1'000'000 (!), 938 ms, 2014194 fetch(es) -- quite ineffective

4.1 select * from null_arg_force_idx_usage(null,1); ==> IR=1; 14 ms, 607 fetch(es)
4.2 select * from null_arg_optimization_test_01(null,1); ==> IR = 0, NIR=2; 0 ms, 8 fetch(es) -- why TWO naturals here ?
4.3 select * from null_arg_optimization_test_02(null,1); ==> IR=0, NIR=1

5.1 select * from null_arg_force_idx_usage(null,-1); ==> no statistics; 0 ms, 3 fetch(es)
5.2 select * from null_arg_optimization_test_01(null,-1); ==> IR=0, NIR=1'000'000(!); 938 ms, 2014194 fetch(es) -- quite ineffective
5.3 select * from null_arg_optimization_test_02(null,-1); ==> no statistics; 0 ms, 3 fetch(es)

@firebird-automations
Copy link
Collaborator Author

Modified by: @pavel-zotov

summary: OR'ed optimization in 3.0 is ineffective when `WHERE` clause contains two expression and they do NOT match order of SP input arguments => OR'ed optimization in 3.0 is ineffective if WHERE-clause contains two expression and they do NOT match to order of SP input arguments

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

assignee: Dmitry Yemanov [ dimitr ]

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

Regression: 3.0 Alpha 2 [ 10560 ]

@firebird-automations
Copy link
Collaborator Author

Commented by: @dyemanov

I've corrected the ticket title as it has nothing to do with order of SP parameters, only order of predicates inside WHERE matters here. The issue is that if two ORed predicates are ANDed, then only the first one can possibly use indices. As simple as that.

So if SP test_01 is called with a_doc_list_id = null, then a fullscan happens, otherwise only index on DOC_ID is used. If SP test_02 is called with a_doc_data_id = null, then a fullscan happens, otherwise only index on ID is used.

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

Component: Engine [ 10000 ]

summary: OR'ed optimization in 3.0 is ineffective if WHERE-clause contains two expression and they do NOT match to order of SP input arguments => OR'ed optimization in 3.0 is ineffective if WHERE-clause contains two expressions

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

Fix Version: 3.0.0 [ 10048 ]

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

Fix Version: 3.0 RC2 [ 10048 ] =>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants