Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong or missing IS NULL optimization (regression) [CORE3449] #3810

Closed
firebird-automations opened this issue Apr 22, 2011 · 13 comments
Closed

Comments

@firebird-automations
Copy link
Collaborator

Submitted by: @pcisar

Relate to CORE2709

Test case:
/* prepare */
create table tmp_tbl1 (fld1 integer, fld2 integer, fld3 integer);
create index tmp_tbl1_fld123 on tmp_tbl1(fld1, fld2, fld3);
create index tmp_tbl1_fld2 on tmp_tbl1(fld2);
commit;
set term ^;
create or alter procedure tmp_sp1
as
declare variable I integer;
begin
i=0;
while (i<10000) do begin
i=i+1;
insert into tmp_tbl1 values (1, :i, 2);
end
end
^
set term ;^
commit;
execute procedure tmp_sp1;
commit;
SET STATISTICS INDEX TMP_TBL1_FLD123;
SET STATISTICS INDEX TMP_TBL1_FLD2;
commit;

/* test */
select count(*) from tmp_tbl1 where fld1=1 and fld2 is null;

/*
1.5:
PLAN (TMP_TBL1 INDEX (TMP_TBL1_FLD2,TMP_TBL1_FLD123))
COUNT

       0

Current memory = 53311040
Delta memory = 16164
Max memory = 53467732
Elapsed time= 0.01 sec
Buffers = 6000
Reads = 0
Writes 0
Fetches = 55

2.5:
PLAN (TMP_TBL1 INDEX (TMP_TBL1_FLD123)) <<<<< different plan
COUNT

       0

Current memory = 53597056
Delta memory = 21576
Max memory = 53936000
Elapsed time= 0.01 sec
Buffers = 6000
Reads = 0
Writes 0
Fetches = 20022 <<<<<<< causes significant performance degradation when executed many times under load
*/

Under v2.5 FLD2 IS NULL predicate is either not optimized using index or is handled wrongly. Note that there are no rows with NULL value in FLD2, so there shouldn't be so much fetches. The number of fetches indicate that only fld1=1 condition is handled via index (there is only 1 value in all 10k rows).

Commits: 3f93bb8

@firebird-automations
Copy link
Collaborator Author

Commented by: @dyemanov

In ascending indices the NULL key has zero length, so searching for {A, NULL} actually means searching for {A}, because there's nothing (physically) to look at in the second segment. This is not a bug per se, just a known limitation. A workaround could be to have the compound index created as descending, in this case NULLs are stored differently and are searchable.

However, I agree that in this particular case the optimizer should take the aforementioned limitation into account and consider also index TMP_TBL1_FLD2.

The issue affects all versions supporting ODS11.

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

assignee: Dmitry Yemanov [ dimitr ]

@firebird-automations
Copy link
Collaborator Author

Commented by: @dyemanov

Looks like duplicating CORE2709.

@firebird-automations
Copy link
Collaborator Author

Commented by: @pcisar

Not exactly duplicating (the problem here is that optimizer doesn't take into account the second index where 1.5 does), but related.

@firebird-automations
Copy link
Collaborator Author

Modified by: @pcisar

Link: This issue relate to CORE2709 [ CORE2709 ]

@firebird-automations
Copy link
Collaborator Author

Commented by: @dyemanov

The improved optimizer is likely to choose a different plan than v1.5 in this case:

PLAN (TMP_TBL1 INDEX (TMP_TBL1_FLD2))

which is obviously better in this artificial test. But I cannot guarantee that it's going to solve the issue in the real database.

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

status: Open [ 1 ] => In Progress [ 3 ]

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

status: In Progress [ 3 ] => Open [ 1 ]

@firebird-automations
Copy link
Collaborator Author

Commented by: @dyemanov

I'm closing this ticket as "won't fix" because the original source of the problem (CORE2709) has been fixed. Now the compound index (tmp_tbl1_fld123) will use the first two segments for lookups (as expected) and thus the other index (tmp_tbl1_fld2) should not be considered. The overall performance should be better than in v1.5 where two index lookups are applied.

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

status: Open [ 1 ] => Resolved [ 5 ]

resolution: Won't Fix [ 2 ]

@firebird-automations
Copy link
Collaborator Author

Modified by: @pcisar

status: Resolved [ 5 ] => Closed [ 6 ]

hvlad added a commit that referenced this issue May 4, 2021
 GH-3106 : Many indexed reads in a compound index with NULLs [CORE2709]
 GH-3810 : Wrong or missing IS NULL optimization (regression) [CORE3449]
@hvlad hvlad reopened this May 4, 2021
@hvlad
Copy link
Member

hvlad commented May 4, 2021

Re-opened to mark as fixed.

@hvlad
Copy link
Member

hvlad commented May 4, 2021

Fix will be backported into v3 after verification

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants