Lookup (or scan) in descending index could be very inefficient for some keys [CORE4302] #4625

firebird-automations · 2013-12-21T20:29:02Z

Submitted by: @hvlad

Is related to QA518

Big thanks to the Pavel Zotov who found the issue and helped with testing.

Using FB3, default config (SS, page cache 2048, database page size is 4096).

Prepare data

set term ^;

create table t1 (id int, val int)
^

execute block as
declare i int = 0;
begin
while (i < 1000000) do
begin
insert into t1 values (:i, mod(:i, 10));
i = i + 1;
end
end
^
commit
^

create descending index t1_desc_idx on t1 (val)
^

set stat on
^

Issue 1: insert three records and look at stats

insert into t1 values (100, 1)^
Buffers = 2048
Reads = 0
Writes 0
Fetches = 7

SQL> insert into t1 values (100, 2)^
Buffers = 2048
Reads = 0
Writes 0
Fetches = 386

SQL> insert into t1 values (100, 3)^
Buffers = 2048
Reads = 0
Writes 0
Fetches = 7

There was 7 fetches for keys "1" and "3" while 386 fetches for key "2" !

Another example (on the same data):

select * from t1 where val <= 1 order by val desc rows 1^

      ID          VAL

============ ============
1 1

Elapsed time= 0.00 sec
Buffers = 2048
Reads = 2
Writes 0
Fetches = 7

select * from t1 where val <= 2 order by val desc rows 1^

      ID          VAL

============ ============
2 2

Elapsed time= 0.64 sec
Buffers = 2048
Reads = 28576
Writes 4
Fetches = 645359

select * from t1 where val <= 3 order by val desc rows 1^

      ID          VAL

============ ============
3 3

Elapsed time= 0.00 sec
Buffers = 2048
Reads = 4
Writes 0
Fetches = 7

Again, key "2" produced very poor performance comparing with another keys

Commits: a1c9527 11bad44 7ef7eb3 69f9bb5 fb35e77 b6e00c5 FirebirdSQL/fbt-repository@28ad515 FirebirdSQL/fbt-repository@26174fb FirebirdSQL/fbt-repository@4d9a681 FirebirdSQL/fbt-repository@263f347 FirebirdSQL/fbt-repository@c594ed7 FirebirdSQL/fbt-repository@2e8cf4c

====== Test Details ======

Test uses MON$ tables to gather statistics snapshot before and after each of query.
Database that is restored (''mon-stat-gathering-N_M.fbk') contains procedures for gathering statistics and view for displaying results in convenient form. In particular, this view has columns that show DIFFERENCE of fetches and thus we can estimate performance.
Numeric literals in the code (5, 20, 100, ... ) were adjusted after several measurements on:
1) WI-V2.5.4.26847 SuperServer:
1.1) with default value of DefaultDBCachePages and
1.2) when it was 8192
2) WI-T3.0.0.31708 (CS/SC/SS):
2.1) with default value of DefaultDBCachePages and
2.2) when it was: 256 on CS, 512 on SC and 8192 on SS.

These numeric literals are used as 'upper-bounds' for comparison with actual fetches. If actual value became equal or more than 'upper-bound' than output will be '!high!' and thus will lead test to FAIL, because expected result is 'normal'.

The text was updated successfully, but these errors were encountered:

firebird-automations · 2013-12-21T20:36:02Z

Commented by: @hvlad

The same test agains v2.5.2

insert into t1 values (100, 1)^
Buffers = 2048
Reads = 0
Writes 0
Fetches = 7

insert into t1 values (100, 2)^
Buffers = 2048
Reads = 0
Writes 0
Fetches = 386

insert into t1 values (100, 3)^
Buffers = 2048
Reads = 0
Writes 0
Fetches = 7

select * from t1 where val <= 1 order by val desc rows 1^

      ID          VAL

============ ============
1 1

Elapsed time= 0.04 sec
Buffers = 2048
Reads = 13
Writes 0
Fetches = 268

select * from t1 where val <= 2 order by val desc rows 1^

      ID          VAL

============ ============
2 2

Elapsed time= 0.07 sec
Buffers = 2048
Reads = 0
Writes 0
Fetches = 892

select * from t1 where val <= 3 order by val desc rows 1^

      ID          VAL

============ ============
3 3

Elapsed time= 0.05 sec
Buffers = 2048
Reads = 0
Writes 0
Fetches = 494

You see - v2.5.2 have same results for INSERT and a bit different (but steel bad) results for SELECT.

firebird-automations · 2013-12-21T20:36:12Z

Modified by: @hvlad

assignee: Vlad Khorsun [ hvlad ]

firebird-automations · 2013-12-21T21:12:23Z

Commented by: @hvlad

The problem is present in descending index when there are keys which are longer then inserted\searched key but have the same prefix.
In descending indices shorter keys are greater than longer keys and this fact was not accounted in some (not all) code parts.

For example, we are searching for key 'a' and keys 'aa', 'ab' etc are present.
With numbers in test case above i can say that key 2 is encoded as 0x3F while keys 3, 4, etc are encoded as two bytes with 0x3F as first byte.

firebird-automations · 2013-12-21T21:13:36Z

Modified by: @hvlad

status: Open [ 1 ] => Resolved [ 5 ]

resolution: Fixed [ 1 ]

Fix Version: 2.5.3 [ 10461 ]

Fix Version: 3.0 Alpha 2 [ 10560 ]

Fix Version: 2.1.6 [ 10460 ]

firebird-automations · 2013-12-21T21:15:30Z

Commented by: @hvlad

Stats after the fix for fb3:

insert into t1 values (100, 1)^
Fetches = 7

insert into t1 values (100, 2)^
Fetches = 7

insert into t1 values (100, 3)^
Fetches = 7

select * from t1 where val <= 1 order by val desc rows 1^

      ID          VAL

============ ============
1 1

Reads = 2
Fetches = 7

select * from t1 where val <= 2 order by val desc rows 1^

      ID          VAL

============ ============
2 2

Reads = 0
Fetches = 7

select * from t1 where val <= 3 order by val desc rows 1^

      ID          VAL

============ ============
3 3

Reads = 0
Fetches = 7

firebird-automations · 2013-12-21T21:17:14Z

Commented by: @hvlad

Stats after the fix for v2.5.2

insert into t1 values (100, 1)^
Fetches = 7

insert into t1 values (100, 2)^
Fetches = 7

insert into t1 values (100, 3)^
Fetches = 7

select * from t1 where val <= 1 order by val desc rows 1^

      ID          VAL

============ ============
1 1

Reads = 13
Fetches = 268

select * from t1 where val <= 2 order by val desc rows 1^

      ID          VAL

============ ============
2 2

Reads = 0
Fetches = 374

select * from t1 where val <= 3 order by val desc rows 1^

      ID          VAL

============ ============
3 3

Reads = 0
Fetches = 494

firebird-automations · 2013-12-21T21:33:22Z

Commented by: @hvlad

Stats for the v2.1 is the same as for the v2.5

firebird-automations · 2013-12-21T21:35:31Z

Modified by: @hvlad

Version: 2.1.5 Update 1 [ 10522 ]

description: Using FB3, default config (SS, page cache 2048, database page size is 4096).

Prepare data

set term ^;

create table t1 (id int, val int)
^

execute block as
declare i int = 0;
begin
while (i < 1000000) do
begin
insert into t1 values (:i, mod(:i, 10));
i = i + 1;
end
end
^
commit
^

create descending index t1_desc_idx on t1 (val)
^

set stat on
^

Issue 1: insert three records and look at stats

insert into t1 values (100, 1)^
Buffers = 2048
Reads = 0
Writes 0
Fetches = 7

SQL> insert into t1 values (100, 2)^
Buffers = 2048
Reads = 0
Writes 0
Fetches = 386

SQL> insert into t1 values (100, 3)^
Buffers = 2048
Reads = 0
Writes 0
Fetches = 7

There was 7 fetches for keys "1" and "3" while 386 fetches for key "2" !

Another example (on the same data):

select * from t1 where val <= 1 order by val desc rows 1^

      ID          VAL

============ ============
1 1

Elapsed time= 0.00 sec
Buffers = 2048
Reads = 2
Writes 0
Fetches = 7

select * from t1 where val <= 2 order by val desc rows 1^

      ID          VAL

============ ============
2 2

Elapsed time= 0.64 sec
Buffers = 2048
Reads = 28576
Writes 4
Fetches = 645359

select * from t1 where val <= 3 order by val desc rows 1^

      ID          VAL

============ ============
3 3

Elapsed time= 0.00 sec
Buffers = 2048
Reads = 4
Writes 0
Fetches = 7

Again, key "2" produced very poor performance comparing with another keys

=>

Big thanks to the Pavel Zotov who found the issue and helped with testing.

Using FB3, default config (SS, page cache 2048, database page size is 4096).

Prepare data

set term ^;

create table t1 (id int, val int)
^

execute block as
declare i int = 0;
begin
while (i < 1000000) do
begin
insert into t1 values (:i, mod(:i, 10));
i = i + 1;
end
end
^
commit
^

create descending index t1_desc_idx on t1 (val)
^

set stat on
^

Issue 1: insert three records and look at stats

insert into t1 values (100, 1)^
Buffers = 2048
Reads = 0
Writes 0
Fetches = 7

SQL> insert into t1 values (100, 2)^
Buffers = 2048
Reads = 0
Writes 0
Fetches = 386

SQL> insert into t1 values (100, 3)^
Buffers = 2048
Reads = 0
Writes 0
Fetches = 7

There was 7 fetches for keys "1" and "3" while 386 fetches for key "2" !

Another example (on the same data):

select * from t1 where val <= 1 order by val desc rows 1^

      ID          VAL

============ ============
1 1

Elapsed time= 0.00 sec
Buffers = 2048
Reads = 2
Writes 0
Fetches = 7

select * from t1 where val <= 2 order by val desc rows 1^

      ID          VAL

============ ============
2 2

Elapsed time= 0.64 sec
Buffers = 2048
Reads = 28576
Writes 4
Fetches = 645359

select * from t1 where val <= 3 order by val desc rows 1^

      ID          VAL

============ ============
3 3

Elapsed time= 0.00 sec
Buffers = 2048
Reads = 4
Writes 0
Fetches = 7

Again, key "2" produced very poor performance comparing with another keys

firebird-automations · 2013-12-23T18:02:55Z

Commented by: @pavel-zotov

> In descending indices shorter keys are greater than longer keys and this fact was not accounted in some (not all) code parts.

Seems that in the following case we have the same high cost and poor statistics but for building bitmap and single search rahter than index navigating:

DDL:

set echo off;
commit;
recreate table td(id int, s varchar(50)); commit;
set term ^;
execute block as
begin
begin
execute statement 'create sequence g';
when any do begin end
end
end^
set term ;^
commit;
alter sequence g restart with 0;
commit;

set term ^;
execute block as
declare n int = 1000000;
declare m int;
begin
m=n; while(m>0) do insert into td(id, s) values( gen_id(g,1), 'q' ) returning :m-1 into m;
m=n; while(m>0) do insert into td(id, s) values( gen_id(g,1), 'qw' ) returning :m-1 into m;
m=n; while(m>0) do insert into td(id, s) values( gen_id(g,1), 'qwe' ) returning :m-1 into m;
m=n; while(m>0) do insert into td(id, s) values( gen_id(g,1), 'qwer' ) returning :m-1 into m;
m=n; while(m>0) do insert into td(id, s) values( gen_id(g,1), 'qwert' ) returning :m-1 into m;
m=n; while(m>0) do insert into td(id, s) values( gen_id(g,1), 'qwerty' ) returning :m-1 into m;
m=n; while(m>0) do insert into td(id, s) values( gen_id(g,1), 'qwertyu' ) returning :m-1 into m;
m=n; while(m>0) do insert into td(id, s) values( gen_id(g,1), 'qwertyui' ) returning :m-1 into m;
m=n; while(m>0) do insert into td(id, s) values( gen_id(g,1), 'qwertyuio' ) returning :m-1 into m;
m=n; while(m>0) do insert into td(id, s) values( gen_id(g,1), 'qwertyuiop' ) returning :m-1 into m;
end^ set term ;^ commit;
--create index td_s_asc on td(s); commit;
create descending index td_s_des on td(s); commit;

Query:

set stat on;
set plan on;
set echo on;
select count(*) from rdb$database where exists(select * from td where s='qwertyuioo'); -- note: `o` duplicated at the end of key

Output:

PLAN (TD INDEX (TD_S_DES))
PLAN (RDB$DATABASE NATURAL)
. . .
Elapsed time= 6.57 sec
Buffers = 524288
Reads = 140644
Writes = 2
Fetches = 18012008

So, we have over 18 mil fetches when search single key that is absent in index :(

firebird-automations · 2013-12-23T19:05:57Z

Commented by: Sean Leyne (seanleyne)

I am thinking that the Summary wording, to better reflect the problem, should be more along the lines of:

Descending index with large number of duplicates can cause inefficient index operations

Thoughts?

firebird-automations · 2013-12-23T20:02:22Z

Commented by: @hvlad

Pavel,
sooner of all this is completely different issue.

Sean,
I object to changing summary in a way you offer.

firebird-automations · 2013-12-23T20:09:07Z

Commented by: Sean Leyne (seanleyne)

@vlad, Please suggest some alternate. The current description seems 'vague'/ambiguous.

firebird-automations · 2013-12-23T21:36:51Z

Commented by: @hvlad

Sean,

so far i see no need to do it.

Explain what kind of ambiguity do you see here ?

firebird-automations · 2013-12-23T21:48:23Z

Commented by: Sean Leyne (seanleyne)

The word "inefficient" without any other context is the problem. What is inefficient? The index navigation logic (how)? The index data storage?

firebird-automations · 2013-12-23T22:39:28Z

Commented by: @hvlad

Well, we can change subject to something like:

Search for some keys in descending index requires much more fetches\reads than it is necessary

PS So, why "inefficient index operations" is less ambigious then just "inefficient" ? :)

firebird-automations · 2013-12-23T23:29:54Z

Commented by: Sean Leyne (seanleyne)

Ok, how about:

Descending index can cause large number of data reads/fetches when large number of duplicate index values present.

firebird-automations · 2013-12-23T23:57:00Z

Commented by: @hvlad

Why do you make accent on duplicates ? They makes issue more visible, yes, but are not an essential part of the problem.
The real problem was that *some keys* in *descending* index handled in a wrong\inefficient way.
I make summary text reflect most significant sides of the issue.
I don't want to create poema in summary text, there is test case with all necessary details in it.
I see no sense to spend so much time discussing this.

firebird-automations · 2013-12-24T10:16:30Z

Commented by: @dyemanov

I'd only suggest to add "Lookup in" in front of the title, as it's not the index itself that's inefficient but the index scan.

firebird-automations · 2013-12-24T16:52:32Z

Commented by: @hvlad

Re-opened to fix the new issue found by Pavel.
While it is different from the code POV it also could be formulated as "indefficient descending index scan for some values".

firebird-automations · 2013-12-24T16:52:32Z

Modified by: @hvlad

status: Resolved [ 5 ] => Reopened [ 4 ]

resolution: Fixed [ 1 ] =>

firebird-automations · 2013-12-24T16:53:59Z

Modified by: @hvlad

summary: Descending index could be very inefficient for some keys => Lookup (or scan) in descending index could be very inefficient for some keys

firebird-automations · 2014-01-09T14:09:29Z

Modified by: @hvlad

status: Reopened [ 4 ] => Resolved [ 5 ]

resolution: Fixed [ 1 ]

firebird-automations · 2014-03-29T17:28:11Z

Modified by: @pcisar

Link: This issue is related to QA518 [ QA518 ]

firebird-automations · 2015-05-29T19:29:50Z

Modified by: @pavel-zotov

status: Resolved [ 5 ] => Resolved [ 5 ]

QA Status: Done successfully

Test Details: Test uses MON$ tables to gather statistics snapshot before and after each of query.
Database that is restored (''mon-stat-gathering-N_M.fbk') contains procedures for gathering statistics and view for displaying results in convenient form. In particular, this view has columns that show DIFFERENCE of fetches and thus we can estimate performance.
Numeric literals in the code (5, 20, 100, ... ) were adjusted after several measurements on:
1) WI-V2.5.4.26847 SuperServer:
1.1) with default value of DefaultDBCachePages and
1.2) when it was 8192
2) WI-T3.0.0.31708 (CS/SC/SS):
2.1) with default value of DefaultDBCachePages and
2.2) when it was: 256 on CS, 512 on SC and 8192 on SS.

firebird-automations · 2015-05-29T19:34:17Z

Modified by: @pavel-zotov

status: Resolved [ 5 ] => Resolved [ 5 ]

Test Details: Test uses MON$ tables to gather statistics snapshot before and after each of query.
Database that is restored (''mon-stat-gathering-N_M.fbk') contains procedures for gathering statistics and view for displaying results in convenient form. In particular, this view has columns that show DIFFERENCE of fetches and thus we can estimate performance.
Numeric literals in the code (5, 20, 100, ... ) were adjusted after several measurements on:
1) WI-V2.5.4.26847 SuperServer:
1.1) with default value of DefaultDBCachePages and
1.2) when it was 8192
2) WI-T3.0.0.31708 (CS/SC/SS):
2.1) with default value of DefaultDBCachePages and
2.2) when it was: 256 on CS, 512 on SC and 8192 on SS.

=>

Test uses MON$ tables to gather statistics snapshot before and after each of query.
Database that is restored (''mon-stat-gathering-N_M.fbk') contains procedures for gathering statistics and view for displaying results in convenient form. In particular, this view has columns that show DIFFERENCE of fetches and thus we can estimate performance.
Numeric literals in the code (5, 20, 100, ... ) were adjusted after several measurements on:
1) WI-V2.5.4.26847 SuperServer:
1.1) with default value of DefaultDBCachePages and
1.2) when it was 8192
2) WI-T3.0.0.31708 (CS/SC/SS):
2.1) with default value of DefaultDBCachePages and
2.2) when it was: 256 on CS, 512 on SC and 8192 on SS.

These numeric literals are used as 'upper-bounds' for comparison with actual fetches. If actual value became equal or more than 'upper-bound' than output will be '!high!' and thus will lead test to FAIL, because expected result is 'normal'.

firebird-automations · 2015-09-25T15:21:44Z

Commented by: @pcisar

Test created.

firebird-automations · 2015-09-25T15:21:44Z

Modified by: @pcisar

status: Resolved [ 5 ] => Closed [ 6 ]

firebird-automations closed this as completed May 29, 2015

firebird-automations added affect-version: 3.0 Alpha 1 affect-version: 2.5.2 Update 1 labels Apr 25, 2021

firebird-automations added affect-version: 2.1.5 Update 1 fix-version: 2.5.3 fix-version: 3.0 Alpha 2 fix-version: 2.1.6 resolution: fixed priority: major component: engine type: bug qa: done successfully labels Apr 25, 2021

firebird-automations assigned hvlad Apr 25, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lookup (or scan) in descending index could be very inefficient for some keys [CORE4302] #4625

Lookup (or scan) in descending index could be very inefficient for some keys [CORE4302] #4625

firebird-automations commented Dec 21, 2013

firebird-automations commented Dec 21, 2013

firebird-automations commented Dec 21, 2013

firebird-automations commented Dec 21, 2013

firebird-automations commented Dec 21, 2013

firebird-automations commented Dec 21, 2013

firebird-automations commented Dec 21, 2013

firebird-automations commented Dec 21, 2013

firebird-automations commented Dec 21, 2013

firebird-automations commented Dec 23, 2013

firebird-automations commented Dec 23, 2013

firebird-automations commented Dec 23, 2013

firebird-automations commented Dec 23, 2013

firebird-automations commented Dec 23, 2013

firebird-automations commented Dec 23, 2013

firebird-automations commented Dec 23, 2013

firebird-automations commented Dec 23, 2013

firebird-automations commented Dec 23, 2013

firebird-automations commented Dec 24, 2013

firebird-automations commented Dec 24, 2013

firebird-automations commented Dec 24, 2013

firebird-automations commented Dec 24, 2013

firebird-automations commented Jan 9, 2014

firebird-automations commented Mar 29, 2014

firebird-automations commented May 29, 2015

firebird-automations commented May 29, 2015

firebird-automations commented Sep 25, 2015

firebird-automations commented Sep 25, 2015

Lookup (or scan) in descending index could be very inefficient for some keys [CORE4302] #4625

Lookup (or scan) in descending index could be very inefficient for some keys [CORE4302] #4625

Comments

firebird-automations commented Dec 21, 2013

firebird-automations commented Dec 21, 2013

firebird-automations commented Dec 21, 2013

firebird-automations commented Dec 21, 2013

firebird-automations commented Dec 21, 2013

firebird-automations commented Dec 21, 2013

firebird-automations commented Dec 21, 2013

firebird-automations commented Dec 21, 2013

firebird-automations commented Dec 21, 2013

firebird-automations commented Dec 23, 2013

DDL:

Query:

Output:

firebird-automations commented Dec 23, 2013

firebird-automations commented Dec 23, 2013

firebird-automations commented Dec 23, 2013

firebird-automations commented Dec 23, 2013

firebird-automations commented Dec 23, 2013

firebird-automations commented Dec 23, 2013

firebird-automations commented Dec 23, 2013

firebird-automations commented Dec 23, 2013

firebird-automations commented Dec 24, 2013

firebird-automations commented Dec 24, 2013

firebird-automations commented Dec 24, 2013

firebird-automations commented Dec 24, 2013

firebird-automations commented Jan 9, 2014

firebird-automations commented Mar 29, 2014

firebird-automations commented May 29, 2015

firebird-automations commented May 29, 2015

firebird-automations commented Sep 25, 2015

firebird-automations commented Sep 25, 2015