New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lookup (or scan) in descending index could be very inefficient for some keys [CORE4302] #4625
Comments
Commented by: @hvlad The same test agains v2.5.2 insert into t1 values (100, 1)^ insert into t1 values (100, 2)^ insert into t1 values (100, 3)^ select * from t1 where val <= 1 order by val desc rows 1^
============ ============ Elapsed time= 0.04 sec select * from t1 where val <= 2 order by val desc rows 1^
============ ============ Elapsed time= 0.07 sec select * from t1 where val <= 3 order by val desc rows 1^
============ ============ Elapsed time= 0.05 sec You see - v2.5.2 have same results for INSERT and a bit different (but steel bad) results for SELECT. |
Modified by: @hvladassignee: Vlad Khorsun [ hvlad ] |
Commented by: @hvlad The problem is present in descending index when there are keys which are longer then inserted\searched key but have the same prefix. For example, we are searching for key 'a' and keys 'aa', 'ab' etc are present. |
Modified by: @hvladstatus: Open [ 1 ] => Resolved [ 5 ] resolution: Fixed [ 1 ] Fix Version: 2.5.3 [ 10461 ] Fix Version: 3.0 Alpha 2 [ 10560 ] Fix Version: 2.1.6 [ 10460 ] |
Commented by: @hvlad Stats after the fix for fb3: insert into t1 values (100, 1)^ insert into t1 values (100, 2)^ insert into t1 values (100, 3)^ select * from t1 where val <= 1 order by val desc rows 1^
============ ============ Reads = 2 select * from t1 where val <= 2 order by val desc rows 1^
============ ============ Reads = 0 select * from t1 where val <= 3 order by val desc rows 1^
============ ============ Reads = 0 |
Commented by: @hvlad Stats after the fix for v2.5.2 insert into t1 values (100, 1)^ insert into t1 values (100, 2)^ insert into t1 values (100, 3)^ select * from t1 where val <= 1 order by val desc rows 1^
============ ============ Reads = 13 select * from t1 where val <= 2 order by val desc rows 1^
============ ============ Reads = 0 select * from t1 where val <= 3 order by val desc rows 1^
============ ============ Reads = 0 |
Commented by: @hvlad Stats for the v2.1 is the same as for the v2.5 |
Modified by: @hvladVersion: 2.1.5 Update 1 [ 10522 ] description: Using FB3, default config (SS, page cache 2048, database page size is 4096). Prepare data set term ^; create table t1 (id int, val int) execute block as create descending index t1_desc_idx on t1 (val) set stat on Issue 1: insert three records and look at stats insert into t1 values (100, 1)^ SQL> insert into t1 values (100, 2)^ SQL> insert into t1 values (100, 3)^ There was 7 fetches for keys "1" and "3" while 386 fetches for key "2" ! Another example (on the same data): select * from t1 where val <= 1 order by val desc rows 1^
============ ============ Elapsed time= 0.00 sec select * from t1 where val <= 2 order by val desc rows 1^
============ ============ Elapsed time= 0.64 sec select * from t1 where val <= 3 order by val desc rows 1^
============ ============ Elapsed time= 0.00 sec Again, key "2" produced very poor performance comparing with another keys => Big thanks to the Pavel Zotov who found the issue and helped with testing. Using FB3, default config (SS, page cache 2048, database page size is 4096). Prepare data set term ^; create table t1 (id int, val int) execute block as create descending index t1_desc_idx on t1 (val) set stat on Issue 1: insert three records and look at stats insert into t1 values (100, 1)^ SQL> insert into t1 values (100, 2)^ SQL> insert into t1 values (100, 3)^ There was 7 fetches for keys "1" and "3" while 386 fetches for key "2" ! Another example (on the same data): select * from t1 where val <= 1 order by val desc rows 1^
============ ============ Elapsed time= 0.00 sec select * from t1 where val <= 2 order by val desc rows 1^
============ ============ Elapsed time= 0.64 sec select * from t1 where val <= 3 order by val desc rows 1^
============ ============ Elapsed time= 0.00 sec Again, key "2" produced very poor performance comparing with another keys |
Commented by: @pavel-zotov > In descending indices shorter keys are greater than longer keys and this fact was not accounted in some (not all) code parts. Seems that in the following case we have the same high cost and poor statistics but for building bitmap and single search rahter than index navigating: DDL:set echo off; set term ^; Query:set stat on; Output:PLAN (TD INDEX (TD_S_DES)) So, we have over 18 mil fetches when search single key that is absent in index :( |
Commented by: Sean Leyne (seanleyne) I am thinking that the Summary wording, to better reflect the problem, should be more along the lines of: Descending index with large number of duplicates can cause inefficient index operations Thoughts? |
Commented by: @hvlad Pavel, Sean, |
Commented by: Sean Leyne (seanleyne) @vlad, Please suggest some alternate. The current description seems 'vague'/ambiguous. |
Commented by: @hvlad Sean, so far i see no need to do it. Explain what kind of ambiguity do you see here ? |
Commented by: Sean Leyne (seanleyne) The word "inefficient" without any other context is the problem. What is inefficient? The index navigation logic (how)? The index data storage? |
Commented by: @hvlad Well, we can change subject to something like: Search for some keys in descending index requires much more fetches\reads than it is necessary PS So, why "inefficient index operations" is less ambigious then just "inefficient" ? :) |
Commented by: Sean Leyne (seanleyne) Ok, how about: Descending index can cause large number of data reads/fetches when large number of duplicate index values present. |
Commented by: @hvlad Why do you make accent on duplicates ? They makes issue more visible, yes, but are not an essential part of the problem. |
Commented by: @dyemanov I'd only suggest to add "Lookup in" in front of the title, as it's not the index itself that's inefficient but the index scan. |
Commented by: @hvlad Re-opened to fix the new issue found by Pavel. |
Modified by: @hvladsummary: Descending index could be very inefficient for some keys => Lookup (or scan) in descending index could be very inefficient for some keys |
Modified by: @pavel-zotovstatus: Resolved [ 5 ] => Resolved [ 5 ] QA Status: Done successfully Test Details: Test uses MON$ tables to gather statistics snapshot before and after each of query. |
Modified by: @pavel-zotovstatus: Resolved [ 5 ] => Resolved [ 5 ] Test Details: Test uses MON$ tables to gather statistics snapshot before and after each of query. => Test uses MON$ tables to gather statistics snapshot before and after each of query. These numeric literals are used as 'upper-bounds' for comparison with actual fetches. If actual value became equal or more than 'upper-bound' than output will be '!high!' and thus will lead test to FAIL, because expected result is 'normal'. |
Commented by: @pcisar Test created. |
Modified by: @pcisarstatus: Resolved [ 5 ] => Closed [ 6 ] |
Submitted by: @hvlad
Is related to QA518
Big thanks to the Pavel Zotov who found the issue and helped with testing.
Using FB3, default config (SS, page cache 2048, database page size is 4096).
Prepare data
set term ^;
create table t1 (id int, val int)
^
execute block as
declare i int = 0;
begin
while (i < 1000000) do
begin
insert into t1 values (:i, mod(:i, 10));
i = i + 1;
end
end
^
commit
^
create descending index t1_desc_idx on t1 (val)
^
set stat on
^
Issue 1: insert three records and look at stats
insert into t1 values (100, 1)^
Buffers = 2048
Reads = 0
Writes 0
Fetches = 7
SQL> insert into t1 values (100, 2)^
Buffers = 2048
Reads = 0
Writes 0
Fetches = 386
SQL> insert into t1 values (100, 3)^
Buffers = 2048
Reads = 0
Writes 0
Fetches = 7
There was 7 fetches for keys "1" and "3" while 386 fetches for key "2" !
Another example (on the same data):
select * from t1 where val <= 1 order by val desc rows 1^
============ ============
1 1
Elapsed time= 0.00 sec
Buffers = 2048
Reads = 2
Writes 0
Fetches = 7
select * from t1 where val <= 2 order by val desc rows 1^
============ ============
2 2
Elapsed time= 0.64 sec
Buffers = 2048
Reads = 28576
Writes 4
Fetches = 645359
select * from t1 where val <= 3 order by val desc rows 1^
============ ============
3 3
Elapsed time= 0.00 sec
Buffers = 2048
Reads = 4
Writes 0
Fetches = 7
Again, key "2" produced very poor performance comparing with another keys
Commits: a1c9527 11bad44 7ef7eb3 69f9bb5 fb35e77 b6e00c5 FirebirdSQL/fbt-repository@28ad515 FirebirdSQL/fbt-repository@26174fb FirebirdSQL/fbt-repository@4d9a681 FirebirdSQL/fbt-repository@263f347 FirebirdSQL/fbt-repository@c594ed7 FirebirdSQL/fbt-repository@2e8cf4c
====== Test Details ======
Test uses MON$ tables to gather statistics snapshot before and after each of query.
Database that is restored (''mon-stat-gathering-N_M.fbk') contains procedures for gathering statistics and view for displaying results in convenient form. In particular, this view has columns that show DIFFERENCE of fetches and thus we can estimate performance.
Numeric literals in the code (5, 20, 100, ... ) were adjusted after several measurements on:
1) WI-V2.5.4.26847 SuperServer:
1.1) with default value of DefaultDBCachePages and
1.2) when it was 8192
2) WI-T3.0.0.31708 (CS/SC/SS):
2.1) with default value of DefaultDBCachePages and
2.2) when it was: 256 on CS, 512 on SC and 8192 on SS.
These numeric literals are used as 'upper-bounds' for comparison with actual fetches. If actual value became equal or more than 'upper-bound' than output will be '!high!' and thus will lead test to FAIL, because expected result is 'normal'.
The text was updated successfully, but these errors were encountered: