Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

indexes with low selectivity kill the select execution [CORE47] #332

Open
firebird-automations opened this issue Feb 15, 2006 · 11 comments
Open

Comments

@firebird-automations
Copy link
Collaborator

Submitted by: centros (centros)

SFID: 1432475#⁠
Submitted By: centros

Having some active indexes with low selectivity is
killing condition for some SELECT statements. It is a
problem of a planner, that such bad indexes are used in
a plan. SELECT execution is about 10 to 50 times
slower, compared to a situation, when such indexes are
set inactive. I am testing on FB 2.0 Beta 2, the same
problem was on FB 1.5. The same SELECT on IB 6.0 or IB
7.5 executes nice fast, even with such bad indexes. IB
planner is clever enough not to use indexes with low
selectivity in a plan.

Example of a SELECT statement:
select
sum(SKLPOHDETAILMNOZSTVI),sum(SKLPOHDETAILCELKEMCENANAKUP),sum(SKLPOHDETAILCELKEMCENAPRODEJ)
from SKLPOHDETAIL
left outer join SKLPOH on
SKLPOHDETAIL.IDSKLPOH=SKLPOH.IDSKLPOH
where SKLPOHDATZADAL >= :DATOD AND SKLPOHDATZADAL <=
:DATDO AND SKLPOHTYP=10 AND
EIDSKLADKARTA=:EIDSKLADKARTA and
SKLPOHDETAIL.recordactive='1' and SKLPOH.recordactive =
'1' and SKLPOHSTORNOVANO='0'

The indexes with low selectivity are set on
SKLPOH.recordactive and SKLPOHDETAIL.recordactive.
These items have only three possible values, 0, 1 or 2.
Most of them have value 1.

Both SKLPOH and SKLPOHDETAIL tables have about 200.000
rows each. Overal database size is about 300 MB.

If someone is interested, I can post the database file
for testing.

Martin

@firebird-automations
Copy link
Collaborator Author

Commented by: Alice F. Bird (firebirds)

Date: 2006-02-28 11:21
Sender: centros
Logged In: YES
user_id=1453236

I made further investigation of this problem and I made the
problem more specific. Dmitry is testing my database since
last week. I hope this new information will help.

Problem:
Activating indexes with bad selectivity dramatically
increases the select from procedure
execution time. For that test case I have simplified the
procedure
SALDO_SKLADU, which we have tested last week (it is in the
database I sent to Dmitry). Now the problem is more specific.

Testing:
The select was tested with some indexes set Inactive in
first try and also with the same indexes
set Active in second try. Results are under selects marked I
-> for indexes Inactive, A -> for Active.
These indexes are: SKLPOH.Recordactive, SKLPOH.Stornovano
and SKLPOHDETAIL.Recordactive
This was made to find out, whether that indexes have some
influence to the problem tested.

Result:
Active indexes with bad selectivity are included in a plan,
but it has no real bad influence
on select execution of a select itself. Problem with indexes
shows up when executing the SALDO
procedure, which executes the same select in a for loop.

1.1 - Select from a procedure SALDO
select * from SALDO('1.3.2005')
I ->
PLAN JOIN (SKLPOHDETAIL INDEX
(IDX_SKLPOHDETAIL_EIDSKLADKARTA), SKLPOH INDEX
(IDX_SKLPOH_ID))JOIN (SKLPOHDETAIL INDEX
(IDX_SKLPOHDETAIL_EIDSKLADKARTA), SKLPOH INDEX
(IDX_SKLPOH_ID))(SKLADKARTA INDEX (IDX_SKLADKARTA_RA,
IDX_SKLADKARTA_RA))
3.344 ms, 1.557.855 reads
A ->
PLAN JOIN (SKLPOHDETAIL INDEX
(IDX_SKLPOHDETAIL_EIDSKLADKARTA, IDX_SKLPOHDETAIL_RA),
SKLPOH INDEX (IDX_SKLPOH_ID))JOIN (SKLPOHDETAIL INDEX
(IDX_SKLPOHDETAIL_EIDSKLADKARTA, IDX_SKLPOHDETAIL_RA),
SKLPOH INDEX (IDX_SKLPOH_ID))(SKLADKARTA INDEX
(IDX_SKLADKARTA_RA, IDX_SKLADKARTA_RA))
50.563 ms, 1.932.149 buffer reads

1.2 - Select from procedure SALDO executed without procedure
(only for EIDSKLADKARTA = 3, which has most records)
select
sum(SKLPOHDETAILMNOZSTVI),sum(SKLPOHDETAILCELKEMCENANAKUP),sum(SKLPOHDETAILCELKEMCENAPRODEJ)
from SKLPOHDETAIL
left outer join SKLPOH on SKLPOHDETAIL.IDSKLPOH=SKLPOH.IDSKLPOH
where SKLPOHDATZADAL <= '1.3.2005' AND SKLPOHTYP = 11 AND
EIDSKLADKARTA = 3 and SKLPOHDETAIL.RECORDACTIVE = '1' and
SKLPOH.RECORDACTIVE = '1' and SKLPOHSTORNOVANO = '0'
I ->
PLAN JOIN (SKLPOHDETAIL INDEX
(IDX_SKLPOHDETAIL_EIDSKLADKARTA), SKLPOH INDEX (IDX_SKLPOH_ID))
2.016 ms, 646.631 buffer reads
A ->
PLAN JOIN (SKLPOHDETAIL INDEX
(IDX_SKLPOHDETAIL_EIDSKLADKARTA, IDX_SKLPOHDETAIL_RA),
SKLPOH INDEX (IDX_SKLPOH_ID))
1.969 ms, 646.729 buffer reads

1.3 - Select from procedure SALDO executed without
procedure, grouped by EIDSKLADKARTA
select
EIDSKLADKARTA,sum(SKLPOHDETAILMNOZSTVI),sum(SKLPOHDETAILCELKEMCENANAKUP),sum(SKLPOHDETAILCELKEMCENAPRODEJ)
from SKLPOHDETAIL
left outer join SKLPOH on SKLPOHDETAIL.IDSKLPOH=SKLPOH.IDSKLPOH
where SKLPOHDATZADAL <= '1.3.2005' AND SKLPOHTYP = 11 AND
SKLPOHDETAIL.RECORDACTIVE = '1' and SKLPOH.RECORDACTIVE =
'1' and SKLPOHSTORNOVANO = '0'
group by EIDSKLADKARTA
I ->
PLAN JOIN (SKLPOHDETAIL ORDER
IDX_SKLPOHDETAIL_EIDSKLADKARTA, SKLPOH INDEX (IDX_SKLPOH_ID))
6.218 ms, 2.240.240 buffer reads
A ->
PLAN JOIN (SKLPOHDETAIL ORDER IDX_SKLPOHDETAIL_EIDSKLADKARTA
INDEX (IDX_SKLPOHDETAIL_RA), SKLPOH INDEX (IDX_SKLPOH_ID))
6.391 ms, 2.233.624 buffer reads

CREATE PROCEDURE SALDO (DATDO DATE)
returns (EIDFIRMA INTEGER,
EIDPOBOCKA INTEGER,
EIDSKLAD INTEGER,
EIDPRODUKT INTEGER,
NAKOUPENE_MNOZSTVI NUMERIC(15,4),
PRODANE_MNOZSTVI NUMERIC(15,4),
OBRAT_NAKUPU_NC NUMERIC(15,4),
OBRAT_PRODEJE_SC NUMERIC(15,4),
OBRAT_PRODEJE_PC NUMERIC(15,4),
OBRAT_NAKUPU_PC NUMERIC(15,4))
AS
declare variable EIDSKLADKARTA Integer;
begin
for select EIDFIRMA,EIDPOBOCKA,EIDSKLAD,EIDSKLADKARTA,EIDPRODUKT
from SKLADKARTA
where (RECORDACTIVE = '1' or RECORDACTIVE = '2')
into :EIDFIRMA,:EIDPOBOCKA,:EIDSKLAD,:EIDSKLADKARTA,:EIDPRODUKT
do
begin

select
sum(SKLPOHDETAILMNOZSTVI),sum(SKLPOHDETAILCELKEMCENANAKUP),sum(SKLPOHDETAILCELKEMCENAPRODEJ)
from SKLPOHDETAIL
left outer join SKLPOH on
SKLPOHDETAIL.IDSKLPOH=SKLPOH.IDSKLPOH
where SKLPOHDATZADAL <= :DATDO AND SKLPOHTYP = 10 AND
EIDSKLADKARTA=:EIDSKLADKARTA and SKLPOHDETAIL.RECORDACTIVE =
'1' and SKLPOH.RECORDACTIVE = '1' and SKLPOHSTORNOVANO = '0'
into :NAKOUPENE_MNOZSTVI,:OBRAT_NAKUPU_NC,:OBRAT_NAKUPU_PC;

select
sum(SKLPOHDETAILMNOZSTVI),sum(SKLPOHDETAILCELKEMCENANAKUP),sum(SKLPOHDETAILCELKEMCENAPRODEJ)
from SKLPOHDETAIL
left outer join SKLPOH on
SKLPOHDETAIL.IDSKLPOH=SKLPOH.IDSKLPOH
where SKLPOHDATZADAL <= :DATDO AND SKLPOHTYP = 11 AND
EIDSKLADKARTA=:EIDSKLADKARTA and SKLPOHDETAIL.RECORDACTIVE =
'1' and SKLPOH.RECORDACTIVE = '1' and SKLPOHSTORNOVANO = '0'
into :PRODANE_MNOZSTVI,:OBRAT_PRODEJE_SC,:OBRAT_PRODEJE_PC;

suspend;

end
end

@firebird-automations
Copy link
Collaborator Author

Commented by: Alice F. Bird (firebirds)

Date: 2006-02-17 15:04
Sender: dimitr
Logged In: YES
user_id=61270

Please post your database here or send it to me by email or
make available for download somewhere. I'll take a look.

@firebird-automations
Copy link
Collaborator Author

Commented by: Alice F. Bird (firebirds)

Date: 2006-02-17 10:39
Sender: centros
Logged In: YES
user_id=1453236

OK, if you want to close this case now, I agree. Then the
result for me is that I must not use indexes with very bad
selectivity or I will keep them inactive at least. I will
try to prepare a reproducable test later. The thing that I
know for sure now is, that I have a database where the same
select with activated indexes takes 100 times more time than
with the indexes set to inactive. It is reproducable. I can
post this database if you want. The thing that I do not know
is, how I brought the database in such a bad condition and
how to repair it. Unfortunately even backup-restore does not
help.

@firebird-automations
Copy link
Collaborator Author

Commented by: Alice F. Bird (firebirds)

Date: 2006-02-16 17:23
Sender: dimitr
Logged In: YES
user_id=61270

I'm afraid we cannot help in such a volatile scenario. A
reproducable test case is required. I may only ensure you
that FB 1.5 and especially 2.0 do much better work on
throwing away the bad indices than both 1.0 or any IB version.

@firebird-automations
Copy link
Collaborator Author

Commented by: Alice F. Bird (firebirds)

Date: 2006-02-16 13:52
Sender: centros
Logged In: YES
user_id=1453236

After further testing I still insist on the error. There is
something happening in the database, because sometimes my
select runs very fast (20 seconds) and another time the same
select runs very slowly (20 minutes). I started working in a
situation when it was fast. Then I made some massive
recalculations on data and tried the same select once more.
That time it was very slow again. So I repeated my test and
deactivated the two indexes. After that the same select was
fast again. Unfortunately after activating the indexes it
has again slowed down.

Finaly I tested which index has this bad influence. First I
deactivated them all. Later I activated them one by one.
Every step I have tested with my select. Guess what
happened. It is again working with all indexes activated!
That I have not expected. It seems like it behaves every
time different.

@firebird-automations
Copy link
Collaborator Author

Commented by: Alice F. Bird (firebirds)

Date: 2006-02-16 12:16
Sender: centros
Logged In: YES
user_id=1453236

Yes, I did backup and restore before testing yesterday. I
had similar problem many times before already with FB 1.5
alnd alo FB 2.0 Beta 1. However today it is working fast
also with indexes ACTIVE. I do not understand it why. If
that happens again, I will examine it deeper. The only
change since yesterday is, that I have restarted my
computer. Anyway I post the plans.

1) Indexes INACTIVE
PLAN JOIN (SKLPOHDETAIL INDEX
(IDX_SKLPOHDETAIL_EIDSKLADKARTA), SKLPOH INDEX (IDX_SKLPOH_ID))
time 1812 ms, 648179 reads

2) Indexes ACTIVE
PLAN JOIN (SKLPOHDETAIL INDEX
(IDX_SKLPOHDETAIL_EIDSKLADKARTA, IDX_SKLPOHDETAIL_RA),
SKLPOH INDEX (IDX_SKLPOH_ID))
time 1828 ms, 648283 reads

@firebird-automations
Copy link
Collaborator Author

Commented by: Alice F. Bird (firebirds)

Date: 2006-02-15 22:36
Sender: seanleyne
Logged In: YES
user_id=71163

A couple of things:

First, did you backup/restore the database so that the new
indexig structures are available?

If not, please do so and re-test.

Second, please post the PLAN generated for the statement.

@firebird-automations
Copy link
Collaborator Author

Modified by: Alice F. Bird (firebirds)

description: SFID: 1432475#⁠
Submitted By: centros

Having some active indexes with low selectivity is
killing condition for some SELECT statements. It is a
problem of a planner, that such bad indexes are used in
a plan. SELECT execution is about 10 to 50 times
slower, compared to a situation, when such indexes are
set inactive. I am testing on FB 2.0 Beta 2, the same
problem was on FB 1.5. The same SELECT on IB 6.0 or IB
7.5 executes nice fast, even with such bad indexes. IB
planner is clever enough not to use indexes with low
selectivity in a plan.

Example of a SELECT statement:
select
sum(SKLPOHDETAILMNOZSTVI),sum(SKLPOHDETAILCELKEMCENANAKUP),sum(SKLPOHDETAILCELKEMCENAPRODEJ)
from SKLPOHDETAIL
left outer join SKLPOH on
SKLPOHDETAIL.IDSKLPOH=SKLPOH.IDSKLPOH
where SKLPOHDATZADAL >= :DATOD AND SKLPOHDATZADAL <=
:DATDO AND SKLPOHTYP=10 AND
EIDSKLADKARTA=:EIDSKLADKARTA and
SKLPOHDETAIL.recordactive='1' and SKLPOH.recordactive =
'1' and SKLPOHSTORNOVANO='0'

The indexes with low selectivity are set on
SKLPOH.recordactive and SKLPOHDETAIL.recordactive.
These items have only three possible values, 0, 1 or 2.
Most of them have value 1.

Both SKLPOH and SKLPOHDETAIL tables have about 200.000
rows each. Overal database size is about 300 MB.

If someone is interested, I can post the database file
for testing.

Martin

=>

SFID: 1432475#⁠
Submitted By: centros

Having some active indexes with low selectivity is
killing condition for some SELECT statements. It is a
problem of a planner, that such bad indexes are used in
a plan. SELECT execution is about 10 to 50 times
slower, compared to a situation, when such indexes are
set inactive. I am testing on FB 2.0 Beta 2, the same
problem was on FB 1.5. The same SELECT on IB 6.0 or IB
7.5 executes nice fast, even with such bad indexes. IB
planner is clever enough not to use indexes with low
selectivity in a plan.

Example of a SELECT statement:
select
sum(SKLPOHDETAILMNOZSTVI),sum(SKLPOHDETAILCELKEMCENANAKUP),sum(SKLPOHDETAILCELKEMCENAPRODEJ)
from SKLPOHDETAIL
left outer join SKLPOH on
SKLPOHDETAIL.IDSKLPOH=SKLPOH.IDSKLPOH
where SKLPOHDATZADAL >= :DATOD AND SKLPOHDATZADAL <=
:DATDO AND SKLPOHTYP=10 AND
EIDSKLADKARTA=:EIDSKLADKARTA and
SKLPOHDETAIL.recordactive='1' and SKLPOH.recordactive =
'1' and SKLPOHSTORNOVANO='0'

The indexes with low selectivity are set on
SKLPOH.recordactive and SKLPOHDETAIL.recordactive.
These items have only three possible values, 0, 1 or 2.
Most of them have value 1.

Both SKLPOH and SKLPOHDETAIL tables have about 200.000
rows each. Overal database size is about 300 MB.

If someone is interested, I can post the database file
for testing.

Martin

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

assignee: Dmitry Yemanov [ dimitr ]

@firebird-automations
Copy link
Collaborator Author

Modified by: @pcisar

Workflow: jira [ 10071 ] => Firebird [ 14240 ]

@firebird-automations
Copy link
Collaborator Author

Commented by: Ján Kolár (kolar)

I noticed the same thing on following select:

select * from ATTESTVALUES where ATTEST_ID = :ID order by ORD

To see what the problem is i measured times needed for similar select commands:
"select * from ATTESTVALUES where ATTEST_ID = :ID order by ORD" needs 1.3 seconds
"select * from ATTESTVALUES where ATTEST_ID = :ID" needs 0.05 seconds

Lets see definition of table ATTESTVALUES:
CREATE TABLE ATTESTVALUES
(
ID Integer NOT NULL,
ATTEST_ID Integer NOT NULL,
TEST_ID Integer NOT NULL,
ORD Smallint NOT NULL,
STATUS Smallint NOT NULL,
RESULT Varchar(100),
RESULT2 Varchar(52),
RESULT3 Varchar(52),
UNCERT Double precision,
TOLER Double precision,
RESULT4 Varchar(52),
PRIMARY KEY (ID),
CONSTRAINT UNATTESTVALUES UNIQUE (ATTEST_ID,TEST_ID,ORD)
);
CREATE INDEX ORDX ON ATTESTVALUES (ORD);

There is also automatically created index RDB$FOREIGN9 on column ATTEST_ID which is a foreign key.
Whats specific for my table is low selectivity (0.008772) of index ORDX because the values in column ORD are in most cases in the range cca 1 - 100. The number of records in detail table ATTESTVALUES in most cases is in the range of tens. In contrast to this index RDB$FOREIGN9 has high selectivity 0.000001. This is because the values in column ATTEST_ID are unique and also because parent table ATTEST has around 700k records (i.e. 700k unique values of column ATTEST_ID). If I modify the plan as follows:

"select * from ATTESTVALUES where ATTEST_ID = :ID PLAN(ATTESTVALUES INDEX(RDB$FOREIGN9)) order by ORD"

i.e. I purposefully disable index ORDX on column ORD now the query takes around 0.05 seconds. If the Firebird would apply indexes in opposite order i.e. first apply index with high selectivity RDB$FOREIGN9 and then index with low selectivity ORDX I think I would not see significant differences in times. I understand that index ORDX is maybe in my example uneccessary (because to sort few tens or few hundreds of records we dont need index) but good database should handle this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants