New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slow inner join vs. fast left join [CORE4666] #4977
Comments
Commented by: Valdir Stiebe Junior (ogecrom) Attached the test database used for queries 1 to 4. |
Modified by: Valdir Stiebe Junior (ogecrom)Attachment: teste.7z [ 12663 ] |
Commented by: Valdir Stiebe Junior (ogecrom) Sorry, the SECOND_TABLE FK index have selectivity 1, greather than the PK index. But the problem ocurrs for any value greater than the PK index (all FK). |
Commented by: @dyemanov You cannot expect really good plans from the uneven distribution. SECOND_TABLE have all the records matching a single record inside TESTE. |
Commented by: Valdir Stiebe Junior (ogecrom) Ok, please mark this issue as duplicated of CORE3707. |
Commented by: @dyemanov They're not duplicates, although related. CORE3707 is about zero (probably outdated) selectivities, so that the engine needs to guess its real value. In your case, selectivities are correct, but value distribution is uneven so the engine's assumption that every inner retrieval gets (cardinality * selectivity) rows is wrong. I don't think it can be fixed without value distribution histograms. |
Commented by: Valdir Stiebe Junior (ogecrom) Got it. I will watch for CORE1686, then. Just a hint for anyone who read this comments. select * As seen on http://www.firebirdfaq.org/faq158/ |
Modified by: @pcisarstatus: Resolved [ 5 ] => Closed [ 6 ] |
Modified by: @pavel-zotovstatus: Closed [ 6 ] => Closed [ 6 ] QA Status: Deferred Test Details: Data in test DB are skewed: SQL> select id_teste,count(*) from second_table group by 1; --- child table SQL> select count(*) from teste; -- parent table
============ No sense in test until histograms will be implemented. |
Submitted by: Valdir Stiebe Junior (ogecrom)
Is related to CORE3707
Is related to CORE1686
Attachments:
teste.7z
Shouldn't the execution plan on the following 4 queries be the same?
On queries 1 and 3, there are way more non indexed reads.
On real data some queries like 1 and 3 take 5 minutes or more to execute while queries like 2 and 4 execute on less than few seconds.
Queries 2 and 3 looks like an ugly hack to circumvent a bug. But I found a lot of queries in our software that were rewritten on this fashion for optimization purposes.
1)
select *
from
V_TESTE_ED T inner join
V_SECOND_TABLE_ED S on S.ID_TESTE = T.ID_TESTE
PLAN JOIN (S SECOND_TABLE NATURAL, T TESTE INDEX (PK_TESTE))
2)
select *
from
V_TESTE_ED T left join
V_SECOND_TABLE_ED S on S.ID_TESTE = T.ID_TESTE
where
S.ID_SECOND_TABLE is not null
PLAN JOIN (T TESTE NATURAL, S SECOND_TABLE INDEX (FK_TESTE))
3)
select *
from
TESTE T inner join
SECOND_TABLE S on S.ID_TESTE = T.ID_TESTE
PLAN JOIN (S NATURAL, T INDEX (PK_TESTE))
4)
select *
from
TESTE T left join
SECOND_TABLE S on S.ID_TESTE = T.ID_TESTE
where
S.ID_SECOND_TABLE is not null
PLAN JOIN (T NATURAL, S INDEX (FK_TESTE))
====== Test Details ======
Data in test DB are skewed:
SQL> select id_teste,count(*) from second_table group by 1; --- child table
ID_TESTE COUNT
============ ============
0 12000
SQL> select count(*) from teste; -- parent table
============
5000
No sense in test until histograms will be implemented.
The text was updated successfully, but these errors were encountered: