Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IN/ANY/ALL predicates may cause sub-optimal (late filtering) execution of joins [CORE5236] #5515

Closed
firebird-automations opened this issue May 10, 2016 · 14 comments

Comments

@firebird-automations
Copy link
Collaborator

Submitted by: Denis Pereira Raymundo (denistek)

Attachments:
problema.rar

Votes: 2

Too slowly to retrieve records in the 2.5.5 version and 3.00.
Even if I force the same execution plan that used in version.

Firebird 2.5.1
PLAN (DP_REGISTRO_OEST INDEX (UNQ1_DP_REGISTRO_OEST))
PLAN (DP_FERIAS INDEX (DP_FERIAS_IDX1, FK_DP_FERIAS_3))
PLAN JOIN (JOIN (JOIN (JOIN (DP_REGISTRO NATURAL, DP_RECIBO INDEX (FK_DP_RECIBO_1)), DP_RECIBO_EVE INDEX (FK_DP_RECIBO_EVE_1, DP_RECIBO_EVE_IDX2)), DP_EVENTO INDEX (RDB$PRIMARY_DP_EVENTO)), DP_RECIBO_PER_ANTERIOR INDEX (FK_DP_RECIBO_PER_ANTERIOR_1))

------ Performance info ------
Prepare time = 0ms
Execute time = 16ms
Current memory = 1.382.144
Max memory = 2.246.624
Memory buffers = 75
Reads from disk to cache = 141
Writes from cache to disk = 0
Fetches from cache = 4.731

Firebird 3.0
PLAN (DP_REGISTRO_OEST INDEX (UNQ1_DP_REGISTRO_OEST))
PLAN (DP_FERIAS INDEX (DP_FERIAS_IDX1))
PLAN JOIN (JOIN (JOIN (JOIN (DP_REGISTRO NATURAL, DP_RECIBO INDEX (FK_DP_RECIBO_1)), DP_RECIBO_EVE INDEX (FK_DP_RECIBO_EVE_1, DP_RECIBO_EVE_IDX2)), DP_EVENTO INDEX (RDB$PRIMARY_DP_EVENTO)), DP_RECIBO_PER_ANTERIOR INDEX (FK_DP_RECIBO_PER_ANTERIOR_1))

------ Performance info ------
Prepare time = 16ms
Execute time = 4m 32s 844ms
Current memory = 2.960.928
Max memory = 3.030.104
Memory buffers = 90
Reads from disk to cache = 5.693.279
Writes from cache to disk = 0
Fetches from cache = 9.345.611

Commits: 118a78e ceacc98 1cdded6

====== Test Details ======

Test checks EXPLAINED PLAN in expected_stdout, otherwise one may not see early filtering. For this reason min_version is 3.0.1, not 2.5.7.

@firebird-automations
Copy link
Collaborator Author

Commented by: Denis Pereira Raymundo (denistek)

This file contain all necessary tables envolved in SQL:

select DP_RECIBO.CODIGO_REC, DP_RECIBO_EVE.EVENTO_REV
from DP_RECIBO
inner join DP_REGISTRO on (DP_REGISTRO.AUTOINC_REGISTRO = DP_RECIBO.REGISTRO_REC)
left join DP_RECIBO_EVE on (DP_RECIBO_EVE.RECIBO_REV = DP_RECIBO.CODIGO_REC)
left join DP_EVENTO on (DP_EVENTO.CODIGO_EVENTO = DP_RECIBO_EVE.EVENTO_REV)
left join DP_RECIBO_PER_ANTERIOR on (DP_RECIBO_PER_ANTERIOR.AUTOINC_RECIBO_PERANT = DP_RECIBO.CODIGO_REC)
where ( ( (DP_REGISTRO.ESTABELECIMENTO_REGISTRO = 1) and
(DP_REGISTRO.FUNCIONARIO_REGISTRO = 3431) )
or
(DP_REGISTRO.AUTOINC_REGISTRO in (
select REGISTRO_SEC_ROEST
from DP_REGISTRO_OEST
where REGISTRO_PRI_ROEST = 181) ) )
and DP_RECIBO.DATA_PROCESSAMENTO_REC is not null
and DP_RECIBO.TIPO_REC not in ('05', '06', '07')
and DP_RECIBO.CODIGO_REC <> 0
and DP_RECIBO_EVE.ORIGEM_REV = '00'
and DP_RECIBO_EVE.RECORIGEM_REV = 0
and ( (DP_RECIBO.DATA_BASE_REC between '04/01/2016 00:00:00' and '04/30/2016 23:59:59 999')
or ('04/01/2016' between DP_RECIBO.DATA_INICIAL_REC and DP_RECIBO.DATA_FINAL_REC)
or ('04/30/2016' between DP_RECIBO.DATA_INICIAL_REC and DP_RECIBO.DATA_FINAL_REC)
or (DP_RECIBO.CODIGO_REC in (
select DP_FERIAS.RECIBO_FERIAS
from DP_FERIAS
where DP_FERIAS.REGISTRO_FERIAS = DP_RECIBO.REGISTRO_REC
and DP_FERIAS.RATEIO_ABONO_FERIAS = 'S'
and (DP_FERIAS.FINAL_FERIAS + DP_FERIAS.DIAS_ABONO_FERIAS) between '04/01/2016 00:00:00' and '04/30/2016 23:59:59 999')))

@firebird-automations
Copy link
Collaborator Author

Modified by: Denis Pereira Raymundo (denistek)

Attachment: problema.rar [ 12960 ]

@firebird-automations
Copy link
Collaborator Author

Commented by: @hvlad

Just notes:

a) the PLAN's you show are not the same

Firebird 2.5.1

PLAN (DP_FERIAS INDEX (DP_FERIAS_IDX1, FK_DP_FERIAS_3))

while

Firebird 3.0

PLAN (DP_FERIAS INDEX (DP_FERIAS_IDX1))

b) change last IN by EXISTS make query works fast in fb3

   or \(exists \(
            select \*
            from DP\_FERIAS
            where DP\_FERIAS\.REGISTRO\_FERIAS = DP\_RECIBO\.REGISTRO\_REC
              and DP\_RECIBO\.CODIGO\_REC = DP\_FERIAS\.RECIBO\_FERIAS
              and DP\_FERIAS\.RATEIO\_ABONO\_FERIAS = 'S'
              and \(DP\_FERIAS\.FINAL\_FERIAS \+ DP\_FERIAS\.DIAS\_ABONO\_FERIAS\) between '04/01/2016 00:00:00' and '04/30/2016 23:59:59 999'\)\)

@firebird-automations
Copy link
Collaborator Author

Commented by: @hvlad

Also, i don't confirm your PLAN for 2.5.1

I see

2.5.1

PLAN (DP_REGISTRO_OEST INDEX (UNQ1_DP_REGISTRO_OEST))
PLAN (DP_FERIAS INDEX (DP_FERIAS_IDX1, FK_DP_FERIAS_3))
PLAN JOIN (JOIN (JOIN (JOIN (DP_REGISTRO NATURAL, DP_RECIBO INDEX (FK_DP_RECIBO_1)), DP_RECIBO_EVE INDEX (FK_DP_RECIBO_EVE_1, DP_RECIBO_EVE_IDX2)), DP_EVENTO INDEX (RDB$PRIMARY_DP_EVENTO)), DP_RECIBO_PER_ANTERIOR INDEX (FK_DP_RECIBO_PER_ANTERIOR_1))

2.5.6
PLAN (DP_REGISTRO_OEST INDEX (UNQ1_DP_REGISTRO_OEST))
PLAN (DP_FERIAS INDEX (DP_FERIAS_IDX1, FK_DP_FERIAS_3))
PLAN JOIN (JOIN (JOIN (JOIN (DP_RECIBO NATURAL, DP_REGISTRO INDEX (PK_DP_REGISTRO)), DP_RECIBO_EVE INDEX (FK_DP_RECIBO_EVE_1, DP_RECIBO_EVE_IDX2)), DP_EVENTO INDEX (RDB$PRIMARY_DP_EVENTO)), DP_RECIBO_PER_ANTERIOR INDEX (FK_DP_RECIBO_PER_ANTERIOR_1))

i.e. possible regression hapens in 2.5.x series

PS 2.5.1 show fast result for both queries (with IN and with EXISTS) due to different join order in main query

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

security: Developers [ 10012 ] =>

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

assignee: Dmitry Yemanov [ dimitr ]

@firebird-automations
Copy link
Collaborator Author

Commented by: Denis Pereira Raymundo (denistek)

Hi Vlad, thank you by your comments.

Answering your questions:

a) the PLAN's you show are not the same:
I told "Even if I force the same execution plan used by version 2.5.1". I forced by this command, but the result were bad too:

select DP_RECIBO.CODIGO_REC, DP_RECIBO_EVE.EVENTO_REV
from DP_RECIBO
inner join DP_REGISTRO on (DP_REGISTRO.AUTOINC_REGISTRO = DP_RECIBO.REGISTRO_REC)
left join DP_RECIBO_EVE on (DP_RECIBO_EVE.RECIBO_REV = DP_RECIBO.CODIGO_REC)
left join DP_EVENTO on (DP_EVENTO.CODIGO_EVENTO = DP_RECIBO_EVE.EVENTO_REV)
left join DP_RECIBO_PER_ANTERIOR on (DP_RECIBO_PER_ANTERIOR.AUTOINC_RECIBO_PERANT = DP_RECIBO.CODIGO_REC)
where ( ( (DP_REGISTRO.ESTABELECIMENTO_REGISTRO = 1) and
(DP_REGISTRO.FUNCIONARIO_REGISTRO = 3431) )
or
(DP_REGISTRO.AUTOINC_REGISTRO in (
select REGISTRO_SEC_ROEST
from DP_REGISTRO_OEST
where REGISTRO_PRI_ROEST = 181
PLAN (DP_REGISTRO_OEST INDEX (UNQ1_DP_REGISTRO_OEST)) ) ) )
and DP_RECIBO.DATA_PROCESSAMENTO_REC is not null
and DP_RECIBO.TIPO_REC not in ('05', '06', '07')
and DP_RECIBO.CODIGO_REC <> 0
and DP_RECIBO_EVE.ORIGEM_REV = '00'
and DP_RECIBO_EVE.RECORIGEM_REV = 0
and ( (DP_RECIBO.DATA_BASE_REC between '04/01/2016 00:00:00' and '04/30/2016 23:59:59 999')
or ('04/01/2016' between DP_RECIBO.DATA_INICIAL_REC and DP_RECIBO.DATA_FINAL_REC)
or ('04/30/2016' between DP_RECIBO.DATA_INICIAL_REC and DP_RECIBO.DATA_FINAL_REC)
or (DP_RECIBO.CODIGO_REC in (
select DP_FERIAS.RECIBO_FERIAS
from DP_FERIAS
where DP_FERIAS.REGISTRO_FERIAS = DP_RECIBO.REGISTRO_REC
and DP_FERIAS.RATEIO_ABONO_FERIAS = 'S'
and (DP_FERIAS.FINAL_FERIAS + DP_FERIAS.DIAS_ABONO_FERIAS) between '04/01/2016 00:00:00' and '04/30/2016 23:59:59 999'
PLAN (DP_FERIAS INDEX (DP_FERIAS_IDX1, FK_DP_FERIAS_3)) )))
PLAN JOIN (JOIN (JOIN (JOIN (DP_REGISTRO NATURAL, DP_RECIBO INDEX (FK_DP_RECIBO_1)), DP_RECIBO_EVE INDEX (FK_DP_RECIBO_EVE_1, DP_RECIBO_EVE_IDX2)), DP_EVENTO INDEX (RDB$PRIMARY_DP_EVENTO)), DP_RECIBO_PER_ANTERIOR INDEX (FK_DP_RECIBO_PER_ANTERIOR_1))

I think this indicate that the problem is not on the choose of the plan.

b) change last IN by EXISTS make query works fast in fb3
When I change, the result is better than privous, but not so good when the version 2.5.1, see:

Plan
PLAN (DP_REGISTRO_OEST INDEX (UNQ1_DP_REGISTRO_OEST))
PLAN (DP_FERIAS INDEX (DP_FERIAS_IDX1))
PLAN JOIN (JOIN (JOIN (JOIN (DP_REGISTRO NATURAL, DP_RECIBO INDEX (FK_DP_RECIBO_1)), DP_RECIBO_EVE INDEX (FK_DP_RECIBO_EVE_1, DP_RECIBO_EVE_IDX2)), DP_EVENTO INDEX (RDB$PRIMARY_DP_EVENTO)), DP_RECIBO_PER_ANTERIOR INDEX (FK_DP_RECIBO_PER_ANTERIOR_1))

------ Performance info ------
Prepare time = 15ms
Execute time = 438ms
Current memory = 2.868.136
Max memory = 2.912.584
Memory buffers = 75
Reads from disk to cache = 24.950
Writes from cache to disk = 0
Fetches from cache = 130.603

The readings from the disk and the time of execution were reduced but not completly.

Iteresting is that if I remove the line below, the result were improved too.

and DP_RECIBO_EVE.RECORIGEM_REV = 0

And if I run only the statement that cross the tables, with all records, the time is good. So, why to bring empty is slow.

select DP_RECIBO.CODIGO_REC, DP_RECIBO_EVE.EVENTO_REV
from DP_RECIBO
inner join DP_REGISTRO on (DP_REGISTRO.AUTOINC_REGISTRO = DP_RECIBO.REGISTRO_REC)
left join DP_RECIBO_EVE on (DP_RECIBO_EVE.RECIBO_REV = DP_RECIBO.CODIGO_REC)
left join DP_EVENTO on (DP_EVENTO.CODIGO_EVENTO = DP_RECIBO_EVE.EVENTO_REV)
left join DP_RECIBO_PER_ANTERIOR on (DP_RECIBO_PER_ANTERIOR.AUTOINC_RECIBO_PERANT = DP_RECIBO.CODIGO_REC)

Plan
PLAN JOIN (JOIN (JOIN (JOIN (DP_REGISTRO NATURAL, DP_RECIBO INDEX (FK_DP_RECIBO_1)), DP_RECIBO_EVE INDEX (FK_DP_RECIBO_EVE_1)), DP_EVENTO INDEX (RDB$PRIMARY_DP_EVENTO)), DP_RECIBO_PER_ANTERIOR INDEX (FK_DP_RECIBO_PER_ANTERIOR_1))

------ Performance info ------
Prepare time = 0ms
Execute time = 5s 891ms
Avg fetch time = 0,01 ms
Current memory = 2.671.992
Max memory = 3.166.712
Memory buffers = 75
Reads from disk to cache = 91.374
Writes from cache to disk = 0
Fetches from cache = 3.992.057

@firebird-automations
Copy link
Collaborator Author

Commented by: @dyemanov

Regression was introduced while fixing CORE4690.

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

priority: Critical [ 2 ] => Major [ 3 ]

Version: 2.5.6 [ 10721 ]

Version: 4.0 Initial [ 10621 ]

Version: 2.5.4 [ 10585 ]

Component: Engine [ 10000 ]

summary: Regression in Firebird => IN/ANY/ALL predicates may cause sub-optimal (late filtering) execution of joins

Component: API / Client Library [ 10040 ] =>

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

Fix Version: 3.0.1 [ 10730 ]

Fix Version: 2.5.7 [ 10770 ]

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

status: Open [ 1 ] => Resolved [ 5 ]

resolution: Fixed [ 1 ]

Fix Version: 4.0 Alpha 1 [ 10731 ]

@firebird-automations
Copy link
Collaborator Author

Modified by: @pavel-zotov

status: Resolved [ 5 ] => Resolved [ 5 ]

QA Status: No test => Done successfully

Test Details: Test checks EXPLAINED PLAN in expected_stdout, otherwise one may not see early filtering.

@firebird-automations
Copy link
Collaborator Author

Modified by: @pavel-zotov

status: Resolved [ 5 ] => Closed [ 6 ]

@firebird-automations
Copy link
Collaborator Author

Modified by: @pavel-zotov

status: Closed [ 6 ] => Closed [ 6 ]

Test Details: Test checks EXPLAINED PLAN in expected_stdout, otherwise one may not see early filtering. => Test checks EXPLAINED PLAN in expected_stdout, otherwise one may not see early filtering. For this reason min_version is 3.0.1, not 2.5.7.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment