Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimiser chooses poor plan where outer joined tables are listed before inner joined tables, won't accept better plan [CORE2937] #3320

Closed
firebird-automations opened this issue Mar 23, 2010 · 7 comments

Comments

@firebird-automations
Copy link
Collaborator

Submitted by: Karl Reynolds (karlreynolds)

Duplicates CORE1239

Create the following tables for testing:

CREATE TABLE MASTER (
MASTER_ID INTEGER NOT NULL PRIMARY KEY,
MASTER_COL INTEGER NOT NULL
);
CREATE INDEX I_M_MASTER_COL ON MASTER (MASTER_COL);
CREATE TABLE DETAIL (
DETAIL_ID INTEGER NOT NULL PRIMARY KEY,
MASTER_ID INTEGER NOT NULL,
DETAIL_COL INTEGER NOT NULL
);
ALTER TABLE DETAIL ADD FOREIGN KEY (MASTER_ID) REFERENCES MASTER (MASTER_ID);
CREATE TABLE OUTER_JOIN (
OJ_ID INTEGER NOT NULL PRIMARY KEY,
OJ_COL INTEGER NOT NULL
);

Now add some data into those tables:

insert into master values (0,0);
insert into master values (1,1);
insert into detail values (0,0,0);
insert into detail values (1,1,1);
insert into outer_join values (0,0);
insert into outer_join values (1,1);

(without the data the optimiser will produce a plan optimised for no data)

Now run the following query:

select *
from DETAIL d
inner join MASTER m
on m.master_id = d.master_id
and m.master_col = 0
left outer join OUTER_JOIN oj
on oj.oj_col = d.detail_col

This will produce a plan such as:

PLAN JOIN (JOIN (M INDEX (I_M_MASTER_COL), D INDEX (RDB$FOREIGN540)), OJ NATURAL)

(which exact plan is produced depends on the rdb$foreign index name)

This plan is as you would expect. Now run the following query:

select *
from DETAIL d
left outer join OUTER_JOIN oj
on oj.oj_col = d.detail_col
inner join MASTER m
on m.master_id = d.master_id
and m.master_col = 0

All that has been done is to rearrange the order of the join clauses, to precede the join to MASTER by the join to OUTER_JOIN. This will produce a plan such as:

PLAN JOIN (JOIN (D NATURAL, OJ NATURAL), M INDEX (RDB$PRIMARY538))

(which exact plan is produced depends on the rdb$primary index name)

The first part of the bug is that the plans are different. If one plan is better than another for a particular arrangement of tables and indices (including statistics), shouldn't the optimiser choose the best plan in both instances?

The second part of the bug is that, while the second plan can be used with the first query in a plan statement, if first plan is attached to the second query, you get the error "index cannot be used in the specified plan". Why not?

Workaround: List inner joins before outer joins in queries.

====== Test Details ======

Deferred until changes in optimizer will be done. Current FB (4.0.0.453) still produce inefficient plan for 2nd query.

@firebird-automations
Copy link
Collaborator Author

Modified by: Karl Reynolds (karlreynolds)

description: Create the following tables for testing:

CREATE TABLE MASTER (
MASTER_ID INTEGER NOT NULL PRIMARY KEY,
MASTER_COL INTEGER NOT NULL
);
CREATE INDEX I_M_MASTER_COL ON MASTER (MASTER_COL);
CREATE TABLE DETAIL (
DETAIL_ID INTEGER NOT NULL PRIMARY KEY,
MASTER_ID INTEGER NOT NULL,
DETAIL_COL INTEGER NOT NULL
);
ALTER TABLE DETAIL ADD FOREIGN KEY (MASTER_ID) REFERENCES MASTER (MASTER_ID);
CREATE TABLE OUTER_JOIN (
OJ_ID INTEGER NOT NULL PRIMARY KEY,
OJ_COL INTEGER NOT NULL
);

Now add some data into those tables:

insert into master values (0,0);
insert into master values (1,1);
insert into detail values (0,0,0);
insert into detail values (1,1,1);
insert into outer_join values (0,0);
insert into outer_join values (1,1);

(without the data the optimiser will produce a plan optimised for no data)

Now run the following query:

select *
from DETAIL d
inner join MASTER m
on m.master_id = d.master_id
and m.master_col = 0
left outer join OUTER_JOIN oj
on oj.oj_col = d.detail_col

This will produce a plan such as:

PLAN JOIN (JOIN (M INDEX (I_M_MASTER_COL), D INDEX (RDB$FOREIGN540)), OJ NATURAL)

(which exact plan is produced depends on the rdb$foreign index name)

This plan is as you would expect. Now run the following query:

select *
from DETAIL d
left outer join OUTER_JOIN oj
on oj.oj_col = d.detail_col
inner join MASTER m
on m.master_id = d.master_id
and m.master_col = 0

All that has been done is to rearrange the order of the join clauses, to precede the join to MASTER by the join to OUTER_JOIN. This will produce a plan such as:

PLAN JOIN (JOIN (D NATURAL, OJ NATURAL), M INDEX (RDB$PRIMARY538))

(which exact plan is produced depends on the rdb$primary index name)

The first part of the bug is that the plans are different. If one plan is better than another for a particular arrangement of tables and indices (including statistics), shouldn't the optimiser choose the best plan in both instances?

The second part of the bug is that, while the second plan can be used with the first query in a plan statement, if first plan is attached to the second query, you get the error "index cannot be used in the specified plan". Why not?

Workaround: List outer joins before inner joins in queries.

=>

Create the following tables for testing:

CREATE TABLE MASTER (
MASTER_ID INTEGER NOT NULL PRIMARY KEY,
MASTER_COL INTEGER NOT NULL
);
CREATE INDEX I_M_MASTER_COL ON MASTER (MASTER_COL);
CREATE TABLE DETAIL (
DETAIL_ID INTEGER NOT NULL PRIMARY KEY,
MASTER_ID INTEGER NOT NULL,
DETAIL_COL INTEGER NOT NULL
);
ALTER TABLE DETAIL ADD FOREIGN KEY (MASTER_ID) REFERENCES MASTER (MASTER_ID);
CREATE TABLE OUTER_JOIN (
OJ_ID INTEGER NOT NULL PRIMARY KEY,
OJ_COL INTEGER NOT NULL
);

Now add some data into those tables:

insert into master values (0,0);
insert into master values (1,1);
insert into detail values (0,0,0);
insert into detail values (1,1,1);
insert into outer_join values (0,0);
insert into outer_join values (1,1);

(without the data the optimiser will produce a plan optimised for no data)

Now run the following query:

select *
from DETAIL d
inner join MASTER m
on m.master_id = d.master_id
and m.master_col = 0
left outer join OUTER_JOIN oj
on oj.oj_col = d.detail_col

This will produce a plan such as:

PLAN JOIN (JOIN (M INDEX (I_M_MASTER_COL), D INDEX (RDB$FOREIGN540)), OJ NATURAL)

(which exact plan is produced depends on the rdb$foreign index name)

This plan is as you would expect. Now run the following query:

select *
from DETAIL d
left outer join OUTER_JOIN oj
on oj.oj_col = d.detail_col
inner join MASTER m
on m.master_id = d.master_id
and m.master_col = 0

All that has been done is to rearrange the order of the join clauses, to precede the join to MASTER by the join to OUTER_JOIN. This will produce a plan such as:

PLAN JOIN (JOIN (D NATURAL, OJ NATURAL), M INDEX (RDB$PRIMARY538))

(which exact plan is produced depends on the rdb$primary index name)

The first part of the bug is that the plans are different. If one plan is better than another for a particular arrangement of tables and indices (including statistics), shouldn't the optimiser choose the best plan in both instances?

The second part of the bug is that, while the second plan can be used with the first query in a plan statement, if first plan is attached to the second query, you get the error "index cannot be used in the specified plan". Why not?

Workaround: List inner joins before outer joins in queries.

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

Link: This issue duplicates CORE1239 [ CORE1239 ]

@firebird-automations
Copy link
Collaborator Author

Commented by: @dyemanov

Duplicates CORE1239. The problem is that outer joins are optimized separately from inner ones and in this case we have D OUTER JOIN (OJ INNER JOIN M), so it's impossible to join D with M directly. This is old and well-known limitation still not addressed.

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

status: Open [ 1 ] => Resolved [ 5 ]

resolution: Duplicate [ 3 ]

@firebird-automations
Copy link
Collaborator Author

Modified by: @pcisar

status: Resolved [ 5 ] => Closed [ 6 ]

@firebird-automations
Copy link
Collaborator Author

Modified by: @pavel-zotov

QA Status: No test

@firebird-automations
Copy link
Collaborator Author

Modified by: @pavel-zotov

status: Closed [ 6 ] => Closed [ 6 ]

QA Status: No test => Deferred

Test Details: Deferred until changes in optimizer will be done. Current FB (4.0.0.453) still produce inefficient plan for 2nd query.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant