Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inner join (nested loop) does not take in account predicate in WHERE clause when driven source is VIEW based on GROUP BY [CORE4399] #4721

Open
firebird-automations opened this issue Apr 22, 2014 · 2 comments

Comments

@firebird-automations
Copy link
Collaborator

Submitted by: @pavel-zotov

Votes: 1

DDL:

recreate view v_dbg_log as select 1 ware_id, 2 sum_qty from rdb$database; -- drop dependency from prev. run
recreate table dbg_data(id int primary key using index dbg_data_pk, doc_id int, ware_id int);
recreate table dbg_list(id int primary key using index dbg_list_pk);
alter table dbg_data add constraint dbg_data_fk foreign key (doc_id) references dbg_list(id) using index dbg_data_fk;

recreate table dbg_log(id int primary key using index dbg_log_pk, ware_id int, qty int);
commit;

insert into dbg_list values(547);
insert into dbg_data(id, doc_id, ware_id) values(1, 547, 1001);
commit;

insert into dbg_log(id, ware_id, qty)
select 0, 1001 w, rand()*100 from rdb$database
union all
select row_number()over(), rand()*1000, rand()*100
from rdb$types a,rdb$types b,(select 1 i from rdb$types rows 20);
commit;

create index dbg_log_ware on dbg_log(ware_id);
commit;
set statistics index dbg_log_pk;
set statistics index dbg_list_pk;
set statistics index dbg_data_pk;
set statistics index dbg_data_fk;
commit;

recreate view v_dbg_log as
select ware_id, sum(qty) sum_qty
from dbg_log
group by 1;
commit;

So, we have
1) two tables: dbg_list and dbg_data - both with ONE row.
2) big table dbg_log with about 1.2 mil. rows
3) view v_dbg_log that returns ~ 1000 rows and is result of aggregating of table dbg_log.

The following query:

SQL> select http://x.id, n.sum_qty
CON> from (select http://h.id, d.ware_id from dbg_list h join dbg_data d on d.doc_id = http://h.id) x
CON> inner join v_dbg_log n on x.ware_id >= n.ware_id and x.ware_id <= n.ware_id;

gives the plan in which 'complex' source (VIEW v_dbg_log) is LEADING and does NOT pushes predicate inside that view.
AFAIK, this behaviour was stated from the early days of IB / FB.

If we try to muddle optimizer by "hint" that first source is very huge, like this:

SQL> select http://x.id, n.sum_qty
CON> from (select first 999999999 http://h.id, d.ware_id from dbg_list h join dbg_data d on d.doc_id = http://h.id) x
CON> inner join v_dbg_log n on x.ware_id >= n.ware_id and x.ware_id <= n.ware_id;

-- than plan changes and FB decides take the result of dbg_list h join dbg_data d on d.doc_id = http://h.id as LEADING part:

Select Expression
-> Filter
-> Nested Loop Join (inner)
-> First N Records
-> Nested Loop Join (inner)
-> Table "X H" Full Scan
-> Filter
-> Table "X D" Access By ID
-> Bitmap
-> Index "DBG_DATA_FK" Range Scan (full match)
-> Aggregate
-> Table "N DBG_LOG" Access By ID
-> Index "DBG_LOG_WARE" Full Scan

But even in this case predicate does not pushes inside view.

PS. I created this ticked by suggestion of Dmitry; discussion on russian national forum can be seen here:

http://www.sql.ru/forum/1090697/proval-hj-i-inner-nl-v-sluchae-soed-ya-malyh-tablic-i-vuhi-ot-group-by

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

assignee: Dmitry Yemanov [ dimitr ]

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

Version: 3.0 Beta 2 [ 10586 ]

Version: 2.5.4 [ 10585 ]

Version: 2.5.3 Update 1 [ 10650 ]

Version: 2.1.7 [ 10651 ]

Version: 3.0 Beta 1 [ 10332 ]

Version: 2.5.3 [ 10461 ]

Version: 2.1.6 [ 10460 ]

Version: 3.0 Alpha 1 [ 10331 ]

Version: 2.5.2 Update 1 [ 10521 ]

Version: 2.1.5 Update 1 [ 10522 ]

Version: 2.5.2 [ 10450 ]

Version: 2.5.1 [ 10333 ]

Version: 2.5.0 [ 10221 ]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment