Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Force optimizer to take in account 'order by T.id ROWS n' that is specified for big table 'T' which is cross joined with small datasource [CORE5048] #5335

Open
firebird-automations opened this issue Dec 14, 2015 · 2 comments

Comments

@firebird-automations
Copy link
Collaborator

Submitted by: @pavel-zotov

DDL:

recreate table thuge(id int, x int);
insert into thuge
select row_number()over(), rand()*10000
from rdb$types,rdb$types,rdb$types;
commit;
create index thuge_x on thuge(x);

recreate table ttiny(id int, v int, u int, w int);
insert into ttiny values(1, 100, 200, 300);
create index ttiny_id on ttiny(id);

commit;

Query-1:

select <http://h.id>, t\.u, t\.v, t\.w
from \(
    select <http://h.id>
    from thuge h
    order by x
    rows 1
\) h
cross join ttiny t;

Its plan:

Select Expression
-> Nested Loop Join (inner)
-> First N Records
-> Table "THUGE" as "H H" Access By ID
-> Index "THUGE_X" Full Scan
-> Table "RDB$DATABASE" Full Scan

This query returns one row instantly (<15ms, Fetches from cache = 18).

Query-2:

select <http://h.id>, t\.u, t\.v, t\.w
from thuge h
cross join ttiny t
order by h\.x
rows 1;

It has totally inefficient plan:

Select Expression
-> First N Records
-> Sort (record length: 76, key length: 8)
-> Nested Loop Join (inner)
-> Filter
-> Table "TTINY" as "T" Access By ID
-> Bitmap
-> Index "TTINY_ID" Range Scan (lower bound: 1/1, upper bound: 1/1)
-> Table "THUGE" as "H" Full Scan

Statistics:
Execute : 21 859,00 ms
Fetches: 32 991 255
NR for THuge = 16387064.

But both queries always will return the same data because ORDER BY clause does not contain any fields from TTINY table.

@firebird-automations
Copy link
Collaborator Author

Commented by: Sean Leyne (seanleyne)

Although the queries will return the same results, they are not equivalent. Accordingly, it is not appropriate to try to make a comparison between them.

@firebird-automations
Copy link
Collaborator Author

Commented by: @pavel-zotov

If resultset is the same on all possible data (including NULLs) then two queries DO equal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant