Issue Details (XML | Word | Printable)

Key: CORE-2200
Type: Improvement Improvement
Status: Closed Closed
Resolution: Fixed
Priority: Minor Minor
Assignee: Dmitry Yemanov
Reporter: Chau Chee Yang
Votes: 0
Watchers: 2
Operations

If you were logged in you would be able to see more operations.
Firebird Core

Extremely slow executing a cross join of 3 tables in Firebird 2.X

Created: 19/Nov/08 04:14 AM   Updated: 13/May/15 06:37 AM
Component/s: Engine
Affects Version/s: 2.0.0, 2.0.1, 2.0.2, 2.0.3, 2.1.0, 2.0.4, 2.1.1
Fix Version/s: 2.1.2, 2.5 Beta 1, 2.0.6

File Attachments: 1. Zip Archive CORE.2200.zip (323 kB)

Environment:
Server: Firebird 2.1.1 on either Linux or Windows platform
Client : IBExpert running on Windows OS
Issue Links:
Relate
 


 Description  « Hide
I have 3 tables:

 1. Agent (0 row)
 2. AR_PM (4,300+ rows)
 3. AR_KnockOff (18,000+ rows)

I execute the following statement in Firebird 2.1.1:

SELECT A.Code FROM AGENT A, AR_PM B, AR_KnockOff C

It takes long times to get the first response back to my client software (IB Expert). However, this situation doesn't happen on Firebird 1.5 running the same query and same amount of data.

Case 1:
======
Firebird 2.1.1: 173 seconds
Firebird 1.5.X: 0 seconds

Case 2:
======
If I append one row into AGENT table, I get the response immediately after running the query:

Firebird 2.1.1: 0.093 seconds
Firebird 1.5.X: 0.031 seconds

I suspect there is some problem with Firebird 2.X. I also attach my database backup for both Firebird 1.5 and 2.1 for you to try out. Please pay attention on the time spent to run the query on Case 1.



 All   Comments   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Chau Chee Yang added a comment - 19/Nov/08 04:18 AM
Firebird 2.X
=========
1. Extract all files from the zip archive.
2. Restore db_2_0.fbk into firebird 2.X server
3. Execute statement: SELECT A.Code FROM AGENT A, AR_PM B, AR_KnockOff C
4. It takes more than 170 seconds to response

Firebird 1.5.X
===========
1. Restore db_1_5.fbk into firebird 1.5 server
2. Execute statement: SELECT A.Code FROM AGENT A, AR_PM B, AR_KnockOff C
3. you get immediate response

Dmitry Yemanov added a comment - 19/Nov/08 07:19 AM
The problem is slightly unusual. With a cross join without any predicates, it actually makes no difference which table to start from. FB 1.5 ignores this fact and always tries to start from a smaller table, as for a regular predicate-driven join. FB 2.0 pretends to be smarter and avoids the [sometimes expensive] cost calculation, starting from the any independent table (in the current implementation, this is the last one in the list). In fact, both solutions should deliver nearly the same performance in all cases but one: when any of the tables is empty. In this case, it would be ideal to put it into the first place in order to skip reading other tables at all. But I wouldn't say that processing empty tables in cross joins is a common real-world practice, so perhaps you're the only person affected :-)

So, I see three possibilities here:

  1) Remove the clever "fast-route" code and rely on costs, as in FB 1.5. This is dumb but it handles the empty table case automagically.
  2) Add a special check for empty tables into the "fast-route" code.
  3) Do nothing and treat this as a "no issue".

Sean Leyne added a comment - 19/Nov/08 12:46 PM
My answer:

3) Do nothing and treat this as a "no issue"

Natural Join are the dumbest thing that a user/developer could do.

Chau Chee Yang added a comment - 19/Nov/08 10:00 PM
The reason I wrote this natural join query is to retrieve a structure (metadata).

My statement may have 2, 3, 4, or 5 statements. It is costly if I perform checking on the row count in each table and re-arrange the join tables sequence.

I personally feel this is an issue as MSSQL or MYSQL doesn't has this problem at all. It should be a problem in Firebird 2.X engine that didn't handle this case well.

Please do consider to fix it. Thank you.

Dmitry Yemanov added a comment - 20/Nov/08 08:27 AM
I'm going to implement a fourth option, which should work fine regardless of the optimizer decision :-)

Sean Leyne added a comment - 20/Nov/08 01:00 PM
Dmirty, so what is the fourth option?

Dmitry Yemanov added a comment - 20/Nov/08 04:22 PM
Improve the cross join algorithm to stop as soon as any of the involved streams is detected as empty.

Chau Chee Yang added a comment - 20/Nov/08 09:43 PM
Thank you very much

Pavel Zotov added a comment - 13/May/15 06:37 AM
IMO, engine should not begin ANY kind of join (not only cross) when absence of rows in at least one of it's data sources forces final result to be empty.

Sample:
=======
recreate table t0(id int);
recreate table t1(id int);
recreate table t2(id int);
recreate table t3(id int);
commit;
insert into t0 select rand()*100 from rdb$types,rdb$types rows 30000;
commit;

insert into t1 select * from t0;
insert into t2 select * from t0;
--------- ::: NB ::: we do NOT add any row to the table `t3`, it remains empty -----------
commit;

set stat on;
set echo on;

select 1
from t1
natural join t2
natural join t3;

select 1
from t2
natural join t3
natural join t1;

select 1
from t3
natural join t2
natural join t1;

Result:
======

select 1
from t1
natural join t2
natural join t3;
Current memory = 3191104
Delta memory = 38040
Max memory = 4863032
Elapsed time= 14.562 sec
Buffers = 256
Reads = 769
Writes = 8
Fetches = 121552

select 1
from t2
natural join t3
natural join t1;
Current memory = 3191280
Delta memory = 176
Max memory = 4863032
Elapsed time= 0.195 sec
Buffers = 256
Reads = 747
Writes = 0
Fetches = 121491

select 1
from t3
natural join t2
natural join t1;
Current memory = 3191280
Delta memory = 0
Max memory = 4863032
Elapsed time= 0.196 sec
Buffers = 256
Reads = 748
Writes = 0
Fetches = 121491


Trace:
======

Query-1:
-----------
select 1
from t1
natural join t2
natural join t3
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
PLAN HASH (HASH (T2 NATURAL, T1 NATURAL), T3 NATURAL)
0 records fetched
  14558 ms, 744 read(s), 8 write(s), 121487 fetch(es)

Table Natural Index
******************************************************
T1 30000
T2 30000

Query-2:
-----------
select 1
from t2
natural join t3
natural join t1
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
PLAN HASH (T1 NATURAL, HASH (T2 NATURAL, T3 NATURAL)
0 records fetched
    193 ms, 743 read(s), 121487 fetch(es)

Table Natural Index
****************************************************
T1 30000
T2 30000

Query-3:
------------
select 1
from t3
natural join t2
natural join t1
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
PLAN HASH (T1 NATURAL, HASH (T2 NATURAL, T3 NATURAL))
0 records fetched
    192 ms, 744 read(s), 121487 fetch(es)

Table Natural Index
******************************************************
T1 30000
T2 30000

All scans of tables `t1` & `t2` looks like wasted labour...