Issue Details (XML | Word | Printable)

Key: CORE-2267
Type: Improvement Improvement
Status: Open Open
Priority: Major Major
Assignee: Unassigned
Reporter: Roy Damman
Votes: 0
Watchers: 1
Operations

If you were logged in you would be able to see more operations.
Firebird Core

Simple indexed join query slows terrible down with useage of 'like'

Created: 06/Jan/09 03:26 PM   Updated: 08/Jan/09 06:44 AM
Component/s: None
Affects Version/s: None
Fix Version/s: None

Environment: P4 Single Core 2.4 GHz with 512 Mb memory running XP Professional SP3. Testcase created with IBOConsole.


 Description  « Hide
Problem discovered when trying to migrate a project from Firebird 1.5.x to 2.0 and after that to (the latest) version 2.1.1.
Note: All execution times are obtained by executing the involved queries twice and take the second time, so there would not be any caching issues.

2 Tables:
Table NUMMER:
select count(*) from nummer: > 14000000

Table NAAM:
select count(*) from naam: > 3700000

Query A:
select count(*) from nummer A JOIN NAAM B ON (B.NAAMID=A.NAAMID) where A.POSTCODE LIKE '81%' and B.naam like 'SEARCHEDNAME%'
Result: 11
Execution time: 2.0188
Plan: PLAN JOIN (B INDEX (NAAM_NAAM), A INDEX (NUMMER_PCHUISNR, NUMMER_NAAMNUMMER))

Query B:
select count(*) from nummer A JOIN NAAM B ON (B.NAAMID=A.NAAMID) where A.POSTCODE LIKE '81%' and B.naam = 'SEARCHEDNAME'
or even:
select count(*) from nummer A JOIN NAAM B ON (B.NAAMID=A.NAAMID) where A.POSTCODE LIKE '81%' and B.naam like 'SEARCHEDNAME'
Result: 6
Execution time: 0.0031
Plan: PLAN JOIN (B INDEX (NAAM_NAAM), A INDEX (NUMMER_PCHUISNR, NUMMER_NAAMNUMMER))


In both cases the same plan is used. And I think both queries can use the same lookup method of the matching B.naam values because the start of the string ('SEARCHEDNAME') is the same. But the first query is about 650 times slower then the second query and has only a double count of records involved. This problem did not occur with Firebird 1.5.X but this behaviour makes Firebird 2.x not useable for this project. Also I tried * instead of count(*).

Results:

Query A:
Execution time: 4.0437
Plan: PLAN JOIN (B INDEX (NAAM_NAAM), A INDEX (NUMMER_PCHUISNR, NUMMER_NAAMNUMMER))

Query B:
Execution time: 0.0079
Plan: PLAN JOIN (B INDEX (NAAM_NAAM), A INDEX (NUMMER_PCHUISNR, NUMMER_NAAMNUMMER))

Again a huge difference: Query B is 512 times faster.
I also tried the search with different names. All had the same result in execution time difference.


The Tables:

CREATE TABLE "NUMMER"
(
  "ID" INTEGER NOT NULL,
  "NUMMER" VARCHAR(12),
  "NAAMID" INTEGER,
  "POSTCODE" CHAR(8),
  "HUISNUMMER_N" INTEGER,
 PRIMARY KEY ("ID")
);

CREATE INDEX "NUMMER_NAAMNUMMER" ON "NUMMER"("NAAMID", "NUMMER");
CREATE INDEX "NUMMER_NUMMER" ON "NUMMER"("NUMMER", "NAAMID");
CREATE INDEX "NUMMER_PCHUISNR" ON "NUMMER"("POSTCODE", "HUISNUMMER_N");

CREATE TABLE "NAAM"
(
  "NAAMID" INTEGER NOT NULL,
  "NAAM" VARCHAR(50),
 PRIMARY KEY ("NAAMID")
);

CREATE INDEX "NAAM_NAAM" ON "NAAM"("NAAM");


 All   Comments   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Vlad Khorsun added a comment - 06/Jan/09 04:13 PM
Show us full execution statistics, please
If possible, for both FB 1.5 and FB 2.1

Mercea Paul added a comment - 06/Jan/09 05:54 PM
If you always have LIKE '81%' you could use STARTING WITH '81'

select count(*) from nummer A JOIN NAAM B ON (B.NAAMID=A.NAAMID) where A.POSTCODE STARTING WITH '81' and B.naam = 'SEARCHEDNAME'

Roy Damman added a comment - 07/Jan/09 02:40 PM
@ Vlad

Which execution statistics do you want and how do I obtain them?

Extra statistics for Firebird 2.1.1 with usage of * instead of count(*)

Query A:
select * from nummer A JOIN NAAM B ON (B.NAAMID=A.NAAMID) where A.POSTCODE LIKE '81%' and B.naam like 'SEARCHEDNAME%'

Execution time: 00:00:04.0343
Prepare time: 00:00:00.0032
Starting Memory: 86508148
Current Memory: 86695300
Delta Memory: 187152
Number of Buffers: 20000
Reads: 1480
Writes: 31
Plan: PLAN JOIN (B INDEX (NAAM_NAAM), A INDEX (NUMMER_PCHUISNR, NUMMER_NAAMNUMMER))
Rows Affected: 11


Query B:
select * from nummer A JOIN NAAM B ON (B.NAAMID=A.NAAMID) where A.POSTCODE LIKE '81%' and B.naam = 'SEARCHEDNAME'

Execution time: 00:00:00.0062
Prepare time: 00:00:00.0031
Starting Memory: 86508080
Current Memory: 86695192
Delta Memory: 187112
Number of Buffers: 20000
Reads: 1480
Writes: 39
Plan: PLAN JOIN (B INDEX (NAAM_NAAM), A INDEX (NUMMER_PCHUISNR, NUMMER_NAAMNUMMER))
Rows Affected: 6


@Mercea
This is not the case. And it shouldn't make any difference if the optimizer was OK.

select * from nummer A JOIN NAAM B ON (B.NAAMID=A.NAAMID) where A.POSTCODE STARTING WITH '81' and B.naam = 'SEARCHEDNAME'
Execution time: 00:00:00.0062
Prepare time: 00:00:00.0016
Rows Affected: 6
Plan: PLAN JOIN (B INDEX (NAAM_NAAM), A INDEX (NUMMER_PCHUISNR, NUMMER_NAAMNUMMER))

select * from nummer A JOIN NAAM B ON (B.NAAMID=A.NAAMID) where A.POSTCODE STARTING WITH '81' and B.naam LIKE 'SEARCHEDNAME%'
Execution time: 00:00:04.0359
Prepare time: 00:00:00.0031
Rows Affected: 11
Plan: PLAN JOIN (B INDEX (NAAM_NAAM), A INDEX (NUMMER_PCHUISNR, NUMMER_NAAMNUMMER))

select * from nummer A JOIN NAAM B ON (B.NAAMID=A.NAAMID) where A.POSTCODE STARTING WITH '81' and B.naam STARTING WITH 'SEARCHEDNAME'
Execution time: 00:00:04.0359 (this is right, exact the same)
Prepare time: 00:00:00.0032
Rows Affected: 11
Plan: PLAN JOIN (B INDEX (NAAM_NAAM), A INDEX (NUMMER_PCHUISNR, NUMMER_NAAMNUMMER))

select * from nummer A JOIN NAAM B ON (B.NAAMID=A.NAAMID) where A.POSTCODE LIKE '81%' and B.naam STARTING WITH 'SEARCHEDNAME'
Execution time: 00:00:04.0375
Prepare time: 00:00:00.0031
Rows Affected: 11
Plan: PLAN JOIN (B INDEX (NAAM_NAAM), A INDEX (NUMMER_PCHUISNR, NUMMER_NAAMNUMMER))






Vlad Khorsun added a comment - 07/Jan/09 05:05 PM
Please, show also results and statistics for the following queries :

SELECT COUNT(*) FROM NAAM WHERE naam LIKE 'SEARCHEDNAME%'
SELECT COUNT(*) FROM NAAM WHERE naam = 'SEARCHEDNAME'
SELECT COUNT(*) FROM nummer WHERE POSTCODE LIKE '81%' and NAAMID = <any id from the query above>

Dmitry Yemanov added a comment - 07/Jan/09 05:16 PM
Number of fetches would also be useful, I suppose.

Roy Damman added a comment - 08/Jan/09 06:44 AM
SELECT COUNT(*) FROM NAAM WHERE naam LIKE 'SEARCHEDNAME%'
COUNT: 91
Execution time: 00:00:00.0000
Prepare time: 00:00:00.0000
Starting Memory: 86328284
Current Memory: 86339364
Delta Memory: 11080
Number of Buffers: 20000
Reads: 313
Writes: 17
Plan: PLAN (NAAM INDEX (NAAM_NAAM))
Rows Affected: 1

SELECT COUNT(*) FROM NAAM WHERE naam = 'SEARCHEDNAME'
COUNT: 1
Execution time: 00:00:00.0000
Prepare time: 00:00:00.0000
Starting Memory: 86329632
Current Memory: 86333200
Delta Memory: 3568
Number of Buffers: 20000
Reads: 313
Writes: 29
Plan: PLAN (NAAM INDEX (NAAM_NAAM))
Rows Affected: 1

SELECT COUNT(*) FROM nummer WHERE POSTCODE LIKE '81%' and NAAMID = 11667 (NAAMID FROM QUERY 2)
COUNT: 6
Execution time: 00:00:00.0031
Prepare time: 00:00:00.0000
Starting Memory: 86344516
Current Memory: 86511124
Delta Memory: 166608
Number of Buffers: 20000
Reads: 992
Writes: 47
Plan: PLAN (NAAM INDEX (NAAM_NAAM))
Rows Affected: 1


I understand in which direction you are searching the explanation. So I changed SEARCHNAME in naam in UNIQUENAME so that the count of =UNIQUENAME and LIKE UNIQUENAME% would be the same. After that the statistics of QUERY A and QUERY B where in the same range. I will install FB 1.5.x once again and repeat the tests above.