New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simple indexed join query slows terrible down with useage of 'like' [CORE2267] #2693
Comments
Commented by: @hvlad Show us full execution statistics, please |
Commented by: Mercea Paul (m24paul) If you always have LIKE '81%' you could use STARTING WITH '81' select count(*) from nummer A JOIN NAAM B ON (B.NAAMID=A.NAAMID) where A.POSTCODE STARTING WITH '81' and B.naam = 'SEARCHEDNAME' |
Commented by: Roy Damman (royd) @ Vlad Which execution statistics do you want and how do I obtain them? Extra statistics for Firebird 2.1.1 with usage of * instead of count(*) Query A: Execution time: 00:00:04.0343 Query B: Execution time: 00:00:00.0062 @mercea select * from nummer A JOIN NAAM B ON (B.NAAMID=A.NAAMID) where A.POSTCODE STARTING WITH '81' and B.naam = 'SEARCHEDNAME' select * from nummer A JOIN NAAM B ON (B.NAAMID=A.NAAMID) where A.POSTCODE STARTING WITH '81' and B.naam LIKE 'SEARCHEDNAME%' select * from nummer A JOIN NAAM B ON (B.NAAMID=A.NAAMID) where A.POSTCODE STARTING WITH '81' and B.naam STARTING WITH 'SEARCHEDNAME' select * from nummer A JOIN NAAM B ON (B.NAAMID=A.NAAMID) where A.POSTCODE LIKE '81%' and B.naam STARTING WITH 'SEARCHEDNAME' |
Commented by: @hvlad Please, show also results and statistics for the following queries : SELECT COUNT(*) FROM NAAM WHERE naam LIKE 'SEARCHEDNAME%' |
Commented by: @dyemanov Number of fetches would also be useful, I suppose. |
Commented by: Roy Damman (royd) SELECT COUNT(*) FROM NAAM WHERE naam LIKE 'SEARCHEDNAME%' SELECT COUNT(*) FROM NAAM WHERE naam = 'SEARCHEDNAME' SELECT COUNT(*) FROM nummer WHERE POSTCODE LIKE '81%' and NAAMID = 11667 (NAAMID FROM QUERY 2) I understand in which direction you are searching the explanation. So I changed SEARCHNAME in naam in UNIQUENAME so that the count of =UNIQUENAME and LIKE UNIQUENAME% would be the same. After that the statistics of QUERY A and QUERY B where in the same range. I will install FB 1.5.x once again and repeat the tests above. |
Submitted by: Roy Damman (royd)
Problem discovered when trying to migrate a project from Firebird 1.5.x to 2.0 and after that to (the latest) version 2.1.1.
Note: All execution times are obtained by executing the involved queries twice and take the second time, so there would not be any caching issues.
2 Tables:
Table NUMMER:
select count(*) from nummer: > 14000000
Table NAAM:
select count(*) from naam: > 3700000
Query A:
select count(*) from nummer A JOIN NAAM B ON (B.NAAMID=A.NAAMID) where A.POSTCODE LIKE '81%' and B.naam like 'SEARCHEDNAME%'
Result: 11
Execution time: 2.0188
Plan: PLAN JOIN (B INDEX (NAAM_NAAM), A INDEX (NUMMER_PCHUISNR, NUMMER_NAAMNUMMER))
Query B:
select count(*) from nummer A JOIN NAAM B ON (B.NAAMID=A.NAAMID) where A.POSTCODE LIKE '81%' and B.naam = 'SEARCHEDNAME'
or even:
select count(*) from nummer A JOIN NAAM B ON (B.NAAMID=A.NAAMID) where A.POSTCODE LIKE '81%' and B.naam like 'SEARCHEDNAME'
Result: 6
Execution time: 0.0031
Plan: PLAN JOIN (B INDEX (NAAM_NAAM), A INDEX (NUMMER_PCHUISNR, NUMMER_NAAMNUMMER))
In both cases the same plan is used. And I think both queries can use the same lookup method of the matching B.naam values because the start of the string ('SEARCHEDNAME') is the same. But the first query is about 650 times slower then the second query and has only a double count of records involved. This problem did not occur with Firebird 1.5.X but this behaviour makes Firebird 2.x not useable for this project. Also I tried * instead of count(*).
Results:
Query A:
Execution time: 4.0437
Plan: PLAN JOIN (B INDEX (NAAM_NAAM), A INDEX (NUMMER_PCHUISNR, NUMMER_NAAMNUMMER))
Query B:
Execution time: 0.0079
Plan: PLAN JOIN (B INDEX (NAAM_NAAM), A INDEX (NUMMER_PCHUISNR, NUMMER_NAAMNUMMER))
Again a huge difference: Query B is 512 times faster.
I also tried the search with different names. All had the same result in execution time difference.
The Tables:
CREATE TABLE "NUMMER"
(
"ID" INTEGER NOT NULL,
"NUMMER" VARCHAR(12),
"NAAMID" INTEGER,
"POSTCODE" CHAR(8),
"HUISNUMMER_N" INTEGER,
PRIMARY KEY ("ID")
);
CREATE INDEX "NUMMER_NAAMNUMMER" ON "NUMMER"("NAAMID", "NUMMER");
CREATE INDEX "NUMMER_NUMMER" ON "NUMMER"("NUMMER", "NAAMID");
CREATE INDEX "NUMMER_PCHUISNR" ON "NUMMER"("POSTCODE", "HUISNUMMER_N");
CREATE TABLE "NAAM"
(
"NAAMID" INTEGER NOT NULL,
"NAAM" VARCHAR(50),
PRIMARY KEY ("NAAMID")
);
CREATE INDEX "NAAM_NAAM" ON "NAAM"("NAAM");
The text was updated successfully, but these errors were encountered: