Join condition fails for UTF-8 databases. [CORE2826] #3213

firebird-automations · 2010-01-26T16:23:23Z

Submitted by: Håkan Johansson (hakjo)

Is related to CORE3056
Is related to QA274

Join conditions fail in some specific conditions when run on a database created for UTF-8 charset.

Steps to reproduce (I am not sure all is needed, but that is what our framework does):
CREATE DATABASE test_db.fdb USER 'sysdba' PASSWORD 'password' PAGE_SIZE=16384 DEFAULT CHARSET SET UTF8;
CREATE COLLATION UNICODE_NOPAD FOR UTF8 FROM UNICODE NO PAD;
UPDATE RDB$CHARACTER_SETS SET RDB$DEFAULT_COLLATE_NAME = 'UNICODE_NOPAD' WHERE RDB$CHARACTER_SET_NAME = 'UTF8';

CREATE TABLE tst1 (
k1 VARCHAR(3),
k2 INT,
k3 CHAR(1),
PRIMARY KEY (k1, k2, k3)
)
INSERT INTO tst1 VALUES ('AP', 123, ' ');
INSERT INTO tst1 VALUES ('HEL', 666, 'V');

CREATE TABLE tst2 (
k1 VARCHAR(3),
k2 INT,
k3 CHAR(1),
PRIMARY KEY (k1, k2, k3)
)
INSERT INTO tst2 VALUES ('AP', 123, ' ');
INSERT INTO tst2 VALUES ('HEL', 666, 'V');

-- NOTE: tst1 and tst2 are identical, both in structure and data.
-- NOTE: If the 'PRIMARY KEY' is removed, then everything works ok.
-- NOTE: If 'PRIMARY KEY' is replaced with 'UNIQUE', then the error still occurs.
-- NOTE: If the type of 'k3' is changed to 'VARCHAR(1)', then everything works ok.

-- Problem statement:
SELECT t1.*
FROM tst1 t1, tst2 t2
WHERE t1.k1 = t2.k1
AND t1.k2 = t2.k2
AND t1.k3 = t2.k3

-- The result is that no matches are found.
-- NOTE: If the 'k1' test is removed, then the expected result is returned.
-- NOTE: If the 'k2' test is removed, then the expected result is returned.
-- NOTE: If the 'k3' test is removed, then the expected result is returned.
-- NOTE: This means that all combinations of checking two key colums return the expected result, but checking all three at the same time will return nothing.

Commits: 12bb869 cdd366e 632e72c bac1cfa

The text was updated successfully, but these errors were encountered:

firebird-automations · 2010-01-26T16:35:37Z

Commented by: @dyemanov

It could be related to CORE1997 and/or CORE2232.

firebird-automations · 2010-05-07T14:56:33Z

Commented by: Håkan Johansson (hakjo)

Raised priority to critical since this bug makes it impossible to use Firebird with UTF-8 data.

firebird-automations · 2010-05-07T14:56:33Z

Modified by: Håkan Johansson (hakjo)

priority: Major [ 3 ] => Critical [ 2 ]

firebird-automations · 2010-05-07T15:10:35Z

Commented by: @asfernandes

I see no direct relation of this issue with CORE1997 (about cascade) and CORE2232 (commented saying lookup of empty keys are broken).

BTW, I can't reproduce this one in HEAD and in 2.1.1.

firebird-automations · 2010-05-10T11:10:19Z

Commented by: Håkan Johansson (hakjo)

Start with no database.

$ isql -z
ISQL Version: LI-V2.1.1.17910 Firebird 2.1

$ isql
CREATE DATABASE 'localhost:/tmp/test_db.fdb' USER 'sysdba' PASSWORD 'password' PAGE_SIZE=16384 DEFAULT CHARACTER SET UTF8;
CREATE COLLATION UNICODE_NOPAD FOR UTF8 FROM UNICODE NO PAD;
UPDATE RDB$CHARACTER_SETS SET RDB$DEFAULT_COLLATE_NAME = 'UNICODE_NOPAD' WHERE RDB$CHARACTER_SET_NAME = 'UTF8';
COMMIT;
exit;

$isql
CONNECT 'localhost:/tmp/test_db.fdb' USER 'sysdba' PASSWORD 'password';

CREATE TABLE tst1 (
k1 VARCHAR(3),
k2 INT,
k3 CHAR(1)
, PRIMARY KEY (k1, k2, k3)
);

CREATE TABLE tst2 (
k1 VARCHAR(3),
k2 INT,
k3 CHAR(1)
, PRIMARY KEY (k1, k2, k3)
);
COMMIT;

INSERT INTO tst1 VALUES ('AP', 123, ' ');
INSERT INTO tst2 VALUES ('AP', 123, ' ');
INSERT INTO tst1 VALUES ('HEL', 666, 'V');
INSERT INTO tst2 VALUES ('HEL', 666, 'V');
COMMIT;

SELECT t1.*
FROM tst1 t1, tst2 t2
WHERE 1 = 1
AND t1.k1 = t2.k1
AND t1.k2 = t2.k2
AND t1.k3 = t2.k3;

-----------------------------------------------------------------------------------------------------------------------------------------
I get no result from the SELECT statement.

If I don't exit isql between creating the schema and running the commands I get the expected result.

firebird-automations · 2010-05-10T11:38:02Z

Commented by: @hvlad

> If I don't exit isql between creating the schema and running the commands I get the expected result.

What if you issue SET NAMES UTF8 *before* CONNECT ?

firebird-automations · 2010-05-10T11:55:06Z

Commented by: @hvlad

Well, SET NAMES doesn't matters there.
When tables created without re-connect to the database, its columns have collate UTF8.
When tables created after re-connect to the database, its columns have collate UNICODE_NOPAD.

So, UPDATE RDB$CHARACTER_SETS have no effect on database until metadata cache is reloaded.

firebird-automations · 2010-05-10T13:45:28Z

Commented by: @asfernandes

Update of system tables has never been a supported feature, and this specific one is know to cause problem after backup/restore. Anyway, it appears to work in HEAD (and probably in 2.5 too).

Also, 2.5 has ALTER CHARACTER SET to set its default collation. So feel free to test it and report if it has any bug.

firebird-automations · 2010-05-10T13:45:28Z

Modified by: @asfernandes

status: Open [ 1 ] => Resolved [ 5 ]

resolution: Won't Fix [ 2 ]

assignee: Adriano dos Santos Fernandes [ asfernandes ]

firebird-automations · 2010-05-10T14:17:33Z

Commented by: @hvlad

I don't think ticket should be closed.
Here more simple example:

CREATE COLLATION UNICODE_NOPAD FOR UTF8 FROM UNICODE NO PAD;
COMMIT; -- (1)

RECREATE TABLE tst1_nopad (
k1 VARCHAR(3) COLLATE UNICODE_NOPAD,
k2 INT,
k3 CHAR(1) COLLATE UNICODE_NOPAD,
PRIMARY KEY (k1, k2, k3)
);
COMMIT;

INSERT INTO tst1_nopad VALUES ('AP', 123, ' ');
INSERT INTO tst1_nopad VALUES ('HEL', 666, 'V');
COMMIT;

SELECT t1.*
FROM tst1_nopad t1
WHERE t1.k1 = 'AP'
AND t1.k2 = 123
AND t1.k3 = ' ';

no rows

SELECT t1.*
FROM tst1_nopad t1
WHERE t1.k1 = 'AP'
AND t1.k2 = 123
AND t1.k3 = ' '
PLAN (T1 NATURAL)

one row as result.

The reason is that BTR_key (used to insert index key) and BTR_make_key (used to search a row value) creates different keys for the same (?) values.

Also, note, without COMMIT at point (1), fields of tst1_nopad will have collate UTF8 !

firebird-automations · 2010-05-10T14:32:45Z

Modified by: @asfernandes

status: Resolved [ 5 ] => Reopened [ 4 ]

resolution: Won't Fix [ 2 ] =>

firebird-automations · 2010-05-11T18:51:21Z

Commented by: @asfernandes

Reason of collation not being used when doing things in the same transaction is because collation id is generated in DFW but used before it, so it's used as 0 (no collation, just the charset). I'm not going to fix this in 2.5 and older versions.

Second problem of incorrect lookup of the string is not because BTR_key/BTR_make_key usage. It's because a text descriptor is used without shrinking it with INTL_adjust_text_descriptor. As the collation is NO-PAD, that matters.

firebird-automations · 2010-05-11T20:55:18Z

Commented by: @hvlad

> Reason of collation not being used when doing things in the same transaction is because collation id is generated in DFW but used before it, so it's used as 0 (no collation, just the charset). I'm not going to fix this in 2.5 and older versions.

It should be properly documented, IMHO

> Second problem of incorrect lookup of the string is not because BTR_key/BTR_make_key usage. It's because a text descriptor is used without shrinking it with INTL_adjust_text_descriptor. As the collation is NO-PAD, that matters.

Is it possible to backport INTL_adjust_text_descriptor into 2.5 and 2.1 ?
Try to create index on CHAR(1) COLLATE UNICODE_NOPAD field and you'll see why i ask for backporting ;)

firebird-automations · 2010-05-11T21:03:31Z

Commented by: @asfernandes

> It should be properly documented, IMHO

I'll create another ticket for it. IMO, 2.5.0 could live without it. First HEAD, then 2.5.X later.

> Is it possible to backport INTL_adjust_text_descriptor into 2.5 and 2.1 ? Try to create index on CHAR(1) COLLATE UNICODE_NOPAD field and you'll see why i ask for backporting ;)

The function is there, probably with another name and static in evl.cpp. I didn't verified yet. I'll do later.

BTW, for documenting purposes, it's not the lookup that's incorrect, but the stored keys.

firebird-automations · 2010-05-11T21:14:47Z

Commented by: @hvlad

And what about single-segment index keys ?

firebird-automations · 2010-05-11T21:21:06Z

Commented by: @asfernandes

> And what about single-segment index keys ?

Oh, I forgot again to do the usual text find to look for duplicate blocks of code! Will do late too, when I have more time and commits are allowed. Thanks for review.

firebird-automations · 2010-05-11T21:27:33Z

Commented by: @hvlad

Thanks !

PS In both 2.0 and 2.1 there is EVL\adjust_text_descriptor() ;)

firebird-automations · 2010-05-17T19:19:48Z

Modified by: @asfernandes

status: Reopened [ 4 ] => Resolved [ 5 ]

resolution: Fixed [ 1 ]

Fix Version: 2.5 RC3 [ 10381 ]

Fix Version: 2.1.4 [ 10361 ]

Fix Version: 3.0 Alpha 1 [ 10331 ]

firebird-automations · 2010-06-18T15:35:13Z

Modified by: @asfernandes

Link: This issue is related to CORE3056 [ CORE3056 ]

firebird-automations · 2011-02-23T14:10:03Z

Modified by: @pcisar

Link: This issue is related to QA274 [ QA274 ]

firebird-automations · 2011-05-03T14:45:26Z

Commented by: @pmakowski

Q/A test made

firebird-automations · 2011-05-03T14:45:26Z

Modified by: @pmakowski

status: Resolved [ 5 ] => Closed [ 6 ]

firebird-automations · 2015-05-27T22:41:40Z

Modified by: @pavel-zotov

status: Closed [ 6 ] => Closed [ 6 ]

QA Status: Done successfully

firebird-automations closed this as completed May 17, 2010

firebird-automations added affect-version: 2.1.1 fix-version: 2.5 RC3 fix-version: 2.1.4 fix-version: 3.0 Alpha 1 resolution: fixed priority: critical labels Apr 25, 2021

firebird-automations added type: bug qa: done successfully labels Apr 25, 2021

firebird-automations assigned asfernandes Apr 25, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Join condition fails for UTF-8 databases. [CORE2826] #3213

Join condition fails for UTF-8 databases. [CORE2826] #3213

firebird-automations commented Jan 26, 2010

firebird-automations commented Jan 26, 2010

firebird-automations commented May 7, 2010

firebird-automations commented May 7, 2010

firebird-automations commented May 7, 2010

firebird-automations commented May 10, 2010

firebird-automations commented May 10, 2010

firebird-automations commented May 10, 2010

firebird-automations commented May 10, 2010

firebird-automations commented May 10, 2010

firebird-automations commented May 10, 2010

firebird-automations commented May 10, 2010

firebird-automations commented May 11, 2010

firebird-automations commented May 11, 2010

firebird-automations commented May 11, 2010

firebird-automations commented May 11, 2010

firebird-automations commented May 11, 2010

firebird-automations commented May 11, 2010

firebird-automations commented May 17, 2010

firebird-automations commented Jun 18, 2010

firebird-automations commented Feb 23, 2011

firebird-automations commented May 3, 2011

firebird-automations commented May 3, 2011

firebird-automations commented May 27, 2015

Join condition fails for UTF-8 databases. [CORE2826] #3213

Join condition fails for UTF-8 databases. [CORE2826] #3213

Comments

firebird-automations commented Jan 26, 2010

firebird-automations commented Jan 26, 2010

firebird-automations commented May 7, 2010

firebird-automations commented May 7, 2010

firebird-automations commented May 7, 2010

firebird-automations commented May 10, 2010

firebird-automations commented May 10, 2010

firebird-automations commented May 10, 2010

firebird-automations commented May 10, 2010

firebird-automations commented May 10, 2010

firebird-automations commented May 10, 2010

firebird-automations commented May 10, 2010

firebird-automations commented May 11, 2010

firebird-automations commented May 11, 2010

firebird-automations commented May 11, 2010

firebird-automations commented May 11, 2010

firebird-automations commented May 11, 2010

firebird-automations commented May 11, 2010

firebird-automations commented May 17, 2010

firebird-automations commented Jun 18, 2010

firebird-automations commented Feb 23, 2011

firebird-automations commented May 3, 2011

firebird-automations commented May 3, 2011

firebird-automations commented May 27, 2015