Issue Details (XML | Word | Printable)

Key: CORE-1502
Type: Bug Bug
Status: Open Open
Priority: Major Major
Assignee: Unassigned
Reporter: Richard Wesley
Votes: 0
Watchers: 2
Operations

If you were logged in you would be able to see more operations.
Firebird Core

Index of many unicode characters results in "internal gds software consistency check"

Created: 11/Sep/07 09:42 AM   Updated: 12/Nov/08 09:03 PM
Component/s: Charsets/Collation, Engine
Affects Version/s: 2.0.3
Fix Version/s: None

File Attachments: 1. Zip Archive CREATETABLETEST.FDB.zip (256 kB)

Environment: Windows XP SP2


 Description  « Hide
This looks a lot like CORE-1049 et alia, but it is showing up in 2.0.3:

CREATE TABLE "Sheet1$" (
"Appearance" VARCHAR(6) CHARACTER SET UTF8 COLLATE UNICODE,
"Decimal" FLOAT(53),
"Name" VARCHAR(83) CHARACTER SET UTF8 COLLATE UNICODE,
"Position" VARCHAR(6) CHARACTER SET UTF8 COLLATE UNICODE
);

INSERT INTO "Sheet1$"
("Appearance", "Decimal", "Name", "Position")
VALUES(?, ?, ?, ?);

// At this point, I inserted a table of about 10600 rows with
distinct single character ucs2 code points in the "Appearance" column
// using the API. Many thanks to my sadistic test cr?e...
// We use UTF8 for the communication character set and convert the
ucs2 to utf8 using the standard Windows character conversion
// routines.

// If I now say:

CREATE INDEX "_tidx_128_1a" ON "Sheet1$" ("Appearance");

// we log the following error in our application:

--- FB Error
---------------------------------------------------------------
File: db\firebirdprotocol.cpp, Line: 1921
Status: 335544333
internal gds software consistency check (index key too big (174),
file: idx.cpp line: 448)
------------------------------------------------------------------------
----


 All   Comments   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Richard Wesley added a comment - 11/Sep/07 09:46 AM
This is a zip of a Firebird database. I think it has a .tde extension but you can just change that to fdb.

Richard Wesley added a comment - 13/Oct/07 03:56 PM
The issue does not appear in 2.1b1.

Adriano dos Santos Fernandes added a comment - 12/Nov/08 09:03 PM
There is a record with the character U+FDFA. You can see it here: http://www.fileformat.info/info/unicode/char/fdfa/index.htm.

Note its decomposition in many others characters. When getting sort key of it, ICU returns 55 bytes. How can we deal with it? With our current fixed size buffers for keys, there is no way...

The attached test case has synthetic data, so it seems this problem is not affecting our users, so I believe it's low priority.