New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UNICODE collations does not work with ICU 49 [CORE3946] #4279
Comments
Modified by: @asfernandesassignee: Adriano dos Santos Fernandes [ asfernandes ] |
Commented by: @asfernandes Does the builtin UNICODE collation works? What is the Linux distro? |
Commented by: @mkubecek It doesn't seem to work: SQL> create database 'localhost:test' default character set UTF8; Distribution is OpenSuSE 12.2. Tested with distribution package (2.5) and 3.0 package from http://download.opensuse.org/repositories/home:/mkubecek:/firebird30/openSUSE_12.2/ Successful tests were on OpenSuSE 11.1 and 11.4 with 2.5 packages from http://download.opensuse.org/repositories/home:/mkubecek:/firebird25/ |
Commented by: @asfernandes Is there anything in firebird.log? |
Commented by: @mkubecek Nothing at all, neither for "create collation" nor for "create table". |
Commented by: @asfernandes Are you using 32 or 64 bit version? Please paste the result of: |
Commented by: @mkubecek It is 64-bit version. unicorn:~ # find /usr/ /lib* -name 'libicu*' |
Commented by: @asfernandes What's the result of command below? objdump -T /usr/lib64/libicuuc.so.49 |grep 'u_init\|u_versionToString\|uloc_countAvailable\|uloc_getAvailable\|uset_close\|uset_getItem\|uset_getItemCount\|uset_open' |
Commented by: @mkubecek mike@unicorn:~> objdump -T /usr/lib64/libicuuc.so.49 |grep 'u_init\|u_versionToString\|uloc_countAvailable\|uloc_getAvailable\|uset_close\|uset_getItem\|uset_getItemCount\|uset_open' mike@unicorn:~> objdump -T /usr/lib64/libicuuc.so.49 |grep 'ucnv_open\|ucnv_close\|ucnv_fromUChars\|u_tolower\|u_toupper\|u_strCompare\|u_countChar32\|utf8_nextCharSafeBody\|UCNV_FROM_U_CALLBACK_STOP\|UCNV_TO_U_CALLBACK_STOP\|ucnv_fromUnicode' mike@unicorn:~> objdump -T /usr/lib64/libicuuc.so.49 |grep 'ucnv_toUnicode\|ucnv_getInvalidChars\|ucnv_getMaxCharSize\|ucnv_getMinCharSize\|ucnv_setFromUCallBack\|ucnv_setToUCallBack' mike@unicorn:~> objdump -T /usr/lib64/libicui18n.so.49 |grep 'ucol_close\|ucol_getContractions\|ucol_getSortKey\|ucol_open\|ucol_setAttribute\|ucol_strcoll\|ucol_getVersion\|utrans_open\|utrans_close\|utrans_transUChars' |
Commented by: @asfernandes I do not see anything problematic. I need you send backtrace of problem, specifying exact version/buildnum you're using (I do prefer it's done with 3.0). gdb -args isql -ch utf8 |
Commented by: @mkubecek This output was created with 3.0.0.30084 (svn revision 57178). The exception is caught in src/jrd/intl.cpp, line 398, CharSetContainer::lookupCollation():
Value of info can be found in the attachment. |
Modified by: @mkubecekAttachment: collation-gdb.txt [ 12238 ] |
Commented by: @asfernandes Please run in ISQL: show collation unicode; |
Commented by: @mkubecek SQL> show collation unicode; I also played a bit more with gdb and got to this stack: #0 Jrd::UnicodeUtil::Utf16Collation::loadICU src/common/unicode_util.cpp:1463 where loadICU(""41.128.4.4", "", ""icu_versions=default") fails |
Commented by: @asfernandes Don't know why, but the collation on your database is initialized incorrectly. Please locate fbintl.conf and set icu_versions to 4.9: Then retry. Create a new database, show the collation and test to see what happens. |
Commented by: @mkubecek Now it works (with newly created database) and output of 'show collation' is different: SQL> create database '/srv/firebird/test3.fdb'; SQL> select 1 from rdb$database where 'a' = 'a' collate unicode;
============ |
Commented by: @asfernandes Was the test with "icu_versions default" done with a fresh new database too? |
Commented by: @mkubecek Yes (I checked again now to be sure). Could the problem be caused by some part of ICU (or something else) missing during the build? |
Commented by: @mkubecek I did the same test on OpenSuSE 12.1 with ICU 4.6 and 4.8.1 (used both for build and test) and the same version of Firebird. In both cases the collation works even with 'icu_versions = default'. So it looks like some incompatibility introduced between ICU 4.8 and 4.9. |
Commented by: @asfernandes "default" means the version present at build time. Looks like you have an installed dev package without the actual runtime paackage. Or some problem in the build include path. Locate these lines (here 887) in src/common/unicode_util.cpp:
put a breakpoint on the last (for) line in the gdb prompt: Once the breakpoint is reach, print version: Or do play at compile time and check where U_ICU_VERSION_MAJOR_NUM and U_ICU_VERSION_MINOR_NUM is coming from and what's they values. |
Commented by: @mkubecek I get majorVersion = 49, minorVersion = 1, which after filename.printf(ucTemplate, majorVersion, minorVersion); gives filename = ""libicuuc.so.491". This fails to load as the name should probably be "libicuuc.so.49". I checked libicu header files and indeed, version 49 (4.9) defines U_ICU_VERSION_MAJOR_NUM=49, U_ICU_VERSION_MINOR_NUM=1 while version 48 (4.8) defined U_ICU_VERSION_MAJOR_NUM=4, U_ICU_VERSION_MINOR_NUM=8. Looking at the version macros defined by 49 (4.9) and 48 (4.8), it seems U_ICU_VERSION_SHORT might be the right one but I'm not sure it will work correctly with older versions as well. Or maybe we could just distinguish cases U_ICU_VERSION_MAJOR_NUM>4 and U_ICU_VERSION_MAJOR_NUM<=4 (and hope they won't change the scheme again). |
Commented by: @asfernandes Problem is that this is not ICU 4.9, it's ICU 49 really, but they changed how this is encoded in the filename. Looks like these people has nothing else to do! |
Modified by: @asfernandessummary: create collation for UTF8 from UNICODE fails with ICU 4.9 => UNICODE collations does not work with ICU 49 |
Commented by: @asfernandes I commited a fix for FB 3.0, without testing with ICU 49. Please test it. |
Commented by: @pmakowski hope you will backport it to 2.5 |
Commented by: @mkubecek I confirm that both standard UNICODE collation and custom collation created from it work as expected now. Thank you. |
Commented by: @asfernandes Committed to 2.5 branch. Please test it. |
Modified by: @asfernandesstatus: Open [ 1 ] => Resolved [ 5 ] resolution: Fixed [ 1 ] Fix Version: 2.5.2 [ 10450 ] Fix Version: 3.0 Alpha 1 [ 10331 ] |
Commented by: @mkubecek Current 2.5 from subversion works for me. Thank you. |
Commented by: @pmakowski this is marked as fixed in 2.5.2, but it is not the case |
Modified by: @pcisarstatus: Resolved [ 5 ] => Closed [ 6 ] |
Submitted by: @mkubecek
Attachments:
collation-gdb.txt
On system with ICU 4.9, command
create collation test_1 for UTF8 from UNICODE;
fails with
Statement failed, SQLSTATE = 42000
unsuccessful metadata update
-Invalid collation attributes
With ICU 4.4, the same command succeeds. With ICU 4.9, it fails with UTF8 and any collation but succeeds with ISO8859_1 or ISO8859_2 charset (tested all ISO8859_1 and about half of ISO8859_2 collations).
Commits: 36dcd8e 8ce4b58
The text was updated successfully, but these errors were encountered: