|
I'll have access to Fedora 14 VM only on Monday. Then I can debug it.
I've lost some hours installing natty libraries in maverick and debugging the problem.
Then I just realized I was testing icu 4.4 using a database created with icu 4.2. And then the error is correctly. If I set in fbintl.conf "icu_versions = default 4.2" (where default was 4.4 I used to compile) it worked. Also worked creating a database with icu 4.4 and using it. So I see no issue to fix. It's the trouble created when people started using icu from distro and icu not maintaining compatibility between they collations tables. OK, I will take one more look at it tomorrow. Because in my case it's reproduced on freshly created database.
# ./isql Use CONNECT or CREATE DATABASE to specify a database SQL> create database 'test.fdb'; SQL> create table test2 (C1 varchar(32) character set UTF8 collate UNICODE); Statement failed, SQLSTATE = 22021 unsuccessful metadata update -TEST2 -COLLATION UNICODE for CHARACTER SET UTF8 is not installed SQL> fbs2 bin # ldd isql ....................................... libicuuc.so.44 => /usr/lib/libicuuc.so.44 (0x00007fc954070000) libicudata.so.44 => /usr/lib/libicudata.so.44 (0x00007fc953033000) libicui18n.so.44 => /usr/lib/libicui18n.so.44 (0x00007fc952c60000) What's the result of:
select rdb$specific_attributes from rdb$collations where rdb$collation_name = 'UNICODE'; And does it have something in the log? Adriano,
obviously there is something to fix, the fedora user, me, Alex, Damian we can all reproduce the problem with fresh created database, under Fedora, Gentoo and Debian is that enough for you (note here the test is with 2.1.3, but same is true with 2.1.4 and 2.5.0) $ isql-fb -user SYSDBA -password masterkey -ch utf8 Use CONNECT or CREATE DATABASE to specify a database SQL> create database '/var/lib/firebird/data/test.fdb'; SQL> create table test2 (C1 varchar(32) character set UTF8 collate UNICODE); Statement failed, SQLCODE = -607 unsuccessful metadata update -TEST2 -COLLATION UNICODE for CHARACTER SET UTF8 is not installed SQL> select rdb$specific_attributes from rdb$collations where rdb$collation_name = 'UNICODE'; RDB$SPECIFIC_ATTRIBUTES ======================= <null> SQL> show version; ISQL Version: LI-V2.1.3.18185 Firebird 2.1 Server version: Firebird/linux AMD64 (access method), version "LI-V2.1.3.18185 Firebird 2.1" Firebird/linux AMD64 (remote server), version "LI-V2.1.3.18185 Firebird 2.1/tcp (fedora64)/P11" Firebird/linux AMD64 (remote interface), version "LI-V2.1.3.18185 Firebird 2.1/tcp (fedora64)/P11" on disk structure version 11.1 SQL> show system collation; ASCII, CHARACTER SET ASCII, PAD SPACE, SYSTEM BIG_5, CHARACTER SET BIG_5, PAD SPACE, SYSTEM BS_BA, CHARACTER SET WIN1250, PAD SPACE, SYSTEM CP943C, CHARACTER SET CP943C, PAD SPACE, SYSTEM CP943C_UNICODE, CHARACTER SET CP943C, PAD SPACE, SYSTEM CS_CZ, CHARACTER SET ISO8859_2, PAD SPACE, SYSTEM CYRL, CHARACTER SET CYRL, PAD SPACE, SYSTEM DA_DA, CHARACTER SET ISO8859_1, PAD SPACE, SYSTEM DB_CSY, CHARACTER SET DOS852, PAD SPACE, SYSTEM DB_DAN865, CHARACTER SET DOS865, PAD SPACE, SYSTEM DB_DEU437, CHARACTER SET DOS437, PAD SPACE, SYSTEM DB_DEU850, CHARACTER SET DOS850, PAD SPACE, SYSTEM DB_ESP437, CHARACTER SET DOS437, PAD SPACE, SYSTEM DB_ESP850, CHARACTER SET DOS850, PAD SPACE, SYSTEM DB_FIN437, CHARACTER SET DOS437, PAD SPACE, SYSTEM DB_FRA437, CHARACTER SET DOS437, PAD SPACE, SYSTEM DB_FRA850, CHARACTER SET DOS850, PAD SPACE, SYSTEM DB_FRC850, CHARACTER SET DOS850, PAD SPACE, SYSTEM DB_FRC863, CHARACTER SET DOS863, PAD SPACE, SYSTEM DB_ITA437, CHARACTER SET DOS437, PAD SPACE, SYSTEM DB_ITA850, CHARACTER SET DOS850, PAD SPACE, SYSTEM DB_NLD437, CHARACTER SET DOS437, PAD SPACE, SYSTEM DB_NLD850, CHARACTER SET DOS850, PAD SPACE, SYSTEM DB_NOR865, CHARACTER SET DOS865, PAD SPACE, SYSTEM DB_PLK, CHARACTER SET DOS852, PAD SPACE, SYSTEM DB_PTB850, CHARACTER SET DOS850, PAD SPACE, SYSTEM DB_PTG860, CHARACTER SET DOS860, PAD SPACE, SYSTEM DB_RUS, CHARACTER SET CYRL, PAD SPACE, SYSTEM DB_SLO, CHARACTER SET DOS852, PAD SPACE, SYSTEM DB_SVE437, CHARACTER SET DOS437, PAD SPACE, SYSTEM DB_SVE850, CHARACTER SET DOS850, PAD SPACE, SYSTEM DB_TRK, CHARACTER SET DOS857, PAD SPACE, SYSTEM DB_UK437, CHARACTER SET DOS437, PAD SPACE, SYSTEM DB_UK850, CHARACTER SET DOS850, PAD SPACE, SYSTEM DB_US437, CHARACTER SET DOS437, PAD SPACE, SYSTEM DB_US850, CHARACTER SET DOS850, PAD SPACE, SYSTEM DE_DE, CHARACTER SET ISO8859_1, PAD SPACE, SYSTEM DOS437, CHARACTER SET DOS437, PAD SPACE, SYSTEM DOS737, CHARACTER SET DOS737, PAD SPACE, SYSTEM DOS775, CHARACTER SET DOS775, PAD SPACE, SYSTEM DOS850, CHARACTER SET DOS850, PAD SPACE, SYSTEM DOS852, CHARACTER SET DOS852, PAD SPACE, SYSTEM DOS857, CHARACTER SET DOS857, PAD SPACE, SYSTEM DOS858, CHARACTER SET DOS858, PAD SPACE, SYSTEM DOS860, CHARACTER SET DOS860, PAD SPACE, SYSTEM DOS861, CHARACTER SET DOS861, PAD SPACE, SYSTEM DOS862, CHARACTER SET DOS862, PAD SPACE, SYSTEM DOS863, CHARACTER SET DOS863, PAD SPACE, SYSTEM DOS864, CHARACTER SET DOS864, PAD SPACE, SYSTEM DOS865, CHARACTER SET DOS865, PAD SPACE, SYSTEM DOS866, CHARACTER SET DOS866, PAD SPACE, SYSTEM DOS869, CHARACTER SET DOS869, PAD SPACE, SYSTEM DU_NL, CHARACTER SET ISO8859_1, PAD SPACE, SYSTEM EN_UK, CHARACTER SET ISO8859_1, PAD SPACE, SYSTEM EN_US, CHARACTER SET ISO8859_1, PAD SPACE, SYSTEM ES_ES, CHARACTER SET ISO8859_1, PAD SPACE, 'DISABLE-COMPRESSIONS=1;SPECIALS-FIRST=1', SYSTEM ES_ES_CI_AI, CHARACTER SET ISO8859_1, PAD SPACE, CASE INSENSITIVE, ACCENT INSENSITIVE, 'DISABLE-COMPRESSIONS=1;SPECIALS-FIRST=1', SYSTEM EUCJ_0208, CHARACTER SET EUCJ_0208, PAD SPACE, SYSTEM FI_FI, CHARACTER SET ISO8859_1, PAD SPACE, SYSTEM FR_CA, CHARACTER SET ISO8859_1, PAD SPACE, SYSTEM FR_FR, CHARACTER SET ISO8859_1, PAD SPACE, SYSTEM FR_FR_CI_AI, CHARACTER SET ISO8859_1, FROM EXTERNAL ('FR_FR'), PAD SPACE, CASE INSENSITIVE, ACCENT INSENSITIVE, 'SPECIALS-FIRST=1', SYSTEM GBK, CHARACTER SET GBK, PAD SPACE, SYSTEM GBK_UNICODE, CHARACTER SET GBK, PAD SPACE, SYSTEM GB_2312, CHARACTER SET GB_2312, PAD SPACE, SYSTEM ISO8859_1, CHARACTER SET ISO8859_1, PAD SPACE, SYSTEM ISO8859_13, CHARACTER SET ISO8859_13, PAD SPACE, SYSTEM ISO8859_2, CHARACTER SET ISO8859_2, PAD SPACE, SYSTEM ISO8859_3, CHARACTER SET ISO8859_3, PAD SPACE, SYSTEM ISO8859_4, CHARACTER SET ISO8859_4, PAD SPACE, SYSTEM ISO8859_5, CHARACTER SET ISO8859_5, PAD SPACE, SYSTEM ISO8859_6, CHARACTER SET ISO8859_6, PAD SPACE, SYSTEM ISO8859_7, CHARACTER SET ISO8859_7, PAD SPACE, SYSTEM ISO8859_8, CHARACTER SET ISO8859_8, PAD SPACE, SYSTEM ISO8859_9, CHARACTER SET ISO8859_9, PAD SPACE, SYSTEM ISO_HUN, CHARACTER SET ISO8859_2, PAD SPACE, SYSTEM ISO_PLK, CHARACTER SET ISO8859_2, PAD SPACE, SYSTEM IS_IS, CHARACTER SET ISO8859_1, PAD SPACE, SYSTEM IT_IT, CHARACTER SET ISO8859_1, PAD SPACE, SYSTEM KOI8R, CHARACTER SET KOI8R, PAD SPACE, SYSTEM KOI8R_RU, CHARACTER SET KOI8R, PAD SPACE, SYSTEM KOI8U, CHARACTER SET KOI8U, PAD SPACE, SYSTEM KOI8U_UA, CHARACTER SET KOI8U, PAD SPACE, SYSTEM KSC_5601, CHARACTER SET KSC_5601, PAD SPACE, SYSTEM KSC_DICTIONARY, CHARACTER SET KSC_5601, PAD SPACE, SYSTEM LT_LT, CHARACTER SET ISO8859_13, PAD SPACE, SYSTEM NEXT, CHARACTER SET NEXT, PAD SPACE, SYSTEM NONE, CHARACTER SET NONE, PAD SPACE, SYSTEM NO_NO, CHARACTER SET ISO8859_1, PAD SPACE, SYSTEM NXT_DEU, CHARACTER SET NEXT, PAD SPACE, SYSTEM NXT_ESP, CHARACTER SET NEXT, PAD SPACE, SYSTEM NXT_FRA, CHARACTER SET NEXT, PAD SPACE, SYSTEM NXT_ITA, CHARACTER SET NEXT, PAD SPACE, SYSTEM NXT_US, CHARACTER SET NEXT, PAD SPACE, SYSTEM OCTETS, CHARACTER SET OCTETS, PAD SPACE, SYSTEM PDOX_ASCII, CHARACTER SET DOS437, PAD SPACE, SYSTEM PDOX_CSY, CHARACTER SET DOS852, PAD SPACE, SYSTEM PDOX_CYRL, CHARACTER SET CYRL, PAD SPACE, SYSTEM PDOX_HUN, CHARACTER SET DOS852, PAD SPACE, SYSTEM PDOX_INTL, CHARACTER SET DOS437, PAD SPACE, SYSTEM PDOX_ISL, CHARACTER SET DOS861, PAD SPACE, SYSTEM PDOX_NORDAN4, CHARACTER SET DOS865, PAD SPACE, SYSTEM PDOX_PLK, CHARACTER SET DOS852, PAD SPACE, SYSTEM PDOX_SLO, CHARACTER SET DOS852, PAD SPACE, SYSTEM PDOX_SWEDFIN, CHARACTER SET DOS437, PAD SPACE, SYSTEM PT_BR, CHARACTER SET ISO8859_1, PAD SPACE, CASE INSENSITIVE, ACCENT INSENSITIVE, SYSTEM PT_PT, CHARACTER SET ISO8859_1, PAD SPACE, SYSTEM PXW_CSY, CHARACTER SET WIN1250, PAD SPACE, SYSTEM PXW_CYRL, CHARACTER SET WIN1251, PAD SPACE, SYSTEM PXW_GREEK, CHARACTER SET WIN1253, PAD SPACE, SYSTEM PXW_HUN, CHARACTER SET WIN1250, PAD SPACE, SYSTEM PXW_HUNDC, CHARACTER SET WIN1250, PAD SPACE, SYSTEM PXW_INTL, CHARACTER SET WIN1252, PAD SPACE, SYSTEM PXW_INTL850, CHARACTER SET WIN1252, PAD SPACE, SYSTEM PXW_NORDAN4, CHARACTER SET WIN1252, PAD SPACE, SYSTEM PXW_PLK, CHARACTER SET WIN1250, PAD SPACE, SYSTEM PXW_SLOV, CHARACTER SET WIN1250, PAD SPACE, SYSTEM PXW_SPAN, CHARACTER SET WIN1252, PAD SPACE, SYSTEM PXW_SWEDFIN, CHARACTER SET WIN1252, PAD SPACE, SYSTEM PXW_TURK, CHARACTER SET WIN1254, PAD SPACE, SYSTEM SJIS_0208, CHARACTER SET SJIS_0208, PAD SPACE, SYSTEM SV_SV, CHARACTER SET ISO8859_1, PAD SPACE, SYSTEM TIS620, CHARACTER SET TIS620, PAD SPACE, SYSTEM TIS620_UNICODE, CHARACTER SET TIS620, PAD SPACE, SYSTEM UCS_BASIC, CHARACTER SET UTF8, PAD SPACE, SYSTEM UNICODE, CHARACTER SET UTF8, PAD SPACE, SYSTEM UNICODE_CI, CHARACTER SET UTF8, FROM EXTERNAL ('UNICODE'), PAD SPACE, CASE INSENSITIVE, SYSTEM UNICODE_FSS, CHARACTER SET UNICODE_FSS, PAD SPACE, SYSTEM UTF8, CHARACTER SET UTF8, PAD SPACE, SYSTEM WIN1250, CHARACTER SET WIN1250, PAD SPACE, SYSTEM WIN1251, CHARACTER SET WIN1251, PAD SPACE, SYSTEM WIN1251_UA, CHARACTER SET WIN1251, PAD SPACE, SYSTEM WIN1252, CHARACTER SET WIN1252, PAD SPACE, SYSTEM WIN1253, CHARACTER SET WIN1253, PAD SPACE, SYSTEM WIN1254, CHARACTER SET WIN1254, PAD SPACE, SYSTEM WIN1255, CHARACTER SET WIN1255, PAD SPACE, SYSTEM WIN1256, CHARACTER SET WIN1256, PAD SPACE, SYSTEM WIN1257, CHARACTER SET WIN1257, PAD SPACE, SYSTEM WIN1257_EE, CHARACTER SET WIN1257, PAD SPACE, SYSTEM WIN1257_LT, CHARACTER SET WIN1257, PAD SPACE, SYSTEM WIN1257_LV, CHARACTER SET WIN1257, PAD SPACE, SYSTEM WIN1258, CHARACTER SET WIN1258, PAD SPACE, SYSTEM WIN_CZ, CHARACTER SET WIN1250, PAD SPACE, CASE INSENSITIVE, SYSTEM WIN_CZ_CI_AI, CHARACTER SET WIN1250, PAD SPACE, CASE INSENSITIVE, ACCENT INSENSITIVE, SYSTEM WIN_PTBR, CHARACTER SET WIN1252, PAD SPACE, CASE INSENSITIVE, ACCENT INSENSITIVE, SYSTEM SQL> exit; [philippe@fedora64 ~]$ ldd /usr/lib64/firebird/intl/fbintl linux-vdso.so.1 => (0x00007fff2c1ff000) libicuuc.so.44 => /usr/lib64/libicuuc.so.44 (0x00007f1715ea4000) libicudata.so.44 => /usr/lib64/libicudata.so.44 (0x00007f1714e66000) libicui18n.so.44 => /usr/lib64/libicui18n.so.44 (0x00007f1714aa9000) libdl.so.2 => /lib64/libdl.so.2 (0x00007f17148a5000) libncurses.so.5 => /lib64/libncurses.so.5 (0x00007f1714681000) libtinfo.so.5 => /lib64/libtinfo.so.5 (0x00007f171445a000) libstdc++.so.6 => /usr/lib64/libstdc++.so.6 (0x00007f1714153000) libm.so.6 => /lib64/libm.so.6 (0x00007f1713ecd000) libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f1713cb8000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f1713a9c000) libc.so.6 => /lib64/libc.so.6 (0x00007f17136ff000) /lib64/ld-linux-x86-64.so.2 (0x00000038d0800000) [philippe@fedora64 ~]$ There is nothing in the log here (firebird 2.5.1 from SVN revision 52328)
Ok, now I know what's happening. ICU was using function name encoding pattern name_4_2 and now uses name_44.
I didn't saw the bug cause I was testing 3.0 and it already has a fix for this. please send me the patch
They're already committed to 2.1 and 2.5 svn.
ok, saw them in the tracker now
thanks will it be ok with icu 4,6 too ? > will it be ok with icu 4,6 too ?
I do not know. But will be a shame if they start to invent new way to encode their function names at each release. > I didn't saw the bug cause I was testing 3.0 and it already has a fix for this.
Appears it's my fault. I've changed that in trunc in a very massive set of changes, but later forgotten to backport. ICU developer here. I came here because of https://bugs.launchpad.net/ubuntu/+source/icu/+bug/778386 which pointed to https://bugzilla.redhat.com/show_bug.cgi?id=697313 This sounds like it was resolved, but if there are questions about the renaming (such as your question, Adriano ) - Please see http://userguide.icu-project.org/design#TOC-ICU-Binary-Compatibility:-Using-ICU and discussion at https://bugs.launchpad.net/ubuntu/+source/icu/+bug/675946 - and also make sure you are availing yourself of mailing lists, bug reports, user guides etc at http://icu-project.org Specifically, function renaming allows multiple ICU versions to coincide in the same address space. This is particularly important for collation, so that multiple versions of collators are available at the same time without having to rebuild the sort keys. ( ICU version X and version Y have different UCA and CLDR versions, and so sort differently and produce different collation keys. ) A future feature ('provider') will make it possible to request multiple ICU versions using a keyword without having to link against multiple ICU codebases, but that's not implemented yet. Have you considered requesting Firebird to be added to http://icu-project.org as a project using ICU? Regards, Steven OK, I just read the SVN commits... why are you loading ICU symbols dynamically???
The docs you mentioned just confirms we're doing the right thing, because ICU don't support compatibility between collation data from (major) version to version and we can load the required version depending on the database.
> Have you considered requesting Firebird to be added to http://icu-project.org as a project using ICU? It would be ok in my opinion. Steven, it's certainly OK to rename symbols like name_X_Y when library version is changed. What's strange is making at some step name_XY instead it.
Firebird is loading ICU symbols dynamically in order to be able to correctly work with collation keys (stored in database indexes) when it needs to open database, created using non-default ICU version. Certainly, appropriate version of ICU must be installed to make it work. And one more question - unrelated with this issue directly. I want to load ICU library dynamically using soft link like /usr/lib/libicuuc.so pointing to particular library, something like libicuuc.so.44.1. How can I determine actual version of loaded library after it? (If it's better for you feel free to reply to peshkoff at mail dot ru.) Adriano, Alexander. Good to write you. We're very busy getting our 4.8 out ( you could test a milestone or the trunk version if you want.. ).
We changed it from name_x_y to name_xy to save a byte, and because we only track the major+minor version number. Didn't expect that to be a hard dependency, in fact, projects can customize their own suffixes as well. As of I think 4.4, the macro U_ICU_ENTRY_POINT_RENAME(x) is the bottleneck for symbol renaming. You are both exactly right- you need multiple ICU versions to deal with keys. In fact, the 'provider' feature http://bugs.icu-project.org/trac/ticket/8157 http://bugs.icu-project.org/trac/ticket/6631 which unfortunately won't make 4.8 (slipped again) deals with this by allowing collators to be loaded using a locale ID such as "de_CH@sp=icu44" "de_CH@sp=icu46" etc. - this is actually already implemented for the C++ collator for getSortKey in sort of an unfinished state. It couldn't work with C because of bug 8157. This doesn't work automatically in the stock ICU- have to do a special build. But, it's a 'plugin' to ICU, so you don't have to recompile the application or ICU to add/remove these providers providing 44, 46, etc- separate shared libs. The advantage is the application only has to link against one version of ICU. I took the approach of building (by macro) an implementation which directly called into the other ICU versions, rather than calling dlsym: http://bugs.icu-project.org/trac/browser/tools/trunk/multi/proj/provider/glue/coll_fe.cpp then all the issues are caught at link time ( in my use, I statically linked the libraries to make everything smaller ) - this special build has scripts which cause ICU itself to emit the right symbol renaming paths, rather than having to hard code them. For your use, what you could do is just have your interfaces which use ICU, but then redefine U_ICU_ENTRY_POINT_RENAME and you could build thin .so's which call into different ICUs. This has been done in similar ways before. myinterface.c " #include <unicode/ucol.h> myfunction() { ucol_open( ... ) } ... " then "cc -o myinterface_44.o -DU_ICU_ENTRY_POINT_RENAME(x)=x ## _44" "cc -o myinterface_46.o -DU_ICU_ENTRY_POINT_RENAME(x)=x ## _46" etc. Then you can verify that it's linkable against a real ICU. I added firebird to icu-project.org About your question.. The best way to ask is to use the icu-support mailing list and or our bug forms.. I hope you at least look at those sometimes.. but, to answer your question, you can call u_getVersion() and it will fill-in a UVersionInfo struct ( 4 bytes) with the actual version number. Also, if the "--enable-auto-cleanup " is configured, ICU will automatically call cleanup when the library is unloaded. I saw some discussion of that. it's ticket http://bugs.icu-project.org/trac/ticket/3126 but isn't enabled by default - mostly in absence of user feedback. Usage like yours would be a good example of why this could be important. I'm glad I was subscribed to launchpad and that a Ubuntu user hit this (not that I'm glad the user had trouble...) .. otherwise, we would not have known about this interaction. Please, please, ask questions and file bugs upstream! Hey, I've finally committed the provider interface into ICU trunk. Read the comment on http://bugs.icu-project.org/trac/ticket/8157 when this interface is installed you can request different ICU collations like the following and get different results.
ja@sp=icu38 = --> AN_CX_DX_EX_FX_HO_LJA_NX_S3_PICU38 95 3A 03 77 9E 03 4D 4F 51 33 33 01 0B 01 A1 85 8F 08 00 ja@sp=icu42 = --> AN_CX_DX_EX_FX_HO_LJA_NX_S3_PICU42 *A3 3A*7A*B2*03*50*52*54*36*36*01*0B*01*A1*85*8F*08*00 ja@sp=icu44 = --> AN_CX_DX_EX_FX_HO_LJA_NX_S3_PICU44 *AC 3A*7C*A6 03*51*53*55*37*37 01 0B 01 A1 85 8F 08 00 ja@sp=icu46 = --> AN_CX_DX_EX_FX_HO_LJA_NX_S3_PICU46 *79*26*03*70*94*03*4B*4D*4F*31*31*01*0B*01*A1*85*8F*08 00 ja@sp=icu48 = --> AN_CX_DX_EX_FX_HO_LJA_NX_S3_PICU48 79 26 03 70 94 03 4B 4D 4F 31 31 01 0B 01 A1 85 8F 08 00 ja = --> AN_CX_DX_EX_FX_HO_LJA_NX_S3 79 26 03 70 94 03 4B 4D 4F 31 31 01 0B 01 A1 85 8F 08 00 Today I installed firebird using
apt-get firebird2.5-super on a fresh and clean debian 6 then I tried to restore my database from a gbk file created from windows with icu 3.0. I get "COLLATION UNICODE_CI_AI for CHARACTER SET UTF8 is not installed". The debian has ICU 4.4 in default locations (ldd shows that fbserver finds all the libs). Is this the same problem? |
it's become a major problem since Linux distro have now or will have soon icu > 4.2