Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Collation is not installed with icu > 4.2 [CORE3447] #3808

Closed
firebird-automations opened this issue Apr 19, 2011 · 24 comments
Closed

Collation is not installed with icu > 4.2 [CORE3447] #3808

firebird-automations opened this issue Apr 19, 2011 · 24 comments

Comments

@firebird-automations
Copy link
Collaborator

Submitted by: @pmakowski

$ ldd /usr/lib64/firebird/bin/fbserver | grep icu
libicuuc.so.44 => /usr/lib64/libicuuc.so.44 (0x00000038e9400000)
libicudata.so.44 => /usr/lib64/libicudata.so.44 (0x00000038e9c00000)
libicui18n.so.44 => /usr/lib64/libicui18n.so.44 (0x00000038eae00000)

show system collations;
give
UNICODE, CHARACTER SET UTF8, PAD SPACE, SYSTEM
UNICODE_CI, CHARACTER SET UTF8, FROM EXTERNAL ('UNICODE'), PAD SPACE,
CASE INSENSITIVE, SYSTEM
UNICODE_FSS, CHARACTER SET UNICODE_FSS, PAD SPACE, SYSTEM
UTF8, CHARACTER SET UTF8, PAD SPACE, SYSTEM

but :
SQL> create table test (C1 varchar(32) character set UTF8 collate UTF8,
C2 varchar(10) character set ISO8859_1 collate FR_FR);
SQL> show table test;
C1 VARCHAR(32) CHARACTER SET UTF8 Nullable
C2 VARCHAR(10) CHARACTER SET ISO8859_1 Nullable COLLATE FR_FR

SQL> create table test2 (C1 varchar(32) character set UTF8 collate UNICODE);
Statement failed, SQLCODE = -607
unsuccessful metadata update
-TEST2
-COLLATION UNICODE for CHARACTER SET UTF8 is not installed

same with for unicode_ci and unicode_ci_ai and certainly others

reported first in Fedora bug tracker https://bugzilla.redhat.com/show_bug.cgi?id=697313

Commits: 7e88a1f 5e4be8e

====== Test Details ======

Passed on: WI-V2.5.5.26871, WI-T3.0.0.31844; LI-V2.5.3.26788, LI-T3.0.0.31842

@firebird-automations
Copy link
Collaborator Author

Modified by: @pmakowski

assignee: Adriano dos Santos Fernandes [ asfernandes ]

@firebird-automations
Copy link
Collaborator Author

Commented by: @pmakowski

Adriano (or someone else), anything I can do to help you to solve this ?
it's become a major problem since Linux distro have now or will have soon icu > 4.2

@firebird-automations
Copy link
Collaborator Author

Commented by: @asfernandes

I'll have access to Fedora 14 VM only on Monday. Then I can debug it.

@firebird-automations
Copy link
Collaborator Author

Commented by: @asfernandes

I've lost some hours installing natty libraries in maverick and debugging the problem.

Then I just realized I was testing icu 4.4 using a database created with icu 4.2. And then the error is correctly. If I set in fbintl.conf "icu_versions = default 4.2" (where default was 4.4 I used to compile) it worked.

Also worked creating a database with icu 4.4 and using it.

So I see no issue to fix. It's the trouble created when people started using icu from distro and icu not maintaining compatibility between they collations tables.

@firebird-automations
Copy link
Collaborator Author

Commented by: @AlexPeshkoff

OK, I will take one more look at it tomorrow. Because in my case it's reproduced on freshly created database.

#⁠ ./isql
Use CONNECT or CREATE DATABASE to specify a database
SQL> create database 'test.fdb';
SQL> create table test2 (C1 varchar(32) character set UTF8 collate UNICODE);
Statement failed, SQLSTATE = 22021
unsuccessful metadata update
-TEST2
-COLLATION UNICODE for CHARACTER SET UTF8 is not installed
SQL> fbs2 bin #⁠ ldd isql
.......................................
libicuuc.so.44 => /usr/lib/libicuuc.so.44 (0x00007fc954070000)
libicudata.so.44 => /usr/lib/libicudata.so.44 (0x00007fc953033000)
libicui18n.so.44 => /usr/lib/libicui18n.so.44 (0x00007fc952c60000)

@firebird-automations
Copy link
Collaborator Author

Commented by: @asfernandes

What's the result of:
select rdb$specific_attributes from rdb$collations where rdb$collation_name = 'UNICODE';

And does it have something in the log?

@firebird-automations
Copy link
Collaborator Author

Commented by: @pmakowski

Adriano,
obviously there is something to fix, the fedora user, me, Alex, Damian we can all reproduce the problem with fresh created database, under Fedora, Gentoo and Debian

is that enough for you (note here the test is with 2.1.3, but same is true with 2.1.4 and 2.5.0)

$ isql-fb -user SYSDBA -password masterkey -ch utf8
Use CONNECT or CREATE DATABASE to specify a database
SQL> create database '/var/lib/firebird/data/test.fdb';
SQL> create table test2 (C1 varchar(32) character set UTF8 collate UNICODE);
Statement failed, SQLCODE = -607
unsuccessful metadata update
-TEST2
-COLLATION UNICODE for CHARACTER SET UTF8 is not installed
SQL> select rdb$specific_attributes from rdb$collations where rdb$collation_name = 'UNICODE';

RDB$SPECIFIC_ATTRIBUTES

             <null\> 

SQL> show version;
ISQL Version: LI-V2.1.3.18185 Firebird 2.1
Server version:
Firebird/linux AMD64 (access method), version "LI-V2.1.3.18185 Firebird 2.1"
Firebird/linux AMD64 (remote server), version "LI-V2.1.3.18185 Firebird 2.1/tcp (fedora64)/P11"
Firebird/linux AMD64 (remote interface), version "LI-V2.1.3.18185 Firebird 2.1/tcp (fedora64)/P11"
on disk structure version 11.1
SQL> show system collation;
ASCII, CHARACTER SET ASCII, PAD SPACE, SYSTEM
BIG_5, CHARACTER SET BIG_5, PAD SPACE, SYSTEM
BS_BA, CHARACTER SET WIN1250, PAD SPACE, SYSTEM
CP943C, CHARACTER SET CP943C, PAD SPACE, SYSTEM
CP943C_UNICODE, CHARACTER SET CP943C, PAD SPACE, SYSTEM
CS_CZ, CHARACTER SET ISO8859_2, PAD SPACE, SYSTEM
CYRL, CHARACTER SET CYRL, PAD SPACE, SYSTEM
DA_DA, CHARACTER SET ISO8859_1, PAD SPACE, SYSTEM
DB_CSY, CHARACTER SET DOS852, PAD SPACE, SYSTEM
DB_DAN865, CHARACTER SET DOS865, PAD SPACE, SYSTEM
DB_DEU437, CHARACTER SET DOS437, PAD SPACE, SYSTEM
DB_DEU850, CHARACTER SET DOS850, PAD SPACE, SYSTEM
DB_ESP437, CHARACTER SET DOS437, PAD SPACE, SYSTEM
DB_ESP850, CHARACTER SET DOS850, PAD SPACE, SYSTEM
DB_FIN437, CHARACTER SET DOS437, PAD SPACE, SYSTEM
DB_FRA437, CHARACTER SET DOS437, PAD SPACE, SYSTEM
DB_FRA850, CHARACTER SET DOS850, PAD SPACE, SYSTEM
DB_FRC850, CHARACTER SET DOS850, PAD SPACE, SYSTEM
DB_FRC863, CHARACTER SET DOS863, PAD SPACE, SYSTEM
DB_ITA437, CHARACTER SET DOS437, PAD SPACE, SYSTEM
DB_ITA850, CHARACTER SET DOS850, PAD SPACE, SYSTEM
DB_NLD437, CHARACTER SET DOS437, PAD SPACE, SYSTEM
DB_NLD850, CHARACTER SET DOS850, PAD SPACE, SYSTEM
DB_NOR865, CHARACTER SET DOS865, PAD SPACE, SYSTEM
DB_PLK, CHARACTER SET DOS852, PAD SPACE, SYSTEM
DB_PTB850, CHARACTER SET DOS850, PAD SPACE, SYSTEM
DB_PTG860, CHARACTER SET DOS860, PAD SPACE, SYSTEM
DB_RUS, CHARACTER SET CYRL, PAD SPACE, SYSTEM
DB_SLO, CHARACTER SET DOS852, PAD SPACE, SYSTEM
DB_SVE437, CHARACTER SET DOS437, PAD SPACE, SYSTEM
DB_SVE850, CHARACTER SET DOS850, PAD SPACE, SYSTEM
DB_TRK, CHARACTER SET DOS857, PAD SPACE, SYSTEM
DB_UK437, CHARACTER SET DOS437, PAD SPACE, SYSTEM
DB_UK850, CHARACTER SET DOS850, PAD SPACE, SYSTEM
DB_US437, CHARACTER SET DOS437, PAD SPACE, SYSTEM
DB_US850, CHARACTER SET DOS850, PAD SPACE, SYSTEM
DE_DE, CHARACTER SET ISO8859_1, PAD SPACE, SYSTEM
DOS437, CHARACTER SET DOS437, PAD SPACE, SYSTEM
DOS737, CHARACTER SET DOS737, PAD SPACE, SYSTEM
DOS775, CHARACTER SET DOS775, PAD SPACE, SYSTEM
DOS850, CHARACTER SET DOS850, PAD SPACE, SYSTEM
DOS852, CHARACTER SET DOS852, PAD SPACE, SYSTEM
DOS857, CHARACTER SET DOS857, PAD SPACE, SYSTEM
DOS858, CHARACTER SET DOS858, PAD SPACE, SYSTEM
DOS860, CHARACTER SET DOS860, PAD SPACE, SYSTEM
DOS861, CHARACTER SET DOS861, PAD SPACE, SYSTEM
DOS862, CHARACTER SET DOS862, PAD SPACE, SYSTEM
DOS863, CHARACTER SET DOS863, PAD SPACE, SYSTEM
DOS864, CHARACTER SET DOS864, PAD SPACE, SYSTEM
DOS865, CHARACTER SET DOS865, PAD SPACE, SYSTEM
DOS866, CHARACTER SET DOS866, PAD SPACE, SYSTEM
DOS869, CHARACTER SET DOS869, PAD SPACE, SYSTEM
DU_NL, CHARACTER SET ISO8859_1, PAD SPACE, SYSTEM
EN_UK, CHARACTER SET ISO8859_1, PAD SPACE, SYSTEM
EN_US, CHARACTER SET ISO8859_1, PAD SPACE, SYSTEM
ES_ES, CHARACTER SET ISO8859_1, PAD SPACE, 'DISABLE-COMPRESSIONS=1;SPECIALS-FIRST=1', SYSTEM
ES_ES_CI_AI, CHARACTER SET ISO8859_1, PAD SPACE, CASE INSENSITIVE, ACCENT INSENSITIVE, 'DISABLE-COMPRESSIONS=1;SPECIALS-FIRST=1', SYSTEM
EUCJ_0208, CHARACTER SET EUCJ_0208, PAD SPACE, SYSTEM
FI_FI, CHARACTER SET ISO8859_1, PAD SPACE, SYSTEM
FR_CA, CHARACTER SET ISO8859_1, PAD SPACE, SYSTEM
FR_FR, CHARACTER SET ISO8859_1, PAD SPACE, SYSTEM
FR_FR_CI_AI, CHARACTER SET ISO8859_1, FROM EXTERNAL ('FR_FR'), PAD SPACE, CASE INSENSITIVE, ACCENT INSENSITIVE, 'SPECIALS-FIRST=1', SYSTEM
GBK, CHARACTER SET GBK, PAD SPACE, SYSTEM
GBK_UNICODE, CHARACTER SET GBK, PAD SPACE, SYSTEM
GB_2312, CHARACTER SET GB_2312, PAD SPACE, SYSTEM
ISO8859_1, CHARACTER SET ISO8859_1, PAD SPACE, SYSTEM
ISO8859_13, CHARACTER SET ISO8859_13, PAD SPACE, SYSTEM
ISO8859_2, CHARACTER SET ISO8859_2, PAD SPACE, SYSTEM
ISO8859_3, CHARACTER SET ISO8859_3, PAD SPACE, SYSTEM
ISO8859_4, CHARACTER SET ISO8859_4, PAD SPACE, SYSTEM
ISO8859_5, CHARACTER SET ISO8859_5, PAD SPACE, SYSTEM
ISO8859_6, CHARACTER SET ISO8859_6, PAD SPACE, SYSTEM
ISO8859_7, CHARACTER SET ISO8859_7, PAD SPACE, SYSTEM
ISO8859_8, CHARACTER SET ISO8859_8, PAD SPACE, SYSTEM
ISO8859_9, CHARACTER SET ISO8859_9, PAD SPACE, SYSTEM
ISO_HUN, CHARACTER SET ISO8859_2, PAD SPACE, SYSTEM
ISO_PLK, CHARACTER SET ISO8859_2, PAD SPACE, SYSTEM
IS_IS, CHARACTER SET ISO8859_1, PAD SPACE, SYSTEM
IT_IT, CHARACTER SET ISO8859_1, PAD SPACE, SYSTEM
KOI8R, CHARACTER SET KOI8R, PAD SPACE, SYSTEM
KOI8R_RU, CHARACTER SET KOI8R, PAD SPACE, SYSTEM
KOI8U, CHARACTER SET KOI8U, PAD SPACE, SYSTEM
KOI8U_UA, CHARACTER SET KOI8U, PAD SPACE, SYSTEM
KSC_5601, CHARACTER SET KSC_5601, PAD SPACE, SYSTEM
KSC_DICTIONARY, CHARACTER SET KSC_5601, PAD SPACE, SYSTEM
LT_LT, CHARACTER SET ISO8859_13, PAD SPACE, SYSTEM
NEXT, CHARACTER SET NEXT, PAD SPACE, SYSTEM
NONE, CHARACTER SET NONE, PAD SPACE, SYSTEM
NO_NO, CHARACTER SET ISO8859_1, PAD SPACE, SYSTEM
NXT_DEU, CHARACTER SET NEXT, PAD SPACE, SYSTEM
NXT_ESP, CHARACTER SET NEXT, PAD SPACE, SYSTEM
NXT_FRA, CHARACTER SET NEXT, PAD SPACE, SYSTEM
NXT_ITA, CHARACTER SET NEXT, PAD SPACE, SYSTEM
NXT_US, CHARACTER SET NEXT, PAD SPACE, SYSTEM
OCTETS, CHARACTER SET OCTETS, PAD SPACE, SYSTEM
PDOX_ASCII, CHARACTER SET DOS437, PAD SPACE, SYSTEM
PDOX_CSY, CHARACTER SET DOS852, PAD SPACE, SYSTEM
PDOX_CYRL, CHARACTER SET CYRL, PAD SPACE, SYSTEM
PDOX_HUN, CHARACTER SET DOS852, PAD SPACE, SYSTEM
PDOX_INTL, CHARACTER SET DOS437, PAD SPACE, SYSTEM
PDOX_ISL, CHARACTER SET DOS861, PAD SPACE, SYSTEM
PDOX_NORDAN4, CHARACTER SET DOS865, PAD SPACE, SYSTEM
PDOX_PLK, CHARACTER SET DOS852, PAD SPACE, SYSTEM
PDOX_SLO, CHARACTER SET DOS852, PAD SPACE, SYSTEM
PDOX_SWEDFIN, CHARACTER SET DOS437, PAD SPACE, SYSTEM
PT_BR, CHARACTER SET ISO8859_1, PAD SPACE, CASE INSENSITIVE, ACCENT INSENSITIVE, SYSTEM
PT_PT, CHARACTER SET ISO8859_1, PAD SPACE, SYSTEM
PXW_CSY, CHARACTER SET WIN1250, PAD SPACE, SYSTEM
PXW_CYRL, CHARACTER SET WIN1251, PAD SPACE, SYSTEM
PXW_GREEK, CHARACTER SET WIN1253, PAD SPACE, SYSTEM
PXW_HUN, CHARACTER SET WIN1250, PAD SPACE, SYSTEM
PXW_HUNDC, CHARACTER SET WIN1250, PAD SPACE, SYSTEM
PXW_INTL, CHARACTER SET WIN1252, PAD SPACE, SYSTEM
PXW_INTL850, CHARACTER SET WIN1252, PAD SPACE, SYSTEM
PXW_NORDAN4, CHARACTER SET WIN1252, PAD SPACE, SYSTEM
PXW_PLK, CHARACTER SET WIN1250, PAD SPACE, SYSTEM
PXW_SLOV, CHARACTER SET WIN1250, PAD SPACE, SYSTEM
PXW_SPAN, CHARACTER SET WIN1252, PAD SPACE, SYSTEM
PXW_SWEDFIN, CHARACTER SET WIN1252, PAD SPACE, SYSTEM
PXW_TURK, CHARACTER SET WIN1254, PAD SPACE, SYSTEM
SJIS_0208, CHARACTER SET SJIS_0208, PAD SPACE, SYSTEM
SV_SV, CHARACTER SET ISO8859_1, PAD SPACE, SYSTEM
TIS620, CHARACTER SET TIS620, PAD SPACE, SYSTEM
TIS620_UNICODE, CHARACTER SET TIS620, PAD SPACE, SYSTEM
UCS_BASIC, CHARACTER SET UTF8, PAD SPACE, SYSTEM
UNICODE, CHARACTER SET UTF8, PAD SPACE, SYSTEM
UNICODE_CI, CHARACTER SET UTF8, FROM EXTERNAL ('UNICODE'), PAD SPACE, CASE INSENSITIVE, SYSTEM
UNICODE_FSS, CHARACTER SET UNICODE_FSS, PAD SPACE, SYSTEM
UTF8, CHARACTER SET UTF8, PAD SPACE, SYSTEM
WIN1250, CHARACTER SET WIN1250, PAD SPACE, SYSTEM
WIN1251, CHARACTER SET WIN1251, PAD SPACE, SYSTEM
WIN1251_UA, CHARACTER SET WIN1251, PAD SPACE, SYSTEM
WIN1252, CHARACTER SET WIN1252, PAD SPACE, SYSTEM
WIN1253, CHARACTER SET WIN1253, PAD SPACE, SYSTEM
WIN1254, CHARACTER SET WIN1254, PAD SPACE, SYSTEM
WIN1255, CHARACTER SET WIN1255, PAD SPACE, SYSTEM
WIN1256, CHARACTER SET WIN1256, PAD SPACE, SYSTEM
WIN1257, CHARACTER SET WIN1257, PAD SPACE, SYSTEM
WIN1257_EE, CHARACTER SET WIN1257, PAD SPACE, SYSTEM
WIN1257_LT, CHARACTER SET WIN1257, PAD SPACE, SYSTEM
WIN1257_LV, CHARACTER SET WIN1257, PAD SPACE, SYSTEM
WIN1258, CHARACTER SET WIN1258, PAD SPACE, SYSTEM
WIN_CZ, CHARACTER SET WIN1250, PAD SPACE, CASE INSENSITIVE, SYSTEM
WIN_CZ_CI_AI, CHARACTER SET WIN1250, PAD SPACE, CASE INSENSITIVE, ACCENT INSENSITIVE, SYSTEM
WIN_PTBR, CHARACTER SET WIN1252, PAD SPACE, CASE INSENSITIVE, ACCENT INSENSITIVE, SYSTEM

SQL> exit;
[philippe@fedora64 ~]$ ldd /usr/lib64/firebird/intl/fbintl
linux-vdso.so.1 => (0x00007fff2c1ff000)
libicuuc.so.44 => /usr/lib64/libicuuc.so.44 (0x00007f1715ea4000)
libicudata.so.44 => /usr/lib64/libicudata.so.44 (0x00007f1714e66000)
libicui18n.so.44 => /usr/lib64/libicui18n.so.44 (0x00007f1714aa9000)
libdl.so.2 => /lib64/libdl.so.2 (0x00007f17148a5000)
libncurses.so.5 => /lib64/libncurses.so.5 (0x00007f1714681000)
libtinfo.so.5 => /lib64/libtinfo.so.5 (0x00007f171445a000)
libstdc++.so.6 => /usr/lib64/libstdc++.so.6 (0x00007f1714153000)
libm.so.6 => /lib64/libm.so.6 (0x00007f1713ecd000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f1713cb8000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f1713a9c000)
libc.so.6 => /lib64/libc.so.6 (0x00007f17136ff000)
/lib64/ld-linux-x86-64.so.2 (0x00000038d0800000)
[philippe@fedora64 ~]$

@firebird-automations
Copy link
Collaborator Author

Commented by: Damyan Ivanov (dam)

There is nothing in the log here (firebird 2.5.1 from SVN revision 52328)

@firebird-automations
Copy link
Collaborator Author

Commented by: @asfernandes

Ok, now I know what's happening. ICU was using function name encoding pattern name_4_2 and now uses name_44.

I didn't saw the bug cause I was testing 3.0 and it already has a fix for this.

@firebird-automations
Copy link
Collaborator Author

Modified by: @asfernandes

status: Open [ 1 ] => Resolved [ 5 ]

resolution: Fixed [ 1 ]

Fix Version: 2.5.1 [ 10333 ]

Fix Version: 3.0 Alpha 1 [ 10331 ]

Fix Version: 2.1.5 [ 10420 ]

@firebird-automations
Copy link
Collaborator Author

Commented by: @pmakowski

please send me the patch

@firebird-automations
Copy link
Collaborator Author

Commented by: @asfernandes

They're already committed to 2.1 and 2.5 svn.

@firebird-automations
Copy link
Collaborator Author

Commented by: @pmakowski

ok, saw them in the tracker now
thanks

will it be ok with icu 4,6 too ?

@firebird-automations
Copy link
Collaborator Author

Commented by: @asfernandes

> will it be ok with icu 4,6 too ?

I do not know. But will be a shame if they start to invent new way to encode their function names at each release.

@firebird-automations
Copy link
Collaborator Author

Commented by: @AlexPeshkoff

> I didn't saw the bug cause I was testing 3.0 and it already has a fix for this.

Appears it's my fault. I've changed that in trunc in a very massive set of changes, but later forgotten to backport.

@firebird-automations
Copy link
Collaborator Author

Commented by: Steven R. Loomis (srl_icu-project.org)

ICU developer here. I came here because of https://bugs.launchpad.net/ubuntu/+source/icu/+bug/778386 which pointed to https://bugzilla.redhat.com/show_bug.cgi?id=697313

This sounds like it was resolved, but if there are questions about the renaming (such as your question, Adriano ) -
Please see http://userguide.icu-project.org/design#TOC-ICU-Binary-Compatibility:-Using-ICU and discussion at https://bugs.launchpad.net/ubuntu/+source/icu/+bug/675946 - and also make sure you are availing yourself of mailing lists, bug reports, user guides etc at http://icu-project.org

Specifically, function renaming allows multiple ICU versions to coincide in the same address space. This is particularly important for collation, so that multiple versions of collators are available at the same time without having to rebuild the sort keys. ( ICU version X and version Y have different UCA and CLDR versions, and so sort differently and produce different collation keys. ) A future feature ('provider') will make it possible to request multiple ICU versions using a keyword without having to link against multiple ICU codebases, but that's not implemented yet.

Have you considered requesting Firebird to be added to http://icu-project.org as a project using ICU?

Regards,
Steven

@firebird-automations
Copy link
Collaborator Author

Commented by: Steven R. Loomis (srl_icu-project.org)

OK, I just read the SVN commits... why are you loading ICU symbols dynamically???

@firebird-automations
Copy link
Collaborator Author

Commented by: @asfernandes

The docs you mentioned just confirms we're doing the right thing, because ICU don't support compatibility between collation data from (major) version to version and we can load the required version depending on the database.

> Have you considered requesting Firebird to be added to http://icu-project.org as a project using ICU?

It would be ok in my opinion.

@firebird-automations
Copy link
Collaborator Author

Commented by: @AlexPeshkoff

Steven, it's certainly OK to rename symbols like name_X_Y when library version is changed. What's strange is making at some step name_XY instead it.

Firebird is loading ICU symbols dynamically in order to be able to correctly work with collation keys (stored in database indexes) when it needs to open database, created using non-default ICU version. Certainly, appropriate version of ICU must be installed to make it work.

And one more question - unrelated with this issue directly. I want to load ICU library dynamically using soft link like /usr/lib/libicuuc.so pointing to particular library, something like libicuuc.so.44.1. How can I determine actual version of loaded library after it? (If it's better for you feel free to reply to peshkoff at mail dot ru.)

@firebird-automations
Copy link
Collaborator Author

Commented by: Steven R. Loomis (srl_icu-project.org)

Adriano, Alexander. Good to write you. We're very busy getting our 4.8 out ( you could test a milestone or the trunk version if you want.. ).

We changed it from name_x_y to name_xy to save a byte, and because we only track the major+minor version number. Didn't expect that to be a hard dependency, in fact, projects can customize their own suffixes as well. As of I think 4.4, the macro U_ICU_ENTRY_POINT_RENAME(x) is the bottleneck for symbol renaming.

You are both exactly right- you need multiple ICU versions to deal with keys. In fact, the 'provider' feature http://bugs.icu-project.org/trac/ticket/8157 http://bugs.icu-project.org/trac/ticket/6631 which unfortunately won't make 4.8 (slipped again) deals with this by allowing collators to be loaded using a locale ID such as "de_CH@sp=icu44" "de_CH@sp=icu46" etc. - this is actually already implemented for the C++ collator for getSortKey in sort of an unfinished state. It couldn't work with C because of bug 8157. This doesn't work automatically in the stock ICU- have to do a special build. But, it's a 'plugin' to ICU, so you don't have to recompile the application or ICU to add/remove these providers providing 44, 46, etc- separate shared libs. The advantage is the application only has to link against one version of ICU.

I took the approach of building (by macro) an implementation which directly called into the other ICU versions, rather than calling dlsym: http://bugs.icu-project.org/trac/browser/tools/trunk/multi/proj/provider/glue/coll_fe.cpp then all the issues are caught at link time ( in my use, I statically linked the libraries to make everything smaller ) - this special build has scripts which cause ICU itself to emit the right symbol renaming paths, rather than having to hard code them.

For your use, what you could do is just have your interfaces which use ICU, but then redefine U_ICU_ENTRY_POINT_RENAME and you could build thin .so's which call into different ICUs. This has been done in similar ways before. myinterface.c " #⁠include <unicode/ucol.h> myfunction() { ucol_open( ... ) } ... " then "cc -o myinterface_44.o -DU_ICU_ENTRY_POINT_RENAME(x)=x #⁠#⁠ _44" "cc -o myinterface_46.o -DU_ICU_ENTRY_POINT_RENAME(x)=x #⁠#⁠ _46" etc. Then you can verify that it's linkable against a real ICU.

I added firebird to http://icu-project.org

About your question.. The best way to ask is to use the icu-support mailing list and or our bug forms.. I hope you at least look at those sometimes.. but, to answer your question, you can call u_getVersion() and it will fill-in a UVersionInfo struct ( 4 bytes) with the actual version number.

Also, if the "--enable-auto-cleanup " is configured, ICU will automatically call cleanup when the library is unloaded. I saw some discussion of that. it's ticket http://bugs.icu-project.org/trac/ticket/3126 but isn't enabled by default - mostly in absence of user feedback. Usage like yours would be a good example of why this could be important.

I'm glad I was subscribed to launchpad and that a Ubuntu user hit this (not that I'm glad the user had trouble...) .. otherwise, we would not have known about this interaction. Please, please, ask questions and file bugs upstream!

@firebird-automations
Copy link
Collaborator Author

Commented by: Steven R. Loomis (srl_icu-project.org)

Hey, I've finally committed the provider interface into ICU trunk. Read the comment on http://bugs.icu-project.org/trac/ticket/8157 when this interface is installed you can request different ICU collations like the following and get different results.

ja@sp=icu38 = --> AN_CX_DX_EX_FX_HO_LJA_NX_S3_PICU38
95 3A 03 77 9E 03 4D 4F 51 33 33 01 0B 01 A1 85 8F 08 00
ja@sp=icu42 = --> AN_CX_DX_EX_FX_HO_LJA_NX_S3_PICU42
*A3 3A*7A*B2*03*50*52*54*36*36*01*0B*01*A1*85*8F*08*00
ja@sp=icu44 = --> AN_CX_DX_EX_FX_HO_LJA_NX_S3_PICU44
*AC 3A*7C*A6 03*51*53*55*37*37 01 0B 01 A1 85 8F 08 00
ja@sp=icu46 = --> AN_CX_DX_EX_FX_HO_LJA_NX_S3_PICU46
*79*26*03*70*94*03*4B*4D*4F*31*31*01*0B*01*A1*85*8F*08 00
ja@sp=icu48 = --> AN_CX_DX_EX_FX_HO_LJA_NX_S3_PICU48
79 26 03 70 94 03 4B 4D 4F 31 31 01 0B 01 A1 85 8F 08 00
ja = --> AN_CX_DX_EX_FX_HO_LJA_NX_S3
79 26 03 70 94 03 4B 4D 4F 31 31 01 0B 01 A1 85 8F 08 00

@firebird-automations
Copy link
Collaborator Author

Modified by: @pmakowski

status: Resolved [ 5 ] => Closed [ 6 ]

@firebird-automations
Copy link
Collaborator Author

Commented by: Cosmin Apreutesei (cosmin_ap2)

Today I installed firebird using

apt-get firebird2.5-super

on a fresh and clean debian 6

then I tried to restore my database from a gbk file created from windows with icu 3.0.

I get "COLLATION UNICODE_CI_AI for CHARACTER SET UTF8 is not installed".

The debian has ICU 4.4 in default locations (ldd shows that fbserver finds all the libs).

Is this the same problem?

@firebird-automations
Copy link
Collaborator Author

Modified by: @pavel-zotov

status: Closed [ 6 ] => Closed [ 6 ]

QA Status: Done successfully

Test Details: Passed on: WI-V2.5.5.26871, WI-T3.0.0.31844; LI-V2.5.3.26788, LI-T3.0.0.31842

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment