Japanese character set CP943C [CORE1324] #1743

firebird-automations · 2007-06-15T22:42:48Z

Submitted by: KIMURA, Meiji (meijik)

In Firebird 2.0 or later, character set conversion method has changed, then "Windows-31J" extension were
cannot use in FB2.0 or later environment. (detail in quated mail as below)

This is a severe problem for japanese user. Typical develloper use delphi with FB1.0 or 1.5 on Windows server and
use "Windows-31J" extension, the same code don't work on FB 2.0 or later. Then many japanese user cannot migrate
from FB1.x to 2.x.

Please add character set 'cp932' to FB 2.1.
#⁠ I will help to test it.

Fortunately, iCU routine has 'Windows-31j' then use it in order to support 'cp932'.

Regards,
KIMURA, Meiji(FAMILY, Given)

//--> Quated mail as below
[Firebird-devel] Firebird 2.x cannot handle with some japanesecharacters in SJIS_0208 environment.

KIMURA, Meiji wrote:
> In Firebird 1.x, InterBase 6.x or later, 'SJIS_0208' *IS* Shift_JIS in IANA.
> But in the condition that the same character set 'SJIS_0208' between client and server,
> there is no conversion of character set. As a result, 'Windows-31J' extension can use
> with no error.
>
In previous version there is a direct (special) converter from SJIS to
something else and this converter was removed, doing the conversion
through Unicode.

> But in Firebird 2.0 environment, If the same character set 'SJIS_0208' used between
> client and server, Unicode is used as a pivot character set. as a result,
> we cannot use "Windows-31J" extension.
>
I've already heard this, maybe from Daiju.

> It seems that the same problem occurs in MySQL 4.1.
> In the case of MySQL, there is no conversion version 4.0 or before, but
> version 4.1 or later, Unicode is used as a pivot character set, then
> the same problem occurs.
>
> MySQL support character set 'cp932' as a measure for this problem.
> cp932 means 'Windows Codepage 932'. cp932 *IS* Windows-31J in IANA.
>
> I supporse if Firebird 2.0 will support character set 'cp932', we can avoid this problem.
> #⁠ When use iCU routine, use 'windows-31j' instead of 'shift-jis'.
>
This seems to be the way to go.

Adriano

Commits: 5d06ef3 f044f67

The text was updated successfully, but these errors were encountered:

firebird-automations · 2007-06-16T15:24:41Z

Modified by: @asfernandes

assignee: Adriano dos Santos Fernandes [ asfernandes ]

firebird-automations · 2007-06-17T01:48:09Z

Commented by: @asfernandes

ICU has CP932 too.
Is it different from Windows-31j or an alias?

firebird-automations · 2007-06-18T13:28:43Z

Commented by: KIMURA, Meiji (meijik)

I think that there are three candidate for handling shift_jis extensiton.

Converter Explorer
http://demo.icu-project.org/icu-bin/convexp
(1) ibm-942_P12A-1999
(2) ibm-943_P15A-2003
(3) ibm-943_P130-1999

In this time, I think (2) is the best candidate. But some kanji characters are mapped as multiple code, then it maynot good for Firebird 2.0 when using unicode as pivot.

Please wait a week or so, I will test and make these difference clear.

firebird-automations · 2007-06-21T11:18:55Z

Commented by: KIMURA, Meiji (meijik)

I want to make some program for check the conversion specification.

What 'Internal Converter Name' of Unicode is used for pivot code in Firebird 2.0?

firebird-automations · 2007-06-22T01:04:46Z

Commented by: @asfernandes

Please try a snapshot build >= 16169.
It has CP932, but I need you to edit intl/fbintl.conf, trying the other ICU charsets to see what is better.
You need to leave one "collation CP932" uncommented in each try, and restart the server after edit the file:
<charset CP932>
intl_module fbintl
collation CP932 ibm-942_P12A-1999
#⁠ collation CP932 ibm-943_P130-1999
#⁠ collation CP932 ibm-943_P15A-2003
collation CP932_UNICODE
</charset>

Please report here.

firebird-automations · 2007-06-22T09:04:31Z

Commented by: Dimitrios Chr. Ioannidis (dchri)

Adriano,

actually the (1.)16169 is the revision of the http://writeBuildNum.sh file. The HEAD branch build number as result of your commits increased to 16012, so he must try a snapshot build >= 16012.

regards,

firebird-automations · 2007-07-12T00:45:14Z

Commented by: @asfernandes

Meiji, did you tested it?

firebird-automations · 2007-07-12T23:18:21Z

Commented by: KIMURA, Meiji (meijik)

Sorry, not yet.

I try to this on Firebird 2.1 Beta, but failed. I tried as above.

(1) Add CP932 definition to intl/fbintl.conf
<charset CP932>
intl_module fbintl
collation CP932 ibm-942_P12A-1999
collation CP932_UNICODE
</charset>

(2) restart firebird server
(3) run SQL Script intl.sql in misc/
(4) SQL> execute procedure sp_register_character_set('CP932',4);

But error message said CHARACTER SET CP932 is not installed.

I have to use newer than FB2.1 beta ? or there is something to do?

firebird-automations · 2007-07-12T23:43:08Z

Commented by: KIMURA, Meiji (meijik)

Today I try to overwrite latest FB2.1 snapsot after installation FB2.1 Beta.

firebird-automations · 2007-07-12T23:58:00Z

Commented by: KIMURA, Meiji (meijik)

It works, it seem that 'collation CP932 ibm-943_P130-1999' is good for this purpos, I will tested 2 or 3 days in detail.

firebird-automations · 2007-07-17T21:24:33Z

Commented by: KIMURA, Meiji (meijik)

I tested it. Please impliment this functions as below.

(i) charset name 'CP943C'
(ii) use ibm-943_P15A-2003

(i)
Strictly speaking, ICU don't have CP932. CP943C is upper set of CP932.
If we use the name 'CP932', it will throw japanese FB users into confusion.
Then we have to the name 'CP943C'.

(ii)
It seems that the specification of 'ibm-943_P15A-2003' is good for ordinary japanese FB users.

As a result, they are good choice for this issue. this function save a lot of japanese users.

<charset CP943C>
intl_module fbintl
collation CP943C ibm-943_P15A-2003
collation CP943C_UNICODE
</charset>

firebird-automations · 2007-07-19T00:15:56Z

Modified by: @asfernandes

summary: Please Support japanese characters cp932 => Japanese character set CP943C

firebird-automations · 2007-07-19T00:25:43Z

Modified by: @asfernandes

status: Open [ 1 ] => Resolved [ 5 ]

resolution: Fixed [ 1 ]

Fix Version: 2.1 Beta 2 [ 10190 ]

firebird-automations · 2007-10-15T07:14:49Z

Commented by: Minoru Yoshida (timeful2)

Hi,

Thanks for the addition of new spec.
I tested for CP943C charcter set, and making reports.
http://timeful.co.jp/fbmap/

Using the same character sets connection is very fine.
There are some problems by different character sets connection.
(The red font has described in the report )

1. CP943C to UTF8(and UNICODE_FSS)

The following characters are wrong.

- 0x8790 - 879C (9 chars)
- 0xED40 - EEFC (374 chars)

2. CP943C to SJIS_0208

The following characters are wrong.

- 0x7E
- 0x815F - 81CA(7 chars)
- 0x8740 - 879C(83 chars)
- 0xED40 - EEFC(374 chars)

Note:
ibm-943_P15A-2003
http://demo.icu-project.org/icu-bin/convexp?conv=ibm-943_P15A-2003&s=ALL

the bytes 81
http://demo.icu-project.org/icu-bin/convexp?conv=ibm-943_P15A-2003&b=81&s=ALL#layout

the bytes 87
http://demo.icu-project.org/icu-bin/convexp?conv=ibm-943_P15A-2003&b=87&s=ALL#layout

the bytes ED
http://demo.icu-project.org/icu-bin/convexp?conv=ibm-943_P15A-2003&b=ED&s=ALL#layout

the bytes EE
http://demo.icu-project.org/icu-bin/convexp?conv=ibm-943_P15A-2003&b=EE&s=ALL#layout

Regards,
Minoru

firebird-automations · 2007-10-16T19:47:24Z

Commented by: Minoru Yoshida (timeful2)

I had mistakes. This thread was fixed.....
And the 0x7E character handrings is good(maybe) by isql.
I will retests , and make the new thread.

Regards,
Minoru

firebird-automations · 2007-10-26T13:46:49Z

Modified by: @pcisar

status: Resolved [ 5 ] => Closed [ 6 ]

firebird-automations · 2008-01-28T15:23:05Z

Modified by: @pcisar

Workflow: jira [ 12332 ] => Firebird [ 15468 ]

firebird-automations closed this as completed Jul 19, 2007

firebird-automations added affect-version: 2.1 Alpha 1 affect-version: 2.1 Initial affect-version: 2.0.1 affect-version: 2.0.0 fix-version: 2.1 Beta 2 resolution: fixed priority: major component: charsets/collation type: new feature labels Apr 25, 2021

firebird-automations assigned asfernandes Apr 25, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Japanese character set CP943C [CORE1324] #1743

Japanese character set CP943C [CORE1324] #1743

firebird-automations commented Jun 15, 2007

firebird-automations commented Jun 16, 2007

firebird-automations commented Jun 17, 2007

firebird-automations commented Jun 18, 2007

firebird-automations commented Jun 21, 2007

firebird-automations commented Jun 22, 2007

firebird-automations commented Jun 22, 2007

firebird-automations commented Jul 12, 2007

firebird-automations commented Jul 12, 2007

firebird-automations commented Jul 12, 2007

firebird-automations commented Jul 12, 2007

firebird-automations commented Jul 17, 2007

firebird-automations commented Jul 19, 2007

firebird-automations commented Jul 19, 2007

firebird-automations commented Oct 15, 2007

firebird-automations commented Oct 16, 2007

firebird-automations commented Oct 26, 2007

firebird-automations commented Jan 28, 2008

Japanese character set CP943C [CORE1324] #1743

Japanese character set CP943C [CORE1324] #1743

Comments

firebird-automations commented Jun 15, 2007

firebird-automations commented Jun 16, 2007

firebird-automations commented Jun 17, 2007

firebird-automations commented Jun 18, 2007

firebird-automations commented Jun 21, 2007

firebird-automations commented Jun 22, 2007

firebird-automations commented Jun 22, 2007

firebird-automations commented Jul 12, 2007

firebird-automations commented Jul 12, 2007

firebird-automations commented Jul 12, 2007

firebird-automations commented Jul 12, 2007

firebird-automations commented Jul 17, 2007

firebird-automations commented Jul 19, 2007

firebird-automations commented Jul 19, 2007

firebird-automations commented Oct 15, 2007

firebird-automations commented Oct 16, 2007

firebird-automations commented Oct 26, 2007

firebird-automations commented Jan 28, 2008