New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Translation of large text BLOB between UNICODE_FSS (UTF8) and other charsets [CORE2122] #976
Comments
Modified by: @dyemanovassignee: Adriano dos Santos Fernandes [ asfernandes ] |
Commented by: @ibprovider If is it need, I can sent the private tests (for Windows 32/64) with demonstration of this problems. |
Commented by: @asfernandes > Insert [connection ctype: UNICODE_FSS] large string with 1048576 UTF8 chars from CP943C charset What you mean? If your blob is being created as UNICODE_FSS but you put CP943 bytes, it's obviously that you will have problems. If that is not the case, please sent the test case. |
Modified by: @asfernandesstatus: Open [ 1 ] => Resolved [ 5 ] resolution: Fixed [ 1 ] Fix Version: 2.5 Beta 1 [ 10251 ] |
Commented by: @ibprovider Hi The problems still occur for single-byte ICU-charsets - TIS620 Sample test: And, after correction - for all lengths of multi-byte ICU-charset - CP943C Sample tests: Ofcourse, may this is other problems, and they will be decided in separate changes See also our old BUG-1596 :-) Thanks |
Commented by: @asfernandes Does your test run in loop or it have too many blob.002* tests? I tried run blob.002* and it never ends... |
Commented by: @ibprovider I have the great workstation + patience :-) |
Commented by: @dyemanov Re-opened upon request of the bug reporter. He insists the problem still exists. |
Commented by: @asfernandes Then I expect from Mr. Kovalenko sources for his test as well as a way to compile and debug it. I can do nothing looking at the debugger on junk bytes that the engine has saying is bad input!!! |
Commented by: @asfernandes Real problem is the following: Test case generate bytes and convert them to UTF-8 using ICU (please correct if I'm wrong, Dmitry K.). But the generated UTF-8 bytes is not valid UNICODE_FSS. Current, well formed check of UNICODE_FSS is done as with UTF-8, so string pass from a stage that it shouldn't. Later, when converting from (wrong) UNICODE_FSS to TIS620 a transliteration error is raised. So what really need to be fixed is UNICODE_FSS well formed check, and then ask for Dmitry correct its tests. :-) This is at least for blob.002.unicode.TBL_CS__TIS620.COL_BLOB.ins_UNICODE_FSS.sel_TIS620.len_32767.chars_TIS620.bind__wstr case. Didn't verified others yet. |
Commented by: @ibprovider See attach file But I continue get the old (and new) errors with select from TBL_CS__CP943C as UNICODE_FSS I think this problem has link with CORE2123 |
Modified by: @ibproviderAttachment: filters_1_63_dirty_patch.txt [ 11110 ] |
Commented by: @ibprovider >and then ask for Dmitry correct its tests. :-) I has improved my tests. But has get the new, similar errors for all FB-charsets :-( Ofcourse, except ASCII [ FB 2.1.1 without filters__dirty_patch ] |
Modified by: @dyemanovFix Version: 2.1.4 [ 10361 ] |
Modified by: @pcisarstatus: Resolved [ 5 ] => Closed [ 6 ] |
Modified by: @pavel-zotovQA Status: No test |
Submitted by: @ibprovider
Attachments:
filters_1_63_dirty_patch.txt
I made some tests for checks the translation of BLOB between UTF8 and other charsets
At small BLOB these tests work fine.
At large BLOB - I get the error "Cannot transliterate character between character sets"
For example:
- Meta: BLOB UNICODE_FSS
- Insert [connection ctype: UNICODE_FSS] large string with 1048576 UTF8 chars from CP943C charset
- Select [connection ctype: CP943C]: "Cannot transliterate character between character sets"
for 1024 chars - no problem at select
----------
- Meta: BLOB UNICODE_FSS
- Insert [connection ctype: CP943C] large string with 32767 CP943C chars: Cannot transliterate character between character sets
with 1024 chars - insert is OK.
----------
I made tests for BIG_5, TIS620, WIN1251 also, and received a similar problem.
Banzay
Commits: e1cb23f acb1151 99246d8
The text was updated successfully, but these errors were encountered: