Issue Details (XML | Word | Printable)

Key: CORE-4786
Type: Bug Bug
Status: Closed Closed
Resolution: Fixed
Priority: Major Major
Assignee: Dmitry Yemanov
Reporter: Pavel Zotov
Votes: 0
Watchers: 3
Operations

If you were logged in you would be able to see more operations.
Firebird Core

Problematic key value (when attempt to insert duplicate in PK/UK) is not shown where length of key >= 127 characters

Created: 08/May/15 04:15 PM   Updated: 01/Jun/15 05:52 PM
Component/s: Engine
Affects Version/s: 3.0 Beta 1, 2.1.7, 2.5.4
Fix Version/s: 3.0 Beta 2, 2.5.5

QA Status: Done successfully
Test Details:
WI-T3.0.0.31844: output if key that violates PK/UK ignores fact that key can be in multi-byte encoding. That leads to non-readable sequence of characters in STDERR.
LI-T3.0.0.31846: all starts work fine, including for unicode text with two- and three- bytes per character.


 Description  « Hide
SQL> insert into test(s) values( rpad('', 126, 'qwertyuioplkjhgfdsazxcvbnm') );
SQL> insert into test(s) values( rpad('', 126, 'qwertyuioplkjhgfdsazxcvbnm') );
Statement failed, SQLSTATE = 23000
violation of PRIMARY or UNIQUE KEY constraint "TEST_PK" on table "TEST"
-Problematic key value is ("S" = 'qwertyuioplkjhgfdsazxcvbnmqwertyuioplkjhgfdsazxcvbnmqwertyuioplkjhgfdsazxcvbnmqwertyuioplkjhgfdsazxcv
bnmqwertyuioplkjhgfdsazxc')
SQL>

SQL> recreate table test(s varchar(127), constraint test_pk primary key(s) ); commit;
SQL> insert into test(s) values( rpad('', 127, 'qwertyuioplkjhgfdsazxcvbnm') );
SQL> insert into test(s) values( rpad('', 127, 'qwertyuioplkjhgfdsazxcvbnm') );
Statement failed, SQLSTATE = 23000
violation of PRIMARY or UNIQUE KEY constraint "TEST_PK" on table "TEST"
-arithmetic exception, numeric overflow, or string truncation
-string right truncation
-expected length 126, actual 127

 All   Comments   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Dmitry Yemanov added a comment - 08/May/15 04:48 PM
Fixed for v3, please test. Note that the textual key representation has a limit of 250 characters, so longer values will be truncated.

Pavel Zotov added a comment - 08/May/15 05:28 PM
SQL> recreate table test(s varchar(300) unique using index test_unq); commit;
SQL> insert into test values( rpad('', 245, 'qwertyuioplkjhgfdsazxcvbnm') || '12345' );
SQL> insert into test values( rpad('', 245, 'qwertyuioplkjhgfdsazxcvbnm') || '12345' );
Statement failed, SQLSTATE = 23000
violation of PRIMARY or UNIQUE KEY constraint "INTEG_1" on table "TEST"
-Problematic key value is ("S" = 'qwertyuioplkjhgfdsazxcvbnmqwertyuioplkjhgfdsazxcvbnmqwertyuioplkjhgfdsazxcvbnmqwertyuioplkjhgfdsazxcv
bnmqwertyuioplkjhgfdsazxcvbnmqwertyuioplkjhgfdsazxcvbnmqwertyuioplkjhgfdsazxcvbnmqwertyuioplkjhgfdsazxcvbnmqwertyuioplkjhgfdsazxcvbnmqw
ertyuiopl1234...)

Yes, it works better now, but I can see 249 (not 250) characters :-) // revision 61512.

IMO, this limit should be documented ?

Pavel Zotov added a comment - 27/May/15 06:40 PM
Consider following script encoded in UTF8:

set names utf8;
create database 'localhost/3333:C:\MIX\firebird\QA\fbt-repo\tmp\c4786.fdb';

recreate table test_utf8(s varchar(169) character set utf8 constraint test_cs_utf8_unq unique using index test_cs_utf8_unq);
commit;

insert into test_utf8 values( rpad('', 164, 'A') || '12345' );
insert into test_utf8 values( rpad('', 164, 'A') || '12345' );
insert into test_utf8 values( rpad('', 164, 'Á') || '12345' ); -- multi-byte character "a-acute", U+00C1
insert into test_utf8 values( rpad('', 164, 'Á') || '12345' );

It will produce STDERR:
===
Statement failed, SQLSTATE = 23000
violation of PRIMARY or UNIQUE KEY constraint "TEST_CS_UTF8_UNQ" on table "TEST_UTF8"
-Problematic key value is ("S" = 'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA12345')

Statement failed, SQLSTATE = 23000
violation of PRIMARY or UNIQUE KEY constraint "TEST_CS_UTF8_UNQ" on table "TEST_UTF8"
-Problematic key value is ("S" = 'ÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁî..)
===

Note: after last 'Á' in the second message we see NOT 'dot' character but "î".
This is because output ignores multi-byte nature of key that violates PK/UK.

If we open this STDERR in some editor that doesn`t support UTF8, we'll see there:

"S" = 'Г_Г_Г_Г_ <skipped> Г_Г...)

-- and no "_" after final "Г" that should be there.

Dmitry Yemanov added a comment - 31/May/15 01:15 PM
Fix is committed, please test. Beware that the UTF8 value will be printed shorter, as the limit is actually 250 bytes, not characters.

Pavel Zotov added a comment - 31/May/15 06:03 PM
Tested on multi-byte text keys with two- and three- bytes for character representation. Result: OK, no 'first-only-byte' effect as before.
Unicode symbols with four bytes per character (e.g. georgian alphabet or musical notes) can`t be stored correctly on my machine: they are displayed as 'small squares' instead.