Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ùse the collate of the domain when altering a domain instead of the default domain of the default character set [CORE5895] #6153

Closed
firebird-automations opened this issue Aug 20, 2018 · 14 comments

Comments

@firebird-automations
Copy link
Collaborator

Submitted by: Van den Wouwer Danny (dannyvdw)

Attachments:
CORE-5895.sql
core-5895.PNG

* Within the 2.xx versions it can be done through direct manipulation of the system tables
* Within the 3.xx versions, it can be done with multiple DML statements, but this is ridiculous and still error prune

Example, both databases have UTF8 as default character set with UTF8 as default collate

DML source:
create domain d1 as char(2) collate unicode_ci_ai;
create domain d2 as varchar(2) collate unicode_ci_ai;

create table t (a d1, b d2);

DML target:
create domain d1 as char(1) collate unicode_ci_ai;
create domain d2 as varchar(1) collate unicode_ci_ai;

create table t (a d1, b d2);

Change script, to make source a target:
ALTER CHARACTER SET UTF8 SET DEFAULT COLLATION UNICODE_CI_AI;
ALTER DOMAIN D1
TYPE CHAR(2);
ALTER CHARACTER SET UTF8 SET DEFAULT COLLATION UTF8;

ALTER CHARACTER SET UTF8 SET DEFAULT COLLATION UNICODE_CI_AI;
ALTER DOMAIN D2
TYPE VARCHAR(2);
ALTER CHARACTER SET UTF8 SET DEFAULT COLLATION UTF8;

* Without these changes we end up with domains with the default collate UTF8
* I think if these domains have indirect dependencies like indices, constraints we could end up with corrupted data (unique constraint)

@firebird-automations
Copy link
Collaborator Author

Commented by: @mrotteveel

Please add a description in words of the problem. Having to execute your reproduction script and then interpret the results to understand what is going on makes it harder to understand your ticket at a glance. Especially as I may not have the same interpretation as you on the results.

@firebird-automations
Copy link
Collaborator Author

Commented by: Van den Wouwer Danny (dannyvdw)

Mark,

A varchar/char domains get the default character set with his default collate if you don't specify them.
If you create a char/varchar domain with a different collate then the default collate of the character set, then you will have this issue when you alter this domain.
It seems that the collate is being reset to the default collate of the character set

I will try to explain in steps:

1. Create a DB with a character set UTF8
2. Create a domain with a different collate then the default collate, for example unicode_ci_ai
3. Just try to increase the number of characters in this domain,what is allowed
4. watch what will happen with the collate of the domain, it is reset to utf8!

The problem is even bigger if you have some index on it, the collate is suddenly changed, what will happen with the existing index. Probably this leads to corruption of the index.

I discovered this while working/testing on the following project: https://github.com/cincuranet/FirebirdDbComparer
I'm the other author besides Jiri Cincura.

So, what I did was temporary set the default collate to the collate of the domain to make the adjustment working, this is a work around. And this you saw in the scripts above.

PS
* I added all these scripts, because you can see what will happen, maybe it was too cryptic, i apologize for this.

@firebird-automations
Copy link
Collaborator Author

Modified by: @asfernandes

assignee: Adriano dos Santos Fernandes [ asfernandes ]

@firebird-automations
Copy link
Collaborator Author

Commented by: @asfernandes

Are you verifying all these things with ISQL's SHOW DOMAIN or in other way?

Please, show the evidence of the problem, without your specific things (source / target database comparator).

Trying to understand your scripts, I see no problem.

@firebird-automations
Copy link
Collaborator Author

Commented by: @asfernandes

> 4. watch what will happen with the collate of the domain, it is reset to utf8!

Why you say that?

Because ISQL's SHOW DOMAIN?

ISQL has some logic to not show charset/collate when they're equal to the default.

@firebird-automations
Copy link
Collaborator Author

Modified by: Van den Wouwer Danny (dannyvdw)

Attachment: CORE5895.sql [ 13285 ]

@firebird-automations
Copy link
Collaborator Author

Modified by: Van den Wouwer Danny (dannyvdw)

Attachment: core-5895.PNG [ 13286 ]

@firebird-automations
Copy link
Collaborator Author

Commented by: Van den Wouwer Danny (dannyvdw)

I attached a script and a screenshot, i hope you can see what I mean.

@firebird-automations
Copy link
Collaborator Author

Commented by: @asfernandes

When you alter the domain's type with [VAR]CHAR(n), you tell it to alter to the default charset and default collate.

You can also manually change the charset with [VAR]CHAR(n) CHARACTER SET <name>, then using the character set's default collate.

When the type (including charset/collate) is changed, fields depending on it are schedulled to have their indexes rebuilt on commit.

All that seems ok, except the fact that with or without a CHARACTER SET type's subclause, the collate cannot be changed explicitly.

@firebird-automations
Copy link
Collaborator Author

Modified by: @asfernandes

status: Open [ 1 ] => Resolved [ 5 ]

resolution: Won't Fix [ 2 ]

@firebird-automations
Copy link
Collaborator Author

Commented by: Van den Wouwer Danny (dannyvdw)

Quote:
'When you alter the domain's type with [VAR]CHAR(n), you tell it to alter to the default charset and default collate.'

==> That is true, because there isn't the possibility to alter the domain while preserving the original collate when we created it, or do i miss something?.

see the definition:
ALTER DOMAIN domain_name
[TO <new_name>]
[TYPE <datatype>]
[SET DEFAULT {literal | NULL | <context_var>} | DROP DEFAULT]
[ADD [CONSTRAINT] CHECK (<dom_condition>) | DROP CONSTRAINT]

<datatype> ::=
{SMALLINT | INTEGER | BIGINT}
| {FLOAT | DOUBLE PRECISION}
| {DATE | TIME | TIMESTAMP}
| {DECIMAL | NUMERIC} [(precision [, scale])]
| {CHAR | CHARACTER | CHARACTER VARYING | VARCHAR} [(size)]
[CHARACTER SET charset_name]
| {NCHAR | NATIONAL CHARACTER | NATIONAL CHAR} [VARYING] [(size)]
| BLOB [SUB_TYPE {subtype_num | subtype_name}]
[SEGMENT SIZE seglen] [CHARACTER SET charset_name]
| BLOB [(seglen [, subtype_num])]

<dom_condition> ::=
<val> <operator> <val>
| <val> [NOT] BETWEEN <val> AND <val>
| <val> [NOT] IN (<val> [, <val> ...] | <select_list>)
| <val> IS [NOT] NULL
| <val> IS [NOT] DISTINCT FROM <val>
| <val> [NOT] CONTAINING <val>
| <val> [NOT] STARTING [WITH] <val>
| <val> [NOT] LIKE <val> [ESCAPE <val>]
| <val> [NOT] SIMILAR TO <val> [ESCAPE <val>]
| <val> <operator> {ALL | SOME | ANY} (<select_list>)
| [NOT] EXISTS (<select_expr>)
| [NOT] SINGULAR (<select_expr>)
| (<dom_condition>)
| NOT <dom_condition>
| <dom_condition> OR <dom_condition>
| <dom_condition> AND <dom_condition>

<operator> ::=
<> | != | ^= | ~= | = | < | > | <= | >= | !< | ^< | ~< | !> | ^> | ~>

<val> ::=
VALUE
| literal
| <context_var>
| <expression>
| NULL
| NEXT VALUE FOR genname
| GEN_ID(genname, <val>)
| CAST(<val> AS <datatype>)
| (<select_one>)
| func([<val> [, <val> ...]])

If I request to alter a domain / field with the possibility to alter/preserve the collation, are you considering to implement this?

I see that postgre / mysql give you this possibility.

@firebird-automations
Copy link
Collaborator Author

Modified by: @pcisar

status: Resolved [ 5 ] => Closed [ 6 ]

@firebird-automations
Copy link
Collaborator Author

Commented by: @asfernandes

What is their syntax for that?

@firebird-automations
Copy link
Collaborator Author

Commented by: Van den Wouwer Danny (dannyvdw)

Adriano,

For example in postgres:

create table test (
id bigint not null,
s_default_collate varchar(1) collate pg_catalog."default" not null,
s_other_collate varchar(1) collate pg_catalog."fr-BE-x-icu" not null,
primary key (id))

alter table test
alter column s_other_collate type varchar(2) collate pg_catalog."fr-BE-x-icu"

PS:
for domains, they do not give you the possibility for altering types

For MySql, i only went to their documentation:

ALTER TABLE tbl_name
[alter_specification [, alter_specification] ...]
[partition_options]

alter_specification:
table_options
...
| CHANGE [COLUMN] old_col_name new_col_name column_definition
[FIRST|AFTER col_name]
| [DEFAULT] CHARACTER SET [=] charset_name [COLLATE [=] collation_name]
| CONVERT TO CHARACTER SET charset_name [COLLATE collation_name]

https://dev.mysql.com/doc/refman/8.0/en/alter-table.html#alter-table-character-set

But for mysql I didn't testing it, I never use this database platform.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment