Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EXECUTE BLOCK parameter character set comes from attachment not declaration [CORE5316] #5592

Open
firebird-automations opened this issue Jul 23, 2016 · 4 comments

Comments

@firebird-automations
Copy link
Collaborator

Submitted by: Geoff Worboys (gworboys)

EXECUTE BLOCK \(
    "Param1" VARCHAR\(80\) CHARACTER SET WIN1252 = ?
\) RETURNS \(
    "Result" VARCHAR\(255\) CHARACTER SET UTF8
\) AS
    DECLARE TmpUtf8 VARCHAR\(20\) CHARACTER SET UTF8;
BEGIN
    TmpUtf8 = \_utf8 x'C690';
    "Result" = TmpUtf8;
    SUSPEND;
    "Param1" = TmpUtf8;   \-\- <<< Expect Error Here
    "Result" = "Param1";
    SUSPEND;
END

If I use WIN1252 for the connection character set then the block fails at the marked line, as expected (although it gives an overflow/truncation error rather than a transliteration failure, not sure why).

However, if I use UTF8 as the connection character set then the block completes successfully.

Ergo: the character set of the parameters to an EXECUTE BLOCK come from the attachment character set and NOT from the declared character set.

This is both far from obvious and contrary to the documentation. From the v2.5 Language Reference Update:

" Firebird 2.1 and up allow the use of domains instead of SQL datatypes when declaring input/output parameters and local variables. With the "TYPE OF" modifier only the domain's type is used, not its NOT NULL setting, CHECK constraint and/or default value. If the domain is of a text type, its character set and collation are always included. "

and, under the heading of "TYPE OF COLUMN in parameter and variable declarations"

" Only the type itself is used; in the case of string types, this includes the character set and the collation."

The reason why I say it is far from obvious is that execute block statements look so much like stored procedures, but in this respect their behaviour varies.

With a stored procedure the transliteration occurs with the input of the the client data into the parameter - the parameter itself remains with the type (charset and collation) that it was declared with.

But with the execute block the parameter type has been silently changed to the attachment character set (haven't studied the collation). As a result the code inside the block may behave differently to the same code inside a stored procedure. This is potentially very confusing and not predictable (the same block may behave differently depending on how the client connects).

I would prefer that the engine was fixed to respect the declared character set (whether declared directly or via TYPE OF references). If that is not possible then the documentation needs to be changed to highlight that EXECUTE BLOCK will change the character set (and collation?), and so the code may not behave the same as the equivalent stored procedure.

@firebird-automations
Copy link
Collaborator Author

Modified by: Geoff Worboys (gworboys)

description: EXECUTE BLOCK (
"Param1" VARCHAR(80) CHARACTER SET WIN1252 = :"Param1"
) RETURNS (
"Result" VARCHAR(255) CHARACTER SET UTF8
) AS
DECLARE TmpUtf8 VARCHAR(20) CHARACTER SET UTF8;
BEGIN
TmpUtf8 = _utf8 x'C690';
"Result" = TmpUtf8;
SUSPEND;
"Param1" = TmpUtf8; -- <<< Expect Error Here
"Result" = "Param1";
SUSPEND;
END

If I use WIN1252 for the connection character set then the block fails at the marked line, as expected (although it gives an overflow/truncation error rather than a transliteration failure, not sure why).

However, if I use UTF8 as the connection character set then the block completes successfully.

Ergo: the character set of the parameters to an EXECUTE BLOCK come from the attachment character set and NOT from the declared character set.

This is both far from obvious and contrary to the documentation. From the v2.5 Language Reference Update:

" Firebird 2.1 and up allow the use of domains instead of SQL datatypes when declaring input/output parameters and local variables. With the "TYPE OF" modifier only the domain's type is used, not its NOT NULL setting, CHECK constraint and/or default value. If the domain is of a text type, its character set and collation are always included. "

and, under the heading of "TYPE OF COLUMN in parameter and variable declarations"

" Only the type itself is used; in the case of string types, this includes the character set and the collation."

The reason why I say it is far from obvious is that execute block statements look so much like stored procedures, but in this respect their behaviour varies.

With a stored procedure the transliteration occurs with the input of the the client data into the parameter - the parameter itself remains with the type (charset and collation) that it was declared with.

But with the execute block the parameter type has been silently changed to the attachment character set (haven't studied the collation). As a result the code inside the block may behave differently to the same code inside a stored procedure. This is potentially very confusing and not predictable (the same block may behave differently depending on how the client connects).

I would prefer that the engine was fixed to respect the declared character set (whether declared directly or via TYPE OF references). If that is not possible then the documentation needs to be changed to highlight that EXECUTE BLOCK will change the character set (and collation?), and so the code may not behave the same as the equivalent stored procedure.

=>

EXECUTE BLOCK \(
    "Param1" VARCHAR\(80\) CHARACTER SET WIN1252 = ?
\) RETURNS \(
    "Result" VARCHAR\(255\) CHARACTER SET UTF8
\) AS
    DECLARE TmpUtf8 VARCHAR\(20\) CHARACTER SET UTF8;
BEGIN
    TmpUtf8 = \_utf8 x'C690';
    "Result" = TmpUtf8;
    SUSPEND;
    "Param1" = TmpUtf8;   \-\- <<< Expect Error Here
    "Result" = "Param1";
    SUSPEND;
END

If I use WIN1252 for the connection character set then the block fails at the marked line, as expected (although it gives an overflow/truncation error rather than a transliteration failure, not sure why).

However, if I use UTF8 as the connection character set then the block completes successfully.

Ergo: the character set of the parameters to an EXECUTE BLOCK come from the attachment character set and NOT from the declared character set.

This is both far from obvious and contrary to the documentation. From the v2.5 Language Reference Update:

" Firebird 2.1 and up allow the use of domains instead of SQL datatypes when declaring input/output parameters and local variables. With the "TYPE OF" modifier only the domain's type is used, not its NOT NULL setting, CHECK constraint and/or default value. If the domain is of a text type, its character set and collation are always included. "

and, under the heading of "TYPE OF COLUMN in parameter and variable declarations"

" Only the type itself is used; in the case of string types, this includes the character set and the collation."

The reason why I say it is far from obvious is that execute block statements look so much like stored procedures, but in this respect their behaviour varies.

With a stored procedure the transliteration occurs with the input of the the client data into the parameter - the parameter itself remains with the type (charset and collation) that it was declared with.

But with the execute block the parameter type has been silently changed to the attachment character set (haven't studied the collation). As a result the code inside the block may behave differently to the same code inside a stored procedure. This is potentially very confusing and not predictable (the same block may behave differently depending on how the client connects).

I would prefer that the engine was fixed to respect the declared character set (whether declared directly or via TYPE OF references). If that is not possible then the documentation needs to be changed to highlight that EXECUTE BLOCK will change the character set (and collation?), and so the code may not behave the same as the equivalent stored procedure.

@firebird-automations
Copy link
Collaborator Author

Commented by: @dyemanov

EXECUTE BLOCK is a DSQL statement, so its parameters are described using the attachment charset (the same as for any SELECT/INSERT/etc statement). But the issue is actually wider, especially in Firebird v3. If you omit the character set specification for local variables, Firebird v3 will describe them using the attachment charset, not the default database charset as for stored PSQL objects (e.g. procedures). This behaviour differs between v2.5 and v3.0. So the issue with "the code inside the block may behave differently to the same code inside a stored procedure" still persists.

@firebird-automations
Copy link
Collaborator Author

Commented by: Geoff Worboys (gworboys)

The DSQL side will of course be described according to the attachment charset, but the internal PSQL parameter should remain of the declared type.

Note that I have been concentrating on the character set because that is what showed up first, but it is the collation that is perhaps more likely to hit users. The character set of the declared PSQL parameter is ignored even when you use TYPE OF. So if I declare a parameter as MyParam TYPE OF NAME_D (where NAME_D is a domain associated with a case-insensitive collation) then I expect to be able to write my PSQL code with value comparisons that can ignore case. But if Firebird has silently replaced my character set and collation with the attachment character set then my code takes on a different meaning.

Your comments about Firebird v3 are a possible concern, but as long as v3 continues to respect the true character set and collation of any TYPE OF declarations then it won't worry me over much. Essentially: where I have declared my intentions I expect Firebird to respect that, where I have not declared my intentions I am open to changes in default behaviour.

@firebird-automations
Copy link
Collaborator Author

Commented by: Geoff Worboys (gworboys)

Just confirming that this problem appears to be fixed in Firebird v3 (released version).

The problem doesn't happen on FB3 even when connecting to it with an FB2 client.

If you want to close this "Won't Fix" that's okay with me. I can work around it until I get to migrating to Firebird v3.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant