New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
_UTF8 prefix for string values does not work with WIN1252 charset [DNET933] #856
Comments
Commented by: @cincuranet That's weird, the `Mülltüte` does not contain any character that can't be represented in WIN1252, AFAIK. I can't immediately understand why Firebird does not handle it properly. > Otherwise: Is it safe to connect via UTF8 to a database with WIN1252 as charset for columns or are there any side effects? Yes, it's safe. Obviously you shouldn't pass into database values that can't be represented in WIN1252. |
Modified by: @cincuranetstatus: Open [ 1 ] => In Progress [ 3 ] |
Commented by: Daniel Richter (drichter) Works: Does not work: Is it possible to omit the _UTF8-prefix in cases where the connection string charset is not UTF8? |
Commented by: @cincuranet > Is it possible to omit the _UTF8-prefix in cases where the connection string charset is not UTF8? Thinking about it more, I think the problem is because the command text is sent as WIN1252 (the connection charset) and then Firebird tries to consider the piece as UTF8 and the bytes do not align. But I don't think removing the prefix is a good idea. I'm thinking in a way of: column charset X, connection charset Y, C# obviously Unicode. And then the X and Y codepages might provide different characters. IMO the correct solution would be to use (and maybe force) usage of UTF8. At least for EF Core that's what I'm now really thinking to do. Or maybe sending it as HEX literal. Which would solve it, but also the query would not be nice. |
Commented by: Daniel Richter (drichter) > Or maybe sending it as HEX literal. Which would solve it, but also the query would not be nice. > Thinking about it more, I think the problem is because the command text is sent as WIN1252 (the connection charset) and then Firebird tries to consider the piece as UTF8 and the bytes do not align. But I don't think removing the prefix is a good idea. I'm thinking in a way of: column charset X, connection charset Y, C# obviously Unicode. And then the X and Y codepages might provide different characters. Hm, from a database clients view: > IMO the correct solution would be to use (and maybe force) usage of UTF8. At least for EF Core that's what I'm now really thinking to do. Than please throw an exception in case UTF8 is _not_ used. |
EF6 provider here is in maintenance/legacy mode here (see https://groups.google.com/g/firebird-net-provider/c/_0Hwoh5XG_I). If somebody feels like tackling this with good PR, it will be reopened. |
Submitted by: Daniel Richter (drichter)
For string values in queries, the generated SQL results in inline values with the prefix "_UTF8". E.g,
dataContext.Articles.Where(a => a.Name=="Mülltüte")
generates SQL such as
SELECT * FROM "ARTICLE" WHERE "NAME"=_UTF8'Mülltüte'
That works as long as the connection charset is set to "UTF8". In case the connection string defines a charset such as "WIN1252" the execution throws an error "Dynamic SQL Error, SQL error code = -104, Malformed string" (when e.g. the value contains an umlaut).
After removing the "_UTF8" prefix everything works fine (both WIN1252 and UTF8 charset):
SELECT * FROM "ARTICLE" WHERE "NAME"='Mülltüte'
Because of the following code in SqlGenerator.cs, Visit(DbConstantExpression)
// constant is always considered Unicode
isUnicode = true;
The "_UTF8" prefix will always be generated - no matter which charset was used to connect.
In my opinion it is not possible to use string values in queries with charsets other than UTF8. The "_UTF8" prefix should only be added in case the database connection uses UTF8, too.
Otherwise: Is it safe to connect via UTF8 to a database with WIN1252 as charset for columns or are there any side effects?
The text was updated successfully, but these errors were encountered: