Issue Details (XML | Word | Printable)

Key: JDBC-257
Type: Task Task
Status: Closed Closed
Resolution: Duplicate
Priority: Major Major
Assignee: Mark Rotteveel
Reporter: Mark Rotteveel
Votes: 0
Watchers: 1

If you were logged in you would be able to see more operations.
Jaybird JCA/JDBC Driver

Investigate options for more intelligent decision of encoding for connection without explicit characterset

Created: 28/Jun/12 04:57 PM   Updated: 07/May/17 12:32 PM
Component/s: JDBC driver
Affects Version/s: None
Fix Version/s: Jaybird 3.0.0

Issue Links:

 Description  « Hide
Currently Jaybird will always use the default system encoding of Java for converting between strings and bytes when no connection characterset or NONE is used for connecting to Firebird.

It may be a good idea to change this to first try to use the characterset of the database itself. If that is NONE as well, then fallback to the default system encoding. The connection characterset would remain NONE, but the encoding used in FBStringField would change to match the database - if possible.

Although this sounds good in theory, it will need to be tested to ensure it works OK and does not cause compatibility problems, and if it is easy to efficiently retrieve the database characterset.

 All   Comments   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Mark Rotteveel added a comment - 29/Jun/12 11:59 AM
Roman suggests that if the connection characterset is NONE and the database characterset is NONE as well, that we terminate the connection.

Roman Rokytskyy added a comment - 29/Jun/12 12:06 PM
No, I suggest to drop the connection if database charset is NONE and connection charset is not specified. This is the "FB beginners case". If encoding property is set explicitly to NONE, we have to assume that people know what they are doing, especially when charSet=XXX, which means "I know how to interpret those data".

Mark Rotteveel added a comment - 29/Jun/12 12:11 PM
To be honest, I don't see a real use case for specifying NONE explicitly, so that is why I equated the no explicit connection characterset as being identical to NONE.

Roman Rokytskyy added a comment - 29/Jun/12 12:17 PM
Backward compatibility only - you don't really want to know how people can misuse the NONE charset. That is why we have the encoding and charSet properties in Jaybird - to let people somehow solve the issues with the legacy databases they have.

Mark Rotteveel added a comment - 29/Jun/12 02:05 PM - edited
Changing the existing behavior could actually lead to logical data corruption, see my post

If we want to change anything, we might actually have no other option than to refuse connections without explicitly setting charSet or encoding or we need to put really big warning signs in the release notes, add an SQLWarning to the connection and write warnings to System.err and the log ;)

Mark Rotteveel added a comment - 07/Jul/12 06:42 PM
As a first step I added code (for Jaybird 2.2) to log a warning and add a warning on the connection.

Mark Rotteveel added a comment - 03/Jul/16 11:02 AM
I decided not to change anything other than keeping the warning introduced in Jaybird 2.2. Changing to default to UTF-8 has its obvious downside with regard to logical corruption, and refusing connection if no character set is specified will be very inconvenient. We are simply between a rock and a hard place on this.

Mark Rotteveel added a comment - 30/Jul/16 02:03 PM
Reversed my previous decision: we will now reject the connection if no explicit character set was specified. See also JDBC-446