Issue Details (XML | Word | Printable)

Key: CORE-6202
Type: Bug Bug
Status: Open Open
Priority: Minor Minor
Assignee: Unassigned
Reporter: Kjell Rilbe
Votes: 0
Watchers: 3
Operations

If you were logged in you would be able to see more operations.
Firebird Core

External table file names not transliterated to OS character set

Created: 05/Dec/19 07:46 AM   Updated: 05/Dec/19 07:48 AM
Component/s: Engine
Affects Version/s: 3.0.4
Fix Version/s: None

Environment: Windows 64 bit with OS character set Windows-1252

QA Status: No test


 Description  « Hide
It seems that the file name specified for an external table is sent to the operating system's file operations without transliteration. This makes it impossible to use file names with non-ASCII characters.

For example, specifying the file name 'Teståäö.txt' will result in a file named 'Teståäö.txt'. Which is the Win-1252 interpretation of the byte sequence that UTF-8 string 'Teståäö.txt' is encoded as.

In other words, it would appear that the file name, stored in UTF-8 (UNICODE-FSS?) format is sent as is to a Windows system call that expects the file name to be encoded in the operating system's codepage, in this case Win-1252.

I've tried this in both isql and FlameRobin and got consistent results. The file name appears correct in RDB$RELATIONS.RDB$EXTERNAL_FILE but ends up wrong in the operating system, like described above.

I expect this to be rather easily fixed, considering the file name is always stored in the same character set (UTF8, or is it UNICODE_FSS?) and the operating system's character set is known. All that should be needed is to add transliteration of the stored file name before sending it to any operating system call.

By the way, I think I've had similar issues with database file name, but have not tried it recently. Maybe it would be a good idea to go through all operating system file operations and make sure the file name(s) passed are properly transliterated.

 All   Comments   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Kjell Rilbe added a comment - 05/Dec/19 07:48 AM
There are no viable work arounds, except just not using non-ASCII names (are we back in the 1980's?), because there's no valid way to encode a string in UFT-8 that will appear as "åäö" when interpreted as Win-1252. The character codes needed are not valid UTF-8.