Issue Details (XML | Word | Printable)

Key: CORE-5963
Type: Sub-task Sub-task
Status: Open Open
Priority: Major Major
Assignee: Unassigned
Reporter: Tomasz Kujalow
Votes: 0
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
Firebird Core
CORE-4661

Restore parameter to convert one byte character set to UTF-8

Created: 13/Nov/18 04:40 PM   Updated: 14/Nov/18 04:49 PM
Component/s: GBAK
Affects Version/s: 3.0.5
Fix Version/s: None

QA Status: No test


 Description  « Hide
I think the most common scenario it is convert one byte character set to UTF-8.
So maybe posibibility putting parameter which force replace one byte character set (set by this parameter) to UTF-8 (what is allways possible).

For example we have multiple bases with WIN1250. When we set parameter for example: -force_convert=WIN1250 for gbak, it will convert all meta fields (tables, procedures...etc) and data in this fields to UTF8.
It could be very usfull functionality.
Is it possible?

 All   Comments   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Adriano dos Santos Fernandes added a comment - 13/Nov/18 06:11 PM - edited
It's not "always possible".

There is stored routines, which may use characters sets and collations in their body.

There may be code doing "where my_field collate xxx = 'y', and then collate xxx is not compatible with utf8.

It seems a task for recreate metadata, editing routines, then pump data.

Tomasz Kujalow added a comment - 14/Nov/18 04:43 PM
Ok. But changing collations in routines (procedures , triggers, packages, etc) is simple.
The most difficult is convert fields character set from single-byte to UTF-8, especially for big databases (1k tables, metadata size=35MB).
And what is important any errors will occure after restore (on run of database), what can be simple repaired (change sp, views, triggers).

We generally have big problem with migrating from WIN1250 (95% all string fields) to UTF-8 for about > 300 databases which are installed on our customer computers (some of them are not connected to internet - no access to them). So we have to prepare program which automaticly convert database from win1250 to utf-8. Calling gbak in such scenario (backup/resotore) is simple and reliable. But gbak not have option to replace character set for fields, which is the most difficalt in whole migration.

May be such parameter:
-CONV_SC_FROM_TO_UTF8=WIN1250,UCS_BASIC
First (WIN1250): Convert from this charset to utf-8
Second (UCS_BASIC): Set such collation for destination utf-8 field.

If it will work only for table fields, it will be big convenience.

Adriano dos Santos Fernandes added a comment - 14/Nov/18 04:49 PM
The problematic expressions may be embedded everywhere (expression index, constraint, etc).

So it's not "reliable" to put a funcionality in a builtin tool that has lots of situation to not work correctly.