New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIN1257_LV (Latvian) collation is wrong for 4 letters: A E I U. [CORE3131] #3508
Comments
Commented by: Aleksey Timohin (tdelphi) updated |
Modified by: Aleksey Timohin (tdelphi)description: In latvian alphabet there can be accented letters A E I U (and others). Accented letters should follow after simple letters according the rules of alphabet, but they don't. For now, Firebird does not sort them, and our clients are unhappy with that. For now it works that way: => In latvian alphabet there can be accented letters A E I U (and others). Accented letters should follow after simple letters according the rules of alphabet, but they don't. For now, Firebird does not sort them, and our clients are unhappy with that. For now it works that way: Currently it works as described here: http://www.collation-charts.org/firebird20/fb203.WIN1257.WIN1257_LV.html Should be: Link to latvian alphabet in Wikipedia: http://lv.wikipedia.org/wiki/Latvie%C5%A1u_alfab%C4%93ts I can provide you additional information and/or test DB and if you need. Can this be fixed in Firebird 2.5 Final? Or maybe there is way to fix it also for older FB versions? p.s. I tried to create custom collation with ACCENT, but it doesn't work as expected. Thank you in advance. |
Modified by: @dyemanovassignee: Dmitry Yemanov [ dimitr ] |
Modified by: @dyemanovpriority: Critical [ 2 ] => Major [ 3 ] |
Commented by: Aleksey Timohin (tdelphi) added script and DB backup |
Modified by: Aleksey Timohin (tdelphi)Attachment: test_lv_script_utf8.sql [ 11773 ] Attachment: test_lv2.gbk [ 11774 ] description: In latvian alphabet there can be accented letters A E I U (and others). Accented letters should follow after simple letters according the rules of alphabet, but they don't. For now, Firebird does not sort them, and our clients are unhappy with that. For now it works that way: Currently it works as described here: http://www.collation-charts.org/firebird20/fb203.WIN1257.WIN1257_LV.html Should be: Link to latvian alphabet in Wikipedia: http://lv.wikipedia.org/wiki/Latvie%C5%A1u_alfab%C4%93ts I can provide you additional information and/or test DB and if you need. Can this be fixed in Firebird 2.5 Final? Or maybe there is way to fix it also for older FB versions? p.s. I tried to create custom collation with ACCENT, but it doesn't work as expected. Thank you in advance. => In latvian alphabet there can be accented letters A E I U (and others). Accented letters should follow after simple letters according the rules of alphabet, but they don't. For now, Firebird does not sort them, and our clients are unhappy with that. For now it works that way: Currently it works as described here: http://www.collation-charts.org/firebird20/fb203.WIN1257.WIN1257_LV.html Should be: Link to latvian alphabet in Wikipedia: http://lv.wikipedia.org/wiki/Latvie%C5%A1u_alfab%C4%93ts I can provide you additional information and/or test DB and if you need. Can this be fixed in Firebird 2.5 Final? Or maybe there is way to fix it also for older FB versions? p.s. I tried to create custom collation with ACCENT, but it doesn't work as expected. Thank you in advance. Script to reproduce the problem in attachment. Script creates table with 2 fields: "TEXT" - latvian text, "SORTIROVKA" - text field with right indexes. Script is saved in UTF-8 encoding. To reproduce problem, use query: or select * Also I attached backup file for DB with test data (same as in script). Backup image (gbak) for Firebird 2.5. |
Modified by: @dyemanovstatus: Open [ 1 ] => In Progress [ 3 ] |
Modified by: @dyemanovstatus: In Progress [ 3 ] => Open [ 1 ] |
Modified by: @dyemanovstatus: Open [ 1 ] => Resolved [ 5 ] resolution: Fixed [ 1 ] Fix Version: 2.1.4 [ 10361 ] Fix Version: 3.0 Alpha 1 [ 10331 ] Fix Version: 2.5.1 [ 10333 ] |
Commented by: @dyemanov Please test the next (tomorrow's) snapshot build. Note that you'll have to recreate all indices existing for WIN1257_LV columns, or backup/restore the database. |
Commented by: Aleksey Timohin (tdelphi) Tested (on 2.5.1). Work as described. Thank you. But there is another issue: CREATE COLLATION my_lv2 select * Query result records will be ordered using accented characters rules. This bug is not actual and not vital for us, but it exist. |
Modified by: @pcisarstatus: Resolved [ 5 ] => Closed [ 6 ] |
Modified by: @pavel-zotovstatus: Closed [ 6 ] => Closed [ 6 ] QA Status: Done successfully Test Details: Test data were taken from the script provided in this ticket. |
Submitted by: Aleksey Timohin (tdelphi)
Attachments:
test_lv_script_utf8.sql
test_lv2.gbk
In latvian alphabet there can be accented letters A E I U (and others). Accented letters should follow after simple letters according the rules of alphabet, but they don't. For now, Firebird does not sort them, and our clients are unhappy with that.
For now it works that way:
A and Ā, a and ā - no difference in sorting
E and Ē, e and ē - no difference in sorting
I and Ī, i and ī - no difference in sorting
U and Ū, u and ū - no difference in sorting
Currently it works as described here: http://www.collation-charts.org/firebird20/fb203.WIN1257.WIN1257_LV.html
Should be:
AĀ, aā
EĒ, eē
IĪ, iī
UŪ, uū
Link to latvian alphabet in Wikipedia: http://lv.wikipedia.org/wiki/Latvie%C5%A1u_alfab%C4%93ts
I can provide you additional information and/or test DB and if you need.
Can this be fixed in Firebird 2.5 Final? Or maybe there is way to fix it also for older FB versions?
p.s. I tried to create custom collation with ACCENT, but it doesn't work as expected.
Thank you in advance.
Script to reproduce the problem in attachment. Script creates table with 2 fields: "TEXT" - latvian text, "SORTIROVKA" - text field with right indexes. Script is saved in UTF-8 encoding.
To reproduce problem, use query:
select *
from TEST_LV_SORT tls
order by tls.text COLLATE WIN1257;
or
select *
from TEST_LV_SORT tls
order by tls.text COLLATE test_lv;
Also I attached backup file for DB with test data (same as in script). Backup image (gbak) for Firebird 2.5.
Commits: b70b571 945b928 57ecbe4
====== Test Details ======
Test data were taken from the script provided in this ticket.
The text was updated successfully, but these errors were encountered: