Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIN1257_LV (Latvian) collation is wrong for 4 letters: A E I U. [CORE3131] #3508

Closed
firebird-automations opened this issue Sep 9, 2010 · 13 comments

Comments

@firebird-automations
Copy link
Collaborator

Submitted by: Aleksey Timohin (tdelphi)

Attachments:
test_lv_script_utf8.sql
test_lv2.gbk

In latvian alphabet there can be accented letters A E I U (and others). Accented letters should follow after simple letters according the rules of alphabet, but they don't. For now, Firebird does not sort them, and our clients are unhappy with that.

For now it works that way:
A and Ā, a and ā - no difference in sorting
E and Ē, e and ē - no difference in sorting
I and Ī, i and ī - no difference in sorting
U and Ū, u and ū - no difference in sorting

Currently it works as described here: http://www.collation-charts.org/firebird20/fb203.WIN1257.WIN1257_LV.html

Should be:
AĀ, aā
EĒ, eē
IĪ, iī
UŪ, uū

Link to latvian alphabet in Wikipedia: http://lv.wikipedia.org/wiki/Latvie%C5%A1u_alfab%C4%93ts

I can provide you additional information and/or test DB and if you need.

Can this be fixed in Firebird 2.5 Final? Or maybe there is way to fix it also for older FB versions?

p.s. I tried to create custom collation with ACCENT, but it doesn't work as expected.

Thank you in advance.

Script to reproduce the problem in attachment. Script creates table with 2 fields: "TEXT" - latvian text, "SORTIROVKA" - text field with right indexes. Script is saved in UTF-8 encoding.

To reproduce problem, use query:
select *
from TEST_LV_SORT tls
order by tls.text COLLATE WIN1257;

or

select *
from TEST_LV_SORT tls
order by tls.text COLLATE test_lv;

Also I attached backup file for DB with test data (same as in script). Backup image (gbak) for Firebird 2.5.

Commits: b70b571 945b928 57ecbe4

====== Test Details ======

Test data were taken from the script provided in this ticket.

@firebird-automations
Copy link
Collaborator Author

Commented by: Aleksey Timohin (tdelphi)

updated

@firebird-automations
Copy link
Collaborator Author

Modified by: Aleksey Timohin (tdelphi)

description: In latvian alphabet there can be accented letters A E I U (and others). Accented letters should follow after simple letters according the rules of alphabet, but they don't. For now, Firebird does not sort them, and our clients are unhappy with that.

For now it works that way:
A and Ā, a and mailto:aleksejst@solcraft.lv

=>

In latvian alphabet there can be accented letters A E I U (and others). Accented letters should follow after simple letters according the rules of alphabet, but they don't. For now, Firebird does not sort them, and our clients are unhappy with that.

For now it works that way:
A and Ā, a and ā - no difference in sorting
E and Ē, e and ē - no difference in sorting
I and Ī, i and ī - no difference in sorting
U and Ū, u and ū - no difference in sorting

Currently it works as described here: http://www.collation-charts.org/firebird20/fb203.WIN1257.WIN1257_LV.html

Should be:
AĀ, aā
EĒ, eē
IĪ, iī
UŪ, uū

Link to latvian alphabet in Wikipedia: http://lv.wikipedia.org/wiki/Latvie%C5%A1u_alfab%C4%93ts

I can provide you additional information and/or test DB and if you need.

Can this be fixed in Firebird 2.5 Final? Or maybe there is way to fix it also for older FB versions?

p.s. I tried to create custom collation with ACCENT, but it doesn't work as expected.

Thank you in advance.

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

assignee: Dmitry Yemanov [ dimitr ]

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

priority: Critical [ 2 ] => Major [ 3 ]

@firebird-automations
Copy link
Collaborator Author

Commented by: Aleksey Timohin (tdelphi)

added script and DB backup

@firebird-automations
Copy link
Collaborator Author

Modified by: Aleksey Timohin (tdelphi)

Attachment: test_lv_script_utf8.sql [ 11773 ]

Attachment: test_lv2.gbk [ 11774 ]

description: In latvian alphabet there can be accented letters A E I U (and others). Accented letters should follow after simple letters according the rules of alphabet, but they don't. For now, Firebird does not sort them, and our clients are unhappy with that.

For now it works that way:
A and Ā, a and ā - no difference in sorting
E and Ē, e and ē - no difference in sorting
I and Ī, i and ī - no difference in sorting
U and Ū, u and ū - no difference in sorting

Currently it works as described here: http://www.collation-charts.org/firebird20/fb203.WIN1257.WIN1257_LV.html

Should be:
AĀ, aā
EĒ, eē
IĪ, iī
UŪ, uū

Link to latvian alphabet in Wikipedia: http://lv.wikipedia.org/wiki/Latvie%C5%A1u_alfab%C4%93ts

I can provide you additional information and/or test DB and if you need.

Can this be fixed in Firebird 2.5 Final? Or maybe there is way to fix it also for older FB versions?

p.s. I tried to create custom collation with ACCENT, but it doesn't work as expected.

Thank you in advance.

=>

In latvian alphabet there can be accented letters A E I U (and others). Accented letters should follow after simple letters according the rules of alphabet, but they don't. For now, Firebird does not sort them, and our clients are unhappy with that.

For now it works that way:
A and Ā, a and ā - no difference in sorting
E and Ē, e and ē - no difference in sorting
I and Ī, i and ī - no difference in sorting
U and Ū, u and ū - no difference in sorting

Currently it works as described here: http://www.collation-charts.org/firebird20/fb203.WIN1257.WIN1257_LV.html

Should be:
AĀ, aā
EĒ, eē
IĪ, iī
UŪ, uū

Link to latvian alphabet in Wikipedia: http://lv.wikipedia.org/wiki/Latvie%C5%A1u_alfab%C4%93ts

I can provide you additional information and/or test DB and if you need.

Can this be fixed in Firebird 2.5 Final? Or maybe there is way to fix it also for older FB versions?

p.s. I tried to create custom collation with ACCENT, but it doesn't work as expected.

Thank you in advance.

Script to reproduce the problem in attachment. Script creates table with 2 fields: "TEXT" - latvian text, "SORTIROVKA" - text field with right indexes. Script is saved in UTF-8 encoding.

To reproduce problem, use query:
select *
from TEST_LV_SORT tls
order by tls.text COLLATE WIN1257;

or

select *
from TEST_LV_SORT tls
order by tls.text COLLATE test_lv;

Also I attached backup file for DB with test data (same as in script). Backup image (gbak) for Firebird 2.5.

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

status: Open [ 1 ] => In Progress [ 3 ]

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

status: In Progress [ 3 ] => Open [ 1 ]

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

status: Open [ 1 ] => Resolved [ 5 ]

resolution: Fixed [ 1 ]

Fix Version: 2.1.4 [ 10361 ]

Fix Version: 3.0 Alpha 1 [ 10331 ]

Fix Version: 2.5.1 [ 10333 ]

@firebird-automations
Copy link
Collaborator Author

Commented by: @dyemanov

Please test the next (tomorrow's) snapshot build. Note that you'll have to recreate all indices existing for WIN1257_LV columns, or backup/restore the database.

@firebird-automations
Copy link
Collaborator Author

Commented by: Aleksey Timohin (tdelphi)

Tested (on 2.5.1). Work as described. Thank you.

But there is another issue:
Custom created collation with ACCENT INSENSITIVE still work in the same way as ACCENT SENSITIVE collation.
F.e.:

CREATE COLLATION my_lv2
FOR WIN1257
from win1257_lv
no pad
CASE INSENSITIVE
ACCENT INSENSITIVE;

select *
from TEST_LV_SORT tls
order by tls.text COLLATE my_lv2;

Query result records will be ordered using accented characters rules.

This bug is not actual and not vital for us, but it exist.

@firebird-automations
Copy link
Collaborator Author

Modified by: @pcisar

status: Resolved [ 5 ] => Closed [ 6 ]

@firebird-automations
Copy link
Collaborator Author

Modified by: @pavel-zotov

status: Closed [ 6 ] => Closed [ 6 ]

QA Status: Done successfully

Test Details: Test data were taken from the script provided in this ticket.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment