You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have a varchar column defined as charset UTF8 with collate UNICODE.
When I use ORDER BY the result set is:
a, ą, ąb, ac
and should be:
a, ac, ą, ąb
The problem is with the polish 'ą' character. It should just follow the ascii 'a' character.
So why the 'ac' text is older than 'ą' or 'ąb' in the ORDER BY ??? If you sort single character texts
there is no problem (as seen in the above example: 'ą' follows 'a').
The same problem is with other polish characters: ć, ś, ę, ń, ó, ż, ź.
They should just follow: c, s, e, n, o, z respectively.
For example:
ORDER BY gives:
s, ś, śb, sc
and should be:
s, sc, ś, śb
What is also strange: there is no problem with the polish characters: 'Ł' and 'ł'.
They correctly always follow: 'L' and 'l' respectively.
Summarizing:
I think that charset UTF8 with collate UNICODE should work with
polish characters in the same way as charset WIN1250 with collate pxw_plk.
System Windows also works in this way.
The text was updated successfully, but these errors were encountered:
This would be something like the following (NOTE: I am not 100% sure this will achieve the effect you desire).
CREATE COLLATION UNICODE_PL
FOR UTF8
FROM UNICODE
NO PAD -- not sure about this one
CASE SENSITIVE
ACCENT SENSITIVE
'LOCALE=pl_PL'
The problem is in your choice of the UNICODE collation, not in the collation itself or a specific engine deficiency. The collation was not designed/intended for the purpose you are using it for.
As Mark has replied, by creating and using an appropriate collation, you will see/get the your expected result.
Submitted by: Mariusz Nogala (mnogala)
I have a varchar column defined as charset UTF8 with collate UNICODE.
When I use ORDER BY the result set is:
a, ą, ąb, ac
and should be:
a, ac, ą, ąb
The problem is with the polish 'ą' character. It should just follow the ascii 'a' character.
So why the 'ac' text is older than 'ą' or 'ąb' in the ORDER BY ??? If you sort single character texts
there is no problem (as seen in the above example: 'ą' follows 'a').
The same problem is with other polish characters: ć, ś, ę, ń, ó, ż, ź.
They should just follow: c, s, e, n, o, z respectively.
For example:
ORDER BY gives:
s, ś, śb, sc
and should be:
s, sc, ś, śb
What is also strange: there is no problem with the polish characters: 'Ł' and 'ł'.
They correctly always follow: 'L' and 'l' respectively.
Summarizing:
I think that charset UTF8 with collate UNICODE should work with
polish characters in the same way as charset WIN1250 with collate pxw_plk.
System Windows also works in this way.
The text was updated successfully, but these errors were encountered: