Issue Details (XML | Word | Printable)

Key: CORE-824
Type: New Feature New Feature
Status: Closed Closed
Resolution: Fixed
Priority: Major Major
Assignee: Adriano dos Santos Fernandes
Reporter: seesink
Votes: 0
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
Firebird Core

accent ignoring collation for unicode

Created: 03/Dec/03 12:00 AM   Updated: 22/Jun/11 01:51 PM
Component/s: Charsets/Collation
Affects Version/s: None
Fix Version/s: 2.5 Alpha 1

Time Tracking:
Not Specified

Issue Links:
Relate
 

SF_ID: 853354
Development: Finished


 Description  « Hide
SFID: 853354#
Submitted By: seesink

Hello,

I am trying to do a query like

SELECT name FROM artist WHERE name LIKE '%BJORK%'

And trying to get BJ??RK as a result. Note the accent.

(And results like BJ??RK, BJORK etc. would also be valid)

If I am not mistaken I would need a collation for this,
but the closest thing I found is:

http://www.brookstonesystems.com

Which is nice, but not for linux and no unicode support.

Am I right that this would need a unicode NOACCENT
collation? If so this is my feature request.

Workarounds for the problem are highly aprecciated. Are
there other collations / charsets in firebird which do this?

Cheers,
Remco Seesink

P.S. NOCASE would be nice too, but workaround with
UPPER works fine.

 All   Comments   Work Log   Change History   Version Control   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Alice F. Bird added a comment - 14/Jun/06 09:42 AM
Date: 2004-06-25 01:42
Sender: raseesink
Logged In: YES
user_id=669582

We solved this by filling a separate wordlist without accents which was
done mainly for speed with search and solves the accent problem in the
same time.

A word list can be searched by index using START WITH "BJORK"
instead of LIKE "%BJORK%". It is not the same but in our problem set
it generates even better result as you get less false (semantic) positives.
Disadvantage is duplicating data and risk getting out of sync.

Adriano dos Santos Fernandes added a comment - 11/Feb/08 07:06 PM
We have now UNICODE_CI and UNICODE_CI_AI.
UNICODE_AI (case-sensitive / accent-insensitive) still not present.