New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SUBSTRING with regular expression (SIMILAR TO) capability [CORE2006] #2443
Comments
Commented by: @samofatov BroadView uses the following UDFs for RegExp functionality, and they are proven very convenient. -- Takes String, Pattern, Options -- Takes String, Pattern, Match_Number, Occurence, Options These UDFs mimic Oracle's REGEXP_LIKE, REGEXP_SUBSTR and REGEXP_REPLACE functions. In general it is probably better to use somebody else' syntax rather than invent new one. |
Commented by: Dany Marmur (dany) I was told that SUBSTRING( <value> FROM <similar pattern> FOR <escape character>) is "SQL standard functionality". When considering UDFs - CSTRING is limited in size and with multi character sets it can be very limiting. LIKE and CONTAINING can even be used with BLOBS. This is one reason why build-in functions are preferrable. UDF needs to be passed the "locale" to. The built-in functions and predicates should handle multi-byte character sets properly from the start. A wrapper for regexp library could behave differently on different platforms. UDFs must be considered from an cross-platform perspective too. In PostgreSQL the built-in function behaves as SIMILAR TO while the wrapper for regexp library functions behaves differently (this is documented but it can be considered un-optimal from a standard compliance pow). Any other SQL standard like SUBSTRING_REGEX is of course fine too. |
Modified by: @dyemanovassignee: Adriano dos Santos Fernandes [ asfernandes ] |
Commented by: @asfernandes Implemented SUBSTRING(<value> SIMILAR <pattern> ESCAPE <escape>) as per the SQL standard. |
Modified by: @asfernandesstatus: Open [ 1 ] => Resolved [ 5 ] resolution: Fixed [ 1 ] Fix Version: 3.0 Alpha 1 [ 10331 ] |
Commented by: @hvlad Unfortunately, current implementation doesn't conforms to SQL standard. =============== SQL standard =============== 6.29 <string value function> General Rules 6) If <regular expression substring function> is specified, then:
=============== SQL standard =============== SELECT SUBSTRING('asds 12.34 asd' SIMILAR '% \"[\+\-]?[0-9]*([0-9].|.[0-9])?[0-9]*\" %' ESCAPE '\') returns '.34'. It is abviously not correct as in this case we have R1 = '% ' E = '\' and S1 = 'asds 12.' and S1 || S2 || S3 similar to R1 || R2 || R3 escape E == true Correct answer should be S1 = 'asds ' I think this ticket should be reopened (no sence to create new ticket with bug description as current implementation was never released). |
Modified by: @asfernandes |
Commented by: @pavel-zotov Release notes specify ability to apply 'NOT' in this statement: === page 72 ===
|
Commented by: @asfernandes Pavel, release notes is wrong. That's not present in README.substring_similar.txt; |
Modified by: @pavel-zotovstatus: Resolved [ 5 ] => Resolved [ 5 ] Test Details: Some expressions from test relates to: CORE2389, CORE2756, CORE2780, CORE3523, CORE3754, CORE769 => Some expressions from test relates to: CORE2389, CORE2756, CORE2780, CORE3523, CORE3754, CORE769 |
Submitted by: Dany Marmur (dany)
Jira_subtask_outward CORE3147
Is duplicated by CORE3584
Is related to QA632
Votes: 2
The build-in function SUBSTRING should be extended to handle matching patterns.
The current implementation of the SIMILAR TO predicate only allows for selection and validation based on pattern matching (regular expressions).
String manipulation based on pattern matching is impossible without extending the SUBSTRING function.
writing
SUBSTRING ( <value> FROM <similar pattern> FOR <escape character>)
the function should return NULL for no match or the matching substring for a match.
Further control is achieved with an additional character combination specified in <similar pattern> to bracket the portion of the match that should be returned.
Commits: 5aac863 4f9b916 FirebirdSQL/fbt-repository@57ef2ba
====== Test Details ======
Some expressions from test relates to: CORE2389, CORE2756, CORE2780, CORE3523, CORE3754, CORE769
See also: CORE4740
The text was updated successfully, but these errors were encountered: