New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SIMILAR TO does not return result when invalid pattern is used (with two adjacent special characters that should be escaped but aren't) [CORE5931] #6188
Comments
Modified by: @pavel-zotovAttachment: difference-on-patterns-that-leads-to-POOR-SIMILAR-TO.png [ 13304 ] Attachment: firebird_1820.20181001_012003.stack_trc.txt.7z [ 13305 ] Attachment: firebird_1820.20181001_012034.stack_trc.txt.7z [ 13306 ] |
Commented by: @pavel-zotov PPS. Links for two full memory dumps (relates to attached stack traces for WI-V3.0.4.33054): https://drive.google.com/open?id=1zGWHp0Eakl1pL4SlULqOQk6UvZyhsSV6 And this is link to FB build WI-V3.0.4.33054 with corresponding PDB files: https://drive.google.com/open?id=1vI-hOn9-OzB33xkfJuKoc49G6zIFvrmm |
Modified by: @pavel-zotovdescription: (yes, i know about tickets CORE3858, CORE4893 & CORE5664 that also relate to SIMILAR TO performance; perhaps this new sample can help to solve them so i decided to create separate ticket). Consider following command: set count on; It will issue result instantly:CONSTANT 1 Records affected: 1Now let change pattern by adding characters "|" and "_" (PIPE and UNDERSCORE) after 1st ALNUM, but with missed escaping (and this means that pattern will be INVALID): select 1 from rdb$database where This statement will HANGS on 2.5, 3.0 and 4.0. NB: This trouble with SIMILAR TO seems to be CRITICAL: no new attachment can be established while engine "does something inside itself" while working with parsing such statement! => (yes, i know about tickets CORE3858, CORE4893 & CORE5664 that also relate to SIMILAR TO performance; perhaps this new sample can help to solve them so i decided to create separate ticket). Consider following command: set count on; It will issue result instantly:CONSTANT 1 Records affected: 1Now let's change pattern by adding characters "|" and "_" (PIPE and UNDERSCORE) after 1st ALNUM, but with missed escaping (and this means that pattern will be INVALID): select 1 from rdb$database where This statement will HANGS on 2.5, 3.0 and 4.0. NB: This trouble with SIMILAR TO seems to be CRITICAL: no new attachment can be established while engine "does something inside itself" while working with parsing such statement! |
Commented by: @mrotteveel That is not an invalid pattern (or at least: it is a syntactically correct pattern). That this syntactically correct pattern then leads to a undecidable pattern (where the regex engine is probably stuck backtracking) is in itself an undecidable problem (in this case you probably can, but for more complex patterns, you can't). |
Commented by: @mrotteveel Specifically ([[:ALNUM:]]|_)+ Breaks down to: <regular expression> = <regular term> = <regular factor> = ([[:ALNUM:]]|_)+ (ignoring quantifier in next steps) |
Modified by: @asfernandesassignee: Adriano dos Santos Fernandes [ asfernandes ] |
Modified by: @asfernandesstatus: Open [ 1 ] => Resolved [ 5 ] resolution: Fixed [ 1 ] Fix Version: 4.0 Beta 2 [ 10888 ] |
Modified by: @pavel-zotovstatus: Resolved [ 5 ] => Resolved [ 5 ] QA Status: No test => Done successfully |
Modified by: @pavel-zotovstatus: Resolved [ 5 ] => Closed [ 6 ] |
Submitted by: @pavel-zotov
Attachments:
difference-on-patterns-that-leads-to-POOR-SIMILAR-TO.png
firebird_1820.20181001_012003.stack_trc.txt.7z
firebird_1820.20181001_012034.stack_trc.txt.7z
(yes, i know about tickets CORE3858, CORE4893 & CORE5664 that also relate to SIMILAR TO performance; perhaps this new sample can help to solve them so i decided to create separate ticket).
Consider following command:
set count on;
set list on;
select 1 from rdb$database where
'
group by
f01,
f02,
f03,
f04,
f05,
f06,
f07,
f08,
f09,
f10,
f11,
f12,
f13,
'
similar to
'%group[[:WHITESPACE:]]+by[[:WHITESPACE:]]+([[:ALNUM:]])+([[:WHITESPACE:]]*,[[:WHITESPACE:]]*[[:ALNUM:]]+){12,}[[:WHITESPACE:]]*%'
;
It will issue result instantly:
CONSTANT 1
Records affected: 1
Now let's change pattern by adding characters "|" and "_" (PIPE and UNDERSCORE) after 1st ALNUM, but with missed escaping (and this means that pattern will be INVALID):
select 1 from rdb$database where
'
group by
f01,
f02,
f03,
f04,
f05,
f06,
f07,
f08,
f09,
f10,
f11,
f12,
f13
'
similar to
'%group[[:WHITESPACE:]]+by[[:WHITESPACE:]]+([[:ALNUM:]]|_)+([[:WHITESPACE:]]*,[[:WHITESPACE:]]*[[:ALNUM:]]+){12,}[[:WHITESPACE:]]*%'
;
This statement will HANGS on 2.5, 3.0 and 4.0.
Difference between patterns - see in attached .pgn
Two stack traces (for WI-V3.0.4.33054 ) are also attached.
NB: This trouble with SIMILAR TO seems to be CRITICAL: no new attachment can be established while engine "does something inside itself" while working with parsing such statement!
Commits: 28e1874
The text was updated successfully, but these errors were encountered: