Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression: line/column numbering may be twisted if alias.name syntax is used [CORE5183] #5464

Closed
firebird-automations opened this issue Apr 4, 2016 · 17 comments

Comments

@firebird-automations
Copy link
Collaborator

Submitted by: Boltik Evgeny (bolt)

Look at v2.5:

execute block
returns (id int)
as
begin
select y
from rdb$database x where z = 0
into id;
suspend;
end^

Dynamic SQL Error
-SQL error code = -206
-Column unknown
-Z
-At line 6, column 31

execute block
returns (id int)
as
begin
select x.y
from rdb$database x where z = 0
into id;
suspend;
end^

Dynamic SQL Error
-SQL error code = -206
-Column unknown
-Z
-At line 6, column 31

As expected, both error messages are the same and point to the error place. Now look at v3.0:

execute block
returns (id int)
as
begin
select y
from rdb$database x where z = 0
into id;
suspend;
end^

Dynamic SQL Error
-SQL error code = -206
-Column unknown
-Z
-At line 6, column 31

execute block
returns (id int)
as
begin
select x.y
from rdb$database x where z = 0
into id;
suspend;
end^

Dynamic SQL Error
-SQL error code = -206
-Column unknown
-Z
-At line 9, column 1

The first message is OK but the second one has incorrect line/column position of the faulty node. The only difference is Y vs X.Y in the select list, completely unrelated to the error place.

Looks like the parser somehow goes crazy after detecting the dot character. Quick debugging shows that FieldNode is created with correct lex.* pointers the first time but the second time lex.* points to the end of the whole statement.

Commits: 984f84a d3e5237 8e81bd6

@firebird-automations
Copy link
Collaborator Author

Commented by: @dyemanov

The problem is that the existing position calculation works fine only for linear parsing. As soon as BTYACC founds a conflict, it tries the following nodes thus invalidating the current position of the lexer, so it cannot be used after backtracing has returned to the original node. And syntax A.B in the select list now seem to introduce one more conflicting state (due to packages support, I suspect), hence the regression.

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

assignee: Dmitry Yemanov [ dimitr ]

@firebird-automations
Copy link
Collaborator Author

Commented by: @asfernandes

That caused regressions in TCS:

FB_SQL_SUBPROC_1 global *FAIL*
FB_SQL_SUBFUNC_1 global *FAIL*
FB_SQL_CUR_FIELD global *FAIL*

FB_SQL_SUBPROC_1.diff:

146,147 c 146,147
< -At sub procedure 'súb1' line: 6, col: 3
< At procedure 'P2' line: 9, col: 2
---------------------------------------------
> -At sub procedure 'súb1' line: 5, col: 2
> At procedure 'P2' line: 8, col: 1
++++++++++++++++++++++++++++++++++++++++++++++++++
326,327 c 326,327
< -At sub procedure 'súb1' line: 6, col: 3
< At procedure 'P2' line: 9, col: 2
---------------------------------------------
> -At sub procedure 'súb1' line: 5, col: 2
> At procedure 'P2' line: 8, col: 1
++++++++++++++++++++++++++++++++++++++++++++++++++

Simple test case:

------------
execute block returns (o1 integer)
as
declare procedure sub (i1 integer) returns (o1 integer not null)
as
begin
o1 = i1;
end
begin
execute procedure sub (null) returning_values o1;
end!
------------

The problem should be in yyReducePosn. However, it's a not intuitive function to change.

@firebird-automations
Copy link
Collaborator Author

Commented by: @dyemanov

I've already found regressions myself, working on this ticket again now.

@firebird-automations
Copy link
Collaborator Author

Commented by: @dyemanov

The major problem was in the *_bk variables, I used them to properly position at X in "X.Y", but they are not updated for line breaks and in many other cases. Now I have the code that does not regress and has CORE5183 resolved too, but it has one caveat - error points to the dot in "X.Y". I think this is acceptable but I'm afraid of other possible issues that may be worse. Do you want me to commit or provide you with the patch (or pull request) for testing?

@firebird-automations
Copy link
Collaborator Author

Commented by: @asfernandes

The bug can be fixed easily with this:

-------------

diff --git a/src/dsql/parse.y b/src/dsql/parse.y
index f6766fe..7110af0 100644
--- a/src/dsql/parse.y
+++ b/src/dsql/parse.y
@@ -5956,6 +5956,8 @@ column_name
FieldNode* fieldNode = newNode<FieldNode>();
fieldNode->dsqlQualifier = *$1;
fieldNode->dsqlName = *$3;
+ fieldNode->line = YYPOSNARG(1).firstLine;
+ fieldNode->column = YYPOSNARG(1).firstColumn;
$$ = fieldNode;
}
| ':' symbol_table_alias_name '.' symbol_column_name
@@ -5974,6 +5976,8 @@ simple_column_name
{
FieldNode* fieldNode = newNode<FieldNode>();
fieldNode->dsqlName = *$1;
+ fieldNode->line = YYPOSNARG(1).firstLine;
+ fieldNode->column = YYPOSNARG(1).firstColumn;
$$ = fieldNode;
}
;

-------------

But it does not fix the general problem.

It seems btyacc has a conceptual problem or we use it incorrectly.

In parse.cpp it calls YYCALLREDUCEPOSN which calls yyReducePosn but that happens after the returning node was created.

setupNode gets the positions from the lexer, but it's called from {} actions which runs after backtracking, so the lexer state is wrong.

@firebird-automations
Copy link
Collaborator Author

Commented by: @asfernandes

Please, let me see your patch. But at first I don't like the error in the dot.

@firebird-automations
Copy link
Collaborator Author

Commented by: @dyemanov

This simple fix was obvious, but this is not the only reduce/reduce conflict we have. I don't want to pollute the grammar with such tricks.

We used btyacc incorrectly, I believe. Lexer state has a linear history and it should not be used after backtracing (but we did). In turn, yyposn is a correct way to get the current location, as it gets stacked and restored during backtracing. This was the core idea of my fix - setupNode() should use yyposn. But I've found that the current calculation of firstColumn is also incorrect, it may point *after* the current node.

@firebird-automations
Copy link
Collaborator Author

Commented by: @dyemanov

This is what I have now:

diff --git a/src/dsql/Parser.cpp b/src/dsql/Parser.cpp
index 84bebd2..efe452c 100644
--- a/src/dsql/Parser.cpp
+++ b/src/dsql/Parser.cpp
@@ -107,6 +107,7 @@
lex.att_charset = characterSet;
lex.line_start_bk = lex.line_start;
lex.lines_bk = lex.lines;
+ lex.last_token = NULL;
lex.param_number = 1;
lex.prev_keyword = -1;
#⁠ifdef DSQL_DEBUG
@@ -280,8 +281,9 @@
if (!yylexSkipSpaces())
return -1;

- yyposn.firstLine = lex.lines_bk;
- yyposn.firstColumn = lex.last_token_bk - lex.line_start_bk + 1;
+ yyposn.firstLine = lex.lines;
+ yyposn.firstColumn = (lex.last_token && lex.last_token >= lex.line_start) ?
+ lex.last_token - lex.line_start + 1 : lex.ptr - lex.line_start;
yyposn.firstPos = lex.ptr - 1;

The trickery with "last token" allows to get proper position for simple cases and semi-proper position (pointing at dot) for X.Y cases. But I'm still searching for any better solution.

@firebird-automations
Copy link
Collaborator Author

Commented by: @asfernandes

I believe I found the missing thing you are searching for. This should resolve:

diff --git a/src/dsql/btyacc_fb.ske b/src/dsql/btyacc_fb.ske
index 5047cac..0426f05 100644
--- a/src/dsql/btyacc_fb.ske
+++ b/src/dsql/btyacc_fb.ske
@@ -552,6 +552,8 @@ yyreduce:

reduce_posn = TRUE;

+ yyposn = YYPOSNARG(1);
+
switch (yyn) {

%% trailer

@firebird-automations
Copy link
Collaborator Author

Commented by: @asfernandes

Well, this my change caused problems which things that has nothing to do with line numbers. Cannot see why now.

@firebird-automations
Copy link
Collaborator Author

Commented by: @asfernandes

That one does not cause the errors.

It's not elegant, but the way I see, where we have duplicate variables (the ones in btyacc positions) and our ones node->line/column, I see something like this as necessary.

Instead of mess with yyposn, we can also create additional variables to be used in setupNode.

diff --git a/src/dsql/btyacc_fb.ske b/src/dsql/btyacc_fb.ske
index 5047cac..7748691 100644
--- a/src/dsql/btyacc_fb.ske
+++ b/src/dsql/btyacc_fb.ske
@@ -552,6 +552,9 @@ yyreduce:

reduce_posn = TRUE;

+ yyposn.firstLine = YYPOSNARG(1).firstLine;
+ yyposn.firstColumn = YYPOSNARG(1).firstColumn;
+
switch (yyn) {

%% trailer

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

reporter: Dmitry Yemanov [ dimitr ] => Boltik Evgeny [ bolt ]

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

status: Open [ 1 ] => Resolved [ 5 ]

resolution: Fixed [ 1 ]

Fix Version: 4.0 Alpha 1 [ 10731 ]

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

Fix Version: 3.0.1 [ 10730 ]

@firebird-automations
Copy link
Collaborator Author

Modified by: @pavel-zotov

status: Resolved [ 5 ] => Resolved [ 5 ]

QA Status: No test => Done successfully

@firebird-automations
Copy link
Collaborator Author

Modified by: @pavel-zotov

status: Resolved [ 5 ] => Closed [ 6 ]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment