Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression in FB 3.x and 4.x when PSQL code has recursive SP calls: crash instead of SQLSTATE = 54001 ("Too many concurrent executions...") [CORE6445] #6678

Open
firebird-automations opened this issue Nov 14, 2020 · 15 comments

Comments

@firebird-automations
Copy link
Collaborator

Submitted by: @pavel-zotov

Attachments:
firebird.20201114_233518.WI-V4_0_0_2265.stack_trace.7z
firebird.20201114_233443.WI-V4_0_0_2265.stack_trace.7z
firebird.20201114_233425.WI-V4_0_0_2265.stack_trace.7z
firebird.20201114_233343.WI-V4_0_0_2265.stack_trace.7z
firebird.20201115_001609.WI-V3_0_8_33390.stack_trace.7z

Run following code on 4.x

SET BAIL ON;
shell del c:\temp\tmp4test.fdb 2>nul;
create database 'localhost:c:\temp\tmp4test.fdb' user 'SYSDBA' password 'masterkey';
commit;

set term ^;
create or alter procedure sp_level_0 as
begin
end
^

create or alter procedure sp_level_2 as
begin
execute procedure sp_level_0;
end
^

create or alter procedure sp_level_1 as
begin
execute procedure sp_level_2;
end
^

alter procedure sp_level_0 as
begin
execute procedure sp_level_1;
end
^

set term ;^
commit;

connect 'localhost:c:\temp\tmp4test.fdb' user 'SYSDBA' password 'masterkey';
set echo on;
execute procedure sp_level_1;
rollback;

On 4.0.0.2265 out will be:

execute procedure sp_level_1;
Statement failed, SQLSTATE = 08006
Error reading data from the connection.

Expected (and got on FB 2.5.9.27152 and 3.0.8.33390):

execute procedure sp_level_1;
Statement failed, SQLSTATE = 54001
Too many concurrent executions of the same request
-At procedure 'SP_LEVEL_0' line: 3, col: 5
At procedure 'SP_LEVEL_2' line: 3, col: 5
At procedure 'SP_LEVEL_1' line: 3, col: 5
At procedure 'SP_LEVEL_0' line: 3, col: 5
At procedure 'SP_LEVEL_2' line: 3, col: 5
At procedure 'SP_LEVEL_1' line: 3, col: 5
At procedure 'SP_LEVEL_0' line: 3, col: 5
At procedure 'SP_LEVEL_2' line: 3, col: 5
At procedure 'SP_LEVEL_1' line: 3, col: 5
At procedure 'SP_LEVEL_0' line: 3, col: 5
At procedure 'SP_LEVEL_2' line: 3, col: 5
At procedure 'SP_LEVEL_1' line: 3, col: 5
At procedure 'SP_LEVEL_0' line: 3, col: 5
At procedure 'SP_LEVEL_2' line: 3, col: 5
At procedure 'SP_LEVEL_1' line: 3, col: 5
At procedure 'SP_LEVEL_0' line: 3, col: 5
At procedure 'SP_LEVEL_2' line: 3, col: 5
At procedure 'SP_LEVEL_1' line: 3, col: 5
At procedure 'SP_LEVEL_0' line: 3, col: 5
At procedure 'SP_LEVEL_2' line: 3, col: 5
At procedure 'SP_LEVEL_1' line: 3, col: 5
At procedure 'SP_LEVEL_0' line: 3, col: 5
At procedure 'SP_LEVEL_2' line: 3, col: 5
At procedure 'SP_LEVEL_1' line: 3, col: 5
At procedur...

Stack traces are in attached .7z files.

PS.

And one more issue.
Every odd (or even ?) run of this code leads to crash NOT instantly but after ~30 seconds. This occurs regardless whether i restart FB service before it or no.
This is log of console when output was provided with timestamps by mtee.exe utility:

run-1:

23:33:43.324 execute procedure sp_level_1;
23:33:43.937 Statement failed, SQLSTATE = 08006
23:33:43.938 Error reading data from the connection.

run-2:

23:33:58.673 execute procedure sp_level_1;
23:34:25.133 Statement failed, SQLSTATE = 08006
23:34:25.133 Error reading data from the connection.

run-3:

23:34:42.804 execute procedure sp_level_1;
23:34:43.629 Statement failed, SQLSTATE = 08006
23:34:43.629 Error reading data from the connection.

run-4:

23:34:52.307 execute procedure sp_level_1;
23:35:18.839 Statement failed, SQLSTATE = 08006
23:35:18.839 Error reading data from the connection.

PPS.

firebird.conf:

AuthClient = Srp, Srp256, ExtAuth, Win_Sspi, Legacy_Auth
AuthServer = Srp, Srp256, ExtAuth, Win_Sspi, Legacy_Auth
BugCheckAbort = 1
DefaultDbCachePages = 20000
ExtConnPoolLifeTime = 10
ExtConnPoolSize = 10
ExternalFileAccess = Full
FileSystemCacheThreshold = 65536K
IpcName = fb40_SS
KeyHolderPlugin = KeyHolder
LockHashSlots = 22111
LockMemSize = 5M
MaxUnflushedWrites = -1
MaxUnflushedWriteTime = -1
ReadConsistency = 0
RemoteServicePort = 3400
ServerMode = Super
TempBlockSize = 2M
TempCacheLimit = 2147483647
TempDirectories = d:\temp
UDFAccess = Restrict UDF
UserManager = Srp, Legacy_UserManager
WireCrypt = Enabled
WireCryptPlugin = ChaCha, Arc4

@firebird-automations
Copy link
Collaborator Author

Modified by: @pavel-zotov

Attachment: firebird.20201114_233518.WI-V4_0_0_2265.stack_trace.7z [ 13510 ]

Attachment: firebird.20201114_233443.WI-V4_0_0_2265.stack_trace.7z [ 13511 ]

Attachment: firebird.20201114_233425.WI-V4_0_0_2265.stack_trace.7z [ 13512 ]

@firebird-automations
Copy link
Collaborator Author

Modified by: @pavel-zotov

Attachment: firebird.20201114_233343.WI-V4_0_0_2265.stack_trace.7z [ 13513 ]

@firebird-automations
Copy link
Collaborator Author

Modified by: @pavel-zotov

Attachment: firebird.20201114_233443.WI-V4_0_0_2265.stack_trace.7z [ 13511 ] =>

@firebird-automations
Copy link
Collaborator Author

Modified by: @pavel-zotov

Attachment: firebird.20201114_233443.WI-V4_0_0_2265.stack_trace.7z [ 13514 ]

@firebird-automations
Copy link
Collaborator Author

Modified by: @pavel-zotov

Attachment: firebird.20201114_233443.WI-V4_0_0_2265.stack_trace.7z [ 13514 ] =>

@firebird-automations
Copy link
Collaborator Author

Modified by: @pavel-zotov

Attachment: firebird.20201114_233443.WI-V4_0_0_2265.stack_trace.7z [ 13515 ]

@firebird-automations
Copy link
Collaborator Author

Modified by: @pavel-zotov

Version: 3.0.7 [ 10940 ]

Version: 4.0 Beta 2 [ 10888 ]

description: Run following code:

SET BAIL ON;
shell del c:\temp\tmp4test.fdb 2>nul;
create database 'localhost:c:\temp\tmp4test.fdb' user 'SYSDBA' password 'masterkey';
commit;

set term ^;
create or alter procedure sp_level_0 as
begin
end
^

create or alter procedure sp_level_2 as
begin
execute procedure sp_level_0;
end
^

create or alter procedure sp_level_1 as
begin
execute procedure sp_level_2;
end
^

alter procedure sp_level_0 as
begin
execute procedure sp_level_1;
end
^

set term ;^
commit;

connect 'localhost:c:\temp\tmp4test.fdb' user 'SYSDBA' password 'masterkey';
set echo on;
execute procedure sp_level_1;
rollback;

On 4.0.0.2265 out will be:

execute procedure sp_level_1;
Statement failed, SQLSTATE = 08006
Error reading data from the connection.

Expected (and got on FB 2.5.9.27152 and 3.0.8.33390):

execute procedure sp_level_1;
Statement failed, SQLSTATE = 54001
Too many concurrent executions of the same request
-At procedure 'SP_LEVEL_0' line: 3, col: 5
At procedure 'SP_LEVEL_2' line: 3, col: 5
At procedure 'SP_LEVEL_1' line: 3, col: 5
At procedure 'SP_LEVEL_0' line: 3, col: 5
At procedure 'SP_LEVEL_2' line: 3, col: 5
At procedure 'SP_LEVEL_1' line: 3, col: 5
At procedure 'SP_LEVEL_0' line: 3, col: 5
At procedure 'SP_LEVEL_2' line: 3, col: 5
At procedure 'SP_LEVEL_1' line: 3, col: 5
At procedure 'SP_LEVEL_0' line: 3, col: 5
At procedure 'SP_LEVEL_2' line: 3, col: 5
At procedure 'SP_LEVEL_1' line: 3, col: 5
At procedure 'SP_LEVEL_0' line: 3, col: 5
At procedure 'SP_LEVEL_2' line: 3, col: 5
At procedure 'SP_LEVEL_1' line: 3, col: 5
At procedure 'SP_LEVEL_0' line: 3, col: 5
At procedure 'SP_LEVEL_2' line: 3, col: 5
At procedure 'SP_LEVEL_1' line: 3, col: 5
At procedure 'SP_LEVEL_0' line: 3, col: 5
At procedure 'SP_LEVEL_2' line: 3, col: 5
At procedure 'SP_LEVEL_1' line: 3, col: 5
At procedure 'SP_LEVEL_0' line: 3, col: 5
At procedure 'SP_LEVEL_2' line: 3, col: 5
At procedure 'SP_LEVEL_1' line: 3, col: 5
At procedur...

Stack traces are in attached .7z files.

PS.

And one more issue.
Every odd (or even ?) run of this code leads to crash NOT instantly but after ~30 seconds. This occurs regardless whether i restart FB service before it or no.
This is log of console when output was provided with timestamps by mtee.exe utility:

run-1:

23:33:43.324 execute procedure sp_level_1;
23:33:43.937 Statement failed, SQLSTATE = 08006
23:33:43.938 Error reading data from the connection.

run-2:

23:33:58.673 execute procedure sp_level_1;
23:34:25.133 Statement failed, SQLSTATE = 08006
23:34:25.133 Error reading data from the connection.

run-3:

23:34:42.804 execute procedure sp_level_1;
23:34:43.629 Statement failed, SQLSTATE = 08006
23:34:43.629 Error reading data from the connection.

run-4:

23:34:52.307 execute procedure sp_level_1;
23:35:18.839 Statement failed, SQLSTATE = 08006
23:35:18.839 Error reading data from the connection.

PPS.

firebird.conf:

AuthClient = Srp, Srp256, ExtAuth, Win_Sspi, Legacy_Auth
AuthServer = Srp, Srp256, ExtAuth, Win_Sspi, Legacy_Auth
BugCheckAbort = 1
DefaultDbCachePages = 20000
ExtConnPoolLifeTime = 10
ExtConnPoolSize = 10
ExternalFileAccess = Full
FileSystemCacheThreshold = 65536K
IpcName = fb40_SS
KeyHolderPlugin = KeyHolder
LockHashSlots = 22111
LockMemSize = 5M
MaxUnflushedWrites = -1
MaxUnflushedWriteTime = -1
ReadConsistency = 0
RemoteServicePort = 3400
ServerMode = Super
TempBlockSize = 2M
TempCacheLimit = 2147483647
TempDirectories = d:\temp
UDFAccess = Restrict UDF
UserManager = Srp, Legacy_UserManager
WireCrypt = Enabled
WireCryptPlugin = ChaCha, Arc4

=>

Run following code on 4.x

SET BAIL ON;
shell del c:\temp\tmp4test.fdb 2>nul;
create database 'localhost:c:\temp\tmp4test.fdb' user 'SYSDBA' password 'masterkey';
commit;

set term ^;
create or alter procedure sp_level_0 as
begin
end
^

create or alter procedure sp_level_2 as
begin
execute procedure sp_level_0;
end
^

create or alter procedure sp_level_1 as
begin
execute procedure sp_level_2;
end
^

alter procedure sp_level_0 as
begin
execute procedure sp_level_1;
end
^

set term ;^
commit;

connect 'localhost:c:\temp\tmp4test.fdb' user 'SYSDBA' password 'masterkey';
set echo on;
execute procedure sp_level_1;
rollback;

On 4.0.0.2265 out will be:

execute procedure sp_level_1;
Statement failed, SQLSTATE = 08006
Error reading data from the connection.

Expected (and got on FB 2.5.9.27152 and 3.0.8.33390):

execute procedure sp_level_1;
Statement failed, SQLSTATE = 54001
Too many concurrent executions of the same request
-At procedure 'SP_LEVEL_0' line: 3, col: 5
At procedure 'SP_LEVEL_2' line: 3, col: 5
At procedure 'SP_LEVEL_1' line: 3, col: 5
At procedure 'SP_LEVEL_0' line: 3, col: 5
At procedure 'SP_LEVEL_2' line: 3, col: 5
At procedure 'SP_LEVEL_1' line: 3, col: 5
At procedure 'SP_LEVEL_0' line: 3, col: 5
At procedure 'SP_LEVEL_2' line: 3, col: 5
At procedure 'SP_LEVEL_1' line: 3, col: 5
At procedure 'SP_LEVEL_0' line: 3, col: 5
At procedure 'SP_LEVEL_2' line: 3, col: 5
At procedure 'SP_LEVEL_1' line: 3, col: 5
At procedure 'SP_LEVEL_0' line: 3, col: 5
At procedure 'SP_LEVEL_2' line: 3, col: 5
At procedure 'SP_LEVEL_1' line: 3, col: 5
At procedure 'SP_LEVEL_0' line: 3, col: 5
At procedure 'SP_LEVEL_2' line: 3, col: 5
At procedure 'SP_LEVEL_1' line: 3, col: 5
At procedure 'SP_LEVEL_0' line: 3, col: 5
At procedure 'SP_LEVEL_2' line: 3, col: 5
At procedure 'SP_LEVEL_1' line: 3, col: 5
At procedure 'SP_LEVEL_0' line: 3, col: 5
At procedure 'SP_LEVEL_2' line: 3, col: 5
At procedure 'SP_LEVEL_1' line: 3, col: 5
At procedur...

Stack traces are in attached .7z files.

PS.

And one more issue.
Every odd (or even ?) run of this code leads to crash NOT instantly but after ~30 seconds. This occurs regardless whether i restart FB service before it or no.
This is log of console when output was provided with timestamps by mtee.exe utility:

run-1:

23:33:43.324 execute procedure sp_level_1;
23:33:43.937 Statement failed, SQLSTATE = 08006
23:33:43.938 Error reading data from the connection.

run-2:

23:33:58.673 execute procedure sp_level_1;
23:34:25.133 Statement failed, SQLSTATE = 08006
23:34:25.133 Error reading data from the connection.

run-3:

23:34:42.804 execute procedure sp_level_1;
23:34:43.629 Statement failed, SQLSTATE = 08006
23:34:43.629 Error reading data from the connection.

run-4:

23:34:52.307 execute procedure sp_level_1;
23:35:18.839 Statement failed, SQLSTATE = 08006
23:35:18.839 Error reading data from the connection.

PPS.

firebird.conf:

AuthClient = Srp, Srp256, ExtAuth, Win_Sspi, Legacy_Auth
AuthServer = Srp, Srp256, ExtAuth, Win_Sspi, Legacy_Auth
BugCheckAbort = 1
DefaultDbCachePages = 20000
ExtConnPoolLifeTime = 10
ExtConnPoolSize = 10
ExternalFileAccess = Full
FileSystemCacheThreshold = 65536K
IpcName = fb40_SS
KeyHolderPlugin = KeyHolder
LockHashSlots = 22111
LockMemSize = 5M
MaxUnflushedWrites = -1
MaxUnflushedWriteTime = -1
ReadConsistency = 0
RemoteServicePort = 3400
ServerMode = Super
TempBlockSize = 2M
TempCacheLimit = 2147483647
TempDirectories = d:\temp
UDFAccess = Restrict UDF
UserManager = Srp, Legacy_UserManager
WireCrypt = Enabled
WireCryptPlugin = ChaCha, Arc4

summary: Regression in FB 4.x when PSQL code has recursive SP calls: crash instead of SQLSTATE = 54001 ("Too many concurrent executions...") => Regression in FB 3.x and 4.x when PSQL code has recursive SP calls: crash instead of SQLSTATE = 54001 ("Too many concurrent executions...")

@firebird-automations
Copy link
Collaborator Author

Commented by: @pavel-zotov

3.x also affected, but code must contain more procedures for "loop": six instead of three.
This code leads 3.0.8.33390 to crash:

SET BAIL ON;
shell del c:\temp\tmp4test.fdb 2>nul;
create database 'localhost:c:\temp\tmp4test.fdb' user 'SYSDBA' password 'masterkey';
commit;

set term ^;

create or alter procedure sp_level_0 as
begin
end
^

create or alter procedure sp_level_5 as
begin
execute procedure sp_level_0;
end
^
create or alter procedure sp_level_4 as
begin
execute procedure sp_level_5;
end
^
create or alter procedure sp_level_3 as
begin
execute procedure sp_level_4;
end
^

create or alter procedure sp_level_2 as
begin
execute procedure sp_level_3;
end
^
create or alter procedure sp_level_1 as
begin
execute procedure sp_level_2;
end
^

alter procedure sp_level_0 as
begin
execute procedure sp_level_1;
end
^

set term ;^
commit;

connect 'localhost:c:\temp\tmp4test.fdb' user 'SYSDBA' password 'masterkey';

set echo on;

execute procedure sp_level_0;

@firebird-automations
Copy link
Collaborator Author

Modified by: @pavel-zotov

Attachment: firebird.20201115_001609.WI-V3_0_8_33390.stack_trace.7z [ 13516 ]

@firebird-automations
Copy link
Collaborator Author

Commented by: @AlexPeshkoff

Works for me:

#⁠ ./isql employee
Database: employee, User: SYSDBA
SQL> set term ^;
SQL> create or alter procedure recurs(x int) as begin x = x - 1; if (x > 0) then execute procedure recurs(x); end^
SQL> commit^
SQL> execute procedure recurs(10)^
SQL> execute procedure recurs(100)^
SQL> execute procedure recurs(1000)^
SQL> execute procedure recurs(2000)^
Statement failed, SQLSTATE = 54001
Too many concurrent executions of the same request
-At procedure 'RECURS' line: 1, col: 77
At procedure 'RECURS' line: 1, col: 77
At procedure 'RECURS' line: 1, col: 77
At procedure 'RECURS' line: 1, col: 77
At procedure 'RECURS' line: 1, col: 77
At procedure 'RECURS' line: 1, col: 77
At procedure 'RECURS' line: 1, col: 77
At procedure 'RECURS' line: 1, col: 77
At procedure 'RECURS' line: 1, col: 77
At procedure 'RECURS' line: 1, col: 77
At procedure 'RECURS' line: 1, col: 77
At procedure 'RECURS' line: 1, col: 77
At procedure 'RECURS' line: 1, col: 77
At procedure 'RECURS' line: 1, col: 77
At procedure 'RECURS' line: 1, col: 77
At procedure 'RECURS' line: 1, col: 77
At procedure 'RECURS' line: 1, col: 77
At procedure 'RECURS' line: 1, col: 77
At procedure 'RECURS' line: 1, col: 77
At procedure 'RECURS' line: 1, col: 77
At procedure 'RECURS' line: 1, col: 77
At procedure 'RECURS' line: 1, col: 77
At procedure 'RECURS' line: 1, col: 77
At procedure 'RECURS' line: 1, col: 77
At procedure 'RECURS' line: 1, col: 77
At procedure 'RECURS' line: 1, col: 77
At pr...
SQL>

@firebird-automations
Copy link
Collaborator Author

Commented by: @dyemanov

I believe we have a stack overflow and Windows builds just have a smaller stack size per thread. And I don't think this is a bug per se, the concurrent recursion limit is enforced for a trivial recursion (SP calls itself). With more additional layers added we increase the stack requirements and thus the stack overflow may occur before we reach the recursion limit. I don't see what we could do about that.

@firebird-automations
Copy link
Collaborator Author

Commented by: @AlexPeshkoff

Explicitly check remaining stack size when new looper is started?

@firebird-automations
Copy link
Collaborator Author

Commented by: @asfernandes

That is half of the problem. Expressions could also cquse the problem.

Some expressions qre called with EVL_expr but many is called with execute directly.

It seems for me that for this case a concrete execute method with internalExecute virtual methods would be necessary.

But not only execute can cause the problem. Compulation too and we currently check only IN with hardcoded limit.

@firebird-automations
Copy link
Collaborator Author

Commented by: @dyemanov

> Explicitly check remaining stack size when new looper is started?
Pointless without knowing the stack requirements of the looper node tree.

@firebird-automations
Copy link
Collaborator Author

Commented by: @AlexPeshkoff

At least better than current frames counting.
We may estimate when to stop recursion using different ways - for example when some percent of stack is already occupied.

@adriano - yes, certainly stack check to be performed not only in looper but in other critical points too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant