New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Database shutdown can cause server crash if multiple attachments run EXECUTE STATEMENT [CORE5087] #5372
Comments
Modified by: @pavel-zotovAttachment: shutdown-active-db-batch.zip [ 12887 ] Attachment: shutdown-active-db-crash-stacktraces.7z [ 12888 ] |
Commented by: @pavel-zotov One more attached file - "shutdown-active-db-crash-stacktrace-previous.7z" - is the stack trace that I received originally, when this test used Python attachments instead of ISQL. |
Modified by: @pavel-zotovAttachment: shutdown-active-db-crash-stacktrace-previous.7z [ 12889 ] |
Modified by: @dyemanovsummary: Database shutdown can cause server crash if multiple active attachments with DML exist => Database shutdown can cause server crash if multiple attachments run EXECUTE STATEMENT |
Modified by: @hvladassignee: Vlad Khorsun [ hvlad ] |
Commented by: @hvlad Fix is committed, please confirm |
Commented by: @pavel-zotov > Fix is committed, please confirm It's OK now (fingers crossed). Run test again, no crashes during ~2 hours. |
Modified by: @hvladstatus: Open [ 1 ] => Resolved [ 5 ] resolution: Fixed [ 1 ] Fix Version: 3.0 RC2 [ 10048 ] Fix Version: 2.5.6 [ 10721 ] |
Commented by: @pavel-zotov Fix for 2.5 seems to be incomplete or has no effect: I still get crash. 1) Crash Window appears on the screen (and one need to press twise on it's OK button to close).
|
Modified by: @pavel-zotovstatus: Resolved [ 5 ] => Resolved [ 5 ] QA Status: No test => Done with caveats Test Details: Done only for 3.0. |
Commented by: @hvlad Additional fix for v2.5 is committed. |
Commented by: @pavel-zotov > Additional fix for v2.5 is committed. Unfortunately, I have more issues. If launch, say, 20 sessions and after small delay (~ 10-20 seconds ) try to move database to shutdown then _some_ ISQL are closed but NOT ALL! Today I repeat with building FB 2.5.6 on Linux (run is as SuperClassic - bith on Win and Nix) and connecting to it from Windows. So, 1st I've launched 60 sessions with delay = 10 seconds. After this delay shutdown command issued and ~45 ISQLs were closed instantly, but ~15 isql sessions remains opened and did not put any messages in their logs (i.e. seems like "active"). I could launch shell script which make stack traces for fm_smp_server with interval 10s when 4 ISQL windows remained - see attached file, subfolder fb25-shutdown_04-isqls-hangs-of-total-60-launched Then I repeat with 10 ISQL sessions and delay 10 second, but started to make stack traces just before command shutdown process was issued. Also, one may to use updated version of batch for using on 2.5 -- see files: shut-active-run_25.bat, shut-active-run_25.sql and shut-active-ddl_25.sql. PS. No such trouble on 3.0 (in any arch.). |
Modified by: @pavel-zotovAttachment: fb25shutdown-extremely-slow-control-return-from-some-of-launched-isqls.zip [ 12897 ] |
Commented by: @hvlad Pavel, Does original issue fixed or not ? |
Commented by: @pavel-zotov Yes. No crash. |
Commented by: @hvlad > So, to create new ticket (for 2.5) ? |
Commented by: @pavel-zotov |
Submitted by: @pavel-zotov
Attachments:
shutdown-active-db-batch.zip
shutdown-active-db-crash-stacktraces.7z
shutdown-active-db-crash-stacktrace-previous.7z
fb25shutdown-extremely-slow-control-return-from-some-of-launched-isqls.zip
Scenario (after creating new database with default parameters):
1) recreate following DB objects:
1.1) table 'test' with indexed field of type = varchar(N), N = 500
1.2) table 'log4attach' for accumulating info about every attachment that occurs;
1.3) DB-level trigger on CONNECT event that will add record into 'log4attach'
2) launch multiple ISQL sessions and give to each .sql script for adding rows into 'test' table, but doing that in autonomous RC transaction via ES:
(where 'n_limit' is some big value, enough for this job last more than a few days without interrupting :-))
3) allow ISQL sessions to make their job, take delay about 30-60 seconds; ENSURE that every ISQL window will write its STDOUT & STDERR to separate files.
4) issue command that will move database to SHUTDOWN state (either by using "FBSVCMGR action_properties dbname ... prp_shutdown_mode prp_sm_full prp_force_shutdown 0" or by "GFIX -shut full -force 0"). All recent FB versions ensure that this command runs in synchronous mode, i.e. it will NOT return control until all database activity with be really terminated.
5) returns database to ONLINE
6) CHECK that all files that were created by ISQL sessions for storing STDERR messages do NOT contain text "SQLSTATE = 08004" (connection rejected by remote interface). Optionally: if at least one of files contains such string - test can be stopped.
7) repeat steps 1 ... 6.
Test (batch + .sql) is in attached .zip.
Batch accepts two input arguments:
1) arg_1 = number of launched ISQL sessions which will do loops with ES (INSERT statements into table with indexed field of type = varchar(N), N = 500)
and
2) arg_2 = time, in seconds, that we allow them to work.
Default values of these arguments (40 and 10) can appear not enough for some environment.
As of Linux host with 12 CPU, 32 Gb ram and power IO, I could get result with arg_1 = 90 and arg_2 = 35.
After this batch worked during ~ 3 hour I have 58 crashes (they are attached in another .7z file).
Tested on: LI-V3.0.0.32294
Config:
Servermode = Super
RemoteServicePort = 3333
DefaultDbCachePages = 2048K
BugCheckAbort=1
AuthClient = Legacy_Auth,Srp,Win_Sspi
AuthServer = Legacy_Auth,Srp
UserManager = Legacy_UserManager
WireCrypt = Disabled
ExternalFileAccess = Restrict /var/db/fb30
FileSystemCacheThreshold = 65536K
LockHashSlots = 22111
MaxUserTraceLogSize = 99999
TempCacheLimit = 2147483647
TempDirectories = /tmp/firebird
Commits: 85e5b9b 416b61b 87d0271 3462ec3 9b6969b FirebirdSQL/fbt-repository@68e8092 FirebirdSQL/fbt-repository@da2261b FirebirdSQL/fbt-repository@e0faff4
====== Test Details ======
Done only for 3.0.
Specifying of 2.5.6 in 'min_versions' is deferred, found crash on WI-V2.5.6.26969 (04-feb-2016).
The text was updated successfully, but these errors were encountered: