Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Superserver dies when client is disconnected abnormally during the index navigational scan [CORE4558] #4875

Closed
firebird-automations opened this issue Sep 25, 2014 · 7 comments

Comments

@firebird-automations
Copy link
Collaborator

Submitted by: Sergey Maslikov (masleus)

Firebird 2.5.3 Super server ( 2.5.2 ) dies when client that use embedded SQL cursor disconnected abnormally.

SS calls function fb_utils::logAndDiewhen you terminate a client while fetching data from a cursor;
The Server receives SIGHUP and dies.

I`ve tried to reproduce the same situation using IBPP library but I didn`t succeed.

GDB BackTrace ====================

Thread 2 (Thread 0xaf185b70 (LWP 9378)):
#⁠0 0xb779e424 in __kernel_vsyscall ()
#⁠1 0xb73db781 in *__GI_raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#⁠2 0xb73debb2 in *__GI_abort () at abort.c:92
#⁠3 0x0838e817 in fb_utils::logAndDie (text=0xaf1825bc "Fatal lock manager error: invalid lock id (34336), errno: 22\n--Invalid argument") at ../src/common/utils.cpp:1027
#⁠4 0x0831095e in Jrd::LockManager::bug (this=0xb5c19400, status_vector=0x0, string=0xaf18461c "invalid lock id (34336)") at ../src/lock/lock.cpp:1704
#⁠5 0x0831113f in Jrd::LockManager::get_request (this=0xb5c19400, offset=34336) at ../src/lock/lock.cpp:2228
#⁠6 0x08313968 in Jrd::LockManager::dequeue (this=0xb5c19400, request_offset=34336) at ../src/lock/lock.cpp:754
#⁠7 0x081d62c2 in DEQUEUE (tdbb=0xaf184a8c, lock=0xb4bfb444) at ../src/jrd/lck.cpp:163
#⁠8 LCK_release (tdbb=0xaf184a8c, lock=0xb4bfb444) at ../src/jrd/lck.cpp:756
#⁠9 0x08126a24 in Jrd::BtrPageGCLock::enablePageGC (this=0xb4bfb444, tdbb=0xaf184a8c) at ../src/jrd/btr.cpp:246
#⁠10 0x0820f7da in RSE_close (tdbb=0xaf184a8c, rsb=<value optimized out>) at ../src/jrd/rse.cpp:192
#⁠11 0x0817fde5 in EXE_unwind (tdbb=0xaf184a8c, request=0xb4da741c) at ../src/jrd/exe.cpp:1095
#⁠12 0x08145526 in CMP_release (tdbb=0xaf184a8c, request=0xb4da741c) at ../src/jrd/cmp.cpp:2483
#⁠13 0x081c82a4 in release_attachment (tdbb=0xaf184a8c, attachment=Unhandled dwarf expression opcode 0xf3
) at ../src/jrd/jrd.cpp:5533
#⁠14 0x081c863c in purge_attachment (tdbb=0xaf184a8c, attachment=0xb4e21d50, force_flag=false) at ../src/jrd/jrd.cpp:6398
#⁠15 0x081c981f in jrd8_detach_database (user_status=0xaf184bf0, handle=0xb65612c8) at ../src/jrd/jrd.cpp:2457
#⁠16 0x080781ad in detach_or_drop_database (user_status=Unhandled dwarf expression opcode 0xf3
) at ../src/jrd/why.cpp:2261
#⁠17 0x0804f13a in rem_port::disconnect (this=0xb4f32218, sendL=0xb4bc02bc, receiveL=0xb4bc054c) at ../src/remote/server.cpp:1769
#⁠18 0x08058897 in process_packet (port=<value optimized out>, sendL=0xb4bc02bc, receive=0xb4bc054c, result=0xaf1852cc) at ../src/remote/server.cpp:3581
#⁠19 0x0805abe8 in loopThread () at ../src/remote/server.cpp:5261
#⁠20 0x0806c404 in run (arg=0xb655f5e0) at ../src/jrd/ThreadStart.cpp:128
#⁠21 threadStart (arg=0xb655f5e0) at ../src/jrd/ThreadStart.cpp:139
#⁠22 0xb74fe955 in start_thread (arg=0xaf185b70) at pthread_create.c:300
#⁠23 0xb747d1de in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:130

LOG ===========================================
HP (Server) Thu Sep 25 15:25:04 2014
INET/inet_error: read errno = 104

HP (Server) Thu Sep 25 15:25:13 2014
Fatal lock manager error: invalid lock id (41000), errno: 32

HP (Server) Thu Sep 25 15:25:13 2014
Shutting down the server with 1 active connection(s) to 1 database(s), 0 active service(s)

HP (Server) Thu Sep 25 15:25:18 2014
Firebird shutdown is still in progress after the specified timeout

HP (Client) Thu Sep 25 15:25:18 2014
/usr/sbin/fbguard: /usr/sbin/fbserver terminated abnormally (-1)

HP (Client) Thu Sep 25 15:25:18 2014
/usr/sbin/fbguard: guardian starting /usr/sbin/fbserver

Commits: 5af1459 FirebirdSQL/fbt-repository@b731fdf

====== Test Details ======

Could not reproduce on following DDL:

alter sequence g restart with 0;
recreate table test(id int, s varchar(210));
insert into test select gen_id(g,1), rpad('', 210, uuid_to_char(gen_uuid()))
from rdb$types, rdb$types, (select 1 i from rdb$types rows 10)
rows 400000;
commit;
create index test_s on test(s);
commit;

Two attempts was done with launch 20 ISQL session and force each of them to:
1) either 'select count(*) from (select s from test ORDER BY s);
2) or 'update test set s = rpad('', 210, uuid_to_char(gen_uuid())) where mod(id, 20) = mod(current_connection, 20).

After waiting about 10-20 seconds all of ISQL sessions were terminated by:
1) either pskill isql;
2) or issuing: delete from mon$attachments where mon$attachment_id != current_connection;
3) or issuing: delete from mon$statements where mon$attachment_id != current_connection;

None of these actions lead to FB crash. Log was only filled with '10054' errors when use pskill.

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

assignee: Dmitry Yemanov [ dimitr ]

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

Version: 2.5.3 Update 1 [ 10650 ]

Version: 3.0 Beta 1 [ 10332 ]

Version: 3.0 Alpha 2 [ 10560 ]

Version: 3.0 Alpha 1 [ 10331 ]

Version: 2.5.2 Update 1 [ 10521 ]

Version: 2.5.2 [ 10450 ]

Version: 2.5.1 [ 10333 ]

Version: 2.5.0 [ 10221 ]

Fix Version: 2.5.4 [ 10585 ]

Fix Version: 3.0 Beta 2 [ 10586 ]

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

summary: FB 2.5.3 Superserver Dies when client disconnected abnormally => Superserver dies when client is disconnected abnormally during the index navigational scan

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

Version: 3.0 Beta 1 [ 10332 ] =>

Version: 3.0 Alpha 2 [ 10560 ] =>

Version: 3.0 Alpha 1 [ 10331 ] =>

Fix Version: 3.0 Beta 2 [ 10586 ] =>

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

status: Open [ 1 ] => Resolved [ 5 ]

resolution: Fixed [ 1 ]

@firebird-automations
Copy link
Collaborator Author

Modified by: @pcisar

status: Resolved [ 5 ] => Closed [ 6 ]

@firebird-automations
Copy link
Collaborator Author

Modified by: @pavel-zotov

status: Closed [ 6 ] => Closed [ 6 ]

QA Status: Not enough information

Test Details: Could not reproduce on following DDL:

alter sequence g restart with 0;
recreate table test(id int, s varchar(210));
insert into test select gen_id(g,1), rpad('', 210, uuid_to_char(gen_uuid()))
from rdb$types, rdb$types, (select 1 i from rdb$types rows 10)
rows 400000;
commit;
create index test_s on test(s);
commit;

Two attempts was done with launch 20 ISQL session and force each of them to:
1) either 'select count(*) from (select s from test ORDER BY s);
2) or 'update test set s = rpad('', 210, uuid_to_char(gen_uuid())) where mod(id, 20) = mod(current_connection, 20).

After waiting about 10-20 seconds all of ISQL sessions were terminated by:
1) either pskill isql;
2) or issuing: delete from mon$attachments where mon$attachment_id != current_connection;
3) or issuing: delete from mon$statements where mon$attachment_id != current_connection;

None of these actions lead to FB crash. Log was only filled with '10054' errors when use pskill.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment