Issue Details (XML | Word | Printable)

Key: CORE-5757
Type: Bug Bug
Status: Closed Closed
Resolution: Fixed
Priority: Critical Critical
Assignee: Alexander Peshkov
Reporter: Hamish Moffatt
Votes: 0
Watchers: 3
Operations

If you were logged in you would be able to see more operations.
Firebird Core

deadlock with events

Created: 22/Feb/18 03:18 AM   Updated: 25/Mar/18 07:40 AM
Component/s: Engine
Affects Version/s: 4.0 Initial, 3.0.0, 2.5.6, 3.0.1, 2.5.7, 3.0.2, 4.0 Alpha 1, 2.5.8, 3.0.3
Fix Version/s: 3.0.4, 4.0 Beta 1, 2.5.9

File Attachments: 1. Text File after-patch.txt (7 kB)
2. Text File after-patch2.txt (21 kB)
3. File event_loop.py (0.8 kB)
4. File event_loop.py (0.5 kB)
5. Text File gdb.txt (5 kB)
6. Text File PORT_connecting.patch (8 kB)

Environment: Linux

QA Status: Done with caveats
Test Details:
Stored as usual Python script, for usage only in separate POSIX environment.
Must NOT be launched together with other tests from fbt-repo!

See: fbt-repo/files/core_5757.py.txt


Sub-Tasks  All   Open   

 Description  « Hide
My Firebird server deadlocks often. I am using 2.5.8 on Linux in a mix of superserver, superclassic, 32-bit and 64-bit. All are affected.

When this happens I cannot make any new connections or run any queries on existing connections.

This looks just like CORE-4680 which was meant to be fixed in 2.5.5.

I created a Python program which connects to the server, registers an event listener then disconnects. It runs 5 threads at once. The server deadlocked after about 300 connects (64 connections on each thread). When I killed the Python program the server resumes. The test database is any empty database.

I have attached a back trace from the server while it's in this state.

 All   Comments   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Hamish Moffatt made changes - 22/Feb/18 03:37 AM
Field Original Value New Value
Affects Version/s 2.5.8 [ 10809 ]
Environment Linux / Windows Linux
Description Approximately every month i see a "firebird superserver" that stalled.
I must restart the server with the init script , every connect to the server is impossible.
Firebird offers the following message in the log file:

Koenig_DB_server (Server) Tue Oct 21 10:47:50 2014
        Shutting down the server with 26 active connection(s) to 1 database(s), 0 active service(s)

Koenig_DB_server (Server) Thu Nov 13 07:50:54 2014
        Shutting down the server with 16 active connection(s) to 1 database(s), 0 active service(s)

Koenig_DB_server (Server) Thu Jan 15 09:42:31 2015
        Shutting down the server with 30 active connection(s) to 1 database(s), 0 active service(s)

Koenig_DB_server (Server) Fri Jan 30 10:24:50 2015
        Shutting down the server with 27 active connection(s) to 1 database(s), 0 active service(s)

Since it was years long so goes, I have written a little program, with which I was able to replicate the problem.

1 .A connection is established to the database server
2. 9 events registered.
3. Waited a short time ( millisecond )
4. Unregister the events.
5. Connection closed.
6. Start all over again.

If it is possible to kill the process that caused the deadlock, the server continues to run normal.
If no events were registered it does not happen.

I was able to reproduce the problem under "super server firebird 2.5 / 3beta2 linux / windows".


Here's a video that shows that problem.
https://datiscum.com/FB_Test_x264.mp4

The program can be downloaded here.
https://datiscum.com/FirebirdTest.7z

The downloads are available only for a few days.

I hope this was helpful and the problem can be eliminated.

Regards,
  Sascha Michel
My Firebird server deadlocks often. I am using 2.5.8 on Linux in a mix of superserver, superclassic, 32-bit and 64-bit. All are affected.

When this happens I cannot make any new connections or run any queries on existing connections.

This looks just like CORE-4680 which was meant to be fixed in 2.5.5.

I created a Python program which connects to the server, registers an event listener then disconnects. It runs 5 threads at once. The server deadlocked after about 300 connects (64 connections on each thread). When I killed the Python program the server resumes. The test database is any empty database.

I have attached a back trace from the server while it's in this state.
Attachment gdb.txt [ 13212 ]
Attachment event_loop.py [ 13213 ]
Alexander Peshkov made changes - 22/Feb/18 12:42 PM
Assignee Alexander Peshkov [ alexpeshkoff ]
Alexander Peshkov made changes - 22/Feb/18 03:57 PM
Attachment PORT_connecting.patch [ 13214 ]
Hamish Moffatt made changes - 22/Feb/18 09:59 PM
Attachment event_loop.py [ 13215 ]
Hamish Moffatt made changes - 22/Feb/18 10:38 PM
Attachment after-patch.txt [ 13216 ]
Attachment after-patch2.txt [ 13217 ]
Pavel Zotov made changes - 25/Feb/18 07:45 AM
Status Open [ 1 ] Open [ 1 ]
Test Details Decided skip implementation after letter from hvlad, 28.12.2017 12:33.
QA Status Cannot be tested
Pavel Zotov made changes - 25/Feb/18 07:45 AM
Status Open [ 1 ] Open [ 1 ]
QA Status No test
Alexander Peshkov made changes - 25/Feb/18 05:13 PM
Affects Version/s 3.0.3 [ 10810 ]
Affects Version/s 4.0 Alpha 1 [ 10731 ]
Affects Version/s 3.0.2 [ 10785 ]
Affects Version/s 2.5.7 [ 10770 ]
Affects Version/s 3.0.1 [ 10730 ]
Affects Version/s 2.5.6 [ 10721 ]
Affects Version/s 3.0.0 [ 10740 ]
Affects Version/s 4.0 Initial [ 10621 ]
Fix Version/s 2.5.5 [ 10670 ]
Alexander Peshkov made changes - 25/Feb/18 05:14 PM
Status Open [ 1 ] Resolved [ 5 ]
Fix Version/s 4.0 Beta 1 [ 10750 ]
Fix Version/s 3.0.4 [ 10863 ]
Fix Version/s 2.5.9 [ 10862 ]
Resolution Fixed [ 1 ]
Pavel Zotov made changes - 09/Mar/18 09:25 AM
Status Resolved [ 5 ] Resolved [ 5 ]
Test Details sent letter to dimitr & alex, 09.03.18 12:32. Waiting for reply.
QA Status No test Deferred
Pavel Zotov made changes - 25/Mar/18 07:17 AM
Status Resolved [ 5 ] Resolved [ 5 ]
Test Details sent letter to dimitr & alex, 09.03.18 12:32. Waiting for reply. Stored as usual Python script, for usage only in separate POSIX environment.
Must NOT be launched together with other tests from fbt-repo!

See: fbt-repo/files/core_5757.py.txt

QA Status Deferred Done with caveats
Pavel Zotov made changes - 25/Mar/18 07:40 AM
Status Resolved [ 5 ] Closed [ 6 ]