Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Race condition between event notification and event registration [CORE5521] #5790

Closed
firebird-automations opened this issue Apr 15, 2017 · 3 comments

Comments

@firebird-automations
Copy link
Collaborator

Submitted by: @mrotteveel

Relate to JDBC175

There is a race condition between event notification and event registration (queue) reusing the event buffer, which causes the notification to write the event-id of the just queued event instead of the event being notified. See also fb-devel thread "Concurrency bugs in posting events?"

Relevant excerpts:

2017-04-09 12:00:
I have built a sample application that reproduces it a little bit more
consistently (although it still occasionally succeeds without a
mismatch). Note that this doesn't include the logging I showed in this
conversation, let me know if you need that.

You can download it from
https://www.dropbox.com/s/6jxfcadxtojodf8/event-race-condition-1.0-SNAPSHOT.zip?dl=0

Start with ./bin/event-race-condition --help for instructions. It
requires Java 8.

Playing around with the --threadCount and --insertsPerThread can help to
improve predictability to reproduce it. I used the same number of
threads as I have (HT) cores in my machine. Using more inserts per
thread can also increase the chance of it eventually occurring.

defaults are:
private static String hostName = "localhost";
private static int portNumber = 3050;
private static String databasePath = "D:/data/db/fb3/eventrace.fdb";
private static String user = "sysdba";
private static String password = "masterkey";
private static int threadCount = 8;
private static int insertsPerThread = 200;

Full project: https://github.com/mrotteveel/event-race-condition

2017-04-02 13:59:
there seems to be a concurrency bug in
events posted by Firebird to the client. It looks like it overwrites
local event ids (shared buffer, race condition?).

This is triggered by running the entire Jaybird test suite. Running the
specific test, TestFBEventManager.testLargeMultiLoad, in isolation
significantly reduces the chance of it occurring.

For example a test run shows:

[V10AsynchronousChannel]Queue event: WireEventHandle:{
name:TEST_EVENT_A, localId:694, eventId:0, internalCount:897,
previousInternalCount:897 }
[AbstractWireOperations]readStatusVector arg:isc_arg_gds int: 0
[V10AsynchronousChannel]java.nio.HeapByteBuffer[pos=0 lim=88 cap=2048]:
000000340000000000000012010C544553545F4556454E545F42C201000000000000000000000000000002B5000000340000000000000012010C544553545F4556454E545F418503000000000000000000000000000002B6
[V10AsynchronousChannel]Received event id 693, eventCount 450
[V10AsynchronousChannel]Queue event: WireEventHandle:{
name:TEST_EVENT_B, localId:695, eventId:0, internalCount:450,
previousInternalCount:450 }
[AbstractWireOperations]readStatusVector arg:isc_arg_gds int: 0
[AbstractWireOperations]readStatusVector arg:isc_arg_gds int: 0
[V10AsynchronousChannel]Received event id 694, eventCount 901
[V10AsynchronousChannel]Queue event: WireEventHandle:{
name:TEST_EVENT_A, localId:696, eventId:0, internalCount:901,
previousInternalCount:901 }
[AbstractWireOperations]readStatusVector arg:isc_arg_gds int: 0
[V10AsynchronousChannel]java.nio.HeapByteBuffer[pos=0 lim=44 cap=2048]:
000000340000000000000012010C544553545F4556454E545F42C301000000000000000000000000000002B8
[AbstractWireOperations]readStatusVector arg:isc_arg_gds int: 0
[V10AsynchronousChannel]Received event id 696, eventCount 451
[V10AsynchronousChannel]Queue event: WireEventHandle:{
name:TEST_EVENT_A, localId:697, eventId:0, internalCount:451,
previousInternalCount:451 }

In other words, Firebird posts event data for TEST_EVENT_B (count 450 ->
451) with the local event id of TEST_EVENT_A. On occasion I also see
that it resends an earlier - already acknowledged - local event id.

As the event name is proper (although not 100% sure it always is), I
might be able to workaround this in the pure java implementation by
matching based on the event name instead, but that is hardly a good
workaround, because it is possible the same event name is registered
multiple times, and it won't solve the occurrence of the same bug with
the native client.

This seems to suggest a race condition of some kind when the events are
posted/written to the aux connection.

I can reproduce this with Jaybird master, Firebird 3.0.2.32703 on
Windows 10 64 bit, but I have also seen it with other Firebird versions,
and with Jaybird 2.2 (which has a significantly different implementation
of event handling), both with pure java and the native client use.

Any thoughts or ideas on this, or is it better if I just create a bug
report?

Other example: both A and B are acknowledged with id of event B:

[V10AsynchronousChannel]Received event id 640, eventCount 843
[V10AsynchronousChannel]Queue event: WireEventHandle:{
name:TEST_EVENT_A, localId:642, eventId:0, internalCount:843,
previousInternalCount:843 }
[AbstractWireOperations]readStatusVector arg:isc_arg_gds int: 0
[AbstractWireOperations]readStatusVector arg:isc_arg_gds int: 0
[FBManagedConnection]End called: Xid[773794790]
[V10AsynchronousChannel]Received event id 641, eventCount 422
[V10AsynchronousChannel]Queue event: WireEventHandle:{
name:TEST_EVENT_B, localId:643, eventId:0, internalCount:422,
previousInternalCount:422 }
[AbstractWireOperations]readStatusVector arg:isc_arg_gds int: 0
[AbstractWireOperations]readStatusVector arg:isc_arg_gds int: 0
[AbstractWireOperations]readStatusVector arg:isc_arg_gds int: 0
[V10AsynchronousChannel]java.nio.HeapByteBuffer[pos=0 lim=88 cap=2048]:
000000340000000000000012010C544553545F4556454E545F414D0300000000000000000000000000000283000000340000000000000012010C544553545F4556454E545F42A70100000000000000000000000000000283
[V10AsynchronousChannel]Received event id 643, eventCount 845
[V10AsynchronousChannel]Queue event: WireEventHandle:{
name:TEST_EVENT_B, localId:644, eventId:0, internalCount:845,
previousInternalCount:845 }
[AbstractWireOperations]readStatusVector arg:isc_arg_gds int: 0
[AbstractWireOperations]readStatusVector arg:isc_arg_gds int: 0
[V10AsynchronousChannel]Received event id 643, eventCount 423

Commits: d250a99

@firebird-automations
Copy link
Collaborator Author

Modified by: @mrotteveel

Link: This issue relate to JDBC175 [ JDBC175 ]

@firebird-automations
Copy link
Collaborator Author

Modified by: @hvlad

assignee: Vlad Khorsun [ hvlad ]

@firebird-automations
Copy link
Collaborator Author

Modified by: @hvlad

status: Open [ 1 ] => Resolved [ 5 ]

resolution: Fixed [ 1 ]

Fix Version: 4.0 Alpha 1 [ 10731 ]

Fix Version: 3.0.3 [ 10810 ]

Fix Version: 2.5.8 [ 10809 ]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment