Issue Details (XML | Word | Printable)

Key: JDBC-341
Type: Bug Bug
Status: Closed Closed
Resolution: Fixed
Priority: Major Major
Assignee: Mark Rotteveel
Reporter: Steffen Heil
Votes: 0
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
Jaybird JCA/JDBC Driver

PooledConnectionQueue not theadsafe?

Created: 22/Feb/14 01:40 PM   Updated: 27/Apr/14 01:41 PM
Component/s: None
Affects Version/s: Jaybird 2.2
Fix Version/s: Jaybird 2.2.5

Time Tracking:
Not Specified

Environment: Sun JDK 6.0.45, Firebird 2.0 LI-V6.3.4.13130


 Description  « Hide
(First off, I know the environment is quite outdated, but I am quite sure, this has nothing to do with my problem.)

I have a massively multithreaded application that uses jaybird (currently 2.2.0) with a connection pool and a maxPoolSize of 15.
For years it worked perfectly (I am not sure, at what point we updated to 2.2.0 though).

Yesterday for the very first time, we noticed a situation where a thread tried to obtain a connection but failed to do so.
Interestingly all connections opened by our application were closed. I am sure about that, because very connection we acquire gets wrapped and the wrappers are tracked. (A background task closes connections, when the wrappers get GCed and logs a stacktrace of the thread that acquired the connection.)

Out own code always acquires database connections using one method in our framework that is synchronized, so getConnection is never called by more than one thread. So the only thread in FB classes at the time of the problem was:

"trustedChannel-worker-1" prio=10 tid=0x00007f0a2d877800 nid=0x19ca in Object.wait() [0x0000000041aec000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at org.firebirdsql.pool.BlockingStack.pop(BlockingStack.java:147)
- locked <0x0000000571cdc700> (a java.lang.Object)
at org.firebirdsql.pool.PooledConnectionQueue.take(PooledConnectionQueue.java:413)
- locked <0x0000000571cdb980> (a java.lang.Object)
at org.firebirdsql.pool.AbstractConnectionPool.getPooledConnection(AbstractConnectionPool.java:231)
- locked <0x0000000571d1c450> (a org.firebirdsql.pool.FBConnectionPoolDataSource)
at org.firebirdsql.pool.AbstractFBConnectionPoolDataSource.getPooledConnection(AbstractFBConnectionPoolDataSource.java:352)
- locked <0x0000000571d1c450> (a org.firebirdsql.pool.FBConnectionPoolDataSource)
at org.firebirdsql.pool.AbstractFBConnectionPoolDataSource.getPooledConnection(AbstractFBConnectionPoolDataSource.java:372)
- locked <0x0000000571d1c450> (a org.firebirdsql.pool.FBConnectionPoolDataSource)
at org.firebirdsql.pool.FBWrappingDataSource.getConnection(FBWrappingDataSource.java:219)
at com.osiris4.core.persistence.servers.FirebirdDatabase.getSafeConnection(FirebirdDatabase.java:377)
- locked <0x0000000571ef4ec0> (a com.osiris4.core.persistence.servers.FirebirdDatabase)

Using eclipse MAT I inspected a memory dump (that I still have, if questions arise) and found the (only) instance of PooledConnectionQueue that used the lock 0x0000000571cdb980. Inspecting that PooledConnectionQueue instance showed the following:

Type |Name |Value
------------------------------------------------------------------------------------------------------------
ref |connectionIdleTime |java.util.HashMap @ 0x571ed19f8
ref |workingConnectionsToClose|java.util.HashSet @ 0x571cdb9b0
ref |workingConnections |java.util.HashSet @ 0x571cdb998
int |totalConnections |15
ref |idleRemover |org.firebirdsql.pool.PooledConnectionQueue$IdleRemover @ 0x571e33ca8
ref |takeMutex |java.lang.Object @ 0x571cdb980
boolean|blocked |true
ref |stack |org.firebirdsql.pool.BlockingStack @ 0x571cdb968
int |size |-37
int |blockingTimeout |600000
ref |queueName |//127.0.0.1//opt/database/osiris4.fdb
ref |key |org.firebirdsql.pool.AbstractConnectionPool$UserPasswordPair @ 0x571e33ea0
ref |configuration |org.firebirdsql.pool.FBConnectionPoolDataSource @ 0x571d1c450
ref |logger |null
ref |connectionManager |org.firebirdsql.pool.FBConnectionPoolDataSource @ 0x571d1c450
------------------------------------------------------------------------------------------------------------

I suspect that "size" should NOT have a value of *-37*.
Looking into the code seems to suggest that access to "size" is only synchronized in one of four locations.

Additionally there *are* 15 instances of "FBPooledConnection", all of which have "inPool" set to *true*....

I am willing to inspect this further but so far I have no idea about all the internal classes used in jaybird, so I would strongly appreciate if someone else could have a look as well.


 All   Comments   Work Log   Change History   Version Control   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Mark Rotteveel added a comment - 24/Feb/14 05:49 PM
As also discussed on Firebird-Java (https://groups.yahoo.com/neo/groups/Firebird-Java/conversations/topics/10992), all classes in org.firebirdsql.pool should be considered deprecated and will be removed in Jaybird 3.0. I suggest to move to either use the connection pool of your application server (if applicable), or to switch to c3p0, BoneCP or another third party connection pool.

That said, I would have expected this to work correctly with FBWrappingDataSource, as such I will see if I can find the cause and I plan to fix it in Jaybird 2.2.5 or maybe 2.2.6.

Mark Rotteveel added a comment - 01/Mar/14 02:57 PM
Testing also revealed a deadlock issue.

Mark Rotteveel added a comment - 01/Mar/14 03:25 PM
Testing with a fixed version of PooledConnectionQueue reveals a deadlock between PooledConnectionQueue and AbstractFBConnectionPoolDataSource

Mark Rotteveel added a comment - 20/Mar/14 08:13 PM
Reworked the locking and synchronization of PooledConnectionQueue and related classes. As far as I have been able to reason and test this should fix the deadlocks I discovered earlier. However this needs additional testing.

I will release a snapshot version of Jaybird 2.2.5 before the end of the month (so: somewhere next week) for testing.

Mark Rotteveel added a comment - 30/Mar/14 02:31 PM
The SNAPSHOT of 2.2.5 is available at https://sourceforge.net/projects/firebird/files/firebird-jca-jdbc-driver/2.2.5-SNAPSHOT/ and also in the Sonatype maven snapshot repository https://oss.sonatype.org/content/repositories/snapshots/

If you could test this it would be really helpful.

Mark Rotteveel added a comment - 27/Apr/14 09:06 AM
Testing shows the issue to have been fixed, no other thread safety issues uncovered.