Issue Details (XML | Word | Printable)

Key: PYFB-39
Type: Bug Bug
Status: Closed Closed
Resolution: Fixed
Priority: Major Major
Assignee: Pavel Cisar
Reporter: mike bayer
Votes: 0
Watchers: 1
Operations

If you were logged in you would be able to see more operations.
Firebird driver for Python

hang with multiple connections open, despite calling rollback. behavior demonstrably different from kinterbasdb

Created: 17/Feb/14 07:39 PM   Updated: 13/Nov/14 01:17 PM
Component/s: None
Affects Version/s: None
Fix Version/s: 1.4.2

Environment: osx, fdb 1.4 from current pypi download


 Description  « Hide
The following script illustrates a difference in concurrency behavior between kinterbasdb and fdb. I've tried various FDB settings in order to work around this but I can't seem to get one. Basically, when a particular DBAPI connection has rollback() called, it should not be retaining any kind of transactional state that would interfere with a table being dropped. With kinterbasdb this works, with FDB it does not. The issue here is somewhat of a blocker for FDB support in SQLAlchemy, as we can't get our unit test suite to pass completely.

import fdb
import kinterbasdb

def run_test(dbapi, user, password, host, dbname):
    print("Running with dbapi: %s" % dbapi)

    raw_conn_one = dbapi.connect(user=user, host=host, password=password, database=dbname)
    raw_conn_two = dbapi.connect(user=user, host=host, password=password, database=dbname)

    cursor = raw_conn_one.cursor()
    cursor.execute("CREATE TABLE foo (id integer)")
    cursor.close()
    raw_conn_one.commit()

    cursor = raw_conn_two.cursor()
    cursor.execute("insert into foo (id) values (1)")
    cursor.close()
    # rollback connection two, but dont actually close it.
    # if we close it, then both DBAPIs proceed.
    raw_conn_two.rollback()

    cursor = raw_conn_one.cursor()
    cursor.execute("drop table foo")
    cursor.close()

    print("about to commit...")
    raw_conn_one.commit()

    # FDB will not get here until raw_conn_two is closed.
    # kinterbasdb will.
    print("done!")


run_test(kinterbasdb, "sysdba", "masterkey", "localhost", "//Users/classic/foo.fdb")
run_test(fdb, "sysdba", "masterkey", "localhost", "//Users/classic/foo.fdb")

the test above will complete when the "kinterbasdb" call proceeds. On the "fdb" call, it hangs:

classic$ python test2.py
Running with dbapi: <module 'kinterbasdb' from '/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/kinterbasdb/__init__.pyc'>
about to commit...
done!
Running with dbapi: <module 'fdb' from '/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/fdb/__init__.pyc'>
about to commit...


I'm not sure if fdb has some architectural issue that makes it work this way, but if not a bug, some guidance on correct practices would be helpful. thanks !

 All   Comments   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Philippe Makowski added a comment - 03/Mar/14 03:50 PM
seems that the problem is around cursor.
by the way, it doesn't hang for me, it report an error "object FOO is in use"
it shouldn't

by the way, I would not use cursor in that case but something like :

    raw_conn_one = dbapi.connect(user=user, host=host, password=password, database=dbname)
    raw_conn_two = dbapi.connect(user=user, host=host, password=password, database=dbname)
    raw_conn_one.main_transaction.execute_immediate("CREATE TABLE foo (id integer)")
    raw_conn_one.commit()
    raw_conn_two.main_transaction.execute_immediate("insert into foo (id) values (1)")
    raw_conn_two.rollback()
    raw_conn_one.main_transaction.execute_immediate("drop table foo")
    print("about to commit...")
    raw_conn_one.commit()
    print("done!")


or even better, with only one connection

    raw_conn_one = dbapi.connect(user=user, host=host, password=password, database=dbname)
    raw_conn_one.main_transaction.execute_immediate("CREATE TABLE foo (id integer)")
    raw_conn_one.commit()
    ta_two = raw_conn_one.trans()
    ta_two.execute_immediate("insert into foo (id) values (1)")
    ta_two.rollback()
    raw_conn_one.main_transaction.execute_immediate("drop table foo")
    print("about to commit...")
    raw_conn_one.commit()
    print("done!")

$ python test.py
Running with dbapi: <module 'fdb' from '/Users/philippe/firebird/fdb/fdb/__init__.pyc'>
about to commit...
done!

mike bayer added a comment - 03/Mar/14 04:13 PM
this is for the use within the SQLAlchemy database API. SQLAlchemy is a wrapper over pep249 compliant database APIs, so in the vast majority of situations we should be able to use the standard connection.cursor() approach, ongoing within a transaction, and as I said our test suite features some tests that incur basic concurrency situations. Working around the behavior is not really the issue here. if we need to scale back our firebird tests to the level of that of a SQLite in-memory database, for which we use a special mode where the entire test suite has just one connection, that's an option, but that would be a very low level of test coverage for our firebird suite and a big step back from kinterbasdb.

Pavel Cisar added a comment - 05/Mar/14 07:12 AM
Well, this "Object in use" case is result of statement handle reuse (prepared statement). Cursors in FDB are implemented as thin wrappers around PreparedStatements (Firebird statement handles), and each Cursor instance maintains a cache of prepared statements. Ending (commit/rollback) a transaction doesn't drop all prepared statements, as they are only "closed" and could be reused within context of another transaction (handle). In this particular case it means that Firebird engine keeps some resources (closed statement handle) in it's cache that prevents to drop table in another connection. To truly drop all prepared statements maintained by cursor you need to call cursor.clear_cache(), or delete (call __dell__) the cursor object that holds them. This is as designed and not a bug in FDB.

Philippe Makowski added a comment - 05/Mar/14 12:37 PM
In fact Pavel, of course you are right, and it is clearly documented http://pythonhosted.org/fdb/reference.html#cursor
but one point, to be more close to pep249 and Python DB API 2.0, couldn't we have a default behavior of cursor.close() as similar to __del__ (and do a clear_cache), and have an option such as 'keep' that would not drop prepared statements ?

 

mike bayer added a comment - 05/Mar/14 02:45 PM
OK well that is useful already, cursor.clear_cache() as a workaround is something we could plug into our dialect for firebird FDB.

however, I think it's less important that cursor.close() remove transactional state and/or cached resources, as much as that connection.rollback() *should*. connection.rollback()/commit means the transaction is over. that connection should be equivalent to a brand new connection that was just opened, and anything cached data which still has a handle on locks should be gone. Especially when that state is strictly there as a hidden performance enhancement. If I showed this test case to any other DBAPI, psycopg2, pymssql, whatever, it would be treated as a major issue.

Philippe Makowski added a comment - 05/Mar/14 03:08 PM
no, I don't agree with "connection.rollback() *should*. "

that's a good feature of Firebird : the capability to keep prepared statement outside of a transaction

A good usage of this, is that you can for exemple prepare your more often used statement at the start of your application, and keep them available for futures transactions.

and again, one connection doesn't mean one transaction, you can have more than one transaction per connection.
so yes "connection.rollback()/commit means the transaction is over." but not the connection and even not others transactions linked to this connection.

mike bayer added a comment - 05/Mar/14 03:16 PM
OK just getting this straight, it's firebird's position that a database connection *should* feel free to retain table locks after the transaction has ended, thus locking that table and preventing it from being dropped until the connection is physically closed. by default, implicitly. is that right?

Philippe Makowski added a comment - 05/Mar/14 05:47 PM - edited
Mike, take it easy,
I appreciate the work you are doing with SQLA, and the effort made to have Firebird support in SQLA.
 I understand also the way Pavel implemented this, and the advantage it have, and I appreciate also the work Pavel did to have a new Python driver for Firebird
 may be we can find a solution so everyone can be happy
Pavel what about my proposal :
 to be more close to pep249 and Python DB API 2.0, couldn't we have a default behavior of cursor.close() as similar to __del__ (and do a clear_cache), and have an option such as 'keep' that would not drop prepared statements ?



mike bayer added a comment - 05/Mar/14 05:59 PM
heh was hoping my frustration didn't poke out on that one :)

is it really that much of a performance boost to not re-create prepared statements? Caching prepared statements over transaction boundaries is nice and all but do your users really expect that to be the default behavior? Think of the zillions of JDBC programs that all do: "x = new PreparedStatement()" every time. Is this like more of a big deal on Firebird? (or is there some other reason besides performance)? I would imagine most users of fdb are using it because, its the default Python DBAPI on the firebird site now. this particular performance enhancement has behavioral impacts and that to me is a tradeoff.

Pavel Cisar added a comment - 06/Mar/14 08:28 AM
Mike, the existence of prepared statement has no other impact on other statements, transactions or connections *except* that Firebird *engine* protects metadata referenced by this statement from permanent delete (i.e. drop). Your test case is special scenario that allows this to surface to end user, but you should understand that it's not normal use case scenario. Such situation (attempt to drop table/index/procedure while other connections use(d) it) should *never* happen in end user application/deployment. Such radical metadata changes should be always done in "single user mode", and if you must carry it under full multi-user load, then your applications should be written in a way to cooperate. So what your unit test is doing is *generally* not normal for any RDBMS, it may be allowed to various degree but it's definitely not recommended practice (it's special case). The monstrosity of such scenario could be hidden and smoothed by engine/connectivity layers to invisibility, but it's still a monstrosity that cause serious goosebumps on the neck of any serious RDBMS professional.

I agree that implicit caching of prepared statements by cursor instance makes it more likely to appear when you want to drop database objects while other connections are active, but FDB provides several ways how to "fix" that at application level. We could discuss whether implicit PS caching is good thing and may be change that for good reason, but I must insist on my view that this behavior is not a bug, is as designed and valid ("problems" surfacing in non-standard situations are not an issue).

Anyway, I never liked implicit PS caching (it's for lazy developers, sneaky method how to put some performance to badly written applications and in worst case may lead to significant resource consumption on server side when developers are not only lazy but also stupid), and it was implemented in FDB only because KInterbasDB provides it (although the implementation details differ. in fact it's difference in transaction implementation that makes this visible difference between KDB and FDB, not difference in PS cache implementation itself). So I'll initiate a discussion in firebird-python list about PS caching. I can't guarantee that it will lead to changes that would also "fix" your unit test, but it definitely may (if it will end my way).

mike bayer added a comment - 06/Mar/14 12:57 PM
well if its really just DROP and absolutely nothing else, it's not that big a deal. I wouldnt worry about it. we can work around it on our end.

Pavel Cisar added a comment - 13/Nov/14 01:17 PM
Support for implicit reuse of prepared statements (internal PS cache) was dropped.