Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"lock conversion denied" or "lock denied" error [CORE2848] #3234

Closed
firebird-automations opened this issue Feb 5, 2010 · 16 comments
Closed

Comments

@firebird-automations
Copy link
Collaborator

Submitted by: @hvlad

Relate to CORE4372

Votes: 1

The error was found during developing mt-safe page cache in v3.0.
It is reproducible on relatively high load - TPCC with 64 concurrent connections, not disk bound.
It is under investigation currently if Firebird versions prior to v2.5 is affected.

Commits: a07e2ff ef2e6e7 FirebirdSQL/fbt-repository@9131896 FirebirdSQL/fbt-repository@41db3ac

@firebird-automations
Copy link
Collaborator Author

Modified by: @hvlad

assignee: Vlad Khorsun [ hvlad ]

@firebird-automations
Copy link
Collaborator Author

Commented by: @ibprovider

[03.12.2012 20:07:45] [info] Provider DLL :_IBProvider_v3_vc10_i.dll
[03.12.2012 20:07:45] [info] Provider Version:3.13.2.14113
[03.12.2012 20:07:45] [info] Server Name :Firebird (x64 SuperClassic)
[03.12.2012 20:07:45] [info] Server Version :2.5.3.26543
[03.12.2012 20:07:45] [info] Client Name :Firebird SQL Server
[03.12.2012 20:07:45] [info] Client Version :2.5.3.26543
[03.12.2012 20:07:45] [info] Database ODS :11.2
[03.12.2012 20:07:45] [info] Database Dialect:1
...
[03.12.2012 20:07:45] [info] ConnectionString:
provider=LCPI.IBProvider.3;location=localhost:d:\database\ibp_test_fb25_d1_2.gdb;user id=gamer;password=vermut;ctype=win1251;icu_library=icuuc30.dll
...
[03.12.2012 20:07:45] Creation 8 thread(s)...

--------- [server log]
HOME2 Mon Dec 03 21:48:48 2012
Database: D:\DATABASE\IBP_TEST_FB25_D1_2.GDB
page 487861, page type 4 lock denied (216)

HOME2 Mon Dec 03 21:48:49 2012
Database: D:\DATABASE\IBP_TEST_FB25_D1_2.GDB
page 636734, page type 5 lock denied (216)
--------- [/server log]

Source code of test: ibp_tso_octets_003_blobs.cpp

-------- [thread #⁠2]
[03.12.2012 21:48:37] * START TEST [octets|octets.003.blob.update_rs.TBL_CS__OCTETS.COL_BLOB.len_65537.bind__bytes]
[03.12.2012 21:48:37] *
[03.12.2012 21:48:37] [test] insert records...
[03.12.2012 21:48:37] [test] CREATE OPEN ROWSET ...OK
[03.12.2012 21:48:37] [test] "IRowsetChange"=TRUE [-1]
[03.12.2012 21:48:37] [test] "IRowsetUpdate"=TRUE [-1]
[03.12.2012 21:48:37] [test] "Append-Only Rowset"=TRUE [-1]
[03.12.2012 21:48:37] [test] OPEN TABLE [TBL_CS__OCTETS]...OK
[03.12.2012 21:48:37] [test] DESCRIBE COLUMNS ...
[03.12.2012 21:48:37] [test] insert 256 record(s)
[03.12.2012 21:48:37] [test] COMMIT CHANGE [ALL=TRUE] ...
[03.12.2012 21:48:50] [test] 1. select as [DBTYPE_BYTES]
[03.12.2012 21:48:50] ERROR: [test] RecN:30 TestID:1904019
[03.12.2012 21:48:50] ERROR: [test] Test record is not founded. RecNum: 30. TestID: 1904019
[03.12.2012 21:48:50] [test] 2. select as [DBTYPE_BYTES [DBTYPE_BYREF]]
[03.12.2012 21:48:51] ERROR: [test] RecN:30 TestID:1904019
[03.12.2012 21:48:51] ERROR: [test] Test record is not founded. RecNum: 30. TestID: 1904019
[03.12.2012 21:48:51] [test] 3. select as [DBTYPE_UI1 | DBTYPE_ARRAY]
[03.12.2012 21:48:51] ERROR: [test] RecN:30 TestID:1904019
[03.12.2012 21:48:51] ERROR: [test] Test record is not founded. RecNum: 30. TestID: 1904019
......
-------- [/thread #⁠2]

-------- [thread #⁠6]
[03.12.2012 21:45:47] * START TEST [octets|octets.003.blob.change_rs.TBL_CS__OCTETS.COL_BLOB.len_131073.bind__ss.block_7]
[03.12.2012 21:45:47] *
[03.12.2012 21:45:47] [test] insert records...
[03.12.2012 21:45:47] [test] CREATE OPEN ROWSET ...OK
[03.12.2012 21:45:47] [test] "IRowsetChange"=TRUE [-1]
[03.12.2012 21:45:47] [test] "IRowsetUpdate"=FALSE [0]
[03.12.2012 21:45:47] [test] "Append-Only Rowset"=TRUE [-1]
[03.12.2012 21:45:47] [test] OPEN TABLE [TBL_CS__OCTETS]...OK
[03.12.2012 21:45:47] [test] DESCRIBE COLUMNS ...
[03.12.2012 21:45:51] [test] insert 256 record(s)
[03.12.2012 21:45:51] [test] 1. select as [DBTYPE_BYTES]
[03.12.2012 21:45:55] [test] select 256 row(s)
.....
[03.12.2012 21:48:38] [test] delete records...
[03.12.2012 21:48:49] ERROR: [octets|octets.003.blob.change_rs.TBL_CS__OCTETS.COL_BLOB.len_131073.bind__ss.block_7] Execution of command
1. [LCPI.IBProvider.3]: Ошибка выполнения SQL выражения.

deadlock
page 636734, page type 5 lock denied
Неопознанная ошибка
.....
-------- [/thread #⁠6]

Validation of database not find any errors.

@firebird-automations
Copy link
Collaborator Author

Commented by: @hvlad

Is it reproducible ? If yes, how long you run the test suite to reproduce it ? Is it possible to limit the whole huge test suite to "relevant" tests only ?
Thanks.

@firebird-automations
Copy link
Collaborator Author

Commented by: @ibprovider

No. Not reproducible. Although, I get the similar "deadlock" errors more or less regulary

But it is first time when was fail with such a situation (record not found)

0. thread #⁠2 and #⁠6 inserted test records and commit transactions.

1. thread #⁠6 deletes own records from TBL_CS__OCTETS and get a "deadlock error"
2. thread #⁠2 not finds one own record

-----
I tried (after test completion) execute the "select TEST_ID from tbl_cs__octets where TEST_ID+0=1904019" - returns empty recordset...

I looked into test code ("select" verifications executed in new attachment and with (ofcourse) new transaction) - seem record with ID=1904019 really was lost...

I will let you know about any other similar situation.

@firebird-automations
Copy link
Collaborator Author

Modified by: @hvlad

Link: This issue relate to CORE4372 [ CORE4372 ]

@firebird-automations
Copy link
Collaborator Author

Commented by: @hvlad

I have few sets of memory and lock table dumps (consistent with each other, i.e. produced at the same time) which
shows a following picture when deadlock is detected:

memory dump:

1. Attachment att1 uses page A and wants to lock page B
2. Attachment att2 uses page B and wants to lock page C

Obviously, there is no clear reason for deadlock as there is no loop in wait\-for graph\. 

lock table:

3. Attachment att1 owns lock for page C (!) and waits for lock for page B
4. Attachment att2 owns lock for page B and waits for lock for page C

i.e. here we have classical deadlock condition.

Research shows that att1 contains in own page cache page C which is:

a) dirty (BDB_dirty flag is set)
b) not used currently (bdb_use_count == 0)
c) low precedence page in regards to page A (!)

att1 can't downgrade lock for page C \- it must downgrade high precedence

page A first, but it can't (as page A is currently used, see point (1) above).

I see two different practical cases of this scenario:

- concurrent handoff's
- one attachment tries to remove empty page from the b-tree, while second
attachment inserts key into b-tree and handle split at leaf level

@firebird-automations
Copy link
Collaborator Author

Commented by: @hvlad

The proposed patch is to change AST handler \(CCH\\down\_grade function\) in a following way:

- detect case when down_grade is called recursively for high precedence page
- in such a case it is not required to downgrade lock, it is enough to write high precedence
page to allow original (low) page to be downgraded

@firebird-automations
Copy link
Collaborator Author

Modified by: @hvlad

Fix Version: 2.5.3 [ 10461 ]

Fix Version: 3.0 Beta 1 [ 10332 ]

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

Fix Version: 3.0 Beta 2 [ 10586 ]

Fix Version: 3.0 Beta 1 [ 10332 ] =>

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

Fix Version: 2.5.3 [ 10461 ] =>

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

Fix Version: 2.5.3 [ 10461 ]

@firebird-automations
Copy link
Collaborator Author

Commented by: @hvlad

So far there was no stable test case to validate patch for fb3. Recently, with help of Pavel Zotov, we confirmed it.

@firebird-automations
Copy link
Collaborator Author

Modified by: @hvlad

status: Open [ 1 ] => Resolved [ 5 ]

resolution: Fixed [ 1 ]

@firebird-automations
Copy link
Collaborator Author

Modified by: @pcisar

status: Resolved [ 5 ] => Closed [ 6 ]

@firebird-automations
Copy link
Collaborator Author

Modified by: @pavel-zotov

QA Status: No test

@firebird-automations
Copy link
Collaborator Author

Modified by: @pavel-zotov

status: Closed [ 6 ] => Closed [ 6 ]

QA Status: No test => Cannot be tested

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment