Issue Details (XML | Word | Printable)

Key: CORE-2848
Type: Bug Bug
Status: Closed Closed
Resolution: Fixed
Priority: Major Major
Assignee: Vlad Khorsun
Reporter: Vlad Khorsun
Votes: 1
Watchers: 5
Operations

If you were logged in you would be able to see more operations.
Firebird Core

"lock conversion denied" or "lock denied" error

Created: 05/Feb/10 07:27 AM   Updated: 29/May/16 10:47 PM
Component/s: Engine
Affects Version/s: 2.5 Alpha 1, 2.5 Beta 1, 2.5 Beta 2, 2.5 RC1, 2.5 RC2
Fix Version/s: 2.5.3, 3.0 Beta 2

Environment: Classic Server or Super Classic arcitecture. SuperServer is not affected.
Issue Links:
Relate
 

QA Status: Cannot be tested


 Description  « Hide
The error was found during developing mt-safe page cache in v3.0.
It is reproducible on relatively high load - TPCC with 64 concurrent connections, not disk bound.
It is under investigation currently if Firebird versions prior to v2.5 is affected.

 All   Comments   Change History   Subversion Commits      Sort Order: Descending order - Click to sort in ascending order
Vlad Khorsun added a comment - 24/Jan/15 11:53 AM
So far there was no stable test case to validate patch for fb3. Recently, with help of Pavel Zotov, we confirmed it.

Vlad Khorsun added a comment - 25/Mar/14 01:28 PM
    The proposed patch is to change AST handler (CCH\down_grade function) in a following way:
- detect case when down_grade is called recursively for high precedence page
- in such a case it is not required to downgrade lock, it is enough to write high precedence
  page to allow original (low) page to be downgraded

Vlad Khorsun added a comment - 25/Mar/14 01:27 PM
  I have few sets of memory and lock table dumps (consistent with each other, i.e. produced at the same time) which
shows a following picture when deadlock is detected:

    memory dump:
1. Attachment att1 uses page A and wants to lock page B
2. Attachment att2 uses page B and wants to lock page C

    Obviously, there is no clear reason for deadlock as there is no loop in wait-for graph.

    lock table:
3. Attachment att1 owns lock for page C (!) and waits for lock for page B
4. Attachment att2 owns lock for page B and waits for lock for page C

i.e. here we have classical deadlock condition.

    Research shows that att1 contains in own page cache page C which is:
a) dirty (BDB_dirty flag is set)
b) not used currently (bdb_use_count == 0)
c) low precedence page in regards to page A (!)

    att1 can't downgrade lock for page C - it must downgrade high precedence
page A first, but it can't (as page A is currently used, see point (1) above).


    I see two different practical cases of this scenario:
- concurrent handoff's
- one attachment tries to remove empty page from the b-tree, while second
  attachment inserts key into b-tree and handle split at leaf level

Kovalenko Dmitry added a comment - 04/Dec/12 10:51 AM - edited
No. Not reproducible. Although, I get the similar "deadlock" errors more or less regulary

But it is first time when was fail with such a situation (record not found)

0. thread #2 and #6 inserted test records and commit transactions.

1. thread #6 deletes own records from TBL_CS__OCTETS and get a "deadlock error"
2. thread #2 not finds one own record

-----
I tried (after test completion) execute the "select TEST_ID from tbl_cs__octets where TEST_ID+0=1904019" - returns empty recordset...

I looked into test code ("select" verifications executed in new attachment and with (ofcourse) new transaction) - seem record with ID=1904019 really was lost...

I will let you know about any other similar situation.

Vlad Khorsun added a comment - 04/Dec/12 10:02 AM
Is it reproducible ? If yes, how long you run the test suite to reproduce it ? Is it possible to limit the whole huge test suite to "relevant" tests only ?
Thanks.

Kovalenko Dmitry added a comment - 04/Dec/12 08:41 AM
[03.12.2012 20:07:45] [info] Provider DLL :_IBProvider_v3_vc10_i.dll
[03.12.2012 20:07:45] [info] Provider Version:3.13.2.14113
[03.12.2012 20:07:45] [info] Server Name :Firebird (x64 SuperClassic)
[03.12.2012 20:07:45] [info] Server Version :2.5.3.26543
[03.12.2012 20:07:45] [info] Client Name :Firebird SQL Server
[03.12.2012 20:07:45] [info] Client Version :2.5.3.26543
[03.12.2012 20:07:45] [info] Database ODS :11.2
[03.12.2012 20:07:45] [info] Database Dialect:1
...
[03.12.2012 20:07:45] [info] ConnectionString:
provider=LCPI.IBProvider.3;location=localhost:d:\database\ibp_test_fb25_d1_2.gdb;user id=gamer;password=vermut;ctype=win1251;icu_library=icuuc30.dll
...
[03.12.2012 20:07:45] Creation 8 thread(s)...

--------- [server log]
HOME2 Mon Dec 03 21:48:48 2012
Database: D:\DATABASE\IBP_TEST_FB25_D1_2.GDB
page 487861, page type 4 lock denied (216)

HOME2 Mon Dec 03 21:48:49 2012
Database: D:\DATABASE\IBP_TEST_FB25_D1_2.GDB
page 636734, page type 5 lock denied (216)
--------- [/server log]

Source code of test: ibp_tso_octets_003_blobs.cpp

-------- [thread #2]
[03.12.2012 21:48:37] * START TEST [octets|octets.003.blob.update_rs.TBL_CS__OCTETS.COL_BLOB.len_65537.bind__bytes]
[03.12.2012 21:48:37] *
[03.12.2012 21:48:37] [test] insert records...
[03.12.2012 21:48:37] [test] CREATE OPEN ROWSET ...OK
[03.12.2012 21:48:37] [test] "IRowsetChange"=TRUE [-1]
[03.12.2012 21:48:37] [test] "IRowsetUpdate"=TRUE [-1]
[03.12.2012 21:48:37] [test] "Append-Only Rowset"=TRUE [-1]
[03.12.2012 21:48:37] [test] OPEN TABLE [TBL_CS__OCTETS]...OK
[03.12.2012 21:48:37] [test] DESCRIBE COLUMNS ...
[03.12.2012 21:48:37] [test] insert 256 record(s)
[03.12.2012 21:48:37] [test] COMMIT CHANGE [ALL=TRUE] ...
[03.12.2012 21:48:50] [test] 1. select as [DBTYPE_BYTES]
[03.12.2012 21:48:50] ERROR: [test] RecN:30 TestID:1904019
[03.12.2012 21:48:50] ERROR: [test] Test record is not founded. RecNum: 30. TestID: 1904019
[03.12.2012 21:48:50] [test] 2. select as [DBTYPE_BYTES [DBTYPE_BYREF]]
[03.12.2012 21:48:51] ERROR: [test] RecN:30 TestID:1904019
[03.12.2012 21:48:51] ERROR: [test] Test record is not founded. RecNum: 30. TestID: 1904019
[03.12.2012 21:48:51] [test] 3. select as [DBTYPE_UI1 | DBTYPE_ARRAY]
[03.12.2012 21:48:51] ERROR: [test] RecN:30 TestID:1904019
[03.12.2012 21:48:51] ERROR: [test] Test record is not founded. RecNum: 30. TestID: 1904019
......
-------- [/thread #2]

-------- [thread #6]
[03.12.2012 21:45:47] * START TEST [octets|octets.003.blob.change_rs.TBL_CS__OCTETS.COL_BLOB.len_131073.bind__ss.block_7]
[03.12.2012 21:45:47] *
[03.12.2012 21:45:47] [test] insert records...
[03.12.2012 21:45:47] [test] CREATE OPEN ROWSET ...OK
[03.12.2012 21:45:47] [test] "IRowsetChange"=TRUE [-1]
[03.12.2012 21:45:47] [test] "IRowsetUpdate"=FALSE [0]
[03.12.2012 21:45:47] [test] "Append-Only Rowset"=TRUE [-1]
[03.12.2012 21:45:47] [test] OPEN TABLE [TBL_CS__OCTETS]...OK
[03.12.2012 21:45:47] [test] DESCRIBE COLUMNS ...
[03.12.2012 21:45:51] [test] insert 256 record(s)
[03.12.2012 21:45:51] [test] 1. select as [DBTYPE_BYTES]
[03.12.2012 21:45:55] [test] select 256 row(s)
.....
[03.12.2012 21:48:38] [test] delete records...
[03.12.2012 21:48:49] ERROR: [octets|octets.003.blob.change_rs.TBL_CS__OCTETS.COL_BLOB.len_131073.bind__ss.block_7] Execution of command
1. [LCPI.IBProvider.3]: Ошибка выполнения SQL выражения.

deadlock
page 636734, page type 5 lock denied
Неопознанная ошибка
.....
-------- [/thread #6]

Validation of database not find any errors.