Issue Details (XML | Word | Printable)

Key: CORE-1770
Type: Bug Bug
Status: Closed Closed
Resolution: Fixed
Priority: Critical Critical
Assignee: Alexander Peshkov
Reporter: Alexander Peshkov
Votes: 0
Watchers: 0

If you were logged in you would be able to see more operations.
Firebird Core

Bugcheck 291 in DDL

Created: 04/Mar/08 03:07 AM   Updated: 12/Nov/09 04:10 PM
Component/s: Engine
Affects Version/s: 2.5 Initial
Fix Version/s: 2.5 Alpha 1

Time Tracking:
Not Specified

Environment: Primary superserver on multicore/SMP system, though can show itself in any environment.
Issue Links:

Planning Status: Unspecified

 Description  « Hide
This is actually very old bug, but it started to actively show itself only in 2.5 - most possible due to more places when threads can run together in the engine. I will describe it on a simple and not widely used today sample, but it obviously can happen in other cases too.

Imagine some system relation (for example rdb$database) is modified in user transaction (this really happens in GPRE-preprocessed programs to set database properties). Transaction is committed, and there is an old version of the record on the disk. Right after it CREATE TABLE is issued, which modifies same relation rdb$database in system transaction. System transaction always uses update_in_placed() to update a record. But during it another thread (on SS this is GC thread, but it can easily be any regular attachment reading rdb$database) removes old unused version of the record. At this moment system transaction fails with bugcheck(291) - no record version found.

 All   Comments   Work Log   Change History   Version Control   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Alexander Peshkov added a comment - 04/Mar/08 03:12 AM
I have a short and working fix for this issue, but it is really hack. The correct fix should be to separate two "magics" of system transaction - dirty read/write and always update_in_place().

Alexander Peshkov added a comment - 17/Mar/08 12:55 PM
Historically we have 2 ways to detect is current transaction system or not - fixed number 0 and flag TRA_system. Initially I was going to use only TRA_system flag and separate in into 2 logical features - dirty access to data and all updates in place. But after reviewing the code it was noticed, that in some places (mainly in the ODS, where transaction number is stored in record version), there is no other information about transaction except it's number. Therefore I had to go another way - and one of possible was modifying of system data in user transation at defreed work phase. Should notice, that in general it's not a crime - a lot of such modifications are performed in dyn*.epp.

Most of changes in deferred work are performed against an object, which name matches dfw_name, i.e. this is an object, directly related with transaction for which we perform deferred work (or closely related objects - like segments of index). The only place where dirty write to system tables seemed to me important is adding new files to database. Luckily, it was always done exclusively. In 2 places I've found that it's required to provide dirty reads in DFW - it's relation ID generation (implemented in metadata cache, i.e. in transaction 0) and access to RDB$FIELDS when creating index (needed to support a kind of hack in gbak, making it possible to use PLAN in procedures). In the last case I had to separate FOR cycle in dfw.epp to keep read access from system transaction, but modification - from user one.