Issue Details (XML | Word | Printable)

Key: CORE-2936
Type: Bug Bug
Status: Closed Closed
Resolution: Fixed
Priority: Major Major
Assignee: Vlad Khorsun
Reporter: Vlad Khorsun
Votes: 0
Watchers: 0

If you were logged in you would be able to see more operations.
Firebird Core

Wrong page type (expected 7 found N) error

Created: 22/Mar/10 10:17 PM   Updated: 28/Jul/15 07:08 PM
Component/s: Engine
Affects Version/s: 2.0.0, 1.5.4, 2.0.1, 2.0.2, 2.0.3, 1.5.5, 2.1.0, 2.0.4, 2.5 Alpha 1, 2.1.1, 2.0.5, 2.1.2, 2.5 Beta 1, 2.5 Beta 2, 2.1.3, 1.5.6, 2.5 RC1, 2.5 RC2
Fix Version/s: 2.0.6, 2.5 RC3, 2.1.4, 3.0 Alpha 1

QA Status: Cannot be tested

 Description  « Hide
  If two consecutive leaf index pages are removed from index (garbage collected) by two different connections at the same time, then linked list of sibling pages could became broken and sibling pointer at other index page could point to the freed index page. When freed page is allocated again index corruption will be reported. Below is more detailed explanation.

  BTR\garbage_collect() should remove some index page from b-tree. This is not simple process as we should be careful about deadlocks and lock pages in special order.

  The order is : parent page, left sibling, page to be removed (gc_page) and right sibling. Parent page number supplied as input parameter of garbage_collect() and left sibling page number we took from initially locked gc_page and store in local variable.

  As said above, we should take page locks in order, therefore at first we do release gc_page. Then we fetch parent page. Parent page could be changed or even re-allocated since we last visit it. Therefore care is taken to ensure that page still is index page and belongs to the same index at the same level :

// fetch the parent page, but we have to be careful, because it could have
// been garbage-collected when we released it--make checks so that we know it
// is the parent page; there is a minute possibility that it could have been
// released and reused already as another page on this level, but if so, it
// won't really matter because we won't find the node on it
WIN parent_window(pageSpaceID, parent_number);
btree_page* parent_page = (btree_page*) CCH_FETCH(tdbb, &parent_window, LCK_write, pag_undefined);
if ((parent_page->btr_header.pag_type != pag_index) ||
    (parent_page->btr_relation != relation_number) ||
    (parent_page->btr_id != (UCHAR)(index_id % 256)) ||
    (parent_page->btr_level != index_level + 1))
    CCH_RELEASE(tdbb, &parent_window);
    return contents_above_threshold;

    Seems good and correct, isn't is ?

    But if page was released and is not reused it still will look the same, i.e. as correct index page from the same index !

    The same about left sibling page. It fetched by its number, stored in variable, and then the same validation is performed. Again, if left sibling page was released but still not reallocated it will look correct and at the end of removing gc_page we will update sibling pointer at wrong left page. And correct left page will still have sibling pointer pointing to just released gc_page.

    To fix this bug i offer to introduce new flag for index pages - btr_released. This flag will be set at removed page just before call of PAG_release_page(). And check for this flag will be added to the already existing checks for validity of just fetched page.

    When removed index page will be reallocated, CCH_fake() will clear page image and this flag also, so no special action is needed to clear this flag.

 All   Comments   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
There are no comments yet on this issue.