If two consecutive leaf index pages are removed from index (garbage collected) by two different connections at the same time, then linked list of sibling pages could became broken and sibling pointer at other index page could point to the freed index page. When freed page is allocated again index corruption will be reported. Below is more detailed explanation.
BTR\garbage_collect() should remove some index page from b-tree. This is not simple process as we should be careful about deadlocks and lock pages in special order.
The order is : parent page, left sibling, page to be removed (gc_page) and right sibling. Parent page number supplied as input parameter of garbage_collect() and left sibling page number we took from initially locked gc_page and store in local variable.
As said above, we should take page locks in order, therefore at first we do release gc_page. Then we fetch parent page. Parent page could be changed or even re-allocated since we last visit it. Therefore care is taken to ensure that page still is index page and belongs to the same index at the same level :
// fetch the parent page, but we have to be careful, because it could have
// been garbage-collected when we released it--make checks so that we know it
// is the parent page; there is a minute possibility that it could have been
// released and reused already as another page on this level, but if so, it
// won't really matter because we won't find the node on it
WIN parent_window(pageSpaceID, parent_number);
btree_page* parent_page = (btree_page*) CCH_FETCH(tdbb, &parent_window, LCK_write, pag_undefined);
if ((parent_page->btr_header.pag_type != pag_index) ||
(parent_page->btr_relation != relation_number) ||
(parent_page->btr_id != (UCHAR)(index_id % 256)) ||
(parent_page->btr_level != index_level + 1))
Seems good and correct, isn't is ?
But if page was released and is not reused it still will look the same, i.e. as correct index page from the same index !
The same about left sibling page. It fetched by its number, stored in variable, and then the same validation is performed. Again, if left sibling page was released but still not reallocated it will look correct and at the end of removing gc_page we will update sibling pointer at wrong left page. And correct left page will still have sibling pointer pointing to just released gc_page.
To fix this bug i offer to introduce new flag for index pages - btr_released. This flag will be set at removed page just before call of PAG_release_page(). And check for this flag will be added to the already existing checks for validity of just fetched page.
When removed index page will be reallocated, CCH_fake() will clear page image and this flag also, so no special action is needed to clear this flag.