Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lower level index pages is missed from parent page [CORE1300] #717

Closed
firebird-automations opened this issue Jun 3, 2007 · 13 comments
Closed

Comments

@firebird-automations
Copy link
Collaborator

Submitted by: @hvlad

Jira_subtask_outward CORE1819
Relate to CORE1819
Is related to CORE1299

Sometimes parent page missed nodes pointed at lower level pages.
For example parent page have nodes A, B while at lower level pages linked as A, C, B

This error can be detected by gfix. It reports in firebird.log 2 errors for the same index page for each such case :

Index XXX is corrupt on page YYY level 1. File: \Fb2\fb2.0\src\jrd\validation.cpp, line: 1656
Index XXX is corrupt on page YYY level 1. File: \Fb2\fb2.0\src\jrd\validation.cpp, line: 1646

(line number as per 2.0 sources)

This error itself can't lead to wrong query results, AFAIK, but i don't know if it can lead to more serious corruptions being present in actively modified index for a long time.

Also this bug can produce 'wrong page type expected 7 found XXX' bugchecks

Commits: 2b31537 6602820 259fafa b0874ce 04f367d

@firebird-automations
Copy link
Collaborator Author

Modified by: @hvlad

Link: This issue is related to CORE1299 [ CORE1299 ]

@firebird-automations
Copy link
Collaborator Author

Commented by: @hvlad

Both bugs reported by gfix similarly and was found using the same test case

@firebird-automations
Copy link
Collaborator Author

Modified by: @hvlad

status: Open [ 1 ] => Resolved [ 5 ]

resolution: Fixed [ 1 ]

Fix Version: 2.0.2 [ 10130 ] =>

@firebird-automations
Copy link
Collaborator Author

Commented by: @hvlad

Below is explanation of a bug and solution :

When we add an entry into the index we search through b-tree branch down to leaf page, insert node there and if leaf page splits we add new page number at one level upper.

Upper page number is remembered in variable ("index" in add_node) before we do handoff from this page down to leaf page

After that we re-fetch upper page by remembered page number and add split page number into it. Note - we don't retain lock at upper page while inserting key into leaf page

But this upper page can be removed from index when we finish split at lower level - thus we will insert split page number into removed page

If this removed page is re-allocated at this point (not a case for this bug report) then we may have 'wrong page type expected 7 found xxx' error if it re-allocated as non-index page. In my test case this page is not allocated thus i have it completed without such errors

I can offer 3 ways to solve the issue :

a) mark parent page with btr_dont_gc flag before CCH_HANDOFF and clear this mark after return from add_node

Easy to implement but make additional page fetches with LCK_write lock which is not necessary in most cases.

b) retain LCK_read lock on parent page, replace CCH_HANDOFF by CCH_FETCH and remove last CCH_FETCH

Also easy to implement but retain more that one page locked. I doubt it can lead to deadlocks as page locks acquired in strong order and we have similar locking schema in btr\garbage_collect

c) detect parent page change and search for correct insertion point starting from root page

Harder to implement and will be not necessary for most cases as parent page can be changed many times and still stay in index at its original place

I've implemented both (a) and (b) and found that (b) is much worse from performance POV.
Ann also prefer (a). So solution (a) is committed into CVS

@firebird-automations
Copy link
Collaborator Author

Modified by: @hvlad

summary: Lower level index pages in missed from parent page => Lower level index pages is missed from parent page

@firebird-automations
Copy link
Collaborator Author

Modified by: @pcisar

status: Resolved [ 5 ] => Closed [ 6 ]

@firebird-automations
Copy link
Collaborator Author

Modified by: @hvlad

Fix Version: 2.0.2 [ 10130 ]

@firebird-automations
Copy link
Collaborator Author

Commented by: maziar (maziar)

i dont sure is fix i continiue in firebird 2.
1beta 2 i see this error !

@firebird-automations
Copy link
Collaborator Author

Commented by: @hvlad

Without reproducible example i'm afraid i can't do much more than already done

@firebird-automations
Copy link
Collaborator Author

Modified by: @pcisar

Workflow: jira [ 12273 ] => Firebird [ 15601 ]

@firebird-automations
Copy link
Collaborator Author

Modified by: Sean Leyne (seanleyne)

Link: This issue relate to CORE1819 [ CORE1819 ]

@firebird-automations
Copy link
Collaborator Author

Modified by: @pavel-zotov

QA Status: No test

@firebird-automations
Copy link
Collaborator Author

Modified by: @pavel-zotov

status: Closed [ 6 ] => Closed [ 6 ]

QA Status: No test => Cannot be tested

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment