Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attempt to read after the end of file when nbackup state is stalled [CORE3468] #3829

Closed
firebird-automations opened this issue May 6, 2011 · 6 comments

Comments

@firebird-automations
Copy link
Collaborator

Submitted by: @hvlad

Relate to CORE3792

When relation is extended new page is allocated (and faked) and passed into DPM\extend_relation() function.
Then it is released, and free slot on some pointer page is searched.
When free slot is found just allocated page is fetched again and linked into pointer page.

When nbackup state is stalled database file pre-allocation feature is disabled to left database file in consistent
state while it is copied for backup purpose. Therefore new pages could be allocated after end of database file.
As all new pages will be written into delta and later merged back into database file this is OK.

The issue is that new data page buffer could be evicted from page cache after it was released at the start of
extend_relation(). When page is fetched again it should be read from disk. In stalled state page first reads
from delta. Delta have preallocated space and read succeeds. As page was never written its image contains
zero's only. In this case attempt to read page from database file is performed.

Here we have IO read error : attempt to read after the end of file.

Commits: bd96e2c e67ae1a c208eaf b1c6cfb 4bbd667 c508121 542d70a

@firebird-automations
Copy link
Collaborator Author

Modified by: @hvlad

assignee: Vlad Khorsun [ hvlad ]

@firebird-automations
Copy link
Collaborator Author

Commented by: @hvlad

The simplest solution is to use CCH_fake instead of CCH_fetch when new allocated page is fetched second time.
CCH_fake doesn't reads page image from disk and there can't be any risk that this just allocated page is used by
someone else.

Another solution is to format page header before release. This will force page write in case of eviction and later
half-correct image will be read from delta. This is enough to not make attemp to read this page from database
and avoid potential IO read error.

I used first solution as it is simple and have no IO overhead.

@firebird-automations
Copy link
Collaborator Author

Modified by: @hvlad

status: Open [ 1 ] => Resolved [ 5 ]

resolution: Fixed [ 1 ]

Fix Version: 2.5.1 [ 10333 ]

Fix Version: 3.0 Alpha 1 [ 10331 ]

Fix Version: 2.1.5 [ 10420 ]

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

Link: This issue relate to CORE3792 [ CORE3792 ]

@firebird-automations
Copy link
Collaborator Author

Commented by: @hvlad

> The issue is that new data page buffer could be evicted from page cache after it was released at the start of
> extend_relation(). When page is fetched again it should be read from disk. In stalled state page first reads
> from delta. Delta have preallocated space and read succeeds. As page was never written its image contains
> zero's only. In this case attempt to read page from database file is performed.

Actually, the code is diffrent in v3 and before.

Before v3 CCH_fetch_page() decides to re-read page from database if pag_checksum is zero. This is correct assumption as page checksum can't be zero.

In v3 we removed checksum's and check was changed : now it checks for pag_type and it really could be zero in the case above.

Therefore the patch is not necessary for v2.x and should be reworked for v3.

I'll undo patch for 2.5.2 and 2.1.5 (after additional testing) as it caused unexpected side effect - second call of CCH_fake writes the just allocated page to disk (see CORE3792).

For v3 the patch will be undone and reworked.

@firebird-automations
Copy link
Collaborator Author

Modified by: @pcisar

status: Resolved [ 5 ] => Closed [ 6 ]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment