Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DB corruption when killing posix CS [CORE1439] #1857

Closed
firebird-automations opened this issue Sep 3, 2007 · 23 comments
Closed

DB corruption when killing posix CS [CORE1439] #1857

firebird-automations opened this issue Sep 3, 2007 · 23 comments

Comments

@firebird-automations
Copy link
Collaborator

Submitted by: @AlexPeshkoff

Attachments:
fb2insi.patch.gz

When posix classic (or embedded) server is killed instead of being shutdown gracefully, database corruption is possible.

Commits: 63e2610

@firebird-automations
Copy link
Collaborator Author

Commented by: @AlexPeshkoff

Fix for 2.1 should be backported to 2_0_Release branch.

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

status: Open [ 1 ] => Resolved [ 5 ]

resolution: Fixed [ 1 ]

@firebird-automations
Copy link
Collaborator Author

Modified by: @pcisar

Fix Version: 2.0.4 [ 10211 ]

@firebird-automations
Copy link
Collaborator Author

Commented by: @pcisar

Reopened to get it back ported from 2.1 into 2.0.4.

@firebird-automations
Copy link
Collaborator Author

Modified by: @pcisar

status: Resolved [ 5 ] => Reopened [ 4 ]

resolution: Fixed [ 1 ] =>

@firebird-automations
Copy link
Collaborator Author

Commented by: Saulius Vabalas (svabalas)

Pavel,

Can You confirm if this fix prevents database corruption in case DB connection termination is initiated on client side, like process kill, PC Reboot and etc? Or this is just when connection is killed on the server side. Would You provide more details what exactly is being fixed here?

Thanks,
Saulius Vabalas

@firebird-automations
Copy link
Collaborator Author

Commented by: @pcisar

Saulius,

This apply only to forcefully killed classic server processes, and has nothing to do with clients connected via remote interface. Unfortunately, the CVS commit was not tagged by tracker id, so I can't comment on changes that were made to fix this. Alex, can you fill in some details?

@firebird-automations
Copy link
Collaborator Author

Commented by: @AlexPeshkoff

I have _not_ promissed it will be backported into 2.0.4. The whole y-valve was seriously rewritten to support the fix. Unfortunately, fix has nothing to do with power lost or hardware malfunction problems - only termination of process using OS kill command (signals 2 & 15) are involved. But in this case shutdown is really smart.

Please take into an account that even in previous versions possibility to have DB broken is very-very small. But sometimes people use to kill fb_inet_serves regularly (!), and when done often - bad thing can happen. Specially taking into an account that in case of power loss/hw problems people try to check and possibly repair database. When killing single process, others continue to work with database actively. And in this mode problems can grow and grow.

I think backporting a fix (this means almost whole copying of why.cpp from HEAD to 2.0) may happen soon after 2.1 release - provided we have no problems with new y-valve.

@firebird-automations
Copy link
Collaborator Author

Commented by: Saulius Vabalas (svabalas)

Alexander,

From what You just said looks like this fix is only in 2.1 and potentially it can be backported into 2.0.4 (Q2 of 2008?), but backport is still questionable. Any recommendations what to do for FB 1.5 and 2.0 customers? It's gonna be a while until stable 2.1 will be available. Even the possibility of DB corruption is "very-very small" it already happened twice in our case. It's apparent when process doing some long running batch updates/inserts is killed that way. Classic has no other way of terminating ran away queries/processes so corruption in this case would be rated as critical in my opinion. Comments?

@firebird-automations
Copy link
Collaborator Author

Commented by: @AlexPeshkoff

Saulius, if you have real problems, it's certainly another case. If you want, I can send you a patch, which is almost for 2.0.1 (a but earlier CVS tree was used, but it should apply to 2.0.1 OK and almost OK to 2.0.3). I'll attach it as a file here - try if you need it.

@firebird-automations
Copy link
Collaborator Author

Commented by: @AlexPeshkoff

This patch should fix a problem in 2.0. Sorry, it also contains a kind of (very llimited functionality) on disconnect trigger for 2.0. Please don;t use it, and it will not damage anything for you.

@firebird-automations
Copy link
Collaborator Author

Modified by: @AlexPeshkoff

Attachment: fb2insi.patch.gz [ 10638 ]

@firebird-automations
Copy link
Collaborator Author

Modified by: @hvlad

assignee: Alexander Peshkov [ alexpeshkoff ] => Vlad Khorsun [ hvlad ]

@firebird-automations
Copy link
Collaborator Author

Modified by: @hvlad

status: Reopened [ 4 ] => In Progress [ 3 ]

@firebird-automations
Copy link
Collaborator Author

Modified by: @hvlad

status: In Progress [ 3 ] => Open [ 1 ]

@firebird-automations
Copy link
Collaborator Author

Commented by: @hvlad

changed by accident, sorry

@firebird-automations
Copy link
Collaborator Author

Modified by: @hvlad

assignee: Vlad Khorsun [ hvlad ] => Alexander Peshkov [ alexpeshkoff ]

@firebird-automations
Copy link
Collaborator Author

Modified by: @pcisar

Workflow: jira [ 12950 ] => Firebird [ 14294 ]

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

Fix Version: 2.0.5 [ 10222 ]

Fix Version: 2.1 Alpha 1 [ 10150 ] =>

Fix Version: 2.0.4 [ 10211 ] =>

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

Fix Version: 2.1 Alpha 1 [ 10150 ]

@firebird-automations
Copy link
Collaborator Author

Modified by: @AlexPeshkoff

status: Open [ 1 ] => Resolved [ 5 ]

resolution: Fixed [ 1 ]

@firebird-automations
Copy link
Collaborator Author

Modified by: @pcisar

status: Resolved [ 5 ] => Closed [ 6 ]

@firebird-automations
Copy link
Collaborator Author

Modified by: @pavel-zotov

QA Status: No test

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment