Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gbak deletes backup file even if error happens when it's already successfully closed [CORE3649] #4000

Closed
firebird-automations opened this issue Nov 1, 2011 · 33 comments

Comments

@firebird-automations
Copy link
Collaborator

Submitted by: Javier Sanchez (thejavo)

Relate to CORE3651

Attachments:
problem.tar.gz

Votes: 1

When I try to backup a database with gbak I get the following lines:

gbak:writing constraint PK_UBI_MOVI_TMP
gbak:writing referential constraints
gbak:writing check constraints
gbak:writing SQL roles
gbak:writing names mapping
gbak:closing file, committing, and finishing. 2897449984 bytes written
gbak: ERROR:Error reading data from the connection.
gbak:Exiting before completion due to errors
gbak: ERROR:Error reading data from the connection.

this occurs when I write gbak -b -g -user sysdba -pas masterkey localhost:/data/base.gdb /data/base.gbk

the only solution that I've found is to backup the gdb via filesystem so I've wrote:

gbak -b -g -user sysdba -pas masterkey /data/base.gdb /data/base.gbk

that worked fine... I've tried Firebird CS 2.5.0, 2.5.1, 2.5.2 64 bits in serveral servers, in all cases the same result.

the gdb was broken, but after doing a gfix, and backedup and restored (via filesystem) I've assumed that all errors are gone... Am I wrong?, thanks in advance.

Commits: cf0128b 0afd11b 950f499 aa64bc7

====== Test Details ======

Could not reproduce on Windows host, checked on FB 2.5.0 and 2.5.1; backup file was created both using services and without them.

GBAK report ends always with:

gbak:writing constraint INTEG_13
gbak:writing constraint INTEG_14
gbak:writing constraint CLI_ARCH_PK
gbak:writing referential constraints
gbak:writing check constraints
gbak:writing SQL roles
gbak:writing names mapping
gbak:closing file, committing, and finishing. 9728 bytes written

No text with 'Error reading data from the connection'.
Also - no text 'writing constraint PK_UBI_MOVI_TMP' ==> attached file differs from that where problem raised.

@firebird-automations
Copy link
Collaborator Author

Commented by: @AlexPeshkoff

Javier, on the on hand tracker is not good place for asking support questions about released versions. On teh other hand, you've got a crash, which is definitely a bug.

If you still have damaged copy of the database, please follow this
http://www.ibphoenix.com/resources/documents/search/doc_36
recommendations and attach stack backtrace here. Without this it's hard to fix something.

@firebird-automations
Copy link
Collaborator Author

Modified by: @AlexPeshkoff

assignee: Alexander Peshkov [ alexpeshkoff ]

@firebird-automations
Copy link
Collaborator Author

Commented by: Javier Sanchez (thejavo)

Sorry for reporting this as support question. It wasn't my idea.

After several hours of trying to generate a gbk, I think I've found the bug.

This was the situation:

-In one table I had a Computed Field like this:
FDIRECC COMPUTED BY (CAST(SUBSTR(FCALLE||' '||K_FORMATNUM(FALTURA,'0;'''';''''')||' '||SUBSTR(FUBICAC,1,20),1,40) AS VARCHAR(40)))

K_FORMATNUM came from an external UDF file, that may have a problem, (I can't see the code). I've changed the declaration to this:

FDIRECC  COMPUTED BY \(SUBSTRING\(FCALLE\|\|' '\|\|COALESCE\(FALTURA,''\)\|\|' '\|\|FUBICAC FROM 1 FOR 40\)\)

and everything started to work.

The problem here is that gbak did the entirely backup process up the the "closing file, committing, and finishing" and after this got a connection error and drop the backed up file with no clue on what could have happened.

Once again, thank you form your time and excuse me for posting this incorrectly.

@firebird-automations
Copy link
Collaborator Author

Modified by: Javier Sanchez (thejavo)

summary: gbak: ERROR:Error reading data from the connection. => gbak fail to backup database with buggy UDF

@firebird-automations
Copy link
Collaborator Author

Modified by: Javier Sanchez (thejavo)

summary: gbak fail to backup database with buggy UDF => gbak fail to backup database with buggy UDF in computed field

@firebird-automations
Copy link
Collaborator Author

Commented by: Javier Sanchez (thejavo)

I could create an example of a gdb that could not make backup. I've also attached the .so UDF file (it's a 64 bit one). I hope you'll find this useful.

@firebird-automations
Copy link
Collaborator Author

Modified by: Javier Sanchez (thejavo)

Attachment: problem.tar.gz [ 12040 ]

@firebird-automations
Copy link
Collaborator Author

Commented by: @kattunga

Confirmed here, I can reproduce the error.

I think that this should be considered as a gbak bug.
gbak should do the backup regardless of a buggy udf, but if it can't at least the right error should be raised

@firebird-automations
Copy link
Collaborator Author

Commented by: @AlexPeshkoff

This is not gbak bug. Something wrong happens inside engine, when deleting database object. BTW, this is a place where UDF libs are unloaded from memory. But I agree that error report is bad and certainly should be fixed.

@firebird-automations
Copy link
Collaborator Author

Commented by: @kattunga

Sorry Alex, you are right is not a gbak bug.
But I think that the "ERROR:Error reading data from the connection. " should no be considered as gbak as a database error, look that it's raised after "closing file, committing, and finishing. 2897449984 bytes written" so the database is teorically right backuped. Instead gbak drops the backup due this connection error.

@firebird-automations
Copy link
Collaborator Author

Commented by: @AlexPeshkoff

The bug happens when isc_detach_database() is executed. At this moment database is already completely backed up, backup file is closed, i.e. the issue is really minor. But when isc_detach_database() returns an error code gbak just reports it and exits. What else can it do?

@firebird-automations
Copy link
Collaborator Author

Commented by: Javier Sanchez (thejavo)

Hello again, the problem that remains still is that after gbak raises the error, inmediately erase the gbk file from disk.

@firebird-automations
Copy link
Collaborator Author

Commented by: @AlexPeshkoff

Ho-ops, this really makes it not minor.

@firebird-automations
Copy link
Collaborator Author

Commented by: Sean Leyne (seanleyne)

Personally, I was going to suggest that any error which occurs during or after the detach could be ignored.

But then my brain kicked in... the error/problem being raised could be significant and relate an issue which does is fact mean that the restore truly failed. So, gbak is doing exactly what is expected.

My suggestion would be to use clearer error text:

- Errors which occur before the detach read as:
gbak: ERROR: Error reading data from the connection.
gbak: ERROR: Abending due to errors, restore failed

- Errors which occur during/after the detach read as:
gbak: Warning: Error reading data from the connection.
gbak: Warning: Exiting early due to errors, restore is suspect

@firebird-automations
Copy link
Collaborator Author

Commented by: Sean Leyne (seanleyne)

Javier,

Are you sure that it is *gbak* which is deleting the gbk and not your own restore script?

I am not aware of any gbak options which would delete the backup file.

@firebird-automations
Copy link
Collaborator Author

Commented by: @asfernandes

"Error reading data from the connection" from client was always server crash, no?

This is not something to hide.

@firebird-automations
Copy link
Collaborator Author

Commented by: Javier Sanchez (thejavo)

"Exiting early due to errors, restore is suspect" would be ok. But, and sorry to bother you, Could it be possible to give a little bit of information?.

@firebird-automations
Copy link
Collaborator Author

Commented by: Javier Sanchez (thejavo)

No script, I'm runing gbak directly and try it from different servers and with different firebird versions. (2.5.0,2.5.1, 2.5.2) always the same result, I've also attached an example of the gdb and the UDF file that created te problem

@firebird-automations
Copy link
Collaborator Author

Commented by: @kattunga

By the way, why is gbak deleting the backup after the error?, also if the crash is before finishing it should not delete the partial backup, may be I can use it to restore some data.

@firebird-automations
Copy link
Collaborator Author

Commented by: @AlexPeshkoff

Javier, looks like sonething is wrong with yourr UDF. Process exits with 0 code (i.e. normal exit) when firebird does dlclose() for your UDF. The most strange thing is that I can reproduce it not on all machines. On my relatively old (1.5 years) gentoo bug takes place, on the other box (Ubuntu 11.04) everything works just fine. I.e. this can even be glibc problem.

We can't fix all problematic UDFs that arrive around. I suggest to make gbak rename output file from file.fbk to something like file.fbk.BROKEN instead of deleting it and do nothing more re this issue.

@firebird-automations
Copy link
Collaborator Author

Commented by: @kattunga

Hi Alex, I think that there are several situations in this issue. In my opinion:

a) gbak should not delete the residual backup file if the the backup process fail, renaming to something like file.fbk.BROKEN is fine.
b) gbak should not take any action over backup file after "closing file, committing, and finishing. 2897449984 bytes written". After this point the backup should be considered fine regardless any exception raised.
c) Some message about the problematic udf should be logged at least in firebird.log (if it can be detected).

@firebird-automations
Copy link
Collaborator Author

Commented by: @AlexPeshkoff

Christian, agreed about a & b, c is impossible - UDF just do not return control to firebird engine, it simply exits.

@firebird-automations
Copy link
Collaborator Author

Commented by: @asfernandes

I think a renamed file should not be created.

Just change the code and puts a message saying "error when closing the database" instead of the "closing and going home" one and leave the file.

@firebird-automations
Copy link
Collaborator Author

Commented by: @AlexPeshkoff

If backup is complete (file is already closed) - yes, we should leave it as is, just give another message.
But if error happened _during_ backup process - we anyway keep file, but under another name. Keeping it 'as is' in this case can cause a lot of problems in case when someone checks not for exit code of gbak, but only for existence of backup file.

@firebird-automations
Copy link
Collaborator Author

Commented by: @kattunga

Alexander,

About the UDF, I don't think that there is a problem in the UDF, I think that the UDF is in freepascal and the problem is related to this bug: CORE3651

@firebird-automations
Copy link
Collaborator Author

Modified by: @AlexPeshkoff

Link: This issue relate to CORE3651 [ CORE3651 ]

@firebird-automations
Copy link
Collaborator Author

Commented by: @AlexPeshkoff

Definitely yes.

@firebird-automations
Copy link
Collaborator Author

Modified by: @AlexPeshkoff

summary: gbak fail to backup database with buggy UDF in computed field => gbak deletes backup file even if error happens when it's already successfully closed

@firebird-automations
Copy link
Collaborator Author

Commented by: @AlexPeshkoff

Certainly, I can't fix fpc issues - but at least your backup file will stay with you

@firebird-automations
Copy link
Collaborator Author

Modified by: @AlexPeshkoff

status: Open [ 1 ] => Resolved [ 5 ]

resolution: Fixed [ 1 ]

Fix Version: 3.0 Alpha 1 [ 10331 ]

Fix Version: 2.5.2 [ 10450 ]

@firebird-automations
Copy link
Collaborator Author

Modified by: @pcisar

status: Resolved [ 5 ] => Closed [ 6 ]

@firebird-automations
Copy link
Collaborator Author

Modified by: @pavel-zotov

QA Status: No test

@firebird-automations
Copy link
Collaborator Author

Modified by: @pavel-zotov

status: Closed [ 6 ] => Closed [ 6 ]

QA Status: No test => Cannot be tested

Test Details: Could not reproduce on Windows host, checked on FB 2.5.0 and 2.5.1; backup file was created both using services and without them.

GBAK report ends always with:

gbak:writing constraint INTEG_13
gbak:writing constraint INTEG_14
gbak:writing constraint CLI_ARCH_PK
gbak:writing referential constraints
gbak:writing check constraints
gbak:writing SQL roles
gbak:writing names mapping
gbak:closing file, committing, and finishing. 9728 bytes written

No text with 'Error reading data from the connection'.
Also - no text 'writing constraint PK_UBI_MOVI_TMP' ==> attached file differs from that where problem raised.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment