Issue Details (XML | Word | Printable)

Key: CORE-3649
Type: Bug Bug
Status: Closed Closed
Resolution: Fixed
Priority: Major Major
Assignee: Alexander Peshkov
Reporter: Javier Sanchez
Votes: 1
Watchers: 4
Operations

If you were logged in you would be able to see more operations.
Firebird Core

gbak deletes backup file even if error happens when it's already successfully closed

Created: 31/Oct/11 10:15 PM   Updated: 23/Apr/13 01:23 PM
Component/s: GBAK
Affects Version/s: 2.5.0, 2.5.1, 2.5.2
Fix Version/s: 2.5.2, 3.0 Alpha 1

Time Tracking:
Not Specified

File Attachments: 1. GZip Archive problem.tar.gz (468 kB)

Environment: Ubuntu Server 10.04 64 bits kernel 2.6.32-34
Issue Links:
Relate
 

Planning Status: Unspecified


 Description  « Hide
When I try to backup a database with gbak I get the following lines:

gbak:writing constraint PK_UBI_MOVI_TMP
gbak:writing referential constraints
gbak:writing check constraints
gbak:writing SQL roles
gbak:writing names mapping
gbak:closing file, committing, and finishing. 2897449984 bytes written
gbak: ERROR:Error reading data from the connection.
gbak:Exiting before completion due to errors
gbak: ERROR:Error reading data from the connection.

this occurs when I write gbak -b -g -user sysdba -pas masterkey localhost:/data/base.gdb /data/base.gbk

the only solution that I've found is to backup the gdb via filesystem so I've wrote:

gbak -b -g -user sysdba -pas masterkey /data/base.gdb /data/base.gbk

that worked fine... I've tried Firebird CS 2.5.0, 2.5.1, 2.5.2 64 bits in serveral servers, in all cases the same result.

the gdb was broken, but after doing a gfix, and backedup and restored (via filesystem) I've assumed that all errors are gone... Am I wrong?, thanks in advance.


 All   Comments   Work Log   Change History   Version Control   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Alexander Peshkov added a comment - 01/Nov/11 06:36 AM
Javier, on the on hand tracker is not good place for asking support questions about released versions. On teh other hand, you've got a crash, which is definitely a bug.

If you still have damaged copy of the database, please follow this
http://www.ibphoenix.com/resources/documents/search/doc_36
recommendations and attach stack backtrace here. Without this it's hard to fix something.

Javier Sanchez added a comment - 01/Nov/11 08:59 PM
Sorry for reporting this as support question. It wasn't my idea.

After several hours of trying to generate a gbk, I think I've found the bug.

This was the situation:

 -In one table I had a Computed Field like this:
    FDIRECC COMPUTED BY (CAST(SUBSTR(FCALLE||' '||K_FORMATNUM(FALTURA,'0;'''';''''')||' '||SUBSTR(FUBICAC,1,20),1,40) AS VARCHAR(40)))

K_FORMATNUM came from an external UDF file, that may have a problem, (I can't see the code). I've changed the declaration to this:

    FDIRECC COMPUTED BY (SUBSTRING(FCALLE||' '||COALESCE(FALTURA,'')||' '||FUBICAC FROM 1 FOR 40))

and everything started to work.

The problem here is that gbak did the entirely backup process up the the "closing file, committing, and finishing" and after this got a connection error and drop the backed up file with no clue on what could have happened.

Once again, thank you form your time and excuse me for posting this incorrectly.

Javier Sanchez added a comment - 01/Nov/11 09:26 PM - edited
I could create an example of a gdb that could not make backup. I've also attached the .so UDF file (it's a 64 bit one). I hope you'll find this useful.

Christian Pradelli added a comment - 03/Nov/11 04:35 PM
Confirmed here, I can reproduce the error.

I think that this should be considered as a gbak bug.
gbak should do the backup regardless of a buggy udf, but if it can't at least the right error should be raised

Alexander Peshkov added a comment - 03/Nov/11 04:58 PM
This is not gbak bug. Something wrong happens inside engine, when deleting database object. BTW, this is a place where UDF libs are unloaded from memory. But I agree that error report is bad and certainly should be fixed.

Christian Pradelli added a comment - 03/Nov/11 05:17 PM
Sorry Alex, you are right is not a gbak bug.
But I think that the "ERROR:Error reading data from the connection. " should no be considered as gbak as a database error, look that it's raised after "closing file, committing, and finishing. 2897449984 bytes written" so the database is teorically right backuped. Instead gbak drops the backup due this connection error.

Alexander Peshkov added a comment - 03/Nov/11 05:29 PM
The bug happens when isc_detach_database() is executed. At this moment database is already completely backed up, backup file is closed, i.e. the issue is really minor. But when isc_detach_database() returns an error code gbak just reports it and exits. What else can it do?

Javier Sanchez added a comment - 03/Nov/11 05:48 PM
Hello again, the problem that remains still is that after gbak raises the error, inmediately erase the gbk file from disk.

Alexander Peshkov added a comment - 03/Nov/11 06:04 PM
Ho-ops, this really makes it not minor.

Sean Leyne added a comment - 03/Nov/11 06:05 PM
Personally, I was going to suggest that any error which occurs during or after the detach could be ignored.

But then my brain kicked in... the error/problem being raised could be significant and relate an issue which does is fact mean that the restore truly failed. So, gbak is doing exactly what is expected.

My suggestion would be to use clearer error text:

- Errors which occur before the detach read as:
  gbak: ERROR: Error reading data from the connection.
  gbak: ERROR: Abending due to errors, restore failed

- Errors which occur during/after the detach read as:
  gbak: Warning: Error reading data from the connection.
  gbak: Warning: Exiting early due to errors, restore is suspect

Sean Leyne added a comment - 03/Nov/11 06:07 PM
Javier,

Are you sure that it is *gbak* which is deleting the gbk and not your own restore script?

I am not aware of any gbak options which would delete the backup file.

Adriano dos Santos Fernandes added a comment - 03/Nov/11 06:10 PM
"Error reading data from the connection" from client was always server crash, no?

This is not something to hide.

Javier Sanchez added a comment - 03/Nov/11 06:11 PM
"Exiting early due to errors, restore is suspect" would be ok. But, and sorry to bother you, Could it be possible to give a little bit of information?.

Javier Sanchez added a comment - 03/Nov/11 06:13 PM
No script, I'm runing gbak directly and try it from different servers and with different firebird versions. (2.5.0,2.5.1, 2.5.2) always the same result, I've also attached an example of the gdb and the UDF file that created te problem

Christian Pradelli added a comment - 03/Nov/11 06:18 PM
By the way, why is gbak deleting the backup after the error?, also if the crash is before finishing it should not delete the partial backup, may be I can use it to restore some data.

Alexander Peshkov added a comment - 07/Nov/11 08:16 AM
Javier, looks like sonething is wrong with yourr UDF. Process exits with 0 code (i.e. normal exit) when firebird does dlclose() for your UDF. The most strange thing is that I can reproduce it not on all machines. On my relatively old (1.5 years) gentoo bug takes place, on the other box (Ubuntu 11.04) everything works just fine. I.e. this can even be glibc problem.

We can't fix all problematic UDFs that arrive around. I suggest to make gbak rename output file from file.fbk to something like file.fbk.BROKEN instead of deleting it and do nothing more re this issue.

Christian Pradelli added a comment - 07/Nov/11 04:19 PM
Hi Alex, I think that there are several situations in this issue. In my opinion:

a) gbak should not delete the residual backup file if the the backup process fail, renaming to something like file.fbk.BROKEN is fine.
b) gbak should not take any action over backup file after "closing file, committing, and finishing. 2897449984 bytes written". After this point the backup should be considered fine regardless any exception raised.
c) Some message about the problematic udf should be logged at least in firebird.log (if it can be detected).

Alexander Peshkov added a comment - 07/Nov/11 04:27 PM
Christian, agreed about a & b, c is impossible - UDF just do not return control to firebird engine, it simply exits.

Adriano dos Santos Fernandes added a comment - 07/Nov/11 04:30 PM
I think a renamed file should not be created.

Just change the code and puts a message saying "error when closing the database" instead of the "closing and going home" one and leave the file.

Alexander Peshkov added a comment - 07/Nov/11 04:38 PM
If backup is complete (file is already closed) - yes, we should leave it as is, just give another message.
But if error happened _during_ backup process - we anyway keep file, but under another name. Keeping it 'as is' in this case can cause a lot of problems in case when someone checks not for exit code of gbak, but only for existence of backup file.

Christian Pradelli added a comment - 16/Nov/11 07:26 PM
Alexander,

About the UDF, I don't think that there is a problem in the UDF, I think that the UDF is in freepascal and the problem is related to this bug: CORE-3651

Alexander Peshkov added a comment - 17/Nov/11 06:54 AM
Definitely yes.

Alexander Peshkov added a comment - 03/Mar/12 03:20 PM
Certainly, I can't fix fpc issues - but at least your backup file will stay with you