Add possibility to backup and restore database including index data (pages) not only definition [CORE5115] #5399

firebird-automations · 2016-02-25T15:07:40Z

Submitted by: @livius2

Hi,

now when we do backup it store only index definition without data it is ok from backup time POV and backup size POV but restore time is increased by need of recreation of indexes data.

Especially it need very big disc space to sort index on big databases at recreation time.

Will be good to see option to backup database with index pages then restore time will be faster and will consume small amount of resources.

Nbackup is not good here because we need to have table data reorganized and defragmented. We can accept index fragmentation.

Will be good also to have switch to ignore index data form backup file if backup was created in this new way.

This is slightly corealated with CORE2992

firebird-automations · 2016-02-25T15:29:46Z

Commented by: @dyemanov

Out of curiosity, why do you need table data "reorganized" and "defragmented" as a maintenance procedure? What problem do you see and how does it improve by that means?

firebird-automations · 2016-02-25T15:57:17Z

Commented by: @livius2

This is only in classic server where cache is small and for one connection or server with limited RAM {especcially shared hosting).
Access to "randomly" stored data on HDD is slower - especially if many clients ask.
Maybe this is not so big overhead but..
Bigger problem here - what i not write previously - is that Nbackup does not validate data but gbak do.

firebird-automations · 2016-02-25T18:55:25Z

Commented by: Sean Leyne (seanleyne)

Karol,

Your initial request is not feasible. Index data contains data row pointers (think RDB$Key), which contain database page number references. So, "reorganizing" and "defragmenting" the data pages will in invalidate all of the index pointers, thus requiring an index rebuild. Thus eliminating any benefit from including the index data in the backup/restore.

As for the issue of data validation, I would suggest that using gbak as a data validator is not "a good thing", there are far better approaches that could be used to implemented to address that need.

firebird-automations · 2016-02-25T18:55:55Z

Modified by: Sean Leyne (seanleyne)

description: Hi,

now when we do backup it store only index definition without data
it is ok from backup time POV and backup size POV
but restore time is increased by need of recreation of indexes data.
Especially it need very big disc space to sort index on big databases at recreation time.

Will be good to see option to backup database with index pages
then restore time will be faster and will consume small amount of resources.

Nbackup is not good here because we need to have table data reorganized and defragmented.
We can accept index fragmentation.

Will be good also to have switch to ignore index data form backup file if backup was created in this new way.

This is slightly corealated with CORE2992

=>

Hi,

now when we do backup it store only index definition without data it is ok from backup time POV and backup size POV but restore time is increased by need of recreation of indexes data.

Especially it need very big disc space to sort index on big databases at recreation time.

Will be good to see option to backup database with index pages then restore time will be faster and will consume small amount of resources.

Nbackup is not good here because we need to have table data reorganized and defragmented. We can accept index fragmentation.

Will be good also to have switch to ignore index data form backup file if backup was created in this new way.

This is slightly corealated with CORE2992

firebird-automations · 2016-02-25T19:10:16Z

Commented by: @dyemanov

Theoretically, indices could be stored in the backup in the logical representation - as a set of already ordered key values. Restore would surely require such an index to be built from scratch, but using the fastest possible way -- just fast_load() without any table reads and external sorting.

That said, I still don't see much sense in this RFE.

firebird-automations · 2016-02-25T21:59:12Z

Commented by: @livius2

I see that implementation is really difficult .
Ann Harrison describe this quite detaily on forum but i suppose not all details are included.
I suppose Index itself have reference to next node in the same way as it is referenced to table tecord (such kind of dbkey)
And this referenmce should be also recreated.

I first thinked about dictionary with map previous dbkey and new dbkey.
But this take memory and i do not know if this will be more efficient then creating index from scratch.
Maybe someone else have better concept.
I think about how this work in MSSQL (i know this is totally different implementation) but backup and restore is there really fast.
I can say unreasonable fast.

firebird-automations · 2016-02-25T22:36:28Z

Commented by: Sean Leyne (seanleyne)

If we want to talk about improving gbak performance there are several approaches that can be taken (I have given this a fair bit of thought).

AFAIK, MS SQL server does not store indexes data in their backups, it would significantly increase the size of backup files. They just have a really efficient index rebuild process.

firebird-automations · 2016-02-26T00:24:35Z

Commented by: @hvlad

Dmitry,

> Theoretically, indices could be stored in the backup in the logical representation - as a set of already ordered key values. Restore would surely require such an index to be built from scratch, but using the fastest possible way -- just fast_load() without any table reads and external sorting.

How it will know record numbers ?

firebird-automations · 2016-02-26T00:38:13Z

Commented by: @hvlad

Karol,

> I think about how this work in MSSQL (i know this is totally different implementation) but backup and restore is there really fast.

IIRC, MSSQL backup is physical backup i.e. contains copy of database pages (extents) and some transaction log records.
Therefore it:
a) almost the same as our nbackup (level 0 for full backup and level 1 for differential backup)
b) doesn't validate data in database (as our nbackup)
c) doesn't reorganize data\indices on restore

So, please, compare apples with apples, not with birds ;)

firebird-automations added priority: major component: gbak type: new feature labels Apr 25, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add possibility to backup and restore database including index data (pages) not only definition [CORE5115] #5399

Add possibility to backup and restore database including index data (pages) not only definition [CORE5115] #5399

firebird-automations commented Feb 25, 2016

firebird-automations commented Feb 25, 2016

firebird-automations commented Feb 25, 2016

firebird-automations commented Feb 25, 2016

firebird-automations commented Feb 25, 2016

firebird-automations commented Feb 25, 2016

firebird-automations commented Feb 25, 2016

firebird-automations commented Feb 25, 2016

firebird-automations commented Feb 26, 2016

firebird-automations commented Feb 26, 2016

Add possibility to backup and restore database including index data (pages) not only definition [CORE5115] #5399

Add possibility to backup and restore database including index data (pages) not only definition [CORE5115] #5399

Comments

firebird-automations commented Feb 25, 2016

firebird-automations commented Feb 25, 2016

firebird-automations commented Feb 25, 2016

firebird-automations commented Feb 25, 2016

firebird-automations commented Feb 25, 2016

firebird-automations commented Feb 25, 2016

firebird-automations commented Feb 25, 2016

firebird-automations commented Feb 25, 2016

firebird-automations commented Feb 26, 2016

firebird-automations commented Feb 26, 2016