Shorten backup/restore duration by using parallel (multi-threaded) execution [CORE2992] #3374

firebird-automations · 2010-05-06T02:21:45Z

Submitted by: Saulius Vabalas (svabalas)

Is duplicated by CORE3958
Is related to CORE1365

Votes: 15

make backups & restores work faster, e.g. optimize internal processes(it takes 8+ hours to do 130GB DB backup & restore, what creates huge data backlog for 24/7 call centers). Is there any way during restore to create all indexes just by doing single pass on the table?

firebird-automations · 2010-05-06T06:26:00Z

Commented by: @dyemanov

Out of curiosity, why would you need GBAK for that purpose? I'd expect NBACKUP to be used instead. It had some problems in the past, but it was significantly reworked in v2.5 and so far nobody complained about it.

firebird-automations · 2010-05-07T01:16:53Z

Commented by: Saulius Vabalas (svabalas)

There are multiple cases when NBackup can not help. Go ahead and correct me if I'm wrong:
- Systems on FB versions prior to 2.5 not having Nbackup in place are always forced to go through restore process if for some reason all the changes have to be rolled back to the latest good backup (safepoint). In emergency restore instances - restore duration becomes really critical in case all system operations are down
- Full backup/restore cycle eliminates db fragmentation and usually improves overall performance
- DB migration from Windows 32 OS to Linux 64 or between different ODS versions
- DB Header Page Transaction Generation value Reset when value approaches signed integer max value. Had multiple cases, when due to some application bug continues select query execution loop was running in some thread, what generated over 60M transactions per day and could cause DB corruption roughly in a month period (in case of multiple threads/apps this perdiod becomes much shorter). It's really easy to automate such task and kill DB without Admin even suspecting anything wrong because transactions move just fine and by default there are no automated monitoring on this value setup.
- In FB1.5 backup/restore used to be the only way to reset internal RDB$PROCEDURE_ID generator in order to stop overflowing its value in case application is extensively creating/dropping temp procedures where execute blocks can't be used (due to dependencies between tmp SP's).

A couple years ago I was doing some internal performance testing trying to guess what FB does and when while doing various backup and restore stages by looking into interactive log, disk and CPU utilization. I had backup file, restored file, temp directory, swap sitting on different physical disks so it was really interesting to see when process is CPU bound and when it is disk bound. If you are interested I can dig out that data for you.

firebird-automations · 2010-05-07T08:57:52Z

Commented by: @hvlad

I've in my plans to try some enhancements for resore process but no promises so far.

firebird-automations · 2010-05-07T13:16:25Z

Commented by: @pcisar

I guess that with new threading architecture it could be possible to distribute the single table load process into several parallel pipelines that would produce the data pages that would be then flushed to disk in bulk by another worker thread? It would require some extensions to the protocol thought. Guess it could be also possible to create in advance sorted streams from incoming data for later index creation?

firebird-automations · 2010-05-07T23:48:27Z

Commented by: Saulius Vabalas (svabalas)

The longest process is a restore. So making some parts parallel where applicable makes perfect sense in current multi-core CPU era where performance is limited by CPU. Modern Servers dealing with 100GB DB's have from 8 to 24 cores, 32-64GB of RAM, where at least 20GB is reserved tor file cache. Why not use that CPU power when needed? In case of eliminating same table rereads for each index creation - singe data read pass would eliminate Disk bound part, but that most likely requires bigger algorithm changes, where same data has to be streamed to dedicated index creators. Right now it just sad to watch server activity when multi-million table index creation starts for table having over 10 indexes and for each one - full table scan is performed. Lots of waisted time and money if counting down time.

I also like Pavel's idea doing restores for each table in parallel. Maybe gbak can have extra switch allowing to specify max number of threads it can use (or priority level), in case process is running on loaded server versus idle.

Same techniques apply to backup as well. As long as disk is able to feed all the data - having multiple db readers will speed up the whole process as well.

Best part - does not look like these improvements require any ODS changes, so it can be ported to 2.5 pretty easily making a lot of people happy.

firebird-automations · 2012-10-23T08:50:26Z

Modified by: @dyemanov

Link: This issue is duplicated by CORE3958 [ CORE3958 ]

firebird-automations · 2012-10-23T08:50:40Z

Modified by: @dyemanov

Link: This issue is related to CORE1365 [ CORE1365 ]

firebird-automations · 2016-03-09T16:19:48Z

Modified by: @dyemanov

Fix Version: 4.0 Beta 1 [ 10750 ]

assignee: Vlad Khorsun [ hvlad ]

firebird-automations · 2018-11-20T17:36:32Z

Commented by: David Culbertson (davidc)

Has anyone ever considered having an option of doing the backup and restore in one pass where the output of the backup is a new database instead of the backup file? A few years ago at the meeting in Prauge I discussed this with Ann H. and Jim S. and they thought it would be possible and not too difficult.

firebird-automations · 2019-01-22T20:34:47Z

Modified by: @dyemanov

Fix Version: 4.0 Beta 1 [ 10750 ] =>

firebird-automations · 2019-01-22T23:55:19Z

Commented by: Todd Manchester (todd710)

Any chance this will work with older versions of Firebird? In particular 2.5.+

firebird-automations · 2019-06-26T12:07:53Z

Commented by: Ján Kolár (kolar_appliedp.com)

This optimization would help us. Currently when I try restore ~3GB server database stored on local network, the restore speed is 300 kB/s ! I have not measured it, but restore of whole database would take a few hours. When I copy database file through file sharing service, upload speed is around 100 MB/s, so this is not caused by slow network.

firebird-automations · 2019-06-26T12:21:05Z

Commented by: @hvlad

Ján Kolár,

please, learn how to restore database over network: CORE2666
It is documented at doc/README.services_extension (look for "4) Services API extension")

firebird-automations · 2019-06-27T09:27:50Z

Commented by: Attila Molnár (e_pluribus_unum)

Try IBEGBak
https://www.ibexpert.net/ibe/pmwiki.php?n=Doc.IBEGbak

firebird-automations added affect-version: 2.5 RC1 priority: major component: gbak type: improvement labels Apr 25, 2021

firebird-automations assigned hvlad Apr 25, 2021

dyemanov added the fix-version: 5.0 Beta 1 label Dec 19, 2022

dyemanov closed this as completed Dec 19, 2022

dyemanov changed the title ~~Shorten backup/restore duration [CORE2992]~~ Shorten backup/restore duration by using parallel (multi-threaded) execution [CORE2992] Mar 12, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Shorten backup/restore duration by using parallel (multi-threaded) execution [CORE2992] #3374

Shorten backup/restore duration by using parallel (multi-threaded) execution [CORE2992] #3374

firebird-automations commented May 6, 2010

firebird-automations commented May 6, 2010

firebird-automations commented May 7, 2010

firebird-automations commented May 7, 2010

firebird-automations commented May 7, 2010

firebird-automations commented May 7, 2010

firebird-automations commented Oct 23, 2012

firebird-automations commented Oct 23, 2012

firebird-automations commented Mar 9, 2016

firebird-automations commented Nov 20, 2018

firebird-automations commented Jan 22, 2019

firebird-automations commented Jan 22, 2019

firebird-automations commented Jun 26, 2019

firebird-automations commented Jun 26, 2019

firebird-automations commented Jun 27, 2019

Shorten backup/restore duration by using parallel (multi-threaded) execution [CORE2992] #3374

Shorten backup/restore duration by using parallel (multi-threaded) execution [CORE2992] #3374

Comments

firebird-automations commented May 6, 2010

firebird-automations commented May 6, 2010

firebird-automations commented May 7, 2010

firebird-automations commented May 7, 2010

firebird-automations commented May 7, 2010

firebird-automations commented May 7, 2010

firebird-automations commented Oct 23, 2012

firebird-automations commented Oct 23, 2012

firebird-automations commented Mar 9, 2016

firebird-automations commented Nov 20, 2018

firebird-automations commented Jan 22, 2019

firebird-automations commented Jan 22, 2019

firebird-automations commented Jun 26, 2019

firebird-automations commented Jun 26, 2019

firebird-automations commented Jun 27, 2019