New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Progress output in GBAK [CORE3146] #3523
Comments
Commented by: Ann Harrison (awharrison) Unfortunately, gbak doesn't know in advance that there are 1673456 rows in the |
Commented by: Gabor Boros (gaborboros) The suggestion talked about pages not rows. ;-) |
Commented by: Savin Gorup (saving) I know counting rows is expensive. The gbak output does not have to be record-count-precise nor time-precise, it should just give an indication of "something going on". The page count seems a good approximation (but maybe I am wrong). Even if gbak would simply print only a page number (I know data is sparse) it is currently processing it would be vastly better than complete silence. |
Commented by: Savin Gorup (saving) Aha. I checked the actual code doing backup (backup.epp). If I understand correctly, it calls put_data() which relies on isc_receive() to get the actual data from server. The later does not give any progress indication, since it is just waiting for server to give the next record. While server is seeking for the next record, nothing happens and isc_receive() blocks. I am not familiar with internals of FB - is there any mechanism in FB to force server to report progress to clients? This would probably benefit not only gbak but any sql query that is taking long time... |
Commented by: Damyan Ivanov (dam) GBAK does report progress somewhat: gbak: writing index FK_EVENTS_EVENT_TYPE Of course, 20 thousands records may take a while to transmit, depending on data type and network speed. |
Commented by: Savin Gorup (saving) Yes, I am aware of that. There are two problems with this approach. |
Commented by: @dyemanov The engine cannot provide you with any progress indication, because it does not know how many records are in the cursor and when it would finish reading them. Cursor is not materialized when you execute a select statement, rows are being read from pages while you fetch from the cursor. |
Commented by: Savin Gorup (saving) I am not intimate with internal workings of server and I believe that it does not know in advance how many rows there are to be processed. The original proposal was on purpose avoiding records (rows). However, engine certainly knows what it is doing at the moment (which page it is processing). In sequencial queries like backup (or all with plan natural) this is fairly good progress indicator, even if data ordering on pages is not sequential. It is quite enough for user to get some feedback on ongoing operation, even if it it just "jumping numbers" of some sort. |
Commented by: @dyemanov Believe or not, the engine doesn't know in advance how many pages to process either. The situation is completely similar to the records one. |
Commented by: Savin Gorup (saving) No, I was probably not clear enough. It does not know how many pages it will process; it does know which page it is processing at the moment, doesn't it? |
Commented by: @dyemanov Sure. It also knows which record it's processing. But I fail to see how it can provide you with any progress indicator. Yes, it would tell you the engine is doing something, but you know the same even if nothing is displayed. The situation is especially complex when you have 10GB of deleted data and gbak performs the garbage collection pass. By design, the engine is not going to return from isc_receive() with any information because the task (read and return one row) is not completed. |
Commented by: Savin Gorup (saving) It centrainly can provide an indicator of progress. We've had far worse in history (accumulating dots, rolling cursors -\|/- ...). User now stares at blank screen and only "top" tells him that engine is actually doing something (if he has access to server at all). At what rate he does not know. Imagine if he would see a page number every 10 seconds: aha, 1000 pages per second, one million pages to do, two passes, that gives, maximum 17 hours. Maybe less. It seems pretty much beter then nothing. The same would apply to long-running sql queries. |
Commented by: @asfernandes You're dreaming too much... I would also like to live in this world.. What's the DMBS that informs you about the progress of a query execution? |
Commented by: Savin Gorup (saving) Hmmm, oracle? (using V$SESSION_LONGOPS view; works for rman from experience!). Also, patches to MySQL are available. A work on PostgreSQL is in progress. FB server knows something. A connection between server and client exists. Servers sends something to client; a "record" or "a page I am working on" periodically. Client reports to user "hey, server is doing something. Look, it is working on page #...". What I would like to hear is technical reasons why it could not be done (or be very hard to do) in FB. Dreams sometimes do come true. |
Commented by: @asfernandes > Hmmm, oracle? (using V$SESSION_LONGOPS view; works for rman from experience!) If you're comparing things this way, then you should look at MON$ tables. |
Commented by: Damyan Ivanov (dam) Hm, I've got an idea. What if the current behaviour of -V is changed from emitting a line every X rows to emitting a line every X seconds? A configurable X would be really nice. Not sure if this deserves another ticket. |
Commented by: @asfernandes > What if the current behaviour of -V is changed from emitting a line every X rows to emitting a line every X seconds? A configurable X would be really nice. What's the reason for this? User could look at timer in screen... It's not different. |
Commented by: Damyan Ivanov (dam) A timer serves to tell you for how long a process is running. A timmed -V would serve different purposes: 1) when the backup is slow (garbage-collection, tables with many big columns or BLOBs), it tells you that something is happening. 2) when the backup is very fast (table with few, small columns), it avoids flooding the user with "rolling" text that is impossible to follow. Anyway, it is just an idea that I think is easy to implement without heavy architectural changes. |
Submitted by: Savin Gorup (saving)
During execution of GBAK on large database (>50 GB) the process seems to hang for a very long time, even if using -V switch.
It would be nice if there would be a command-line option to print a progress bar while doing work, for example on every 100 pages.
Like this:
----
...
gbak: writing index STAT_CAS_ID
gbak: writing data for table STAT
gbak: = ( 100 of 1673456 )
gbak: = ( 200 of 1673456 )
...
gbak: =========== ( 836700 of 1673456 )
...
gbak: ====================== ( 1673456 of 1673456 )
The text was updated successfully, but these errors were encountered: