New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Firebird Server hangs [CORE3603] #3957
Comments
Commented by: @mrotteveel This sounds more like a support question than a bug report. Support questions should be directed to the firebird-support mailinglist. |
Commented by: Christian Masberg (cubism) Hi Mark, Kind regards |
Commented by: Christian Masberg (cubism) I set up the debug version on the server. Am waiting for the next crash to happen and will post the crash dump file then. Kind regards |
Commented by: Christian Masberg (cubism) Finally the server crashed again. I attached the Dr. Watson Logfile and the windows mdmp file. Hope someone can give me a clue!! |
Modified by: Christian Masberg (cubism)Attachment: drwtsn32.log [ 12022 ] Attachment: fb_inet_server.exe.mdmp [ 12023 ] |
Commented by: Arioch (arioch) Classic server should have some separate LOCKS manager, not belonging to certain SQL worker per-connection proceses. Maybe you can dig more about how lock manager is implemented and how to diagnose/reset its state. |
Commented by: Arioch (arioch) maybe some points in comments at Issue CORE3473 would apply here |
Commented by: Christian Masberg (cubism) hopefully full Dump File. |
Modified by: Christian Masberg (cubism)Attachment: fb_inet_server.exe.hdmp [ 12030 ] |
Commented by: Christian Masberg (cubism) Hi Dimitry, CITRIX-SERVER Mon Oct 24 10:39:44 2011 CITRIX-SERVER Mon Oct 24 10:48:10 2011 CITRIX-SERVER Mon Oct 24 10:48:10 2011 CITRIX-SERVER Mon Oct 24 10:49:00 2011 CITRIX-SERVER Mon Oct 24 10:49:00 2011 CITRIX-SERVER Mon Oct 24 10:58:37 2011 CITRIX-SERVER Mon Oct 24 10:58:37 2011 CITRIX-SERVER Mon Oct 24 10:59:27 2011 CITRIX-SERVER Mon Oct 24 10:59:27 2011 I'd be very happy if anyone could look at the problem. Is there any way that I could debug the process myself? |
Commented by: Christian Masberg (cubism) Dump Files of today. Full crash dump by operating system and Dr. Watson Logfile. |
Modified by: Christian Masberg (cubism)Attachment: drwtsn32_20111024.log [ 12031 ] Attachment: fb_inet_server.exe 20111024.hdmp [ 12032 ] |
Commented by: Christian Masberg (cubism) Since the last post there have been some deadlocks. We urgently need help in debuging this error! If nobody can take the time to do this him/herselves we need advice on how to setup a debugging environment. Furthermore if someone has some further advice or point out to more relevant data, please do so. Thanks in advance and kind regards |
Commented by: @hvlad Please, describe what files (dumps, something else ?) do you have, its amount, sizes and how do you produced it. |
Commented by: Christian Masberg (cubism) Hi! What else do you need or might be helpful. It seems that deadlocks become more frequent so that one is occurring every two weeks sometimes every week. Please advice so that we may act differently the next time a crash happens.. |
Commented by: @hvlad Put dupms on the ftp, compress each file separately. |
Commented by: Christian Masberg (cubism) Hi Vlad, I thought about asking for commercial support myself, too. |
Commented by: @pmakowski you have a list here : http://www.firebirdsql.org/en/support/ |
Commented by: Christian Masberg (cubism) Hi there, Unfortunately our efforts were not rewarded, because today the firebird server software hung another time. As usual the server software and all client processes came to a sudden standstill and new processes couldn't connect to the database. We now created a second db on the same physical server. |
Commented by: Sean Leyne (seanleyne) Christian, Have you looked at the Sweep Interval setting of the database? |
Commented by: Christian Masberg (cubism) Hi Sean, A DB statistic at the end of the day looks like this: Database header page information:
The oldest transaction shows that we have not made a sweep for some http://time.Is this a relevant factor, or are just OAT and OST of any relevance? Kind regards |
Commented by: Sean Leyne (seanleyne) Christian, 1 - If you are running Classic server you should significantly reduce the size of the page cache/buffers. Depending on the number of simultaneous connections I would recommend a valud no larger than 500. Classic server, unlike SuperServer or SuperClassic, performance drops as the size of cache increases due to the time required to synchronize the cache across all engine instances. 2 - I believe that the database has a lot of old record versions which can be contributing to the "deadlocks" you are seeing. A database sweep is recommended, it does not appear that the database has had a sweep in some time (since Sept 13-14, based on the age of the database and the avg number of transactions per day). |
Commented by: Christian Masberg (cubism) Hi Sean, At he moment we have the following db settings: Page Size: 4096 Due to your recommendations I would suggest the following settings: Page Size: 4096 (seems to be sufficient, we don't have a table where the number of records exceeds the number of pages) At the same time a sweep will be executed daily at night. Kind regards |
Commented by: Sean Leyne (seanleyne) Christian, Your cache size and nightly sweep are reasonable. I would increase the page size to 8KB/8192, that is the typical disk block size used by many filing systems. By matching the page size to a full multiple (ie. 1x) of the disk block size, you get the most "bang" for each disk IO. It could also improve index performance, if you have very deep indexes (you need to run gstat and look at the "depth" values, value > 3 is not good). One thing, what is the "KB" you referring to? |
Commented by: Christian Masberg (cubism) Hi Sean, Thanks for the page size hint. 16KB îs not recommendable? Kind regards |
Commented by: Sean Leyne (seanleyne) I have found that 8KB is a "sweet spot" for active databases (our databases range up to 50GB). 16KB page would require 2 disk block IO, which due to disk fragmentation could be at different disk locations, making the IO much more expensive. |
Commented by: Paul Read (nsolve) We have a very similar situation occurring - where all clients freeze and no new connections can be made until FB server is restarted. |
Commented by: Christian Masberg (cubism) Hi Paul, Kind regards |
Commented by: Sean Leyne (seanleyne) Based on the comments posted, it seems that the issue was related to server configuration, not a specific issue which requires engine developer action. As such, this case is closed. |
Modified by: Sean Leyne (seanleyne)status: Open [ 1 ] => Resolved [ 5 ] resolution: Cannot Reproduce [ 5 ] |
Modified by: @pcisarstatus: Resolved [ 5 ] => Closed [ 6 ] |
Submitted by: Christian Masberg (cubism)
Attachments:
drwtsn32.log
fb_inet_server.exe.mdmp
fb_inet_server.exe.hdmp
drwtsn32_20111024.log
fb_inet_server.exe 20111024.hdmp
Votes: 1
Hi there!
We are using Firebird server software for 5 years and have been experienced just a few problems. Since appr. 3 years the server hung from time to time (once every half year). It was something that was ok and bearable.
In the recent half year the deadlocks of the whole db got more frequent. The last few weeks the server hung once a week making it a real problem.
The majority of the incidents happened during a backup process handled by the FIBS Service (Firebird/Interbase Backup Scheduler) which uses the gbak.exe for remote and scheduled backups, but there were also deadlocks during normal business and while executing rather huge queries.
In these cases neither a db shutdown and a following bring online nor a restart of firebird service is solving the problem. Only a restart of the physical server brings the db back to normal life.
I have now read a lot of issues and did some web research and I want to setup Dr. Watson on our server to create a hopefully siginificant dump file after the next crash. As a following step I would then update the server to 2.1.4, but beforehand I would like to invest some time to maybe discover the cause of the problem.
Concerning the setup I have some questions.
After some searching I found the corresponding debug version to our server which is 2.1.1 build 17910. What are the differences of the debug version from the standard one? Is it safe to implement it in productive or is it just for testing environments? Is it slowing down the perfomrance in any kind? If we update to version 2.1.4 is it advisable to install the debug versio by default for the case that the problem is not resolved? Are there any further tips or hints for seting up the debug environment?
I would be very grateful if any of you could spare some experiences in this issue! Thank you very much in advance.
Christian
The text was updated successfully, but these errors were encountered: