New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Server is Crashing after Building up Memory when One or More Clients are Connected Concurrently [CORE3385] #3751
Comments
Commented by: @dyemanov I removed the over-longish firebird.log contents, please attach it as a separate file. |
Modified by: @dyemanovsecurity: Developers [ 10012 ] => |
Commented by: @dyemanov What exactly v2.5.1 build do you use? Also, what platform (win32 / win64)? All the errors in the log are related to the out-of-memory condition, the server process is out of virtual memory. I suppose this is the 32-bit build and you experience some kind of memory leak. What FB version did you run before trying v2.5.1? |
Commented by: Andre van Zuydam (andrevanzuydam) Log file with the errors before a crash of Firebird |
Modified by: Andre van Zuydam (andrevanzuydam)Attachment: firebird.log [ 11918 ] |
Modified by: Andre van Zuydam (andrevanzuydam)description: We have a test system running on 2.5.1 snapshot, each week (plus minus 7 - 8 days) apart we have to manually restart / initialize the Firebird service which crashes out. Looking into the logs points to a possible network which we have had a look at, unfortunatelly I get INET/inet_error: read errno = 10054 on localhost development machine every time I work. One possible cause is perhaps the sweep that failed ? The five clients were connected but could not query the server after such an incident which also makes the problem strange. The logs of the past week since the last crash and today are below. => We have a test system running on 2.5.1 snapshot, each week (plus minus 7 - 8 days) apart we have to manually restart / initialize the Firebird service which crashes out. Looking into the logs points to a possible network problem which we have had a look at, unfortunately I get INET/inet_error: read errno = 10054 on local host development machine every time I work. One possible cause is perhaps the sweep that failed ? The five clients were connected but could not query the server after such an incident which also makes the problem strange. The logs of the past week since the last crash and today are below. environment: Windows 7 Professional => Windows 7 Professional, Firebird Snapshot 2.5.1.26208 |
Commented by: Andre van Zuydam (andrevanzuydam) We have now tested with Super Classic version of Firebird and the memory on the system is being released correctly! How do we debug memory leaks on Firebird Super Server ? |
Commented by: @dyemanov Have you looked into the MON$MEMORY_USAGE table? |
Commented by: Andre van Zuydam (andrevanzuydam) Ok, we've done some extensive testing and can duplicate the problem now at whim, The problem happens when two client application connect to the Super Server concurrently. As each client is connected a small increase in memory occurs on the fbserver.exe in Task Manager. (Over a week this produces a crash) I'm sure this will not happen in standard circumstances but we have a service which polls the database engine every 10 seconds and it is this service that is causing a build up of memory once another connection happens. If the service is running by itself then it would be happy indefinitely. If a local client to the service or a remote client connects, Firebird starts building up memory. Assuming these clients stay connected for a long time or perform reconnections to the database what we find is that the memory is freed up but about 100 - 200K memory always stays occupied. Only once all the client applications have disconnected does the Firebird Server go back to its normal memory state (about 4600K on our system). As long as two of the clients remains connected the memory builds up, all clients must then disconnect and Firebird memory goes back to normal. If only one client is connected the server is stable and memory does not increase. Does this sound like a shared memory problem ? Super Classic does not have any of these draw backs and behaves correctly. What can I do to help debug this ? |
Modified by: Andre van Zuydam (andrevanzuydam)summary: Server is Crashing after cannot start sweep thread (0) => Server is Crashing after Building up Memory when One or More Clients are Connected Concurrently |
Commented by: @dyemanov Thanks for the information, hopefully it will help us to find this memory leak. I will report back if more input would be required from your side. |
Modified by: @dyemanovassignee: Dmitry Yemanov [ dimitr ] |
Modified by: @dyemanovstatus: Open [ 1 ] => In Progress [ 3 ] |
Commented by: @dyemanov While I'm searching for the possible memory leak, could you please re-try SuperServer and monitor the OST (oldest snapshot transaction) counter with gstat -h -- whether it gets stuck or not. |
Commented by: Andre van Zuydam (andrevanzuydam) Hi Dmitry, sorry for the delay in posting, here is a sample of the gstat -h What exactly should I be looking for here? Database header page information:
I'm getting a lot of INET/inet_error: read errno = 10054 in my logs which I do not think is network hardware related, after this happens the clients disconnect off the database and we have to restart the engine. Is this something that I can prevent or is this a bug ? |
Commented by: Andre van Zuydam (andrevanzuydam) Another log, super server version at another site, memory is building up at a regular pace, about 100KB per transaction, only occasionally seems to free up some, 2 days later Firebird is now using 245MB of RAM, 6 clients connected permanently 24 X 7. Database "----.FDB"
|
Commented by: @hvlad Transactions management is far from perfect. |
Commented by: Andre van Zuydam (andrevanzuydam) Hi Vlad I've set the cooperative policy on, I suppose this is how Classic server runs? We do get more performance out of Super Server though and this is why we want to run this. We do perform many inserts while operating as our system is transactional in terms of how the data is stored, updates are very few, delete operations are limited to archiving of a single table to another database. We are using a stored proc to connect to the other database to send the data, could this be where the memory is leaking ? Some other things we have tested is that a normal RAM cleaner app will bring the memory use down on the Firebird server, this is not ideal. I am also open to poor programming on my side, how can I test if my transactions are really getting closed ? I definitely call close transaction after I do a query and statement, there is something that bothered me on Firebird 2.5, some of the transactions I opened reported a 501 error of attempting to close an already closed cursor which the same code / client did not report in 2.1, I changed my transaction closing method to use the DSQL_UNPREPARE from a DSQL_DROP or DSQL_CLOSE parameter which "seemed" to fix this problem. These transactions which returned cursor errors were update or execute statements for stored procs which in most cases do not return results, I had similar problems with update insert statements with returning values, something definitely changed in the client after 2.1 which started this. Perhaps there is a simple explanation for these changes which will allow me to correct my code too ? Thank you for your help so far. |
Commented by: Andre van Zuydam (andrevanzuydam) The cooperative policy on Firebird Super Server does not seem to work as the memory is still building on Super Server. Classic server works perfectly I might add and is still stable. |
Commented by: @hvlad > I've set the cooperative policy on, Are you restarted Firebird after edit of firebird.conf ? > I suppose this is how Classic server runs? Not exactly. It disabled background garbage collection and corresponding in-memory structures. As you have stuck OST number, these in-memory structures are not cleaned up. Suggestion to switch to cooperative gcpolicy was given to confirm this idea. > We are using a stored proc to connect to the other database to send the data, could this be where the memory is leaking ? I doubt it > Some other things we have tested is that a normal RAM cleaner app will bring the memory use down on the Firebird server, this is not ideal. RAM cleaner on database server machine ? Is it joke ? > I am also open to poor programming on my side, how can I test if my transactions are really getting closed ? At the program side you can ensure that transaction handle becames zero. > I definitely call close transaction after I do a query and statement, there is something that bothered me on Firebird 2.5, some of the transactions I opened reported a 501 error of attempting to close an already closed cursor which the same code / client did not report in 2.1, I changed my transaction closing method to use the DSQL_DROP which "seemed" to fix this problem. Sure. In v2.5 it is not allowed to close cursor when you have no cursor :) This is exactly your case : nor UPDATE, nor EXECUTE PROCEDURE doesn't returns cursor. But this shouldn't affect transaction state, except of your code flow is not called commit (or rollback) after such error. BTW, DSQL_DROP is NOT a "transaction closing method". This is option of *statement* close. > The cooperative policy on Firebird Super Server does not seem to work as the memory is still building on Super Server. Again, are you restarted Firebird after edit of firebird.conf ? > Classic server works perfectly I might add and is still stable. |
Modified by: @dyemanovstatus: In Progress [ 3 ] => Open [ 1 ] |
Modified by: @dyemanovassignee: Dmitry Yemanov [ dimitr ] => |
Commented by: Andre van Zuydam (andrevanzuydam) Hi Vlad Definitely restarting Firebird on each config change, memory still building, (it is very small 100K fore each instance that gets run). After a week the server will crash or not respond. The RAM cleaner on the database server was not a joke, only a test to see if the memory was still being accessed, unfortunately the machines we deploy on are not dedicated servers, we may not have come across this problem on a dedicated server. I must add that I do not have this problem on a Linux server, so must be a windows thing ? Thank you for your replies with regard to the coding, and again, always restarting Firebird when making conf changes. I have also tried running Super Server without Guardian just as a extra test, to no avail. My next resort is to build a standalone exe to replicate the problem, I think this is something that will help troubleshoot this, so please wait for this as I need to simulate what is happening and then we can work from there ? |
Submitted by: Andre van Zuydam (andrevanzuydam)
Attachments:
firebird.log
Votes: 1
We have a test system running on 2.5.1 snapshot, each week (plus minus 7 - 8 days) apart we have to manually restart / initialize the Firebird service which crashes out.
Looking into the logs points to a possible network problem which we have had a look at, unfortunately I get INET/inet_error: read errno = 10054 on local host development machine every time I work.
One possible cause is perhaps the sweep that failed ? The five clients were connected but could not query the server after such an incident which also makes the problem strange.
The logs of the past week since the last crash and today are below.
The text was updated successfully, but these errors were encountered: