Issue Details (XML | Word | Printable)

Key: CORE-3857
Type: Bug Bug
Status: Open Open
Priority: Critical Critical
Assignee: Unassigned
Reporter: anthony jang
Votes: 4
Watchers: 10
Operations

If you were logged in you would be able to see more operations.
Firebird Core

Firebird hangs for a while blocking all DB operations periodically.

Created: 28/May/12 09:50 PM   Updated: 28/Nov/18 10:05 AM
Component/s: Engine
Affects Version/s: 2.5.1
Fix Version/s: None

File Attachments: 1. Zip Archive fb_inet_server_081012.zip (467 kB)
2. Zip Archive fb_inet_server_minidump.zip (242 kB)

Environment: Firebird 2.5.1 Super-Classic x64 on Windows 2008 R2 Server with 32 GB of RAM


 Description  « Hide
We have a Firebird server installation that periodically blocks all operations for a few minutes and then comes back alive on its own. This has happened about once a month for the last few months. During this blocking period, Firebird CPU usage is unusually low as this is a busy server. This server normally has 200-300 client attachments. The blocking time has varied from 2 minutes to over 10 minutes. During this time, no Firebird operations can be performed i.e. New connections are blocked along with existing connections. Upon recovery, Firebird continues processing without any other issues.

We have noticed that this has generally occurred, but not always, during a DB sweep. Our sweep interval is set to 0 and we are performing the sweep once a day as a scheduled task.


 All   Comments   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Sean Leyne added a comment - 28/May/12 10:34 PM
This issue should be handled/directed to the Firebird Support mailing list, this is a support issue. This tracker is NOT a support tool, it is intended for only confirmed problems.

anthony jang added a comment - 14/Jun/12 07:17 PM
Firebird mini-dump and lock-print during the blocking. Vlad mentioned that he would look at it.

Jesus Angel Garcia Zarco added a comment - 06/Jul/12 08:16 AM
I'm having the same issue on a Windows 2008 server R2 64 bits and 16 Gb RAM.
I use firebird 2.5.2.
I do not have sweep disabled, and this morning ia have received a call with the problem. When i have connected, all has runned fine.
In the moment of the problem, there is around 130 attachments.

Jesus Angel Garcia Zarco added a comment - 30/Jul/12 01:23 PM - edited
Hello Anthony, have You discovered something about this issue?

anthony jang added a comment - 10/Aug/12 06:53 PM
This problem occurred again this morning. Attached is the Firebird dump file. Vlad has been emailed the dump files as well.

anthony jang added a comment - 10/Aug/12 06:54 PM
Jesus,

We do not have a solution to this issue yet. New dump files have been attached that may help to resolve the issue.

Sascha Michel added a comment - 24/Apr/15 10:59 AM
Do you use the firebird event mechanism on the Server ?
Does one ore more client using firebird events ?

Sascha Michel added a comment - 01/May/15 11:13 AM - edited
I think that's the same as: http://tracker.firebirdsql.org/browse/CORE-4680

Behave exactly the same thing I have described!
Whether it takes 2 minutes or more than 10 minutes will depend on when the client process was killed.

If the causative client process is aborted, the server running as if there was no problem.

The problem occurs unfortunately only very rarely.
If events are used. The number of users is even crucial.
I have the problem over ten years. But could not explain what the real problem is.

I watch the problem for more than 10 years and unfortunately it still exists in Firebird 3.0.

I have a workaround for the problem and since this on the same system i had no more problems with that bug.
Unfortunately, I had the workaround turned on, when I gave an developer the testprogramm and he could therefore not understand the error.

My workaround works like this:
  TKillTimerThread * KillOnStall = new TKillTimerThread( true );
  KillOnStall->Start();
// This Thread waits 5 seconds and when the main program does not disable this threat, than the thread kills the program. And for now i have no problems with a still standing firebird server. ( Max 5 seconds ;-) )
  SIBfibEventAlerter1->Registered = false;
  KillOnStall->Terminate();

I think that network sockets are closed by the immediate killing of the program and the server can therefore take up its work again.

I very much hope, that at some point the real problem is found in the server, but that no longer interests me so much.

Siva Ramanathan added a comment - 01/May/15 01:06 PM
We have not seen this problem in the latest Firebird release, 2.5.4, so we believe that it has been resolved. We were not using Firebird events.

Sean Leyne added a comment - 04/Jul/15 05:07 PM
@Sascha,

Have you tested the latest v2.5.x and/or v3.0 Beta 2 releases?

Sascha Michel added a comment - 16/Jul/15 09:28 AM
I have now again tested it with version "LI-V6.3.0.31936 Firebird 3.0 Release Candidate 1".

There are differences from the previous version.

1. The error occurs faster.
2. In the log file, there are errors that were not previously displayed.
3.The server shuts down automatically and doesn't hang.

Here entries from the log file:

TEST1 !!

FB30 Thu Jul 16 11:10:53 2015
        INET/inet_error: invalid socket in packet_receive errno = 22
FB30 Thu Jul 16 11:10:53 2015
        SRVR_multi_thread: shutting down due to unhandled exception
FB30 Thu Jul 16 11:10:53 2015
        INET/inet_error: accept errno = 9
FB30 Thu Jul 16 11:10:53 2015
        Unable to complete network request to host "FB30".
        Failed to establish a secondary connection for event processing.
        Bad file descriptor
FB30 Thu Jul 16 11:10:53 2015
        Unable to complete network request to host "FB30".
        Error reading data from the connection.
        Invalid argument
FB30 Thu Jul 16 11:10:53 2015
        SRVR_multi_thread: forcefully disconnecting a port
FB30 Thu Jul 16 11:10:53 2015
        Shutting down the server with 7 active connection(s) to 1 database(s), 0 active service(s)
FB30 Thu Jul 16 11:10:53 2015
        /opt/firebird/bin/fbguard: /opt/firebird/bin/firebird normal shutdown.

----------------------------------------------------------------------------------------------------
TEST2 !!
FB30 Thu Jul 16 11:17:06 2015
        INET/inet_error: invalid socket in packet_receive errno = 22
FB30 Thu Jul 16 11:17:06 2015
        SRVR_multi_thread: shutting down due to unhandled exception
FB30 Thu Jul 16 11:17:06 2015
        Unable to complete network request to host "FB30".
        Error reading data from the connection.
        Invalid argument
FB30 Thu Jul 16 11:17:06 2015
        SRVR_multi_thread: forcefully disconnecting a port
FB30 Thu Jul 16 11:17:06 2015
        Shutting down the server with 3 active connection(s) to 1 database(s), 0 active service(s)
FB30 Thu Jul 16 11:17:06 2015
        INET/inet_error: accept errno = 9
FB30 Thu Jul 16 11:17:06 2015
        Unable to complete network request to host "FB30".
        Failed to establish a secondary connection for event processing.
        Bad file descriptor
FB30 Thu Jul 16 11:17:06 2015
        /opt/firebird/bin/fbguard: /opt/firebird/bin/firebird normal shutdown.


Veselin Pavlov added a comment - 03/Aug/16 10:12 AM - edited
I am experiencing this problem also. Most of the time is early in the morning while first users are connecting.
Don't know if its related. but in the log file I have a lot of
INET/inet_error: send errno = 104
and some
invalid socket in packet_receive errno = 22

Server Version: LI-V2.5.6.27008 Firebird 2.5
Server Implementation: Firebird/linux AMD64
Service Version: 2

At active time we have:
Number of connections: 198
Number of databases: 13

On application login there are a lot of events registering.

Han added a comment - 28/Nov/18 07:09 AM
I am experiencing the same problem
how do i apply the workaround from Sascha MIclhel

My workaround works like this:
  TKillTimerThread * KillOnStall = new TKillTimerThread( true );
  KillOnStall->Start();
// This Thread waits 5 seconds and when the main program does not disable this threat, than the thread kills the program. And for now i have no problems with a still standing firebird server. ( Max 5 seconds ;-) )
  SIBfibEventAlerter1->Registered = false;
  KillOnStall->Terminate();
 
help is greatly appreciated..

Vlad Khorsun added a comment - 28/Nov/18 09:11 AM
a) make sure you use latest Firebird release
b) we need something to reproduce the issue, so far there is no enough info

Sascha Michel added a comment - 28/Nov/18 10:05 AM
For Windows, the thread that is started makes nothing other than terminate the application hard.
If everything runs normally when the database connection is terminated, the thread is terminated early and is not executed.
In the latest Firebird 3 version this should no longer be necessary!


void __fastcall TKillTimerThread::Execute()
{
  this->FreeOnTerminate = true;
  this->Sleep(5000);
  if ( !this->Terminated)
  {
DWORD processID;
GetWindowThreadProcessId( Application->Handle , &processID);
AnsiString CMD = AnsiString("taskkill /F /PID ") + AnsiString(processID).c_str();
WinExec( CMD.c_str() ,0);
  }
}