New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Auto sweeper kills its own connection [CORE4745] #5050
Comments
Commented by: @hvlad Sweep can't shutdown own attachment. It could stop because of error, but we see no error message in firebird.log. |
Commented by: Omacht András (aomacht) Vlad, 100% percent the users are unable to kill the process with monitoring tables, because they have no rights to do it. And the administrator (my collegues and me) didn't do that too. Anyway you are right, there are no other messages in the log file. |
Commented by: @hvlad Could it be the Linux kernel who kills "too heavy" process ? |
Commented by: Omacht András (aomacht) No, this is a very unused server at least 16 cores and 32 gb ram free (used for disk cache) all the time. This database size is ~1gb. |
Commented by: @AlexPeshkoff Was server started as classic or superclassic? |
Commented by: Omacht András (aomacht) Alexander, as classic |
Commented by: @AlexPeshkoff In that case the reason is almost obvious. After backup without garbage collection firebird detects that sweep pass is needed for database and starts it as a separate thread in one of the processes. But classic process gets terminated when the user detachs from database. At this moment we see a record in the log: xxx Thu Apr 9 15:53:16 2015 This connection is sweep. And cause engine brings on terminate all connections to shutdown mode next what we see is: xxx Thu Apr 9 15:53:16 2015 This is not dangerous, but definitely annoying. I see the following possible ways to fix an issue: |
Commented by: @hvlad Alex, > I see the following possible ways to fix an issue: This is how it should work and i believe i've tested it before 2.5.0 release - when made auto-sweep run non-syncronous in Classic... - or from the most beginning start sweep as separate process on classic. It can't help, as user attachment which triggered auto-sweep still act as a client and its termination will terminate sweep process too. |
Commented by: Omacht András (aomacht) Alex, I can imagine this happend at the first time: Starts: Thu Apr 9 15:51:31 2015 but the 2nd and 3rd time between start and kill time is 0. Thu Apr 9 15:52:55 2015 vs. Thu Apr 9 15:52:55 2015 We made backup WITHOUT -g options (I know this is slower), so the gargabe collector is running, and the log shows everything is ok during backup time. |
Commented by: Attila Molnár (e_pluribus_unum) Hi! Same error in log here. I've started a long running process at 14 Apr 2015 07:34:38 +0200, which maintains the database schema and data by running a lot of DDL, DQL, DML commands. Error in client side : "Database is probably already opened by another engine instance in another Windows session." at 14 Apr 2015 07:37:53 +0200 No concurrency in client side, calls are made from apps main thread. Logs in server side (2.5.4 Classic Server) L3S-4 Tue Apr 14 07:35:45 2015 L3S-4 Tue Apr 14 07:36:03 2015 L3S-4 Tue Apr 14 07:36:03 2015 L3S-4 Tue Apr 14 07:36:03 2015 L3S-4 Tue Apr 14 07:36:59 2015 L3S-4 Tue Apr 14 07:36:59 2015 L3S-4 Tue Apr 14 07:36:59 2015 L3S-4 Tue Apr 14 07:37:01 2015 L3S-4 Tue Apr 14 07:37:01 2015 L3S-4 Tue Apr 14 07:37:01 2015 L3S-4 Tue Apr 14 07:37:19 2015 L3S-4 Tue Apr 14 07:37:19 2015 L3S-4 Tue Apr 14 07:37:19 2015 L3S-4 Tue Apr 14 07:37:20 2015 L3S-4 Tue Apr 14 07:37:20 2015 L3S-4 Tue Apr 14 07:37:20 2015 L3S-4 Tue Apr 14 07:37:21 2015 L3S-4 Tue Apr 14 07:37:21 2015 L3S-4 Tue Apr 14 07:37:21 2015 L3S-4 Tue Apr 14 07:37:25 2015 L3S-4 Tue Apr 14 07:37:25 2015 L3S-4 Tue Apr 14 07:37:25 2015 L3S-4 Tue Apr 14 07:37:53 2015 L3S-4 Tue Apr 14 07:37:53 2015 |
Commented by: Attila Molnár (e_pluribus_unum) Sweep starts when I connect to the DB. Looks like "Shutting down" messages came, when sweep is not finished and I disconnect from the DB. L3S-4 Tue Apr 14 08:09:33 2015 L3S-4 Tue Apr 14 08:12:14 2015 |
Modified by: @hvladassignee: Vlad Khorsun [ hvlad ] |
Commented by: Omacht András (aomacht) Vlad, I can accept Alex's explanation. We have some maintainer process running ~0 secs (connect and disconnect in less than a sec), these processes can cause the problems. By log it seems if a longer connection starts the GC it can do its job well. (Also Attila's ticket is more interesting where we had error on client's side. CORE4751) |
Commented by: Sean Leyne (seanleyne) @vlad and Alex, It seems wrong that the engine would trigger anything on Classic as the result of a backup process, since once the backup is complete the connection will be closed. Is there anyway to suppress the sweep in that case? (IMO the current error messages need to be trapped -- as they suggest a significant operational issue) |
Commented by: @hvlad > Vlad, I can accept Alex's explanation. > Anyway there is no error on client side, and the datasbase works well too. > So, please close this ticket. |
Commented by: @hvlad Sean, > It seems wrong that the engine would trigger anything on Classic as the result of a backup process, since once the backup is complete the connection will be closed. > Is there anyway to suppress the sweep in that case? > (IMO the current error messages need to be trapped -- as they suggest a significant operational issue) |
We haven’t experienced this error in the last few years, so the issue can be closed. |
Submitted by: Omacht András (aomacht)
On different databases at different time auto sweeper dies and restarts three or four times and finally complete the job.
Log:
xxx Thu Apr 9 15:51:31 2015
Sweep is started by SWEEPER
Database "/u1/firebird/db/yyy.fdb"
OIT 229037, OAT 229038, OST 229038, Next 263970
xxx Thu Apr 9 15:51:53 2015
Shutting down the server with 1 active connection(s) to 1 database(s), 0 active service(s)
xxx Thu Apr 9 15:51:54 2015
Error during sweep:
connection shutdown
xxx Thu Apr 9 15:52:55 2015
Sweep is started by SWEEPER
Database "/u1/firebird/db/yyy.fdb"
OIT 232570, OAT 253894, OST 253894, Next 263979
xxx Thu Apr 9 15:52:55 2015
Shutting down the server with 1 active connection(s) to 1 database(s), 0 active service(s)
xxx Thu Apr 9 15:52:55 2015
Error during sweep:
connection shutdown
xxx Thu Apr 9 15:53:16 2015
Sweep is started by SWEEPER
Database "/u1/firebird/db/yyy.fdb"
OIT 232570, OAT 253894, OST 253894, Next 263985
xxx Thu Apr 9 15:53:16 2015
Shutting down the server with 1 active connection(s) to 1 database(s), 0 active service(s)
xxx Thu Apr 9 15:53:16 2015
Error during sweep:
connection shutdown
xxx Thu Apr 9 15:53:23 2015
Sweep is started by SWEEPER
Database "/u1/firebird/db/yyy.fdb"
OIT 232570, OAT 253894, OST 253894, Next 263989
xxx Thu Apr 9 15:57:44 2015
Sweep is finished
Database "/u1/firebird/db/yyy.fdb"
OIT 253893, OAT 253894, OST 253894, Next 264042
After backup/restore randomly the same happen. Backup without -g works fine.
The text was updated successfully, but these errors were encountered: