New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Users cannot connect to database [Fatal lock manager error: invalid lock id (100376), errno: 2] [CORE1872] #2303
Comments
Commented by: Kevin Smith (kevinsmith) fb_lock_print command strace file |
Modified by: Kevin Smith (kevinsmith)Attachment: fb_lock_print.log [ 10870 ] |
Commented by: Kevin Smith (kevinsmith) GFIX strace output |
Modified by: Kevin Smith (kevinsmith)Attachment: gfix.log [ 10871 ] |
Modified by: Kevin Smith (kevinsmith)Attachment: isc_lock1.localhost.zip [ 10872 ] |
Modified by: Kevin Smith (kevinsmith)description: Sometimes when first user try to connect to database: When I kill first fb_inet_proccess that takes 100% of CPU, next one starts to taking 100% of CPU, and so on... When I look into /opt/firebird/firebird.log file I can see: -- cut here ------------- Here's htop output PID USER PRI NI VIRT RES SHR S CPU% MEM% TIME+ Command Please not PID of fb_inet_server process - 5373... When I try to strace it, strace shows nothing...: -- cut here ------------- -- cut here ------------- I also cannot run gfix and fb_inet_server command against database: 1. gfix -housekeeping 0 database.gdb (I've attached strace log) The only think I can do is: After that when I run Firebird server once again, users can connect without any problems. Tested on Firebird 1.5 (LI-V1.5.3.4870 Firebird 1.5) I had above situation 3 times, each time on different production servers running different Fedora Core versions. Best regards, => Sometimes when first user try to connect to database: When I kill first fb_inet_proccess that takes 100% of CPU, next one starts to taking 100% of CPU, and so on... When I look into /opt/firebird/firebird.log file I can see: -- cut here ------------- Here's htop output PID USER PRI NI VIRT RES SHR S CPU% MEM% TIME+ Command Please not PID of fb_inet_server process - 5373... When I try to strace it, strace shows nothing...: -- cut here ------------- -- cut here ------------- I also cannot run gfix and fb_inet_server command against database: 1. gfix -housekeeping 0 database.gdb (I've attached strace log) The only think I can do is: After that when I run Firebird server once again, users can connect without any problems. Tested on Firebird 1.5 (LI-V1.5.3.4870 Firebird 1.5) I had above situation 3 times, each time on different production servers running different Fedora Core versions. Best regards, |
Modified by: Kevin Smith (kevinsmith)description: Sometimes when first user try to connect to database: When I kill first fb_inet_proccess that takes 100% of CPU, next one starts to taking 100% of CPU, and so on... When I look into /opt/firebird/firebird.log file I can see: -- cut here ------------- Here's htop output PID USER PRI NI VIRT RES SHR S CPU% MEM% TIME+ Command Please not PID of fb_inet_server process - 5373... When I try to strace it, strace shows nothing...: -- cut here ------------- -- cut here ------------- I also cannot run gfix and fb_inet_server command against database: 1. gfix -housekeeping 0 database.gdb (I've attached strace log) The only think I can do is: After that when I run Firebird server once again, users can connect without any problems. Tested on Firebird 1.5 (LI-V1.5.3.4870 Firebird 1.5) I had above situation 3 times, each time on different production servers running different Fedora Core versions. Best regards, => Sometimes when first user try to connect to database: When I kill first fb_inet_proccess that takes 100% of CPU, next one starts to taking 100% of CPU, and so on... When I look into /opt/firebird/firebird.log file I can see: -- cut here ------------- Here's htop output PID USER PRI NI VIRT RES SHR S CPU% MEM% TIME+ Command Please note PID of fb_inet_server process - 5373... When I try to strace it, strace shows nothing...: -- cut here ------------- -- cut here ------------- I also cannot run gfix and fb_inet_server command against database: 1. gfix -housekeeping 0 database.gdb (I've attached strace log) The only think I can do is: After that when I run Firebird server once again, users can connect without any problems. Tested on Firebird 1.5 (LI-V1.5.3.4870 Firebird 1.5) I had above situation 3 times, each time on different production servers running different Fedora Core versions. Best regards, |
Modified by: Kevin Smith (kevinsmith)description: Sometimes when first user try to connect to database: When I kill first fb_inet_proccess that takes 100% of CPU, next one starts to taking 100% of CPU, and so on... When I look into /opt/firebird/firebird.log file I can see: -- cut here ------------- Here's htop output PID USER PRI NI VIRT RES SHR S CPU% MEM% TIME+ Command Please note PID of fb_inet_server process - 5373... When I try to strace it, strace shows nothing...: -- cut here ------------- -- cut here ------------- I also cannot run gfix and fb_inet_server command against database: 1. gfix -housekeeping 0 database.gdb (I've attached strace log) The only think I can do is: After that when I run Firebird server once again, users can connect without any problems. Tested on Firebird 1.5 (LI-V1.5.3.4870 Firebird 1.5) I had above situation 3 times, each time on different production servers running different Fedora Core versions. Best regards, => Sometimes when first user try to connect to database: When I kill first fb_inet_proccess that takes 100% of CPU, next one starts to taking 100% of CPU, and so on... When I look into /opt/firebird/firebird.log file I can see: -- cut here ------------- Here's htop output PID USER PRI NI VIRT RES SHR S CPU% MEM% TIME+ Command Please note PID of fb_inet_server process - 5373... When I try to strace it, strace shows nothing...: -- cut here ------------- -- cut here ------------- I also cannot run gfix and fb_inet_server command against database: 1. gfix -housekeeping 0 database.gdb (I've attached strace log) Those command also takes 100% of CPU and never ends. The only think I can do is: After that when I run Firebird server once again, users can connect without any problems. Tested on Firebird 1.5 (LI-V1.5.3.4870 Firebird 1.5) I had above situation 3 times, each time on different production servers running different Fedora Core versions. Best regards, |
Modified by: Kevin Smith (kevinsmith)description: Sometimes when first user try to connect to database: When I kill first fb_inet_proccess that takes 100% of CPU, next one starts to taking 100% of CPU, and so on... When I look into /opt/firebird/firebird.log file I can see: -- cut here ------------- Here's htop output PID USER PRI NI VIRT RES SHR S CPU% MEM% TIME+ Command Please note PID of fb_inet_server process - 5373... When I try to strace it, strace shows nothing...: -- cut here ------------- -- cut here ------------- I also cannot run gfix and fb_inet_server command against database: 1. gfix -housekeeping 0 database.gdb (I've attached strace log) Those command also takes 100% of CPU and never ends. The only think I can do is: After that when I run Firebird server once again, users can connect without any problems. Tested on Firebird 1.5 (LI-V1.5.3.4870 Firebird 1.5) I had above situation 3 times, each time on different production servers running different Fedora Core versions. Best regards, => Sometimes when first user try to connect to database: When I kill first fb_inet_proccess that takes 100% of CPU, next one starts to taking 100% of CPU, and so on... When I look into /opt/firebird/firebird.log file I can see: -- cut here ------------- Here's htop output PID USER PRI NI VIRT RES SHR S CPU% MEM% TIME+ Command Please note PID of fb_inet_server process - 5373... When I try to strace it, strace shows nothing...: -- cut here ------------- -- cut here ------------- I also cannot run gfix and fb_inet_server command against database: 1. gfix -housekeeping 0 database.gdb (I've attached strace log) Those commands also takes 100% of CPU and never ends. The only think I can do is: After that when I run Firebird server once again, users can connect without any problems. Tested on Firebird 1.5 (LI-V1.5.3.4870 Firebird 1.5) I had above situation 3 times, each time on different production servers running different Fedora Core versions. Best regards, |
Modified by: Kevin Smith (kevinsmith)description: Sometimes when first user try to connect to database: When I kill first fb_inet_proccess that takes 100% of CPU, next one starts to taking 100% of CPU, and so on... When I look into /opt/firebird/firebird.log file I can see: -- cut here ------------- Here's htop output PID USER PRI NI VIRT RES SHR S CPU% MEM% TIME+ Command Please note PID of fb_inet_server process - 5373... When I try to strace it, strace shows nothing...: -- cut here ------------- -- cut here ------------- I also cannot run gfix and fb_inet_server command against database: 1. gfix -housekeeping 0 database.gdb (I've attached strace log) Those commands also takes 100% of CPU and never ends. The only think I can do is: After that when I run Firebird server once again, users can connect without any problems. Tested on Firebird 1.5 (LI-V1.5.3.4870 Firebird 1.5) I had above situation 3 times, each time on different production servers running different Fedora Core versions. Best regards, => Sometimes when first user try to connect to database: When I kill first fb_inet_proccess that takes 100% of CPU, next one starts to taking 100% of CPU, and so on... When I look into /opt/firebird/firebird.log file I can see: -- cut here ------------- Here's htop output PID USER PRI NI VIRT RES SHR S CPU% MEM% TIME+ Command Please note PID of fb_inet_server process - 5373... When I try to strace it, strace shows nothing...: -- cut here ------------- -- cut here ------------- I also cannot run gfix and fb_inet_server command against database: 1. gfix -housekeeping 0 database.gdb (I've attached strace log) Those commands also takes 100% of CPU and never ends. The only thing I can do is: After that when I run Firebird server once again, users can connect without any problems. Tested on Firebird 1.5 (LI-V1.5.3.4870 Firebird 1.5) I had above situation 3 times, each time on different production servers running different Fedora Core versions. Best regards, |
Commented by: @AlexPeshkoff Kevin, v.1.5 of firebird is really old. I can agree that as long as it works for you, it's possibly no use in upgrading. But when it does not... Upgrade, please, and see, is your problem reproducible with fresh firebird. Only critical security fixes may be fixed in 1.5. Deadlocks in lock manager are not fixed in it. I'll be interested in your report, does your problem continue to take place in 2.X version of firebird. |
Commented by: Kevin Smith (kevinsmith) We use Firebird 2 on two production servers running our ERP system (we have there 30 and 5 concurrent users). |
Commented by: @dyemanov Not reproducible in FB v2.x. |
Modified by: @pcisarstatus: Resolved [ 5 ] => Closed [ 6 ] |
Submitted by: Kevin Smith (kevinsmith)
Attachments:
fb_lock_print.log
gfix.log
isc_lock1.localhost.zip
Sometimes when first user try to connect to database:
* fb_inet_server process that handle connection, takes 100% of CPU but never ends
* as effect user cannot connect to database
* next user also cannot connect to database - but next fb_inet_server does not take 100% of CPU (it just hangs)
When I kill first fb_inet_proccess that takes 100% of CPU, next one starts to taking 100% of CPU, and so on...
When I look into /opt/firebird/firebird.log file I can see:
-- cut here -------------
localhost.localdomain Mon Apr 28 07:53:21 2008
Fatal lock manager error: invalid lock id (100376), errno: 2
-- cut here -------------
Here's htop output
-- cut here -------------
1 [ 0.0%] Tasks: 103 total, 2 running
2 [| 0.7%] Load average: 1.01 1.53 2.65
3 [ 0.0%] Uptime: 4 days, 01:32:40
4 [|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||100.0%]
Mem[|||||||||||||||||||||||||||||||||||||||| 300/3162MB]
Swp[| 0/4996MB]
PID USER PRI NI VIRT RES SHR S CPU% MEM% TIME+ Command
5373 firebird 25 0 45124 21512 11728 R 99.7 1.7 49:52.87 fb_inet_server
-- cut here -------------
Please note PID of fb_inet_server process - 5373... When I try to strace it, strace shows nothing...:
-- cut here -------------
[root@localhost public]# strace -p 5373
Process 5373 attached - interrupt to quit
-- cut here -------------
I also cannot run gfix and fb_inet_server command against database:
1. gfix -housekeeping 0 database.gdb (I've attached strace log)
2. fb_lock_print -a (I've attached strace log)
Those commands also takes 100% of CPU and never ends.
The only thing I can do is:
* kill all fb_inet_server processes (killall -TERM fb_inet_server)
* stop firebird database (/etc/init.d/xinetd stop)
* delete isc_lock1.localhost.localdomain file from /opt/firebird directory (I've attached this file to the issue...)
After that when I run Firebird server once again, users can connect without any problems.
Tested on Firebird 1.5 (LI-V1.5.3.4870 Firebird 1.5)
I had above situation 3 times, each time on different production servers running different Fedora Core versions.
Best regards,
Kevin Smith
The text was updated successfully, but these errors were encountered: