Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

segfault on fb_inet_server processes [CORE3071] #3450

Closed
firebird-automations opened this issue Jul 17, 2010 · 28 comments
Closed

segfault on fb_inet_server processes [CORE3071] #3450

firebird-automations opened this issue Jul 17, 2010 · 28 comments

Comments

@firebird-automations
Copy link
Collaborator

Submitted by: @kattunga

Attachments:
backtrace.zip

I was testing Firebird 2.5RC2 and, since yesterday, daily snapshots (RC3) in a production server.

/var/log/kern.log is getting full of following messages:
...
Jul 15 13:14:10 server-sig kernel: [147921.412548] fb_inet_server[11493]: segfault at b35192b4 ip b35192b4 sp b6d1f34c error 14
Jul 15 13:14:10 server-sig kernel: [147921.459548] fb_inet_server[11499]: segfault at b34b72b4 ip b34b72b4 sp b6ccd34c error 14
Jul 15 13:14:13 server-sig kernel: [147924.197462] fb_inet_server[11506]: segfault at b35432b4 ip b35432b4 sp b6d4934c error 14
Jul 15 13:14:15 server-sig kernel: [147926.527755] fb_inet_server[11512]: segfault at b35e42b4 ip b35e42b4 sp b6dfa34c error 14
Jul 15 13:14:15 server-sig kernel: [147926.708456] fb_inet_server[11518]: segfault at b35fe2b4 ip b35fe2b4 sp b6e0434c error 14
Jul 15 13:14:16 server-sig kernel: [147927.379181] fb_inet_server[11524]: segfault at b35ef2b4 ip b35ef2b4 sp b6e0534c error 14
...

The segfault is raised when the process is destroyed (client disconnect), same problem with fb_smp_server with service is stoped.

@firebird-automations
Copy link
Collaborator Author

Commented by: @asfernandes

Are you using both versions at the same time?

@firebird-automations
Copy link
Collaborator Author

Commented by: @kattunga

No, first I tested RC2 for a couple of days, when I saw the segfault in the log I uninstall it and then I installed the latest nightly build to test if the bug was fixed after RC2, but the segfault is still there.
At this moment I have installed FB 2.5 Classic nightly build of 07/16

@firebird-automations
Copy link
Collaborator Author

Commented by: @asfernandes

Are there files in /tmp/firebird? What about if you delete them (with Firebird closed) and start it?

@firebird-automations
Copy link
Collaborator Author

Commented by: @dyemanov

Please setup the debug information and provide us with the backtrace when the crash happens the next time.

@firebird-automations
Copy link
Collaborator Author

Commented by: @kattunga

I was investigating a little more and I find that the segfault happens when client close the connection. Not all the time, mostly when conecting from a php web application, but sometimes when connecting from simple clients.
Adriano, yes I delete all files in /tmp/firebird and start it again. That directory have several fb_trace_xxxxx files all the time (I don't have any trace enabled).
I need some help to do a backtrace (I have no idea), could somebody point me to the steps to to get a backtrace?

@firebird-automations
Copy link
Collaborator Author

@firebird-automations
Copy link
Collaborator Author

Modified by: @AlexPeshkoff

assignee: Alexander Peshkov [ alexpeshkoff ]

@firebird-automations
Copy link
Collaborator Author

Commented by: @AlexPeshkoff

Small addition - 2.5 does not have a problem, mentioned in the end of thet articel - you install debuginfo and work with gdb. Command in gdb to get trace - 'thread apply all bt'.
Also you may try RC3 from prerelease area - http://www.firebirdsql.org/download/prerelease/

@firebird-automations
Copy link
Collaborator Author

Modified by: @kattunga

Attachment: backtrace.zip [ 11682 ]

@firebird-automations
Copy link
Collaborator Author

Commented by: @kattunga

I attached the core dump and the result of the backtrace, please, let me know if I did it right.
I did it with RC3 from prerelease area

@firebird-automations
Copy link
Collaborator Author

Commented by: @AlexPeshkoff

Christian, please take a look at this:
https://bugs.launchpad.net/ubuntu/+source/evolution/+bug/576346

Firebird crashes in the same funtion __nptl_deallocate_tsd(). Looks like ubuntu 10.04 bug.

@firebird-automations
Copy link
Collaborator Author

Commented by: @kattunga

Alexander, I was thinking that could be an Ubuntu bug too, but in the same machine FB 2.1.3 CS works fine.
Are there differences in FB architecture that could affect FB 2.5?

@firebird-automations
Copy link
Collaborator Author

Commented by: @AlexPeshkoff

Yes, certainly - and true serious in the aspect of this bug. Linux classic server before 2.5 is single-threaded program, which (therefore) never starts new threads, and __nptl_deallocate_tsd() function is therefore never invoked. 2.5 CS is multi-threaded, even when it serves single connection.

@firebird-automations
Copy link
Collaborator Author

Commented by: @kattunga

Big problem....
Any idea if I could upgrade or downgrade any package of ubuntu to confirm this?
Do you know which is the package that contain the library that hold this function?

@firebird-automations
Copy link
Collaborator Author

Commented by: @kattunga

Alexander, I was investigating a little and I didn't found any confirmed bug related with this library and ubuntu, there are some reported bugs in several linux distributions that involves a segfault in this function but none is confirmed.
I'll try to test FB 2.5 against other linux distribution to try to isolate the problem, but it will take me some time because I need time to install them (the only one that I have installed now is ubuntu 10.04 32 bits)
May be, some more fb testers could test if they see this problem in other new linux distributions.

@firebird-automations
Copy link
Collaborator Author

Commented by: @kattunga

Take care, that this segfault doesn't produce any visible error in client side, the only symptom is the kernel.log filled with the error message.

@firebird-automations
Copy link
Collaborator Author

Commented by: @asfernandes

I do use (for FB development) FB 2.5 SuperClassic (fb_smp_server) in Ubuntu 10.04 64bit without problem.

@firebird-automations
Copy link
Collaborator Author

Commented by: @kattunga

I test fb_smp_server and I found that it produce the error when I shut down the service.
So the segfault happens at the moment that the process is destroyed.

Adriano, could you check if when you shutdown firebird, it write an entry in kernel.log with the segfault?

@firebird-automations
Copy link
Collaborator Author

Modified by: @kattunga

description: I was testing Firebird 2.5RC2 and, since yesterday, daily snapshots (RC3) in a production server.

/var/log/kern.log is getting full of following messages:
...
Jul 15 13:14:10 server-sig kernel: [147921.412548] fb_inet_server[11493]: segfault at b35192b4 ip b35192b4 sp b6d1f34c error 14
Jul 15 13:14:10 server-sig kernel: [147921.459548] fb_inet_server[11499]: segfault at b34b72b4 ip b34b72b4 sp b6ccd34c error 14
Jul 15 13:14:13 server-sig kernel: [147924.197462] fb_inet_server[11506]: segfault at b35432b4 ip b35432b4 sp b6d4934c error 14
Jul 15 13:14:15 server-sig kernel: [147926.527755] fb_inet_server[11512]: segfault at b35e42b4 ip b35e42b4 sp b6dfa34c error 14
Jul 15 13:14:15 server-sig kernel: [147926.708456] fb_inet_server[11518]: segfault at b35fe2b4 ip b35fe2b4 sp b6e0434c error 14
Jul 15 13:14:16 server-sig kernel: [147927.379181] fb_inet_server[11524]: segfault at b35ef2b4 ip b35ef2b4 sp b6e0534c error 14
...

=>

I was testing Firebird 2.5RC2 and, since yesterday, daily snapshots (RC3) in a production server.

/var/log/kern.log is getting full of following messages:
...
Jul 15 13:14:10 server-sig kernel: [147921.412548] fb_inet_server[11493]: segfault at b35192b4 ip b35192b4 sp b6d1f34c error 14
Jul 15 13:14:10 server-sig kernel: [147921.459548] fb_inet_server[11499]: segfault at b34b72b4 ip b34b72b4 sp b6ccd34c error 14
Jul 15 13:14:13 server-sig kernel: [147924.197462] fb_inet_server[11506]: segfault at b35432b4 ip b35432b4 sp b6d4934c error 14
Jul 15 13:14:15 server-sig kernel: [147926.527755] fb_inet_server[11512]: segfault at b35e42b4 ip b35e42b4 sp b6dfa34c error 14
Jul 15 13:14:15 server-sig kernel: [147926.708456] fb_inet_server[11518]: segfault at b35fe2b4 ip b35fe2b4 sp b6e0434c error 14
Jul 15 13:14:16 server-sig kernel: [147927.379181] fb_inet_server[11524]: segfault at b35ef2b4 ip b35ef2b4 sp b6e0534c error 14
...

The segfault is raised when the process is destroyed (client disconnect), same problem with fb_smp_server with service is stoped.

@firebird-automations
Copy link
Collaborator Author

Commented by: @kattunga

I test it on a fresh Ubuntu server 9.10 32bit and problem exist too.

@firebird-automations
Copy link
Collaborator Author

Modified by: @kattunga

environment: Ubuntu Server 10.04 32bits => Ubuntu Server 10.04 32bits / Ubuntu server 9.10 32bits

@firebird-automations
Copy link
Collaborator Author

Commented by: @AlexPeshkoff

On gentoo (I currently have glibc-2.10.1 installed) I could not reproduce an issue, running fbtcs which makes hundreds of connections during the run and each connection means classic server start/stop.

So the traditional question comes - what _exactly_ should be done to reproduce your problem? Some specific database operations, may be UDFs, etc. ?

@firebird-automations
Copy link
Collaborator Author

Commented by: @kattunga

From a client application I open a connection, I run a query, then I close the connection and fb_inet_server log a segfault in kernel.log
When using fb_smp_server the segfault is logged in kernel at the moment of shutting down firebird service.

@firebird-automations
Copy link
Collaborator Author

Commented by: @AlexPeshkoff

Sorry, if this is all details you can provide - I can't reproduce it on gentoo.

@firebird-automations
Copy link
Collaborator Author

Commented by: @kattunga

I found the problem that cause the segfault.
The problem was in an old udf compiled with Kilix.
I recompile it width FreePascal and the segfault disappears.
So this bug should be closed as invalid

@firebird-automations
Copy link
Collaborator Author

Commented by: @kattunga

Sorry, I didn't see this before because the udf was linked in several computed fields, so I didn't see it when I run
SELECT * FROM ANYTABLE

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

status: Open [ 1 ] => Resolved [ 5 ]

resolution: Won't Fix [ 2 ]

@firebird-automations
Copy link
Collaborator Author

Modified by: @pcisar

status: Resolved [ 5 ] => Closed [ 6 ]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants