Issue Details (XML | Word | Printable)

Key: CORE-1671
Type: Bug Bug
Status: Closed Closed
Resolution: Fixed
Priority: Major Major
Assignee: Alexander Peshkov
Reporter: Markus Hoenicka
Votes: 0
Watchers: 1
Operations

If you were logged in you would be able to see more operations.
Firebird Core

atexit() calls in client libraries cause segfaults if the libraries are used in dlopen()ed modules

Created: 01/Jan/08 11:10 AM   Updated: 12/Nov/09 04:02 PM
Component/s: API / Client Library
Affects Version/s: None
Fix Version/s: 2.5 Alpha 1

Time Tracking:
Not Specified

File Attachments: 1. GZip Archive datest.tar.gz (0.8 kB)

Environment:
Tested in two environments:
(1) FreeBSD 6.1-RELEASE with Firebird 2.0.3 built from the ports collection
(2) Debian "Etch", kernel 2.6.18-5, firebird2.0-classic_2.0.3.12981.ds1-1_i386.deb, firebird2.0-common_2.0.3.12981.ds1-1_i386.deb, firebird2.0-dev_2.0.3.12981.ds1-1_all.deb
Issue Links:
Depend
 


 Description  « Hide
The firebird client libraries libfbembed.so and libfbclient.so install exit handlers via atexit(). If the libraries are used by a module which is dlopen()ed at runtime (e.g. by a database abstraction layer which loads database drivers as modules, see http://libdbi.sourceforge.net), the pointers to the installed handlers dangle as soon as the modules are unloaded via dlclose(). This causes the program to crash on exit on all platforms which do not use obscure workarounds to prevent this.

Some operating systems (most notably Solaris, recent Linux versions) do implement obscure workarounds, so you won't see a problem here. Other operating systems (FreeBSD) no longer support these obscure workarounds as there are apparently more appropriate ways to clean up libraries upon closing than installing exit handlers, see e.g. this thread: http://lists.freebsd.org/pipermail/freebsd-hackers/2007-December/022763.html which includes a simple testcase to reproduce the problem.

I kindly ask to avoid using atexit() in the client libraries in order to allow programmers to access the firebird client libraries from dlopen()ed modules in a portable and robust way.


 All   Comments   Work Log   Change History   Version Control   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Markus Hoenicka added a comment - 01/Jan/08 11:14 AM
Generic test case to reproduce the crash after installing exit handlers in a dlopen()ed module. Please see the included README file for build and run instructions. Remember that many OSes protect against this kind of error, so you may not see the crash on your pet platform. To see it reproducibly crash, use e.g. FreeBSD.

Alexander Peshkov added a comment - 01/Jan/08 12:52 PM
A few years ago there was same problem with linux, but linux (i.e. glibc) has it already fixed. From 'man atexit':
Since glibc 2.2.3 (rather old version, BTW - AP) atexit() (and on_exit()) can be used to within a shared library to establish functions that are called when the shared library is unloaded.

Therefore I've decided not to change place in a code which anyway works correctly now. But if some C libraries are not fixed in obvious way - OK, it's no problems fixing in firebird.

Markus Hoenicka added a comment - 01/Jan/08 03:59 PM
You are referring to the normal use of a shared library, i.e. when a regular application is linked against a library using atexit(). This also works on FreeBSD. The problem arises if a loadable module is linked against said library. In this case, the exit handlers may be called *after* the module has been unloaded, leaving dangling pointers. This is because the exit handlers are only called when the program exits, not when the module is unloaded via dlclose(). Some OSes protect against this problem, but to me this looks like creepy workarounds.

So all boils down to the question: do you intend to support using the firebird client libraries from *loadable* modules? If you do, you should remove the atexit() calls, instead of waiting for the OSes to develop workarounds.

Alexander Peshkov added a comment - 04/Jan/08 05:04 AM
I've said I _will_ fix it. In the nearest release of 2.0.

But when you say:
"The problem arises if a loadable module is linked against said library. In this case, the exit handlers may be called *after* the module has been unloaded, leaving dangling pointers. This is because the exit handlers are only called when the program exits, not when the module is unloaded via dlclose()."
you are a bit wrong. This is not OS-dependent, cause atexit() is not system call, it is standard C-library function. And, as I've already said, the bug with atexit() is already fixed in glibc. But I see no problems supporting buggy libraries in firebird.

Markus Hoenicka added a comment - 05/Jan/08 05:54 PM
I greatly appreciate that you intend to fix this problem. However, your comments make me unsure whether you understand why the atexit() calls cause problems in our (=libdbi) usage scenario. At the risk of nitpicking, please consider the following usage cases:

1) No shared libraries, program installs exit handlers via atexit(), then calls exit() or returns from main(): The exit handler addresses are available at compile time, so there is never a problem.

2) Program is linked against shared library, shared library installs exit handlers via atexit(). The main program calls exit() or returns from main(): The exit handler addresses are not available at compile time, so a bug in glibc *may* cause problems, e.g. by not properly registering the addresses of the exit handlers in the shared library which are only known at runtime. This is apparently the glibc bug that you say has been fixed.

3) Program dlopen()s module which is linked against a shared library which installs exit handlers via atexit(). The program unloads the module via dlclose(), *then* calls exit() or returns from main(). The exit handler addresses are valid only as long as the module is in memory. After calling dlclose(), the OS is free to reuse this memory (otherwise the dlclose() function would be useless), and therefore any attempt to jump to an address in this memory is necessarily a segmentation fault. No standard C library is ever going to take care of this. As discussed in the FreeBSD thread mentioned in a previous comment, several OSes have implemented workarounds that keep copies of the exit handlers in memory although the modules themselves were unloaded. However, this is a courtesy of the operating system, not a libc feature, and therefore should not be relied upon if you strive for portable code. This usage cases is exactly the one which causes problems with the libdbi Firebird driver on FreeBSD.

Alexander Peshkov added a comment - 06/Jan/08 06:48 AM
I clearly understand that bug happens only for libraries loaded using dlopen() and dlclose()'d _before_calling program exits. And yes, if someone makes an attempt to call a function from unloaded library - any attempt to jump to an address in this memory is necessarily a segmentation fault. But the approach used by both glibc and msvcrt is the same - call functions from some library, registered by atexit(), when that library is unloaded, not waiting for process exit. This is absolutely clear and logical approach - such functions are provided to perform cleanup when library finished it's work. And when library is unloaded - it's work is really finished.
I agree that in can violate standrad, written before dynamic libraries became widely used. But that's bad standard problem!

Alexander Peshkov added a comment - 06/Jan/08 09:08 AM
Replaced function, registered by atexit, with destructor of a class, having single static instance.

Alexander Peshkov added a comment - 11/Jan/08 08:02 AM
backported

Vlad Khorsun added a comment - 12/Jan/08 02:11 PM
Unfortunately this fix is not ok for at least Windows Classic.

Current build 2.1.0.17822 crashed in Classic shutdown : just run "fb_inet_server -a" and shutdown it via tray icon menu.
I have AV in xnet.cpp\release_all in call of XNET_LOCK - xnet_mutex already destroyed at this moment.

This is because of before this fix exit handlers was called before static object's destructors. Currently we have no
knowledge of order in which destructors is called on program exit.

MSDN :
With the atexit function, you can specify an exit-processing function that executes prior to program termination.
No global static objects initialized prior to the call to atexit are destroyed prior to execution of the exit-processing function.

Alexander Peshkov added a comment - 13/Jan/08 06:12 AM
The only quick solutuon coming to my mind is rolling changes back, marking issue as related with

http://tracker.firebirdsql.org/browse/CORE-1079

(it also requires well controlled order of dtors execution), and fixing them both in 2.5.

If noone objects, will do it a few days after 2.1RC1 release.

Alexander Peshkov added a comment - 13/Jan/08 06:16 AM
Suggested solution caused problems with other environment.

Alexander Peshkov added a comment - 17/Jan/08 12:15 PM
Both issues require detailed control over constructors and destructors of global variables.

Alexander Peshkov added a comment - 23/Jan/08 11:00 AM
Avoid use of atexit() in production build.