Issue Details (XML | Word | Printable)

Key: CORE-3859
Type: Bug Bug
Status: Closed Closed
Resolution: Fixed
Priority: Major Major
Assignee: Alexander Peshkov
Reporter: Pete Cervasio
Votes: 0
Watchers: 6
Operations

If you were logged in you would be able to see more operations.
Firebird Core

Too many different dynamic events crash server

Created: 30/May/12 07:04 PM   Updated: 18/Jan/16 04:01 PM
Component/s: Engine
Affects Version/s: 2.1.3, 2.0.6, 2.1.4, 2.0.7, 2.1.5
Fix Version/s: 2.1.6

Environment: CentOS 5 (5.5 and 5.8)

QA Status: No test


 Description  « Hide
A recent change to how our software is using triggers to create dynamic events has been found to crash the firebird server.

Our software is watching for changes to a 'PLAYLIST' table which has entries containing station number and date and time values. Since the system doesn't care about changes that are not within the immediate time frame a dynamic set of events was created in the form of 'PLAYLIST_INSERTED_x_yyyy-mm-dd_hh' where 'x' is a station number, 'yyyy-mm-dd' is the contents of a date field, and 'hh' is the first two characters of the scheduled time field. At the start of each hour, the program registers interest in the current and next hours.

After approximately 616 event registrations and event occurrences, the Firebird server will crash. Changing the length of the event names to use a single character 'I' instead of 'PLAYLIST_INSERTED_' makes that number jump a bit higher, but the crash still happens.

Operating system : CentOS 5.8 (but also tested on 5.5)
Firebird server : Tested on versions 2.0.6, 2.0.7, 2.1.4 and 2.1.5 release candidate (from 22 May 2012)

Server log when the crash happens:

~>tail /opt/firebird/firebird.log

serv1-0000.sbcglobal.net (Client) Mon May 28 14:26:04 2012
        /opt/firebird/bin/fbguard: /opt/firebird/bin/fbserver terminated abnormally (-1)



serv1-0000.sbcglobal.net (Client) Mon May 28 14:26:04 2012
        /opt/firebird/bin/fbguard: guardian starting bin/fbserver


Sample database:

-- ------------------------------------------------------------
-- ISQL scriptlet:

create database '/tmp/event_fail.fdb';

-- A table with the bare minimum to show this problem:
create table playlist (
    station_number integer not null,
    scheduled_date date not null,
    scheduled_time varchar(12) not null
);

-- Two triggers to go with it:
set term ;!!

create trigger pe_playlist_deleted for playlist
       active after delete position 0 as
begin
    post_event 'PLAYLIST_DELETED_' || old.station_number || '_' ||
    old.scheduled_date || '_' || substring(old.scheduled_time from 1 for 2);
end !!

create trigger pe_playlist_inserted for playlist
       active after insert position 0 as
begin
    post_event 'PLAYLIST_INSERTED_' || new.station_number || '_' ||
    new.scheduled_date || '_' || substring(new.scheduled_time from 1 for 2);
end !!
-- Two stored procedure with same params as our regular ones
create procedure insert_log_entry (
    station_num integer,
    dt date,
    tm varchar(12),
    id varchar(250),
    cluster integer,
    mandatory varchar(1))
returns
    (curr integer)
as
begin
    insert into playlist values (:station_num, :dt, :tm);
    curr = 0;
end !!

-- Unused in perl script, my pascal calls on it, though
create procedure remove_log_entry (
    station_num integer,
    curr integer)
returns
    (next integer)
as
begin
    delete from playlist where station_number = :station_num;
    next = 0;
end !!

set term ;!!

commit;

You might want to add that to your aliases:

echo "event_fail = /tmp/event_fail.fdb" >> /opt/firebird/aliases.conf

And here is a perl script which will show the error. I'm sure real perl programmers will cringe at this.

#!/usr/bin/perl -w
#
# Note: This needs DBI::InterBase. It might work with DBI::Firebird
# but I haven't tried it.
#
# Usage: $0 dbserver:database

use strict;
use DBI;
use POSIX qw(strftime mktime);

my $dsn = shift; # $ARGV[0] - Specified as dbhost:dbname.

my $DB_USER = 'sysdba';
my $DB_PASS = 'masterkey';

my ($dbhost, $dbname) = split ':', $dsn;

my $db = "dbi:InterBase:db=$dbname;host=$dbhost;ib_dialect=3";

my $dbi = DBI->connect($db, $DB_USER, $DB_PASS, { PrintError => 0,
                                                  RaiseError => 0,
                                                  AutoCommit => 1 })
    or die "FATAL ERROR: DBI->connect to $dsn failed";

# Counter for events.
my $want_events = 0;

# -----------------------------------------------------
# Event callback function
# -----------------------------------------------------
sub handle_event {
    my $pe = shift;

    while (my ($k, $v) = each %$pe) {
        print "Got event $k\n";
    }
    $want_events--;
    1;
}

my $station = 1;
my $curr_date = mktime(0, 0, 0, 1, 0, 110); # 01 Jan 2010 @00:00
my $end_date = mktime(0, 0, 0, 1, 0, 111); # 01 Jan 2011 @00:00

my $sql;

my $ev1 = "";
my $ev2 = "";

while ($curr_date < $end_date) {

    my ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday) = localtime($curr_date);

    my $dd = strftime ('%F', localtime($curr_date));
    my $hh = sprintf ('%.2d', $hour);
    my $tt = "${hh}:000:00.00";

    # Only using one here. Kylix test program (and real application)
    # use both of these, so this script registers for both.
    $ev1 = "PLAYLIST_INSERTED_${station}_${dd}_${hh}";
    $ev2 = "PLAYLIST_DELETED_${station}_${dd}_${hh}";

    my $evh = $dbi->func($ev1, $ev2, 'ib_init_event');

    # Set the callback
    $dbi->func($evh, \&handle_event, 'ib_register_callback');

    $want_events = 1;
    $sql = "execute procedure insert_log_entry (?, ?, ?, ?, ?, ?)";
    my $sth = $dbi->prepare($sql);

    my $res = $sth->execute($station, $dd, $tt, 'Event Test Entry', 0, 'N');

    print "Inserted $dd $hh\n";

    $sth->finish;

    while ($want_events > 0) {
        sleep 1;
    }

    $dbi->func($evh, 'ib_cancel_callback');

    # Bump to next hour and repeat
    $curr_date += 3600;

}

$dbi->disconnect;


When I run the above perl, I get this as output:


~ $perl event_test.pl "serv1-0000:event_fail"
Inserted 2010-01-01 00
Got event PLAYLIST_INSERTED_1_2010-01-01_00
Inserted 2010-01-01 01
Got event PLAYLIST_INSERTED_1_2010-01-01_01
............much elided.....
Inserted 2010-01-13 17
Got event PLAYLIST_INSERTED_1_2010-01-13_17
Inserted 2010-01-13 18
Got event PLAYLIST_INSERTED_1_2010-01-13_18
DBD::InterBase::db::ib_register_callback() -- ev is not a blessed SV reference at event_test.pl line 91.
Can't call method "execute" on an undefined value at event_test.pl line 97.
Broken pipe

and over on the server is another instance of terminating abnormally.

Things I've tried:

a). Making the event names shorter as in PI_{date}_{hour}

b). Table modification to store hour as an integer value and having the trigger use that instead of calling on substring.

Both of those changes were for naught.


 All   Comments   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Dmitry Yemanov added a comment - 30/May/12 07:18 PM
Does the crash happen if you increase the initial event table size (EventMemSize in firebird.conf)?

Pete Cervasio added a comment - 30/May/12 09:08 PM
Yes, the crash still happens, it just happens a bit later.

After changing EventMemSize to 262144 the above perl script is able to get up to 21 Feb in the sequence.

Sean Leyne added a comment - 30/May/12 10:01 PM
What happens if you increase to beyond 256K? Say to 1M or 2M?

Pete Cervasio added a comment - 30/May/12 10:40 PM
The same thing, it just happens later.

I set the event memory size to 2 megs and expanded the perl scripts test range. It failed after event PLAYLIST_INSERTED_1_2011-02-17_02 fired.

Pete Cervasio added a comment - 30/May/12 10:45 PM
Just a data point: This is apparently working just fine in version 2.5.1, although that doesn't help me much on CentOS or any of our earlier systems that are still in use.

I have a personal Slackware system that has new enough kernel/libraries to run the latest and even with the default settings it's been able to run three separate instances of my test with a 2 year test range with no problems.

Alexander Peshkov made changes - 31/May/12 08:20 AM
Field Original Value New Value
Assignee Alexander Peshkov [ alexpeshkoff ]
Alexander Peshkov added a comment - 31/May/12 08:28 AM
Pete, please see http://www.ibphoenix.com/resources/documents/search/doc_36 for instructions and send stack trace here.

Pete Cervasio added a comment - 31/May/12 12:19 PM
Here you are, Alexander. This was done with version 2.1.4, after resetting the EventMemSize back to the default so that the problem would occur sooner. I was astonished that gdb said the executable might not match... perhaps that's because firebird is in /opt/firebird.214 with a 'firebird' symlink pointing to it?

In any event, here's what gdb reports:

/tmp>gdb /opt/firebird/bin/.debug/fbserver.debug core.1144
GNU gdb (GDB) CentOS (7.0.1-42.el5.centos)
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "i386-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /opt/firebird.214/bin/.debug/fbserver.debug...done.

warning: core file may not match specified executable file.
[New Thread 1291]
[New Thread 1290]
[New Thread 1286]
[New Thread 1285]
[New Thread 1284]
[New Thread 1283]
[New Thread 1148]
[New Thread 1147]
[New Thread 1144]

warning: .dynamic section for "/usr/lib/libstdc++.so.5" is not at the expected address

warning: difference appears to be caused by prelink, adjusting expectations

warning: .dynamic section for "/lib/libgcc_s.so.1" is not at the expected address

warning: difference appears to be caused by prelink, adjusting expectations
Reading symbols from /lib/libdl.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib/libdl.so.2
Reading symbols from /usr/lib/libncurses.so.5...(no debugging symbols found)...done.
Loaded symbols for /usr/lib/libncurses.so.5
Reading symbols from /lib/libpthread.so.0...(no debugging symbols found)...done.
[Thread debugging using libthread_db enabled]
Loaded symbols for /lib/libpthread.so.0
Reading symbols from /opt/firebird/lib/libicuuc.so.30...Reading symbols from /opt/firebird.214/lib/.debug/libicuuc.so.30.0.debug...done.
done.
Loaded symbols for /opt/firebird/lib/libicuuc.so.30
Reading symbols from /opt/firebird/lib/libicudata.so.30...Reading symbols from /opt/firebird.214/lib/.debug/libicudata.so.30.0.debug...done.
done.
Loaded symbols for /opt/firebird/lib/libicudata.so.30
Reading symbols from /opt/firebird/lib/libicui18n.so.30...Reading symbols from /opt/firebird.214/lib/.debug/libicui18n.so.30.0.debug...done.
done.
Loaded symbols for /opt/firebird/lib/libicui18n.so.30
Reading symbols from /usr/lib/libstdc++.so.5...(no debugging symbols found)...done.
Loaded symbols for /usr/lib/libstdc++.so.5
Reading symbols from /lib/libm.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib/libm.so.6
Reading symbols from /lib/libgcc_s.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib/libgcc_s.so.1
Reading symbols from /lib/libc.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib/libc.so.6
Reading symbols from /lib/ld-linux.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib/ld-linux.so.2
Reading symbols from /lib/libnss_files.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib/libnss_files.so.2
Reading symbols from /opt/firebird/intl/fbintl...Reading symbols from /opt/firebird.214/intl/.debug/fbintl.debug...done.
done.
Loaded symbols for /opt/firebird/intl/fbintl
Core was generated by `/opt/firebird/bin/fbserver'.
Program terminated with signal 11, Segmentation fault.
#0 0x0041477b in __lll_lock_wait () from /lib/libpthread.so.0
(gdb) thread apply all bt

Thread 9 (Thread 0xb7f1d6d0 (LWP 1144)):
#0 0x00436402 in __kernel_vsyscall ()
#1 0x0032fb71 in select () from /lib/libc.so.6
#2 0x08263fc1 in select_wait (main_port=0xb7f171cc, selct=0x83ea8c0)
    at ../src/remote/inet.cpp:2834
#3 0x08263bdf in select_multi (main_port=0xb7f171cc, buffer=0xbfd57ef0 "",
    bufsize=8192, length=0xbfd57eee, port=@0xbfd57ecc) at ../src/remote/inet.cpp:2565
#4 0x0827661c in rem_port::select_multi (this=0x83ea8cc,
    buffer=0xfffffdfe <Address 0xfffffdfe out of bounds>, bufsize=0, length=0xfffffdfe,
    port=@0xfffffdfe) at ../src/remote/remote.cpp:785
#5 0x08050e6c in SRVR_multi_thread (main_port=0xb7f171cc, flags=2)
    at ../src/remote/server.cpp:480
#6 0x080504ba in server_main (argc=-514, argv=0x1) at ../src/remote/inet_server.cpp:448
#7 0x08059a09 in main (argc=-514, argv=0xfffffdfe) at ../src/remote/server_stub.cpp:12

Thread 8 (Thread 1147):
#0 0x00436402 in __kernel_vsyscall ()
#1 0x00413e95 in sem_wait@@GLIBC_2.1 () from /lib/libpthread.so.0
#2 0x08050978 in shutdown_thread (arg=0x0) at ../src/common/classes/semaphore.h:169
#3 0x08071a40 in threadStart (arg=0x0) at ../src/jrd/ThreadData.cpp:274
#4 0x0040d852 in start_thread () from /lib/libpthread.so.0
#5 0x003370ae in clone () from /lib/libc.so.6

Thread 7 (Thread 1148):
#0 0x00436402 in __kernel_vsyscall ()
#1 0x00414038 in sem_timedwait () from /lib/libpthread.so.0
#2 0x0805991c in SignalSafeSemaphore::tryEnter (this=0x83e07b4, seconds=62804704)
    at ../src/common/classes/semaphore.h:220
---Type <return> to continue, or q <return> to quit---
#3 0x080582da in loopThread (flags=0x2) at ../src/remote/server.cpp:5545
#4 0x08071a40 in threadStart (arg=0x2) at ../src/jrd/ThreadData.cpp:274
#5 0x0040d852 in start_thread () from /lib/libpthread.so.0
#6 0x003370ae in clone () from /lib/libc.so.6

Thread 6 (Thread 1283):
#0 0x00436402 in __kernel_vsyscall ()
#1 0x00411f32 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/libpthread.so.0
#2 0x0806cd4e in ISC_event_wait (count=-4, events=0x2fb71d8, values=0x2fb71dc,
    micro_seconds=10000000, timeout_handler=0, handler_arg=0x0)
    at ../src/jrd/isc_sync.cpp:671
#3 0x081105bc in cache_writer (arg=0xb759881c) at ../src/jrd/cch.cpp:4360
#4 0x08071a40 in threadStart (arg=0xb759881c) at ../src/jrd/ThreadData.cpp:274
#5 0x0040d852 in start_thread () from /lib/libpthread.so.0
#6 0x003370ae in clone () from /lib/libc.so.6

Thread 5 (Thread 1284):
#0 0x00436402 in __kernel_vsyscall ()
#1 0x00411f32 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/libpthread.so.0
#2 0x0806cd4e in ISC_event_wait (count=-4, events=0x22c1168, values=0x22c116c,
    micro_seconds=10000000, timeout_handler=0, handler_arg=0x0)
    at ../src/jrd/isc_sync.cpp:671
#3 0x081eb652 in garbage_collector (arg=0xb759881c) at ../src/jrd/vio.cpp:4134
#4 0x08071a40 in threadStart (arg=0xb759881c) at ../src/jrd/ThreadData.cpp:274
#5 0x0040d852 in start_thread () from /lib/libpthread.so.0
#6 0x003370ae in clone () from /lib/libc.so.6

Thread 4 (Thread 1285):
#0 0x00436402 in __kernel_vsyscall ()
---Type <return> to continue, or q <return> to quit---
#1 0x00411f32 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/libpthread.so.0
#2 0x0806cd4e in ISC_event_wait (count=-4, events=0x5c021d8, values=0x5c021dc,
    micro_seconds=10000000, timeout_handler=0, handler_arg=0x0)
    at ../src/jrd/isc_sync.cpp:671
#3 0x081105bc in cache_writer (arg=0xb75a881c) at ../src/jrd/cch.cpp:4360
#4 0x08071a40 in threadStart (arg=0xb75a881c) at ../src/jrd/ThreadData.cpp:274
#5 0x0040d852 in start_thread () from /lib/libpthread.so.0
#6 0x003370ae in clone () from /lib/libc.so.6

Thread 3 (Thread 1286):
#0 0x00436402 in __kernel_vsyscall ()
#1 0x00411f32 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/libpthread.so.0
#2 0x0806cd4e in ISC_event_wait (count=-4, events=0x45e6168, values=0x45e616c,
    micro_seconds=10000000, timeout_handler=0, handler_arg=0x0)
    at ../src/jrd/isc_sync.cpp:671
#3 0x081eb652 in garbage_collector (arg=0xb75a881c) at ../src/jrd/vio.cpp:4134
#4 0x08071a40 in threadStart (arg=0xb75a881c) at ../src/jrd/ThreadData.cpp:274
#5 0x0040d852 in start_thread () from /lib/libpthread.so.0
#6 0x003370ae in clone () from /lib/libc.so.6

Thread 2 (Thread 1290):
#0 0x00436402 in __kernel_vsyscall ()
#1 0x00414038 in sem_timedwait () from /lib/libpthread.so.0
#2 0x0805991c in SignalSafeSemaphore::tryEnter (this=0x83e07b4, seconds=134476512)
    at ../src/common/classes/semaphore.h:220
#3 0x080582da in loopThread (flags=0x2) at ../src/remote/server.cpp:5545
#4 0x08071a40 in threadStart (arg=0x2) at ../src/jrd/ThreadData.cpp:274
#5 0x0040d852 in start_thread () from /lib/libpthread.so.0
#6 0x003370ae in clone () from /lib/libc.so.6
---Type <return> to continue, or q <return> to quit---

Thread 1 (Thread 0x4fe7b90 (LWP 1291)):
#0 0x0041477b in __lll_lock_wait () from /lib/libpthread.so.0
#1 0x0040fddf in _L_lock_885 () from /lib/libpthread.so.0
#2 0x0040fca6 in pthread_mutex_lock () from /lib/libpthread.so.0
#3 0x0806d6ff in ISC_mutex_lock (mutex=0x2) at ../src/jrd/isc_sync.cpp:3183
#4 0x0813632e in acquire () at ../src/jrd/event.cpp:518
#5 0x08137185 in watcher_thread () at ../src/jrd/event.cpp:1530
#6 0x08071a40 in threadStart (arg=0x0) at ../src/jrd/ThreadData.cpp:274
#7 0x0040d852 in start_thread () from /lib/libpthread.so.0
#8 0x003370ae in clone () from /lib/libc.so.6
(gdb)

Alexander Peshkov added a comment - 31/May/12 12:56 PM
Well, it's superserver. And it is MT thing, like 2.5.
In that case I first of all have to ask about glibc version, and if it is < 2.7, please try to:
1. Install classic instead super and try to reproduce a bug.
2. Try to reproduce it on FB2.1SS, but on the host with fresh linux.

Pete Cervasio added a comment - 31/May/12 02:24 PM
CentOS does indeed have a glibc that is less than 2.7. According to rpm, I have glibc-2.5-81.el5_8.1 installed.

1). FirebirdCS 2.1.4 was installed and it appears to work fine on CentOS with these dynamic events. Using three separate instances of the test, I let each run for a year's worth of event firing.

I can't recall the exact reason at the moment, but there was some thought to our choice of SS over CS.

2). I Installed Firebird SS 2.1.4 on my Slackware 13.37 system (with glibc 2.13 and kernel 2.6.37.6-smp) and it does indeed exhibit the crash. I'm not sure if it'll help, but here's a stack trace from that machine:

/tmp $ gdb /opt/firebird/bin/.debug/fbserver.debug core
GNU gdb (GDB) 7.2
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "i486-slackware-linux".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /opt/firebird/bin/.debug/fbserver.debug...done.

warning: core file may not match specified executable file.
[New Thread 31528]
[New Thread 30819]
[New Thread 30805]
[New Thread 31485]
[New Thread 31484]
[New Thread 31487]
[New Thread 31486]
[New Thread 30800]

warning: Can't read pathname for load map: Input/output error.
Reading symbols from /lib/libdl.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib/libdl.so.2
Reading symbols from /lib/libncurses.so.5...(no debugging symbols found)...done.
Loaded symbols for /lib/libncurses.so.5
Reading symbols from /lib/libpthread.so.0...(no debugging symbols found)...done.
Loaded symbols for /lib/libpthread.so.0
Reading symbols from /opt/firebird/lib/libicuuc.so.30...Reading symbols from /opt/firebird/lib/.debug/libicuuc.so.30.0.debug...done.
done.
Loaded symbols for /opt/firebird/lib/libicuuc.so.30
Reading symbols from /opt/firebird/lib/libicudata.so.30...Reading symbols from /opt/firebird/lib/.debug/libicudata.so.30.0.debug...done.
done.
Loaded symbols for /opt/firebird/lib/libicudata.so.30
Reading symbols from /opt/firebird/lib/libicui18n.so.30...Reading symbols from /opt/firebird/lib/.debug/libicui18n.so.30.0.debug...done.
done.
Loaded symbols for /opt/firebird/lib/libicui18n.so.30
Reading symbols from /usr/i486-slackware-linux/lib/libstdc++.so.5...(no debugging symbols found)...done.
Loaded symbols for /usr/i486-slackware-linux/lib/libstdc++.so.5
Reading symbols from /lib/libm.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib/libm.so.6
Reading symbols from /usr/lib/libgcc_s.so.1...(no debugging symbols found)...done.
Loaded symbols for /usr/lib/libgcc_s.so.1
Reading symbols from /lib/libc.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib/libc.so.6
Reading symbols from /lib/ld-linux.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib/ld-linux.so.2
Reading symbols from /lib/libnss_compat.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib/libnss_compat.so.2
Reading symbols from /lib/libnsl.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib/libnsl.so.1
Reading symbols from /lib/libnss_nis.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib/libnss_nis.so.2
Reading symbols from /lib/libnss_files.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib/libnss_files.so.2
Reading symbols from /opt/firebird/intl/fbintl...Reading symbols from /opt/firebird/intl/.debug/fbintl.debug...done.
done.
Loaded symbols for /opt/firebird/intl/fbintl
Core was generated by `/opt/firebird/bin/fbserver'.
Program terminated with signal 11, Segmentation fault.
#0 0xb7785226 in __lll_lock_wait () from /lib/libpthread.so.0
(gdb) thread apply all bt

Thread 8 (Thread 30800):
#0 0xb73b5a67 in select () from /lib/libc.so.6
#1 0x08263fc1 in select_wait (main_port=0xb77fc1cc, selct=0x83ea8c0) at ../src/remote/inet.cpp:2834
#2 0x08263bdf in select_multi (main_port=0xb77fc1cc, buffer=0xbfc5b780 "", bufsize=8192, length=0xbfc5b77e,
    port=@0xbfc5b75c) at ../src/remote/inet.cpp:2565
#3 0x0827661c in rem_port::select_multi (this=0x83ea8cc,
    buffer=0xfffffdfe <Address 0xfffffdfe out of bounds>, bufsize=0, length=0xfffffdfe, port=@0xfffffdfe)
    at ../src/remote/remote.cpp:785
#4 0x08050e6c in SRVR_multi_thread (main_port=0xb77fc1cc, flags=2) at ../src/remote/server.cpp:480
#5 0x080504ba in server_main (argc=-514, argv=0x1) at ../src/remote/inet_server.cpp:448
#6 0x08059a09 in main (argc=-514, argv=0xfffffdfe) at ../src/remote/server_stub.cpp:12

Thread 7 (Thread 31486):
#0 0xb7782ac5 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/libpthread.so.0
#1 0x0806cd4e in ISC_event_wait (count=-516, events=0xb476d1a8, values=0xb476d1ac, micro_seconds=10000000,
    timeout_handler=0, handler_arg=0x0) at ../src/jrd/isc_sync.cpp:671
#2 0x081105bc in cache_writer (arg=0xb709181c) at ../src/jrd/cch.cpp:4360
#3 0x08071a40 in threadStart (arg=0xb709181c) at ../src/jrd/ThreadData.cpp:274
#4 0xb777dc04 in start_thread () from /lib/libpthread.so.0
#5 0xb73be06e in clone () from /lib/libc.so.6

Thread 6 (Thread 31487):
#0 0xb7782ac5 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/libpthread.so.0
#1 0x0806cd4e in ISC_event_wait (count=-516, events=0xb3f6d138, values=0xb3f6d13c, micro_seconds=10000000,
    timeout_handler=0, handler_arg=0x0) at ../src/jrd/isc_sync.cpp:671
#2 0x081eb652 in garbage_collector (arg=0xb709181c) at ../src/jrd/vio.cpp:4134
#3 0x08071a40 in threadStart (arg=0xb709181c) at ../src/jrd/ThreadData.cpp:274
#4 0xb777dc04 in start_thread () from /lib/libpthread.so.0
#5 0xb73be06e in clone () from /lib/libc.so.6

Thread 5 (Thread 31484):
#0 0xb7782ac5 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/libpthread.so.0
#1 0x0806cd4e in ISC_event_wait (count=-516, events=0xb578d1a8, values=0xb578d1ac, micro_seconds=10000000,
---Type <return> to continue, or q <return> to quit---
    timeout_handler=0, handler_arg=0x0) at ../src/jrd/isc_sync.cpp:671
#2 0x081105bc in cache_writer (arg=0xb683d81c) at ../src/jrd/cch.cpp:4360
#3 0x08071a40 in threadStart (arg=0xb683d81c) at ../src/jrd/ThreadData.cpp:274
#4 0xb777dc04 in start_thread () from /lib/libpthread.so.0
#5 0xb73be06e in clone () from /lib/libc.so.6

Thread 4 (Thread 31485):
#0 0xb7782ac5 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/libpthread.so.0
#1 0x0806cd4e in ISC_event_wait (count=-516, events=0xb4f6d138, values=0xb4f6d13c, micro_seconds=10000000,
    timeout_handler=0, handler_arg=0x0) at ../src/jrd/isc_sync.cpp:671
#2 0x081eb652 in garbage_collector (arg=0xb683d81c) at ../src/jrd/vio.cpp:4134
#3 0x08071a40 in threadStart (arg=0xb683d81c) at ../src/jrd/ThreadData.cpp:274
#4 0xb777dc04 in start_thread () from /lib/libpthread.so.0
#5 0xb73be06e in clone () from /lib/libc.so.6

Thread 3 (Thread 30805):
#0 0xb7784920 in sem_wait@@GLIBC_2.1 () from /lib/libpthread.so.0
#1 0x08050978 in shutdown_thread (arg=0x0) at ../src/common/classes/semaphore.h:169
#2 0x08071a40 in threadStart (arg=0x0) at ../src/jrd/ThreadData.cpp:274
#3 0xb777dc04 in start_thread () from /lib/libpthread.so.0
#4 0xb73be06e in clone () from /lib/libc.so.6

Thread 2 (Thread 30819):
#0 0xb7784abe in sem_timedwait () from /lib/libpthread.so.0
#1 0x0805991c in SignalSafeSemaphore::tryEnter (this=0x83e07b4, seconds=-1232932176)
    at ../src/common/classes/semaphore.h:220
#2 0x080582da in loopThread (flags=0x2) at ../src/remote/server.cpp:5545
#3 0x08071a40 in threadStart (arg=0x2) at ../src/jrd/ThreadData.cpp:274
#4 0xb777dc04 in start_thread () from /lib/libpthread.so.0
#5 0xb73be06e in clone () from /lib/libc.so.6

Thread 1 (Thread 31528):
#0 0xb7785226 in __lll_lock_wait () from /lib/libpthread.so.0
#1 0xb77801ad in _L_lock_810 () from /lib/libpthread.so.0
---Type <return> to continue, or q <return> to quit---
#2 0xb777ffd1 in pthread_mutex_lock () from /lib/libpthread.so.0
#3 0x0806d6ff in ISC_mutex_lock (mutex=0x2) at ../src/jrd/isc_sync.cpp:3183
#4 0x0813632e in acquire () at ../src/jrd/event.cpp:518
#5 0x08137185 in watcher_thread () at ../src/jrd/event.cpp:1530
#6 0x08071a40 in threadStart (arg=0x0) at ../src/jrd/ThreadData.cpp:274
#7 0xb777dc04 in start_thread () from /lib/libpthread.so.0
#8 0xb73be06e in clone () from /lib/libc.so.6
(gdb)

Alexander Peshkov added a comment - 31/May/12 04:17 PM
Well, in that case appears this is really firebird bug, present at least in linux SS. I will take closer look at it.

Alexander Peshkov added a comment - 01/Jun/12 03:00 PM - edited
Pete Cervasio, there are problems with your script.
I have perl-5.12.4, DBI-1.615 (both are emerged by gentoo) and DBD-InterBase-0.46 (fresh download from SF).
What I typically get trying to run your test:

Inserted 2010-01-01 00
        (in cleanup) panic: memory wrap at ./3859crash.perl line 79.
        (in cleanup) Not a CODE reference at ./3859crash.perl line 79.
*** glibc detected *** /usr/bin/perl: free(): invalid pointer: 0x0000000001efab8f ***
======= Backtrace: =========
/lib/libc.so.6(+0x76156)[0x7fe9bc25c156]
/lib/libc.so.6(cfree+0x6c)[0x7fe9bc260fdc]
/usr/lib/libperl.so.5.12(Perl_op_free+0x10c)[0x7fe9bc5a8eec]

I've tried different fbclient versions. I have no time to fix issues with perl. Therefore I kindly ask you to prepare test case using some other tool, best of all is our native API.

Pete Cervasio added a comment - 01/Jun/12 05:01 PM
Hi, Alexander. Someone else had problems with the asynchronous events when trying in Windows, and they wrote a simpler test which uses synchronous events. Before I write a whole new thing, perhaps this will work better.

#!/usr/bin/perl -w
#
# Slight modification of my perl script by Michael Ludwig
# to use synchronous events. Modified more by myself to
# take out the Time::HiRes module (which I didn't have)
# and to use DBD::InterBase
#
# Usage: $0 localhost:/tmp/event_test.fdb sysdba masterkey

use strict;
use warnings;
use DBI;
use POSIX qw(strftime mktime);

my $db = shift or die 'database!';
my $user = shift or die 'username!';
my $pass = shift or die 'password!';

my $dbi = DBI->connect('dbi:InterBase:' . $db . ';ib_dialect=3', $user, $pass,
    { PrintError => 0, RaiseError => 1, AutoCommit => 1 });

my $station = 1;
my $curr_date = mktime(0, 0, 0, 1, 0, 110); # 01 Jan 2010 @00:00
my $end_date = mktime(0, 0, 0, 1, 0, 111); # 01 Jan 2011 @00:00

my $insert_count = 0;

my $sql = "execute procedure insert_log_entry (?, ?, ?, ?, ?, ?)";
my $sth = $dbi->prepare($sql);

while ($curr_date < $end_date) {
    my @lt_curr_date = localtime $curr_date;
    my $dd = strftime '%Y-%m-%d', @lt_curr_date;
    my $hh = sprintf '%.2d', $lt_curr_date[2]; # hour
    my $tt = "${hh}:000:00.00";

    my $ev1 = "PLAYLIST_INSERTED_${station}_${dd}_${hh}";
    my $ev2 = "PLAYLIST_DELETED_${station}_${dd}_${hh}";
    my $evh = $dbi->func($ev1, $ev2, 'ib_init_event');

    my $res = $sth->execute($station, $dd, $tt, 'Event Test Entry', 0, 'N');
    $sth->finish; # call needed to trigger event!
    $insert_count++;
    printf "%5u - Inserted $dd $hh\n", $insert_count;

    my $events = $dbi->func($evh, 'ib_wait_event');
    if ( $events ) {
        while ( my($evnam, $evcount) = each %$events ) {
            printf "%50s %2u\n", $evnam, $evcount;
        }
    }

    $curr_date += 3600; # Bump to next hour and repeat
}

$dbi->disconnect;
print "inserted $insert_count records\n";
__END__


Alexander Peshkov added a comment - 04/Jun/12 09:34 AM
That's better - reproduced.

Alexander Peshkov added a comment - 04/Jun/12 12:10 PM
Segfault happened due to unprotected by mutex global variable access.

Alexander Peshkov made changes - 04/Jun/12 12:10 PM
Status Open [ 1 ] Resolved [ 5 ]
Fix Version/s 2.1.6 [ 10460 ]
Resolution Fixed [ 1 ]
Pete Cervasio added a comment - 04/Jun/12 02:18 PM
Thank you very much, Alexander.

One question... this was too late to go into 2.1.5, so the fix won't go out until 2.1.6 is released next year?

Alexander Peshkov added a comment - 05/Jun/12 07:16 AM
FB admins are discussing it now...

Anderson Freitas Alves added a comment - 05/Jun/12 02:50 PM
Please Firebird 2.1.5 rc2 with fixed CORE-3855 and CORE-3859 (more robust and secure).

Le Roy Arnaud added a comment - 12/Jul/12 08:04 PM
have you any idea about which version are affected, we can't reproduce this problem with SS on windows

Pete Cervasio added a comment - 12/Jul/12 09:15 PM
Le Roy, the versions and OS where I found this bug are listed up at the top:

  Affects Version/s: 2.1.3, 2.0.6, 2.1.4, 2.0.7, 2.1.5
  Environment: CentOS 5 (5.5 and 5.8)

I'm not sure if I explicitly state it, but I was using Super Server. I don't have Windows, so I have no idea if this breaks on that OS or not.

Further down in the comments is a revised perl script, if you have trouble with the one in the original report.

Le Roy Arnaud added a comment - 13/Jul/12 04:42 AM
Pete, yes i know which test you have done but now alexander has found where is the problem so maybe he knows which versions are affected ?

today i can't break a super server on windows with our test program. this same test program breaks fb supe server on linux ubuntu 10.10,11.10.12.04 with fb super server since 2.1.3 version i can't test with previous version of fb server ss

thanks pete for your report.

Alexander Peshkov added a comment - 16/Jul/12 10:44 AM
Guys, this bug does not happen on windows, it's posix specific.

Pavel Cisar made changes - 23/Apr/13 01:54 PM
Status Resolved [ 5 ] Closed [ 6 ]
Pavel Zotov made changes - 18/Jan/16 04:01 PM
Status Closed [ 6 ] Closed [ 6 ]
QA Status No test