New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Engine leaks memory and crashes when lot of autonomous transactions have been started and finished [CORE3908] #4244
Comments
Commented by: @pavel-zotov I've repeated this test but changed TIL of starting transaction to RO RC: SQL> commit; After working about 15 hours isql linux server became almost dead. PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND I could stop isql only via kill -9: there was no reaction on kill -15 or Ctrl-D (inside isql). After this I decided reconnect to this database again via isql - it was OK. Statistics of database header: I have got the backtrace of hanged isql with gdb - see it in attach #1. |
Modified by: @pavel-zotovAttachment: gdb_backtrace_isql_hangs_on_quit.zip [ 12201 ] Attachment: fdb_isql_hangs_on_quit.zip [ 12202 ] |
Commented by: Sean Leyne (seanleyne) Correct the subject, ISQL is not what is failing. |
Modified by: Sean Leyne (seanleyne)Component: Engine [ 10000 ] summary: isql crashes when lot of autonomous transactions have been started and finished in TIL = SNAPSHOT => engune crashes when lot of autonomous transactions have been started and finished in TIL = SNAPSHOT Component: ISQL [ 10003 ] => |
Modified by: Sean Leyne (seanleyne)summary: engune crashes when lot of autonomous transactions have been started and finished in TIL = SNAPSHOT => engine crashes when lot of autonomous transactions have been started and finished in TIL = SNAPSHOT |
Modified by: @AlexPeshkoffassignee: Alexander Peshkov [ alexpeshkoff ] |
Commented by: @AlexPeshkoff You were a bit impatient. That's not hang - engine was analyzing 100M of transactions. |
Commented by: @pavel-zotov > What about OOM - that's more interesting, looks like we have a kind of memory leak. Not only memory. bash-3.2$ ls -la /tmp/core-isql-354 |
Commented by: @dyemanov I'm attaching the patch that improves the "hang" from several hours down to less than one second. Please review. |
Modified by: @dyemanovAttachment: sweep_limbo_search.patch [ 12203 ] |
Commented by: @AlexPeshkoff Dmitry, your patch appears correct. |
Commented by: @dyemanov I've splitted away the "hang" issue to a separate ticket (CORE3994). As for the memory leak, I can easily reproduce it. It's leaking from the outer transaction pool, because autonomous transactions are deleted explicitly while many transaction internals expect themselves being deleted "by pool". But maybe there are other reasons as well, I didn't dig deeper. |
Modified by: @dyemanovsummary: engine crashes when lot of autonomous transactions have been started and finished in TIL = SNAPSHOT => Engine leaks memory and crashes when lot of autonomous transactions have been started and finished |
Modified by: @dyemanovAttachment: sweep_limbo_search.patch [ 12203 ] => |
Modified by: @AlexPeshkoffVersion: 3.0 Initial [ 10301 ] |
Commented by: @AlexPeshkoff No more leaks noticed when using separate pool for autonomous transactions. |
Modified by: @AlexPeshkoffstatus: Open [ 1 ] => Resolved [ 5 ] resolution: Fixed [ 1 ] Fix Version: 3.0 Alpha 1 [ 10331 ] Fix Version: 2.5.3 [ 10461 ] |
Modified by: @pcisarstatus: Resolved [ 5 ] => Closed [ 6 ] |
Submitted by: @pavel-zotov
Relate to CORE3994
Attachments:
gdb_backtrace_isql_hangs_on_quit.zip
fdb_isql_hangs_on_quit.zip
Votes: 1
SQL> create database 't0.fdb'; commit;
SQL> set term ;^
SQL> set term ^;
SQL> execute block as
CON> declare v int;
CON> begin
CON> while (1=1) do in autonomous transaction do select 1 from rdb$database into v;
CON> end^
After working about 18 hours isql has outputed this messages:
Statement failed, SQLSTATE = HY000
operating system directive munmap failed
-Cannot allocate memory
Attempt to quit from isql leads to:
--------------------------------------------
SQL> quit;
CON> Expected end of statement, encountered EOF -- это потому что не дошло до set term ;^
Statement failed, SQLSTATE = HY000
operating system directive munmap failed
-Cannot allocate memory
terminate called after throwing an instance of 'Firebird::system_call_failed'
Aborted (core dumped)
Firebird's log:
-------------------
bash-3.2$ cat -n firebird.log
1
2 reservdb Sat Aug 25 09:51:57 2012
3 Operating system call munmap failed. Error code 12
4
5
6 reservdb Sat Aug 25 09:51:57 2012
7 Operating system call munmap failed. Error code 12
8
9
10 reservdb Sat Aug 25 12:49:47 2012
11 Operating system call munmap failed. Error code 12
12
13
14 reservdb Sat Aug 25 12:49:47 2012
15 Operating system call munmap failed. Error code 12
16
17
18 reservdb Sat Aug 25 12:49:47 2012
19 Operating system call munmap failed. Error code 12
20
21
22 reservdb Sat Aug 25 12:49:47 2012
23 Operating system call pthread_mutex_destroy failed. Error code 16
24
25
26 reservdb Sat Aug 25 12:49:47 2012
27 Error in isc_detach_database() API call when working with security database
28 operating system directive pthread_mutex_destroy failed
29 Device or resource busy
30
31
32 reservdb Sat Aug 25 12:49:47 2012
33 Operating system call pthread_mutex_destroy failed. Error code 16
Backtrace of core brings error messages:
[root@reservdb .debug]# gdb -q -x ./gdb_backtrace_batch.txt /opt/firebird/bin/.debug/isql.debug /tmp/core-isql-354 1>isql-354.txt
warning: core file may not match specified executable file.
Failed to read a valid object file image from memory.
Cannot access memory at address 0x40b8dc68
Cannot access memory at address 0x429a7ff8
Cannot access memory at address 0x4158ef80
Cannot access memory at address 0x41f8ff80
Cannot access memory at address 0x7fff8c0017f8
But log "isql-354.txt" was filled with some useful(?) info:
[root@reservdb .debug]# cat isql-354.txt
Reading symbols from /opt/firebird/bin/.debug/isql.debug...done.
[New Thread 354]
[New Thread 358]
[New Thread 357]
[New Thread 356]
[New Thread 355]
Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
Core was generated by `isql t0.fdb'.
Program terminated with signal 6, Aborted.
#0 0x000000302f630265 in ?? ()
Thread 5 (Thread 355):
Thread 4 (Thread 356):
Thread 3 (Thread 357):
Thread 2 (Thread 358):
Thread 1 (Thread 354):
PS.
ISQL Version: LI-V2.5.2.26448 Firebird 2.5
Server version:
Firebird/linux AMD64 (access method), version "LI-V2.5.2.26448 Firebird 2.5"
on disk structure version 11.2
Commits: ef9448f 97b4b8c 2a29d5f 3f2477e
The text was updated successfully, but these errors were encountered: