|
[
Permalink
| « Hide
]
Alexander Peshkov added a comment - 03/Feb/16 10:22 AM
Michal, do we have an exact reference to the standard explaining reasons of such incompatible change of operator new?
I tried using -std=gnu++98 with firebird 2.5.5 (technically plus some Debian patches). There are no compilation errors, but running create_db either fails with
invalid request BLR at offset 24 -Too many Contexts of Relation/Procedure/Views. Maximum allowed is 255 or aborts with a segmentation fault. Here's the stack trace: (gdb) thread apply all bt Thread 4 (LWP 28868): #0 0x00007fb5035d2475 in ?? () #1 0x00007fb504825578 in ?? () #2 0x00000000004e7e70 in ?? () at ../src/include/../common/classes/RefCounted.h:77 #3 0x00007fb5048255b8 in ?? () #4 0x00007fb504825588 in ?? () #5 0xfffffffeffffffff in ?? () #6 0x00007fb500f65e60 in ?? () #7 0x00007fb500f65df0 in ?? () #8 0x00007fb5035d253f in ?? () #9 0x0000000000000000 in ?? () Thread 3 (LWP 28869): #0 0x00007fb5035d004f in ?? () #1 0x0000000000000000 in ?? () Thread 2 (LWP 28867): #0 0x00007fb5035d22a7 in ?? () #1 0x0000000000000000 in ?? () Thread 1 (LWP 28866): #0 par_relation (tdbb=tdbb@entry=0x7ffe7a37c190, csb=csb@entry=0x7fb4ffefa9d8, blr_operator=<optimized out>, parse_context=parse_context@entry=true) at ../src/jrd/par.cpp:2296 #1 0x000000000046ea70 in PAR_parse_node (tdbb=tdbb@entry=0x7ffe7a37c190, csb=csb@entry=0x7fb4ffefa9d8, expected=expected@entry=5) at ../src/jrd/par.cpp:3086 #2 0x000000000046f41c in PAR_parse_node (tdbb=tdbb@entry=0x7ffe7a37c190, csb=csb@entry=0x7fb4ffefa9d8, expected=1) at ../src/jrd/par.cpp:2967 #3 0x000000000046e2ef in PAR_parse_node (tdbb=tdbb@entry=0x7ffe7a37c190, csb=csb@entry=0x7fb4ffefa9d8, expected=expected@entry=1) at ../src/jrd/par.cpp:3177 #4 0x000000000046fd29 in PAR_parse_node (tdbb=tdbb@entry=0x7ffe7a37c190, csb=0x7fb4ffefa9d8, expected=expected@entry=0) at ../src/jrd/par.cpp:3299 #5 0x0000000000472c62 in PAR_parse (tdbb=tdbb@entry=0x7ffe7a37c190, csb=..., blr=blr@entry=0x7e2260 <_ZL6jrd_20> "\004\002\004", blr_length=blr_length@entry=56, internal_flag=internal_flag@entry=true, dbginfo_length=dbginfo_length@entry=0, dbginfo=0x0) at ../src/jrd/par.cpp:729 #6 0x000000000066e399 in CMP_compile2 (tdbb=tdbb@entry=0x7ffe7a37c190, blr=blr@entry=0x7e2260 <_ZL6jrd_20> "\004\002\004", blr_length=blr_length@entry=56, internal_flag=internal_flag@entry=true, dbginfo_length=dbginfo_length@entry=0, dbginfo=dbginfo@entry=0x0) at ../src/jrd/cmp.cpp:609 #7 0x00000000005f7fc7 in store_message (handle=<synthetic pointer>, message=0xa6d500 <trigger_messages>, tdbb=0x7ffe7a37c190) at ../temp/boot/jrd/ini.cpp:2252 #8 INI_format (owner=<optimized out>, charset=<optimized out>) at ../temp/boot/jrd/ini.cpp:803 #9 0x0000000000457d79 in jrd8_create_database (user_status=0x7ffe7a37c9d0, filename=0x7fb504828388 "/home/dam/work/debian/firebird/2.5/empty.fdb", handle=0x7ffe7a37c678, dpb_length=<optimized out>, dpb=<optimized out>) at ../src/jrd/jrd.cpp:2249 #10 0x000000000043bc71 in isc_create_database (user_status=user_status@entry=0x7ffe7a37c9d0, file_length=file_length@entry=0, file_name=<optimized out>, public_handle=public_handle@entry=0x7ffe7a37c9c8, dpb_length=dpb_length@entry=0, dpb=dpb@entry=0x0) at ../src/jrd/why.cpp:2068 #11 0x0000000000406dff in main (argc=<optimized out>, argv=0x7ffe7a37cc68) at ../src/utilities/create_db.cpp:22 Omitting -Ox flags makes the full build pass, including creation of various databases, super and classic. Using -O1 segfaults again, but with a more verbose backtrace: Program terminated with signal SIGSEGV, Segmentation fault. #0 0x0000000000459e66 in par_relation (tdbb=tdbb@entry=0x7fffdac6d240, csb=csb@entry=0x2b7ff085d9d8, blr_operator=blr_operator@entry=75, parse_context=parse_context@entry=true) at ../src/jrd/par.cpp:2296 2296 csb->csb_rpt[stream].csb_relation = relation; [Current thread is 1 (Thread 0x2b7fed17fb80 (LWP 8048))] (gdb) thread apply all bt Thread 4 (Thread 0x2b7ff07f1700 (LWP 8051)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x0000000000674e46 in ISC_event_wait (event=0x2b7fed16d140, value=value@entry=1, micro_seconds=micro_seconds@entry=0) at ../src/jrd/isc_sync.cpp:1884 #2 0x0000000000550b89 in Jrd::LockManager::blocking_action_thread (this=this@entry=0x2b7fed150430) at ../src/lock/lock.cpp:1580 #3 0x0000000000554bbb in Jrd::LockManager::blocking_action_thread (arg=arg@entry=0x2b7fed150430) at ../src/lock/../lock/lock_proto.h:468 #4 0x000000000057bb39 in (anonymous namespace)::ThreadArgs::run (this=<synthetic pointer>) at ../src/jrd/ThreadStart.cpp:128 #5 (anonymous namespace)::threadStart (arg=0x2b7fed155778) at ../src/jrd/ThreadStart.cpp:139 #6 0x00002b7fee17e454 in start_thread (arg=0x2b7ff07f1700) at pthread_create.c:334 #7 0x00002b7fee47cfdd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109 Thread 3 (Thread 0x2b7ff03ef700 (LWP 8049)): #0 0x00002b7fee1862a7 in futex_abstimed_wait_cancelable (private=0, abstime=0x0, expected=0, futex_word=0x2b7fed157030) at ../sysdeps/unix/sysv/linux/futex-internal.h:205 #1 do_futex_wait (sem=sem@entry=0x2b7fed157030, abstime=0x0) at sem_waitcommon.c:111 #2 0x00002b7fee186354 in __new_sem_wait_slow (sem=0x2b7fed157030, abstime=0x0) at sem_waitcommon.c:181 #3 0x00002b7fee1863e9 in __new_sem_wait (sem=sem@entry=0x2b7fed157030) at sem_wait.c:29 #4 0x000000000040aa60 in Firebird::SignalSafeSemaphore::enter (this=0x2b7fed157030) at ../src/common/classes/semaphore.cpp:132 #5 0x0000000000423211 in (anonymous namespace)::shutdownThread () at ../src/jrd/why.cpp:933 #6 0x000000000057bb39 in (anonymous namespace)::ThreadArgs::run (this=<synthetic pointer>) at ../src/jrd/ThreadStart.cpp:128 #7 (anonymous namespace)::threadStart (arg=0x2b7fed155778) at ../src/jrd/ThreadStart.cpp:139 #8 0x00002b7fee17e454 in start_thread (arg=0x2b7ff03ef700) at pthread_create.c:334 #9 0x00002b7fee47cfdd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109 Thread 2 (Thread 0x2b7ff05f0700 (LWP 8050)): #0 0x00002b7fee186475 in futex_abstimed_wait_cancelable (private=0, abstime=0x2b7ff05efe70, expected=0, futex_word=0x2b7fed151588) at ../sysdeps/unix/sysv/linux/futex-internal.h:205 #1 do_futex_wait (sem=sem@entry=0x2b7fed151588, abstime=abstime@entry=0x2b7ff05efe70) at sem_waitcommon.c:111 #2 0x00002b7fee18653f in __new_sem_wait_slow (sem=0x2b7fed151588, abstime=0x2b7ff05efe70) at sem_waitcommon.c:181 #3 0x00002b7fee1865f2 in sem_timedwait (sem=sem@entry=0x2b7fed151588, abstime=abstime@entry=0x2b7ff05efe70) at sem_timedwait.c:36 #4 0x000000000040ab99 in Firebird::SignalSafeSemaphore::tryEnter (this=this@entry=0x2b7fed151588, seconds=seconds@entry=1800, milliseconds=1800000, milliseconds@entry=0) at ../src/common/classes/semaphore.cpp:171 #5 0x00000000004cc0b3 in Jrd::ConfigStorage::touchThreadFunc (this=this@entry=0x2b7fed1515b8) at ../src/jrd/trace/TraceConfigStorage.cpp:352 #6 0x00000000004cc1cb in Jrd::ConfigStorage::touchThread (arg=arg@entry=0x2b7fed1515b8) at ../src/jrd/trace/TraceConfigStorage.cpp:338 #7 0x000000000057bb39 in (anonymous namespace)::ThreadArgs::run (this=<synthetic pointer>) at ../src/jrd/ThreadStart.cpp:128 #8 (anonymous namespace)::threadStart (arg=0x2b7fed155778) at ../src/jrd/ThreadStart.cpp:139 #9 0x00002b7fee17e454 in start_thread (arg=0x2b7ff05f0700) at pthread_create.c:334 #10 0x00002b7fee47cfdd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109 Thread 1 (Thread 0x2b7fed17fb80 (LWP 8048)): #0 0x0000000000459e66 in par_relation (tdbb=tdbb@entry=0x7fffdac6d240, csb=csb@entry=0x2b7ff085d9d8, blr_operator=blr_operator@entry=75, parse_context=parse_context@entry=true) at ../src/jrd/par.cpp:2296 #1 0x000000000045df62 in PAR_parse_node (tdbb=tdbb@entry=0x7fffdac6d240, csb=csb@entry=0x2b7ff085d9d8, expected=expected@entry=5) at ../src/jrd/par.cpp:3086 #2 0x000000000045cfad in PAR_parse_node (tdbb=tdbb@entry=0x7fffdac6d240, csb=csb@entry=0x2b7ff085d9d8, expected=expected@entry=1) at ../src/jrd/par.cpp:2967 #3 0x000000000045e433 in PAR_parse_node (tdbb=tdbb@entry=0x7fffdac6d240, csb=csb@entry=0x2b7ff085d9d8, expected=expected@entry=1) at ../src/jrd/par.cpp:3177 #4 0x000000000045f7e0 in PAR_parse_node (tdbb=tdbb@entry=0x7fffdac6d240, csb=0x2b7ff085d9d8, expected=expected@entry=0) at ../src/jrd/par.cpp:3299 #5 0x0000000000460dc4 in PAR_parse (tdbb=tdbb@entry=0x7fffdac6d240, csb=..., blr=blr@entry=0x783420 <jrd_20> "\004\002\004", blr_length=blr_length@entry=56, internal_flag=internal_flag@entry=true, dbginfo_length=dbginfo_length@entry=0, dbginfo=0x0) at ../src/jrd/par.cpp:729 #6 0x0000000000623b22 in CMP_compile2 (tdbb=tdbb@entry=0x7fffdac6d240, blr=blr@entry=0x783420 <jrd_20> "\004\002\004", blr_length=blr_length@entry=56, internal_flag=internal_flag@entry=true, dbginfo_length=dbginfo_length@entry=0, dbginfo=dbginfo@entry=0x0) at ../src/jrd/cmp.cpp:609 #7 0x00000000005b7b3d in store_message (handle=<synthetic pointer>, message=0xa0b4c0 <trigger_messages>, tdbb=0x7fffdac6d240) at ../temp/boot/jrd/ini.cpp:2252 #8 INI_format (owner=<optimized out>, charset=<optimized out>) at ../temp/boot/jrd/ini.cpp:803 #9 0x0000000000447d76 in jrd8_create_database (user_status=user_status@entry=0x7fffdac6d8a0, filename=<optimized out>, handle=handle@entry=0x7fffdac6d668, dpb_length=<optimized out>, dpb=dpb@entry=0x7fffdac6d510 "\001M") at ../src/jrd/jrd.cpp:2249 #10 0x000000000042eb73 in isc_create_database (user_status=user_status@entry=0x7fffdac6d8a0, file_length=file_length@entry=0, file_name=<optimized out>, public_handle=public_handle@entry=0x7fffdac6d89c, dpb_length=dpb_length@entry=0, dpb=dpb@entry=0x0) at ../src/jrd/why.cpp:2068 #11 0x0000000000405824 in main (argc=<optimized out>, argv=0x7fffdac6da48) at ../src/utilities/create_db.cpp:22 When optimization level affects result of program execution that's sooner of all compiler issue, is not it?
Certainly one can check value of variable 'stream' in Thread 1 / frame 0. Normally it's small non-negative value. Looks like UCHAR -> SSHORT conversion triggers an incorrect result somewhere. And yes, this smells like a compiler bug.
(gdb) explore stream
The value of 'stream' is of type 'const SSHORT' which is a typedef of type 'const short' 'stream' is a scalar value of type 'const short'. stream = -13016 The value seems to come from the nextStream method, which uses csb_n_stream member. The later however seems uninitialized to me: (gdb) explore csb 'csb' is a pointer to a value of type 'Jrd::CompilerScratch' Continue exploring it as a pointer to a single value [y/n]: y The value of '*csb' is a struct/class of type 'Jrd::CompilerScratch' with the following fields: pool_alloc<(BlockType)15> = <Enter 0 to explore this base class of type 'pool_alloc<(BlockType)15>'> csb_blr_reader = <Enter 1 to explore this field of type 'Jrd::BlrReader'> csb_node = <Enter 2 to explore this field of type 'Jrd::jrd_nod *'> csb_external = <Enter 3 to explore this field of type 'Jrd::ExternalAccessList'> csb_access = <Enter 4 to explore this field of type 'Jrd::AccessItemList'> csb_variables = <Enter 5 to explore this field of type 'Jrd::vec<Jrd::jrd_nod*> *'> csb_resources = <Enter 6 to explore this field of type 'Jrd::ResourceList'> csb_dependencies = <Enter 7 to explore this field of type 'Jrd::NodeStack'> csb_fors = <Enter 8 to explore this field of type 'Firebird::Array<Jrd::RecordSource*, Firebird::EmptyStorage<Jrd::RecordSource*> >'> csb_exec_sta = <Enter 9 to explore this field of type 'Firebird::Array<Jrd::jrd_nod*, Firebird::EmptyStorage<Jrd::jrd_nod*> >'> csb_invariants = <Enter 10 to explore this field of type 'Firebird::Array<Jrd::jrd_nod*, Firebird::EmptyStorage<Jrd::jrd_nod*> >'> csb_current_nodes = <Enter 11 to explore this field of type 'Firebird::Array<Jrd::jrd_node_base*, Firebird::EmptyStorage<Jrd::jrd_node_base*> >'> csb_n_stream = <Enter 12 to explore this field of type 'USHORT'> csb_msg_number = <Enter 13 to explore this field of type 'USHORT'> csb_impure = <Enter 14 to explore this field of type 'SLONG'> csb_g_flags = <Enter 15 to explore this field of type 'USHORT'> csb_pool = <Enter 16 to explore this field of type 'Firebird::MemoryPool &'> csb_dbg_info = <Enter 17 to explore this field of type 'Firebird::DbgInfo'> csb_map_field_info = <Enter 18 to explore this field of type 'Jrd::MapFieldInfo'> csb_map_item_info = <Enter 19 to explore this field of type 'Jrd::MapItemInfo'> csb_domain_validation = <Enter 20 to explore this field of type 'Firebird::MetaName'> csb_view = <Enter 21 to explore this field of type 'Jrd::jrd_rel *'> csb_view_stream = <Enter 22 to explore this field of type 'USHORT'> csb_remap_variable = <Enter 23 to explore this field of type 'USHORT'> csb_validate_expr = false .. (Value of type 'bool') csb_returning_expr = false .. (Value of type 'bool') csb_rpt = <Enter 26 to explore this field of type 'Firebird::HalfStaticArray<Jrd::CompilerScratch::csb_repeat, 5ul>'> Enter the field number of choice: 12 The value of '(*csb).csb_n_stream' is of type 'USHORT' which is a typedef of type 'unsigned short' '(*csb).csb_n_stream' is a scalar value of type 'unsigned short'. (*csb).csb_n_stream = 52521 so it is indeed a ushort->sshort conversion (induced by the types used to pass the value), but the source of the problem seems to be the very large value in csb.csb_n_stream. 52521 == 0xcd29, looks like garbage at the first look
CompilerScratch is using operator new from class pool_alloc, which zeroes memory. Can't an issue be due to using wrong operator new and therefore having not initialized memory including csb_n_stream? A good technique I use to debug this type of problem is:
- Disable Linux memory address space randomization - Run the code, get the address of this variable - Watch the address of this variable in gdb - Run again and check every change to this address Setting a watchpoint for the address tells me the memory is indeed not initialized. Here's the list of places where the memory location of csb.csb_n_stream is changed:
Hardware watchpoint 2: *(USHORT *) 0x7ffff36b7b00 Old value = <unreadable> New value = 31920 Firebird::BePlusTree<Firebird::BlockInfo, unsigned long, Firebird::MemoryPool::InternalAllocator, Firebird::BlockInfo, Firebird::DefaultComparator<unsigned long> >::add (this=0x7ffff36b79e0, item=..., accessor=accessor@entry=0x7ffff36b79f8) at ../src/include/../common/classes/tree.h:705 705 return true; (gdb) c Continuing. Hardware watchpoint 2: *(USHORT *) 0x7ffff36b7b00 Old value = 31920 New value = 30232 _wordcopy_fwd_aligned (dstp=<optimized out>, srcp=<optimized out>, len=8) at wordcopy.c:118 118 wordcopy.c: Няма такъв файл или директория. (gdb) c Continuing. Hardware watchpoint 2: *(USHORT *) 0x7ffff36b7b00 Old value = 30232 New value = 30856 Firebird::MemoryPool::internal_deallocate (this=this@entry=0x7ffff36b79d8, block=block@entry=0x7ffff36b7888) at ../src/common/classes/alloc.cpp:1897 1897 } (gdb) c Continuing. Hardware watchpoint 2: *(USHORT *) 0x7ffff36b7b00 Old value = 30856 New value = 30232 Firebird::MemoryPool::internal_alloc (this=this@entry=0x7ffff36b79d8, size=size@entry=32, upper_size=upper_size@entry=0, type=type@entry=0) at ../src/common/classes/alloc.cpp:1535 1535 addFreeBlock(current_block); (gdb) c Continuing. Hardware watchpoint 2: *(USHORT *) 0x7ffff36b7b00 Old value = 30232 New value = 30856 Firebird::BePlusTree<Firebird::BlockInfo, unsigned long, Firebird::MemoryPool::InternalAllocator, Firebird::BlockInfo, Firebird::DefaultComparator<unsigned long> >::add (this=0x7ffff36b79e0, item=..., accessor=accessor@entry=0x7ffff36b79f8) at ../src/include/../common/classes/tree.h:705 705 return true; (gdb) c Continuing. Hardware watchpoint 2: *(USHORT *) 0x7ffff36b7b00 Old value = 30856 New value = 30968 Firebird::BePlusTree<Firebird::BlockInfo, unsigned long, Firebird::MemoryPool::InternalAllocator, Firebird::BlockInfo, Firebird::DefaultComparator<unsigned long> >::add (this=0x7ffff36b79e0, item=..., accessor=accessor@entry=0x7ffff36b79f8) at ../src/include/../common/classes/tree.h:705 705 return true; (gdb) c Continuing. Hardware watchpoint 2: *(USHORT *) 0x7ffff36b7b00 Old value = 30968 New value = 27944 Firebird::BePlusTree<Firebird::BlockInfo, unsigned long, Firebird::MemoryPool::InternalAllocator, Firebird::BlockInfo, Firebird::DefaultComparator<unsigned long> >::add (this=0x7ffff36b79e0, item=..., accessor=accessor@entry=0x7ffff36b79f8) at ../src/include/../common/classes/tree.h:705 705 return true; (gdb) c Continuing. Hardware watchpoint 2: *(USHORT *) 0x7ffff36b7b00 Old value = 27944 New value = 27945 0x0000000000462701 in Jrd::CompilerScratch::nextStream (this=0x7ffff36b79d8, check=<optimized out>) at ../src/jrd/../jrd/exe.h:853 853 return csb_n_stream++; There's no zeroing-out, as it seems. How do I check if the right memory allocator is used? csb is an instance of class CompilerScratch which inherits from pool_alloc:
class CompilerScratch : public pool_alloc<type_csb> Inside pool_alloc: void* operator new(size_t s, MemoryPool& p ) { return p.calloc(s); } void* operator new[](size_t s, MemoryPool& p) { return p.calloc(s); } And MemoryPool::calloc() zero-initializes the memory. If gcc6 decides to use some different operator new() rather than pool_alloc:new(), then Firebird has no chances to work. As found by Richard Biener (our toochain developer), Firebird 2.5 builds fine with "-std=gnu++98 -fno-lifetime-dse". The gcc documentation for -fno-lifetime-dse says
-fno-lifetime-dse In C++ the value of an object is only affected by changes within its lifetime: when the constructor begins, the object has an indeterminate value, and any changes during the lifetime of the object are dead when the object is destroyed. Normally dead store elimination will take advantage of this; if your code relies on the value of the object storage persisting beyond the lifetime of the object, you can use this flag to disable this optimization. To preserve stores before the constructor starts (e.g. because your operator new clears the object storage) but still treat the object as dead after the destructor you, can use -flifetime-dse=1. The default behavior can be explicitly selected with -flifetime-dse=2. -flifetime-dse=0 is equivalent to -fno-lifetime-dse. IMHO this may explain the behaviour observed in previous comments, the case of operator new clearing object storage is explicitely mentioned as an example. Here is my findings building trunk with gcc 6:
First, I just committed a change for GPRE generate code compatible with C++-14 I build in C++-14 mode with these options: -fno-sized-deallocation: because C++-14 pass the size to the delete operator. Instead of use this options, it would be better to check __cpluplus >= 201402L and do the appropriate change in the sources. -fno-delete-null-pointer-checks: becase the ->as ->is methods that we call in null pointers. May have others similar things. These options are not needed: -no-lifetime-dse -flifetime-dse=1 But I think -flifetime-dse=1 should be safe. (8) should be fixed now (it produced errors with MSVC14, not warnings)
For information, here the build log for Firebird 3.0.0.32483 under Fedora with gcc 6 :
https://copr-be.cloud.fedoraproject.org/results/makowski/firebird/fedora-rawhide-x86_64/00175576-firebird/ |