Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decompression overran buffer after rollback [CORE5422] #2082

Closed
firebird-automations opened this issue Dec 15, 2016 · 11 comments
Closed

Decompression overran buffer after rollback [CORE5422] #2082

firebird-automations opened this issue Dec 15, 2016 · 11 comments

Comments

@firebird-automations
Copy link
Collaborator

Submitted by: prenosil (prenosil)

This is the same issue as in (for FB3.0)
CORE5392
and as in (for FB4.0)
CORE5420
(and I have suspition that CORE5419 is the same problem)

I copied test from CORE5420 with some minor changes (adding primary key is removed because it is unnecessary, and force write is set ON (in my tests on FB2.5.6 i can''t reproduce the crash with FW OFF).

Few notes:
- probalibity of crash is higher with CpuAffinityMask = 1 than with CpuAffinityMask = 0
- when testing on real table/data, the crash occur even without "alter table test add ..."
- after exactly 2 minutes after "Decompression overran" there is "deadlock" in fbirebird.log
- no error occurs when
- GC is turned off in connect string (isc_dpb_no_garbage_collect), or
- GC set to Cooperative in .conf, or
- using Embedded version
- when testing night FB3.0 snapshot the problem seems fixed (did not occur even after >400 runs of test scripts), i.e. fix from CORE5392 works.
- latest snapshot 2.5.7.27030-0_x64 crashes too

==========

shell del E:\TEMP\Test.fdb 2>nul;
create database 'localhost:E:\TEMP\Test.fdb';

connect 'localhost:E:\TEMP\Test.fdb' no garbage_collect;

create domain dm_longutf as varchar(8000) character set utf8;
recreate table test (id int not null, a int);
commit;

set echo on;

set term ^;
execute block as
declare i int;
declare n int = 100000;
begin
while (n>0) do insert into test(id, a) values(:n, :n) returning :n-1 into n;
end
^
set term ;^
commit;

alter table test add b dm_longutf default '' not null;
commit;

update test set a=2;
rollback;

set list on;
-- this lead to decompression overran buffer (179), file: sqz.cpp line: 282
set echo on;
update test set a=3;
commit;

==========

Commits: 90a46fa 835c78c dd52257 0eb85af 10c6a14 8d1110a

====== Test Details ======

Bug is 100% reproduced on 2.5.7.27030 (18-dec-2016), fixed on 27038.
Test already exists - see core_5392.fbt, so i decided only to change there min_version to 2.5.7.

@firebird-automations
Copy link
Collaborator Author

Commented by: @dyemanov

Ivan, just FYI: CORE5392 was a regression, the problematic code does not exist in v2.x. So it seems that v2.5 also has races with background GC thread that can lead to the same issue, but the reason is somewhat different (although possibly related). Thus so far I doubt it's fixed by the patch for CORE5392, maybe it's just hidden in v3 now.

BTW, is the "Enviroment" ticket field really correct? Based on your description, it should rather contain "GC set to Mixed / Background".

@firebird-automations
Copy link
Collaborator Author

Commented by: prenosil (prenosil)

Sorry, I copy/pasted wrong row, GC should be default one, i.e.
GCPolicy = combined

@firebird-automations
Copy link
Collaborator Author

Modified by: @hvlad

assignee: Vlad Khorsun [ hvlad ]

@firebird-automations
Copy link
Collaborator Author

Commented by: @hvlad

Looking at bug nature and a patch (just committed into B2_5_Release) i'd say it is very old.
Probably it was somehow hidden before.
Must note, that i couldn't reproduce it until affinity mask was set to use more than one core.

@firebird-automations
Copy link
Collaborator Author

Modified by: @hvlad

Version: 3.0.1 [ 10730 ]

Version: 3.0.0 [ 10740 ]

Version: 4.0 Initial [ 10621 ]

Version: 2.5.5 [ 10670 ]

Version: 2.5.4 [ 10585 ]

Version: 2.5.3 Update 1 [ 10650 ]

Version: 2.1.7 [ 10651 ]

Version: 2.5.3 [ 10461 ]

Version: 2.5.2 Update 1 [ 10521 ]

Version: 2.5.2 [ 10450 ]

Version: 2.5.1 [ 10333 ]

Version: 2.5.0 [ 10221 ]

Fix Version: 3.0.2 [ 10785 ]

Fix Version: 4.0 Alpha 1 [ 10731 ]

Fix Version: 2.5.7 [ 10770 ]

@firebird-automations
Copy link
Collaborator Author

Commented by: @hvlad

Ivan,
could you, please, test next snapshot build of 2.5.7 ?

@firebird-automations
Copy link
Collaborator Author

Commented by: prenosil (prenosil)

Tested intensively by two people, even after several hundreds of cycles the error did not occur any morfe. Thanks a lot.

@firebird-automations
Copy link
Collaborator Author

Modified by: @hvlad

status: Open [ 1 ] => Resolved [ 5 ]

resolution: Fixed [ 1 ]

@firebird-automations
Copy link
Collaborator Author

Commented by: @hvlad

Thanks!

@firebird-automations
Copy link
Collaborator Author

Modified by: @pavel-zotov

status: Resolved [ 5 ] => Resolved [ 5 ]

QA Status: No test => Covered by another test(s)

Test Details: Bug is 100% reproduced on 2.5.7.27030 (18-dec-2016), fixed on 27038.
Test already exists - see core_5392.fbt, so i decided only to change there min_version to 2.5.7.

Test Specifics: [Architecture (SS/CS) specific]

@firebird-automations
Copy link
Collaborator Author

Modified by: @pavel-zotov

status: Resolved [ 5 ] => Closed [ 6 ]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment