Issue Details (XML | Word | Printable)

Key: CORE-3921
Type: Bug Bug
Status: Closed Closed
Resolution: Fixed
Priority: Major Major
Assignee: Vlad Khorsun
Reporter: Pavel Zotov
Votes: 1
Watchers: 3
Operations

If you were logged in you would be able to see more operations.
Firebird Core

"record disappeared (186), file: vio.cpp line: 408" + CPU 100% (when bugcheckabort=1 and sweep starts at gap ~21000)

Created: 11/Sep/12 04:39 AM   Updated: 02/Sep/16 08:20 PM
Component/s: Engine
Affects Version/s: None
Fix Version/s: 3.0 Alpha 2, 2.5.3

File Attachments: 1. Zip Archive trace_sweep_vio_error.zip (1 kB)

Issue Links:
Relate
 

QA Status: Deferred
Test Details:
I've prepared test but:
1) can not reproduce bug on WI-V2.5.2.26540 (release; checked on both SC and SS) and have no intermediate snapshot WI-V2.5.2.26536 that was specified in the start message; also this bug absent in 2.5.1;
2) test lasts about 2 minutes on regular PC (cpu 3.0 GHz, ram 2gb), so I prefer to refrain from committing such test in fbt-repo.

The only note: test has final command which launches ISQL with 'drop database; quit;' statements, but it does not work: database file remains on disk, despite that there is no any attachments at this point (including trace session). Checked on 2.5.2, the same on 2.5.7.27025 (SS and SC).

2 dimitr, hvlad: i can send .py script to you if it's needed.


 Description  « Hide
Intro: create database 't0.fdb'; commit; quit;
####

session #1
##########
C:\1INSTALL\FIREBIRD\Data>isql localhost:C:\1INSTALL\FIREBIRD\Data\t0.fdb -n
Database: localhost:C:\1INSTALL\FIREBIRD\Data\t0.fdb
SQL> recreate table tfix1(id int primary key, s01 varchar(1000));
SQL> recreate table tfix2(id int primary key, s01 varchar(1000));
SQL> commit;
SQL> insert into tfix1 values(-1,'tfix1');
SQL> insert into tfix2 values(-2,'tfix2');
SQL> commit;
SQL>
SQL> set transaction read committed;
SQL> set term ^;
SQL> execute block as
CON> declare n int = 200000;
CON> declare k int;
CON> begin
CON> while (n>0) do begin
CON> --insert into ttmp(id,s01) values(:n, rpad('',1000,'x'));
CON> insert into tfix1(id,s01) values(:n, rpad('',1000,'x'));
CON> insert into tfix2(id,s01) values(:n, rpad('',1000,'x'));
CON> n=n-1;
CON> end
CON> end^
SQL> set term ;^

session #2 // start it after execute block in session #1 has finished
##########
commit;
set transaction read committed;
set term ^;
execute block as
  declare n int = 21000;
  declare k int;
begin
  while (n>0) do begin
    in autonomous transaction do select 1 from rdb$database into :k;
    n=n-1;
  end
end^
set term ;^
commit;

session #1 // switch in it only after EB in session #2 has finished
##########
rollback;

/* at this point we have:
Database "T0.FDB"
Database header page information:
<...>
Page size 4096
ODS version 11.2
Oldest transaction 5
Oldest active 6
Oldest snapshot 6
Next transaction 21008
<...>
Database dialect 3
Creation date Sep 11, 2012 8:09:28
Attributes force write

    Variable header data:
*END*
*/

session #3
##########
Start user trace with the following config file:
<database %[\\/]((t0|t1|t2).fdb)|(test001)>
  enabled true
  log_connections true
  log_transactions true

  log_statement_prepare true
  log_statement_free true

  log_statement_start true
  log_statement_finish true

# log_trigger_start true
  log_trigger_finish true

  log_blr_requests true

  log_errors true
  log_sweep true # <<<<<<<<<<<<<< ENABLE sweep info to be logged <<<<<<<<<<<<<<<

  print_perf true
  print_blr true

  time_threshold 0
  max_sql_length 4000
  max_blr_length 1000
  max_dyn_length 1000
  max_log_size 0
  #log_procedure_finish true
</database>

session #1
#########
update tfix2 set s01='' where id<0; -- стартует sweep, в трейсе по нему появляется туча сообщений


session #2
#########
SQL> select count(*) from tfix1;

       COUNT
============
           1

After this point I got standard 'critical error windows' on my Windows-2000 machine (fb_inet_server.exe - The application has requested the Runtime to terminate it in an unusial way).
Trace window has been stopped (like if I press Ctrl-C in it) and the following messages appeared in firebird.log:

BALAHA Tue Sep 11 08:25:40 2012
Sweep is started by SWEEPER
Database "C:\1INSTALL\FIREBIRD\DATA\T0.FDB"
OIT 5, OAT 6, OST 6, Next 21009


BALAHA Tue Sep 11 08:25:50 2012
Database: C:\1INSTALL\FIREBIRD\DATA\T0.FDB
internal Firebird consistency check (record disappeared (186), file: vio.cpp line: 408)


BALAHA (Client) Tue Sep 11 08:26:11 2012
XNET error: Server shutdown detected

No any actions could be done in both the sessions #1 and 2:
SQL> select count(*) from tfix1;
Statement failed, SQLSTATE = 08006
Unable to complete network request to host "localhost".
-Error writing data to the connection.

Trace text has been attached in ZIP.

 All   Comments   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Pavel Zotov added a comment - 11/Sep/12 04:42 AM
PS.

ISQL Version: WI-V2.5.2.26536 Firebird 2.5
Server version:
Firebird/x86/Windows NT (access method), version "WI-V2.5.2.26536 Firebird 2.5"
Firebird/x86/Windows NT (remote server), version "WI-V2.5.2.26536 Firebird 2.5/tcp (balaha)/P12"
Firebird/x86/Windows NT (remote interface), version "WI-V2.5.2.26536 Firebird 2.5/tcp (balaha)/P12"
on disk structure version 11.2