New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gfix -sweep makes "reconnect" when it is removed from mon$attachments by delete command (issued in another window) [CORE4337] #1535
Comments
Modified by: @pavel-zotovAttachment: gdb-gfix-unsuccessful-detach-via-delete-from-mon_attachments.zip [ 12423 ] Attachment: trace-when-gfix-unsuccessful-detach-via-delete-from-mon_attachments.zip [ 12424 ] |
Commented by: @pavel-zotov PS. Please note that attachment_ID of gfix has been changed on iteration #2 but the process PID was still the same. |
Modified by: @pavel-zotovsummary: gfix makes "reconnect" when it is removed from mon$attachments by delete command (issued in another window) => gfix -sweep makes "reconnect" when it is removed from mon$attachments by delete command (issued in another window) |
Modified by: @dyemanovRegression: 3.0 Alpha 2 [ 10560 ] |
Modified by: @hvladassignee: Vlad Khorsun [ hvlad ] |
Modified by: @dyemanovFix Version: 3.0.0 [ 10048 ] |
Commented by: @hvlad Pavel, could you repeat this test again to see if it is still reproduced ? |
Commented by: @pavel-zotov Done on LI-V3.0.0.32251 (SS and SC). Tail from firebird.log:oel64 Tue Dec 29 16:13:30 2015
|
Modified by: @hvladstatus: Open [ 1 ] => Resolved [ 5 ] resolution: Duplicate [ 3 ] Fix Version: 3.0 RC2 [ 10048 ] => |
Modified by: @pcisarstatus: Resolved [ 5 ] => Closed [ 6 ] |
Modified by: @pavel-zotovQA Status: No test |
Modified by: @pavel-zotovstatus: Closed [ 6 ] => Closed [ 6 ] QA Status: No test => Done successfully |
Submitted by: @pavel-zotov
Attachments:
gdb-gfix-unsuccessful-detach-via-delete-from-mon_attachments.zip
trace-when-gfix-unsuccessful-detach-via-delete-from-mon_attachments.zip
Scenario.
1) create new database with default page size (4096), make FW = OFF
2) run the following DDL (exactly in NON-interactive mode, i.e. ISQL path/database -i script.sql):
recreate table t(id int primary key, s01 varchar(36) , s02 varchar(36) , s03 varchar(36) );
commit;
create index t_s01 on t(s01);
create index t_s02 on t(s02);
create index t_s03 on t(s03);
commit;
set term ^;
execute block as
begin
begin
execute statement 'create sequence g';
when any do begin end
end
end^
set term ;^
alter sequence g restart with 0;
commit;
-- 5'000'000 ==> 1700Mb, ~7 min
set stat on;
set term ^;
execute block as
declare n int = 5000000;
begin
while (n>0) do
insert into t(id, s01, s02, s03)
values( :n +iif( mod(:n,1000)=0, 0*gen_id(g,1000), 0),
uuid_to_char(gen_uuid()),
uuid_to_char(gen_uuid()),
uuid_to_char(gen_uuid())
) returning :n-1 into n;
end^
set term ;^
set echo on;
commit;
select count(*) from t;
delete from t;
commit;
set echo off;
show version;
show database;
set echo on;
exit;
/* this script leads to lot of garbage versions that should be removed later by gfix -sweep */
3) create simple script to non-interactive gathering of stack trace info (I gave name "gdb_backtrace_batch.script" to it):
----
thread apply all bt full
quit
yes
----
4) create auxiliary .sql script that will attempt to remove attach of gfix from mon$attachments:
-- file: gfixkill_eb.sql
set list on;
select * from mon$database;
commit;
set term ^;
execute block returns(dts_before timestamp, deleted_attach_id int , dts_after timestamp) as
begin
dts_before=cast('now' as timestamp);
deleted_attach_id=-1; -- if remains to this value then no gfix attachment was found
for
select mon$attachment_id
from mon$attachments
where mon$remote_process containing 'gfix'
into
deleted_attach_id
as cursor
tcur
do
delete from mon$attachments where current of tcur;
dts_after=cast('now' as timestamp);
suspend;
end^
set term ;^
set stat off;
set echo on;
show database;
commit;
exit;
5) create main .sh (to be run under linux shell):
# file: http://gfixtest.sh
clear
fbhome=/opt/fb30trnk
fbport=3333
fbname=firebird
dbname=/var/db/fb30/gfixtest30.fdb
gdb_batch_file=./gdb_backtrace_batch.script
delay=20
i=1$(echo -n $ (date +'%Y-%m-%d %H:%M:%S.%N')|cut -c1-24) sweep starting:
killall -9 gfix 2>/dev/null
echo
set -x
###################################################
$fbhome/bin/gfix -sweep localhost/$fbport:$dbname &
###################################################
set +x
fbpid=$(ps aux|grep /opt/fb30trnk/bin/firebird|grep -v grep|awk '{print $2}')$gfixpid 1>logs/gfix_started_$ (date +'%y%m%d_%H%M%S').gdb.txt 2>&1$fbpid 1>logs/firebird_when_gfix_started_$ (date +'%y%m%d_%H%M%S').gdb.txt 2>&1$(echo -n $ (date +'%Y-%m-%d %H:%M:%S.%N')|cut -c1-24) now wait $delay seconds before killing it...$(echo -n $ (date +'%Y-%m-%d %H:%M:%S.%N')|cut -c1-24) attempt to detach gfix process...
gfixpid=$(ps aux|grep $fbhome/bin/gfix|grep -v "defunct\|grep"|awk '{print $2}')
echo +++++++++++++++++++++++++++++++++++
echo alive \"gfix -sweep\" process: $gfixpid, firebird process: $fbpid
echo +++++++++++++++++++++++++++++++++++
ps $gfixpid
echo gather initial stacktrace of running gfix...
gdb -q -x $gdb_batch_file $fbhome/bin/gfix
gdb -q -x $gdb_batch_file $fbhome/bin/$fbname
echo done:
ls -l logs/*.gdb.txt
while :
do
gfixpid=$(ps aux|grep $fbhome/bin/gfix|grep -v "defunct\|grep"|awk '{print $2}')
echo . . . . . . . . . . . iter N $i . . . . . . . . . . . . . .
echo before isql: alive gfix process: \>\>\> $gfixpid \<\<\<
echo
sleep $delay
echo ....................................................................
echo
set -x
$fbhome/bin/isql localhost/$fbport:$dbname -i gfixkill_eb.sql
set +x
echo$(echo -n $ (date +'%Y-%m-%d %H:%M:%S.%N')|cut -c1-24) result of attempt $i to detach gfix: check that there is NO alive gfix pid:$(echo -n $ (date +'%Y-%m-%d %H:%M:%S.%N')|cut -c1-24) starting gather stacktrace for gfix and firebird processes.$(echo -n $ (date +'%Y-%m-%d %H:%M:%S.%N')|cut -c1-24) finish gathered stacktrace for gfix and firebird processes:$(echo -n $ (date +'%Y-%m-%d %H:%M:%S.%N')|cut -c1-24) finish iter $i
# again!
gfixpid=$(ps aux|grep $fbhome/bin/gfix|grep -v "defunct\|grep"|awk '{print $2}')
if [ -n "$gfixpid" ]; then
echo after isql: alive gfix process: \>\>\> $gfixpid \<\<\<
echo
gdb4gfixlog=logs/gfix_alive_$(date +'%y%m%d_%H%M%S').gdb.txt
gdb4fblog=logs/firebird_when_gfix_alive_$(date +'%y%m%d_%H%M%S').gdb.txt
set -x
gdb -q -x $gdb_batch_file $fbhome/bin/gfix $gfixpid 1>$gdb4gfixlog 2>&1
gdb -q -x $gdb_batch_file $fbhome/bin/$fbname $fbpid 1>$gdb4fblog 2>&1
set +x
echo
ls -l $gdb4gfixlog $gdb4fblog
else
echo NO gfix process found. Bye!..
exit
fi
echo
i=$((i+1))
done
6) make subdirectory 'logs' under current folder, update settings in script http://gfixtest.sh for your environment:
fbhome=/opt/fb30trnk
fbport=3333
fbname=firebird
dbname=/var/db/fb30/gfixtest30.fdb
7) run http://gfixtest.sh
This script will NOT be able to detach gfix -sweep process at the FIRST attempt. It needs TWO such attempts.
Files in attach:
1) gdb-gfix-unsuccessful-detach-via-delete-from-mon_attachments.zip - stack traces for GFIX and FIREBIRD processes, for two moments:
1.1) when gfix -sweep just started
1.2) when gfix could not be detached (iter #2 of .shell script)
2) trace-when-gfix-unsuccessful-detach-via-delete-from-mon_attachments.zip - trace and .shell script output for another run of this test.
The text was updated successfully, but these errors were encountered: