Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Placing "Srp" at the start of authentification plugins list (on server) can lead to multiple (up to 42) fails to connect with error "SQLSTATE = 08006". Mostly on Classic, also can be on SuperClassic. Never on SuperServer. Requires FB run as APP. [CORE5050] #5337

Closed
firebird-automations opened this issue Dec 19, 2015 · 14 comments

Comments

@firebird-automations
Copy link
Collaborator

Submitted by: @pavel-zotov

Attachments:
logs-with-failed-first-attempt-to-connect.7z
c5050-upd.7z

1. Create SQL script (I've named it 'c4868.sql' because this is next step of investigation core-4868 issues):

set term ^;
create or alter procedure sys_get_fb_arch (
a_connect_with_usr varchar(31) default 'SYSDBA'
,a_connect_with_pwd varchar(31) default 'masterke'
) returns(
fb_arch varchar(50)
) as
declare cur_server_pid int;
declare ext_server_pid int;
declare att_protocol varchar(255);
declare v_test_sttm varchar(255);
declare v_fetches_beg bigint;
declare v_fetches_end bigint;
begin

\-\- Aux SP for detect FB architecture\.

select a\.mon$server\_pid, a\.mon$remote\_protocol
from mon$attachments a
where a\.mon$attachment\_id = current\_connection
into cur\_server\_pid, att\_protocol;

if \( att\_protocol is null \) then
    fb\_arch = 'Embedded';
else if \( upper\(current\_user\) = upper\('SYSDBA'\)
          and rdb$get\_context\('SYSTEM','ENGINE\_VERSION'\) NOT starting with '2\.5' 
          and exists\(select \* from mon$attachments a 
                     where a\.mon$remote\_protocol is null
                           and upper\(a\.mon$user\) in \( upper\('Cache Writer'\), upper\('Garbage Collector'\)\)
                    \) 
        \) then
    fb\_arch = 'SuperServer';
else
    begin
        v\_test\_sttm =
            'select a\.mon$server\_pid \+ 0\*\(select 1 from rdb$database\)'
            \|\|' from mon$attachments a '
            \|\|' where a\.mon$attachment\_id = current\_connection';

        select i\.mon$page\_fetches
        from mon$io\_stats i
        where i\.mon$stat\_group = 0  \-\- db\_level
        into v\_fetches\_beg;
    
        execute statement v\_test\_sttm
        on external
             'localhost:' \|\| rdb$get\_context\('SYSTEM', 'DB\_NAME'\)
        as
             user a\_connect\_with\_usr
             password a\_connect\_with\_pwd
             role left\('R' \|\| replace\(uuid\_to\_char\(gen\_uuid\(\)\),'\-',''\),31\)
        into ext\_server\_pid;
    
        in autonomous transaction do
        select i\.mon$page\_fetches
        from mon$io\_stats i
        where i\.mon$stat\_group = 0  \-\- db\_level
        into v\_fetches\_end;
    
        fb\_arch = iif\( cur\_server\_pid is distinct from ext\_server\_pid, 
                       'Classic', 
                       iif\( v\_fetches\_beg is not distinct from v\_fetches\_end, 
                            'SuperClassic', 
                            'SuperServer'
                          \) 
                     \);
    end

fb\_arch = fb\_arch \|\| ' ' \|\| rdb$get\_context\('SYSTEM','ENGINE\_VERSION'\);

suspend;

end

^ -- sys_get_fb_arch
set term ;^
commit;

create or alter view v_info as
select
mon$user as who_ami,
mon$auth_method as my_auth,
rdb$get_context('SYSTEM','CLIENT_PROCESS') as my_exe,
mon$client_version as my_fbdll,
current_connection as my_attach,
p.fb_arch
from mon$attachments
left join sys_get_fb_arch('SYSDBA', 'masterke') p on 1=1
where mon$attachment_id = current_connection
;

set list on;

set term ^;
execute block returns(
msg varchar(50),
fb_arch type of column v_info.fb_arch,
who_ami type of column v_info.who_ami,
my_auth type of column v_info.my_auth,
my_exe type of column v_info.my_exe,
my_fbdll type of column v_info.my_fbdll,
my_attach type of column v_info.my_attach
) as
begin

for
select 'From main connection', fb_arch, who_ami, my_auth, my_exe, my_fbdll, my_attach
from v_info
into msg, fb_arch, who_ami, my_auth, my_exe, my_fbdll, my_attach
do
suspend;

for
execute statement
'select ''From connection via ES/EDS'', fb_arch, who_ami, my_auth, my_exe, my_fbdll, my_attach '
||' from v_info'
on external 'localhost:' || rdb$get_context('SYSTEM','DB_NAME')
as user 'SYSDBA' password 'masterke'
into msg, fb_arch, who_ami, my_auth, my_exe, my_fbdll, my_attach
do
suspend;
end
^
set term ;^
rollback;

2. Create batch file with name that is matched to SQL (except extension, of course; so, its name here is: 'c4868.bat'):

@echo off
setlocal enabledelayedexpansion enableextensions

@Rem path to ISQL on client machine:
set fbc=C:\MIX\firebird\fb30

@Rem Remote host, port and DB alias:
set host=192.168.0.220
set port=3333
set dbnm=e30

set pbak=%path%
set path=%fbc%;%pbak%

@Rem Define current timestamp in order to add its value to LOGS:
set vbs=%~n0.tmp.vbs
del %vbs% 2>nul
(
echo strDate = DatePart("yyyy",Date^) _
echo ^& Right("0" ^& DatePart("m",Date^), 2^) _
echo ^& Right("0" ^& DatePart("d",Date^), 2^)
echo strTime = Right("0" ^& Hour(Now^), 2^) _
echo ^& Right("0" ^& Minute(Now^), 2^)
@Rem echo ^& Right("0" ^& Second(Now^), 2^)
echo Wscript.Echo( strDate ^& "_" ^& strTime^)
) >>%vbs%

for /f %%i in ('cscript /nologo %vbs%') do set dts=%%i
del %vbs% 2>nul

set log=%~n0.%dts%.log
set err=%~n0.%dts%.err
set tmp=%~n0.tmp
del %log% 2>nul
del %err% 2>nul

echo Starting probes at !time!
set k=0
:m1
set /a k=!k!+1
set msg=!time!. Probe #⁠ !k!

echo \!msg\!
echo \!msg\!\>\>%log%

%fbc%\\isql %host%/%port%:%dbnm% \-z \-i %\~n0\.sql 1\>\>%log% 2\>\>%tmp%

findstr /i SYSDBA %log% \>nul
if not errorlevel 1 goto end

for /f "tokens=\*" %%a in \('findstr /i /c:"statement failed" %tmp%'\) do \(
  echo \!time\! %%a \>\>%err%
\)
del %tmp% 2\>nul

echo \!time\!\. FAILED\.

ping \-n 1 127\.0\.0\.1 \>nul

goto :m1

:end
if not exist %err% (
echo !time!. Got connection at FIRST attempt, no logs required.
type %log%
del %log%
) else (
echo !time! Got connection only at !k!th attempt, check %err%
)
del %tmp% 2>nul
set path=%pbak%

3. Replace in batch settings of: FBC, HOST, PORT and DBNM to yours.

4. Change firebird.conf on server to:

AuthClient = Srp,Legacy_Auth,Win_Sspi
AuthServer = Srp,Legacy_Auth
BugCheckAbort=1
ExternalFileAccess = Restrict /var/db/fb30
FileSystemCacheThreshold = 65536K
LockHashSlots = 22111
LockMemSize = 64M
MaxUserTraceLogSize = 99999
RemoteServicePort = 3333
Servermode = Classic
TempCacheLimit = 2147483647
TempDirectories = /tmp/firebird
UserManager = Srp
WireCrypt = Disabled

NOTE: 'Srp' must be at 1st position in Auth* lists.
Client-side firebird.conf can be with either 'Legacy' or 'Srp' in 1st position of plugins - this seems has no matter.

5. Open two console windows: 1) on server host (I use FB on Linux, so get this via Putty); 2) on client machine (Windows XP).

6. Switch to client console and type: C4868.BAT - but do NOT press Enter now.
7. Stop FB if it is running as service. Type in command prompt: /opt/fb30/bin/firebird -- but also do NOT press Enter for now.

8. Check that when you will press Alt-Tab then Windows will switch between server & client console windows and do not 'jumps' to some other window (if exists).

9. Switch to server console and press Enter.
10. Swicth === !! AS FAST AS YOU CAN !! == to client console (by pressing Alt-Tab) and press Enter there.

When FB is launched as Classic (as it show above) one may to get almost in every 3/4 runs following (in the client console):

Starting probes at 13:13:54,80
13:13:54,80. Probe #⁠ 1
13:13:55,97. FAILED.
13:13:56,00. Probe #⁠ 2
13:13:57,18. FAILED.
13:13:57,21. Probe #⁠ 3
13:13:58,38. FAILED.
13:13:58,39. Probe #⁠ 4
13:13:59,58. FAILED.
13:13:59,60. Probe #⁠ 5
13:14:00,80. FAILED.
13:14:00,82. Probe #⁠ 6
13:14:02,00. FAILED.
13:14:02,02. Probe #⁠ 7
13:14:03,21. FAILED.
13:14:03,22. Probe #⁠ 8
13:14:04,60 Got connection only at 8th attempt, check c4868.20151219_1313.err

Most often there are 25 attempts before connection can be established (for Classic).

When FB is launched as SuperClassic this strange behaviour is harder to reproduce, and it seems that one may to get it only at FIRST running since parameter 'Servermode' was changed to SC. Any subsequent launched of batch will report that connection is established immediatelly, at 1st attempt.

When FB is launched as SuperServer this effect could not be reproduced by me.

When 'Legacy' is placed at 1st position on server firebird.conf this effect does not appear.

When FB is running as service this effect also does not appear.

IMPORTANT NOTES.

1. *** WAIT *** at least 20-30 seconds after FB application process is stopped (by pressing Ctrl-C). Do not launch it again just after you change config if this time was not elapsed yet!

2. If you change Servermode to SuperClassic you will be able to get failed attempts only ONCE, even if later you will stop FB application and wait more than several minutes. You have to change Servermode to Classic and do at least one launch before change it back to SuperClassic. MISTERIOUS but this is what I've got on my environment.

3. When you will play only with Servermode = Classic you will get failed attempts almost every time (I got approx. in 75% of cases), so you can leave config unchanged. But remember: you have to WAIT at least 20-30 seconds before subsequent run of batch.

Bulk of logs with failed 1st and subsequent attaches see in attached file.

PS. It's very pity that one may not to obtain actual server config parameters in ISQL (by some new fields in rdb$database ?) and log them :(

@firebird-automations
Copy link
Collaborator Author

Modified by: @pavel-zotov

Attachment: logs-with-failed-first-attempt-to-connect.7z [ 12854 ]

@firebird-automations
Copy link
Collaborator Author

Commented by: @pavel-zotov

One more issue: it seems that the problem exists only when trying to connect to FB on Linux.
This is another batch that can be used on localhost (i.e. win_client -> win_server, both client & server are on localhost):

@echo off
setlocal enabledelayedexpansion enableextensions

@Rem path to ISQL on client machine:
set fbc=C:\MIX\firebird\fb30

@Rem Remote host, port and DB alias:
set host=localhost
set port=3333
set dbnm=e30
set usr=SYSDBA
set pwd=masterke

set pbak=%path%
set path=%fbc%;%pbak%

(
echo RemoteServicePort = 3333
echo Servermode = Classic
echo AuthServer = Srp,Legacy_Auth
echo AuthClient = Srp,Legacy_Auth,Win_Sspi
echo WireCrypt = Disabled
) > %fbc%\firebird.conf

@Rem Define current timestamp in order to add its value to LOGS:
set vbs=%~n0.tmp.vbs
del %vbs% 2>nul
(
echo strDate = DatePart("yyyy",Date^) _
echo ^& Right("0" ^& DatePart("m",Date^), 2^) _
echo ^& Right("0" ^& DatePart("d",Date^), 2^)
echo strTime = Right("0" ^& Hour(Now^), 2^) _
echo ^& Right("0" ^& Minute(Now^), 2^)
@Rem echo ^& Right("0" ^& Second(Now^), 2^)
echo Wscript.Echo( strDate ^& "_" ^& strTime^)
) >>%vbs%

for /f %%i in ('cscript /nologo %vbs%') do set dts=%%i
del %vbs% 2>nul

set log=%~n0.%dts%.log
set err=%~n0.%dts%.err
set tmp=%~n0.tmp
del %log% 2>nul
del %err% 2>nul

set msg=!time!. Start FB as application.
echo !msg!
echo !msg!>>%log%
start /min %fbc%\firebird.exe -a

echo Starting probes at !time!
set k=0
:m1
set /a k=!k!+1
set msg=!time!. Probe #⁠ !k!

echo \!msg\!
echo \!msg\!\>\>%log%

%fbc%\\isql %host%/%port%:%dbnm% \-z \-i %\~n0\.sql 1\>\>%log% 2\>\>%tmp%

findstr /i SYSDBA %log% \>nul
if not errorlevel 1 goto end

for /f "tokens=\*" %%a in \('findstr /i /c:"statement failed" %tmp%'\) do \(
  echo \!time\! %%a \>\>%err%
\)
del %tmp% 2\>nul

echo \!time\!\. FAILED\.

ping \-n 1 127\.0\.0\.1 \>nul

goto :m1

:end
if not exist %err% (
echo !time!. Got connection at FIRST attempt, no logs required.
type %log%
del %log%
) else (
echo !time! Got connection only at !k!th attempt, check %err%
)
del %tmp% 2>nul
set path=%pbak%

echo Killing FB process. . .
pskill firebird 2>nul
ping -n 3 127.0.0.1>nul
pslist 2>nul | findstr /i firebird

This batch requires Sysinternal utilities 'pskill' (for terminate FB process) and (optionally) 'pslist' .
But I could not get any fails on attempting to establish connection using this batch, despite that time 'gap' between launching FB process and running 1st probe is almost zero.

@firebird-automations
Copy link
Collaborator Author

Commented by: @hvlad

Pavel,

do you think

Statement failed, SQLSTATE = 08006

is complete error message ? This is all i can see in your log's

Also, please, stop put looong sources of batch files into description\comments (unless you refers to it contents) - we have attacments for it

@firebird-automations
Copy link
Collaborator Author

Commented by: @pavel-zotov

> is complete error message ? This is all i can see in your log's

yes, I took only 1st line of error message.

This is full text of error for one iteration:

Statement failed, SQLSTATE = 08006
Unable to complete network request to host "192.168.0.220".
-Failed to establish a connection.
After line 0 in file c5050-upd.sql
Use CONNECT or CREATE DATABASE to specify a database
Use CONNECT or CREATE DATABASE to specify a database

I've slightly changed batch & sql in order this error could be saved as it is delivered to client - see attached file 'c5050-upd.7z'.

And one else issue: on server console (where we run FB as application) one may easy to get during client make attempts to connect:

Unable to complete network request to host "oel64".
-Error while listening for an incoming connection.
-Address already in use

I got it three times (when ServerMode = Classic).

FB process terminated at this moment. And firebird.log for this error contains:

oel64 Sun Dec 20 16:50:00 2015
INET/inet_error: bind errno = 98

oel64 Sun Dec 20 16:50:00 2015
startup:INET_connect:
Unable to complete network request to host "oel64".
Error while listening for an incoming connection.
Address already in use

ISQL Version: WI-V3.0.0.32239 Firebird 3.0 Release Candidate 2
Server version:
Firebird/Linux/AMD/Intel/x64 (access method), version "LI-V3.0.0.32239 Firebird 3.0 Release Candidate 2"
Firebird/Linux/AMD/Intel/x64 (remote server), version "LI-V3.0.0.32239 Firebird 3.0 Release Candidate 2/tcp (oel64)/P13"
Firebird/Windows/Intel/i386 (remote interface), version "WI-V3.0.0.32239 Firebird 3.0 Release Candidate 2/tcp (csprog)/P13"
on disk structure version 12.0

> please, stop put looong sources ... we have attacments for it

Yes, you're right. But once upon a time (a few years ago) there was epic fail with FB tracker and lot of attachments were lost.
I keep remember this before doing any post here :-)

@firebird-automations
Copy link
Collaborator Author

Modified by: @pavel-zotov

Attachment: c5050-upd.7z [ 12855 ]

@firebird-automations
Copy link
Collaborator Author

Modified by: @AlexPeshkoff

assignee: Alexander Peshkov [ alexpeshkoff ]

@firebird-automations
Copy link
Collaborator Author

Commented by: @AlexPeshkoff

What sysdba passwords are set in srp and legacy user managers?

@firebird-automations
Copy link
Collaborator Author

Commented by: @pavel-zotov

I always use 'masterke' (eight characters, without 'y' in 9th position) - both for 2.5 and 3.0

@firebird-automations
Copy link
Collaborator Author

Commented by: @AlexPeshkoff

In that case I must say that I can't reproduce an issue. I've connected from linux, but taking into an account that bugreport depends upon linux server architecture it does not matter. Batch was replaced with single line on bash.

#⁠ ps ax|grep firebird
2513 pts/8 S+ 0:00 grep --colour=auto firebird

#⁠ sleep 30;./firebird& for i in `seq 100`; do ./isql -b -i error.sql -user sysdba -pas masterke localhost:employee && break; done;killall firebird
[1] 2517
Server started successfully
Server started successfully
Server started successfully
Server started successfully

MSG From main connection
FB_ARCH Classic 3.0.0
WHO_AMI SYSDBA
MY_AUTH Srp
MY_EXE /usr/home/firebird/trunk/gen/Debug/firebird/bin/isql
MY_FBDLL LI-V3.0.0.32241-dev Firebird 3.0 Release Candidate 2
MY_ATTACH 122

MSG From connection via ES/EDS
FB_ARCH Classic 3.0.0
WHO_AMI SYSDBA
MY_AUTH Srp
MY_EXE /usr/home/firebird/trunk/gen/Debug/firebird/bin/firebird
MY_FBDLL LI-V3.0.0.32241-dev Firebird 3.0 Release Candidate 2
MY_ATTACH 126

You see that connection is tried immediately after server is started and it's successful.

What am I missing?

@firebird-automations
Copy link
Collaborator Author

Commented by: @pavel-zotov

> connection is tried immediately after server is started and it's successful.

Yes, I have the same result on my Linux env for your script.

The only that I can suppose is that in my test connections established from remote host (with Windows client; who knows, may be OS difference in client / server also matters).

@firebird-automations
Copy link
Collaborator Author

Commented by: @AlexPeshkoff

I have not noticed at once the following:

oel64 Sun Dec 20 16:50:00 2015
startup:INET_connect:
Unable to complete network request to host "oel64".
Error while listening for an incoming connection.
Address already in use

This is known issue with linux - port is not always becomes free immediately after process completion. Nothing to fix in FB.

@firebird-automations
Copy link
Collaborator Author

Modified by: @AlexPeshkoff

status: Open [ 1 ] => Resolved [ 5 ]

resolution: Won't Fix [ 2 ]

@firebird-automations
Copy link
Collaborator Author

Commented by: @pavel-zotov

IMO, ticket has been closed before time.
1) Why such effect as unable to establish connect at 1st attempt (because of "port is not always becomes free" or for some other reason) can be reproduced only when FB is running as application and, more, only when arch = CS or SC but never on SS ?
2) may be some suggestion about linux tuning could be placed here (I mean what to change there in order port to be free instantly)

PS. Can anyone who has access to linux server from windows machine repeat actions which I've specified in the starting post ? TIA.

@firebird-automations
Copy link
Collaborator Author

Modified by: @pcisar

status: Resolved [ 5 ] => Closed [ 6 ]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment