Issue Details (XML | Word | Printable)

Key: CORE-6347
Type: Bug Bug
Status: Resolved Resolved
Resolution: Fixed
Priority: Blocker Blocker
Assignee: Vlad Khorsun
Reporter: Virgo Pärna
Votes: 0
Watchers: 7
Operations

If you were logged in you would be able to see more operations.
Firebird Core

New connections to database server sometimes stall, when there is existing connection to database.

Created: 30/Jun/20 08:48 AM   Updated: 28/Jul/20 06:50 AM
Component/s: None
Affects Version/s: 4.0 Beta 2, 3.0.6
Fix Version/s: 4.0 RC 1, 3.0.7

File Attachments: 1. Zip Archive firebird.exe_200630_122419.zip (156 kB)

Environment:
Windows 10 64 bit
Firebird 3.0.6 32 bit
ServerMode is commented out - so it should be Super.
CPU: AMD Athlon II X2 250 dual core 3 GHz

QA Status: No test


 Description  « Hide
When there is already existing connection to one database on server, then attempting to connect to server sometimes stalls. Even when trying to connect databases, that do not exist. And when that happens, then original open connection also stalls.
I wrote powershell script to connect 36 databases (using ado.net database driver), of which only first exists. And I connected first database in Flamerobin and then executed the script. It successfully connected first database, failed to open next 5 (because those do not exist) and then attempting to connect next database stalled. And when I tried to disconnect Flamerobin from first database it stalled also.
Problem does not appear on Firebird 3.0.5

 All   Comments   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Virgo Pärna added a comment - 30/Jun/20 08:56 AM
Ok, existing connection is probably not required acutally. After switching to debug version I managed to reproduce that freeze on second run, when trying without existing connection. Will try to generate dump.

Virgo Pärna added a comment - 30/Jun/20 09:29 AM
zipped dump file generated with procdump firebird.exe
Firebird is from Firebird-3.0.6.33328-0_Win32_pdb.zip file.

Sean Leyne added a comment - 30/Jun/20 03:36 PM
This ticket seems related to other connection issues reported related to 3.0.6 [CORE-6346, CORE-6347 and CORE-6348]

Virgo Pärna added a comment - 02/Jul/20 11:40 AM
I'm now having trouble recreating the error. Managed it only once. Only difference is, that Eset antivirus was updated and computer restarted. But it did occur once, when trying specifically to recreate the error.. Maybe it was caused by external factor. But then again, It did happen once today.

Virgo Pärna added a comment - 03/Jul/20 01:40 PM
Happened again, when not trying to replicate. But this time it resolved itself after 3 minutes... So it is difficult to duplicate.

Vlad Khorsun added a comment - 05/Jul/20 07:24 AM
1. Memory dump contains to few bits of process memory:

WinDBG: User Mini Dump File: Only registers, stack and portions of memory are available

thus I can't say much about issue. Always produce full memory dump, please.


2. I see one idle worker thread.

3. Listener thread seems to hung in accept():

00 ntdll+0x71e4c
01 mswsock+0x14e44
02 ws2_32+0x146ef
03 ws2_32+0x14657
04 firebird!os_utils::accept+0x15
05 firebird!select_accept+0x28
06 firebird!select_multi+0x65
07 firebird!rem_port::select_multi+0x1a
08 firebird!SRVR_multi_thread+0x16c
09 firebird!inet_connect_wait_thread+0x85
0a firebird!threadStart+0x74
0b msvcr100!endthreadex+0x3a
0c msvcr100!endthreadex+0xe4
0d kernel32+0x16359
0e ntdll+0x67c24
0f ntdll+0x67bf4

here os_utils::accept() just call WinSock accept() function.

This is all dump said to me. I see no variables, no state, almost nothing :(


It could be really related to the CORE-6348.
To check this you may try to disable WireCompression using 3.0.6.33328.
Also, you may try to run current snapshot build of v3 and left WireCompression setting at it is.

It also could be related with ESET, btw. It actively intervenes network stack and there was issues because of this in the past.
Antivirus SW on database server is not good idea in any case.

Vlad Khorsun added a comment - 05/Jul/20 07:27 AM
Sean,

> This ticket seems related to other connection issues reported related to 3.0.6 [CORE-6346, CORE-6347 and CORE-6348]

It is NOT related with CORE-6346. I can state it after looking at dump provided.
And it is really related with CORE-6347 - as it IS CORE-6347 :)

Virgo Pärna added a comment - 06/Jul/20 05:58 AM
What is best way to create full dump? Because id did try to create full dump with procdump (procdump -mp), but resulting file was 23 MB even after compressing it with zip.

Vlad Khorsun added a comment - 06/Jul/20 08:10 PM
According to procdump docs you should use -ma

Vlad Khorsun added a comment - 11/Jul/20 03:27 PM
Please, try next snapshot build

Vlad Khorsun added a comment - 13/Jul/20 09:04 AM
I consider it as fixed, please check snapshot build.

Virgo Pärna added a comment - 14/Jul/20 09:26 AM
Updated to snapshot. Hopefully will not happen anymore. Unfortunately it was not easily reproducible, but it happened today again (with 3.0.6) - memory usage of server was over 40 MB and dump was over 80 (with procdump -ma). But let see, if it happens again.