Issue Details (XML | Word | Printable)

Key: CORE-5436
Type: Bug Bug
Status: Resolved Resolved
Resolution: Fixed
Priority: Major Major
Assignee: Vlad Khorsun
Reporter: Kovalenko Dmitry
Votes: 0
Watchers: 5
Operations

If you were logged in you would be able to see more operations.
Firebird Core

[FB3 SC] Server hangs (under load test)

Created: 05/Jan/17 01:03 PM   Updated: 13/Jun/18 10:40 PM
Component/s: Engine
Affects Version/s: 3.0.2
Fix Version/s: 3.0.4, 4.0 Beta 1

Environment:
Firebird 3.0.2.32651 SuperClassic x64, Win10, IBProvider Test System.

Also reported on Linux too.
Also reported with Classic Server.
Issue Links:
Relate
 

QA Status: No test


 Description  « Hide
SuperClassic hangs under load test.

-------------------
Two test processes (with 4 test theads) with own (two) databases.

All test threads are waiting the answer from "database attach" operation.

------------------------------------- [Error at client side]
[THR:001544] [05.01.2017 11:44:09] [test] Verify Arrays [TBL_CS__NONE::CHAR_ARRAY__8]. ArrayRwMode: direct. BindArrayType: array_iunknown__ibp_ss_wc. CType:UNICODE_FSS Truncate Char: false
[THR:001544] [05.01.2017 11:44:29] ERROR: [array.002.rw_api.build_mode__use_cmd.unicode.TBL_CS__NONE.CHAR_ARRAY__8.chars__WIN1251.write__array_iunknown__ibp_ss_wc.ctype__UTF8] Data Source initialization
1. [Unknown DBMS]: Error occurred during login, please check server firebird.log for details
2. [LCPI.IBProvider.3]: Ошибка подключения к базе данных.
Неопознанная ошибка

COM Error Code: E_FAIL

------------------------------------- [Error in firebird.log]
HOME4 Thu Jan 5 11:44:19 2017
Database: D:\PROGRAM_FILES\FIREBIRD_3_0_X64\SECURITY3.FDB
page 0, page type 1 lock denied (216)


HOME4 Thu Jan 5 11:44:29 2017
Database:
page 0, page type 1 lock denied (216)


HOME4 Thu Jan 5 11:44:29 2017
Authentication error
page 0, page type 1 lock denied

------------------
Dump file, Sources, PDB, EXE, DLL - http://dropmefiles.com/qlTUB

 All   Comments   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Kovalenko Dmitry added a comment - 22/Apr/17 05:43 AM
Gutten morgen.

Firebird 3.0.3.32714 ClassicServer (x64)

-----------
Dump and other files: http://dropmefiles.com/dkCA8

----------- Firebird log
HOME4 Sat Apr 22 07:53:01 2017
Database: D:\DATABASE\FB_03_0_0\IBP_TEST_FB30_D3_2.GDB
page 9631149, page type 5 lock denied (216)


HOME4 Sat Apr 22 07:54:11 2017
Database: D:\DATABASE\FB_03_0_0\IBP_TEST_FB30_D3_2.GDB
page 9627759, page type 4 lock denied (216)


HOME4 Sat Apr 22 07:55:25 2017
Database: D:\PROGRAM_FILES\FIREBIRD_3_0_X64\SECURITY3.FDB
page 0, page type 1 lock denied (216)


HOME4 Sat Apr 22 07:55:35 2017
Database:
page 0, page type 1 lock denied (216)


HOME4 Sat Apr 22 07:55:35 2017
Authentication error
page 0, page type 1 lock denied

-------------- Last error from test log

[THR:013432] [22.04.2017 07:55:35] ERROR: [charsets.txt_params.disable_defer.auto_commit.unicode.TBL_CS__WIN1251.VARCHAR_ARRAY__32.cn_NONE.user_WIN1251.chars_WIN1251.param_adVarBinary_Array.bind_bstr_array.small.array_type_17] Data Source initialization
1. [Unknown DBMS]: Error occurred during login, please check server firebird.log for details
2. [LCPI.IBProvider.3]: Ошибка подключения к базе данных.
Неопознанная ошибка

Vlad Khorsun added a comment - 23/Apr/18 07:04 AM
There is (private) confirmation by user that patch helps

John Franck added a comment - 26/Apr/18 09:16 AM
I've had similar problem yesterday on a production database, this is my firebird.log entry, there's nothing before or after that's related to this event:

-------------

sqlserver Wed Apr 25 17:59:53 2018
        Database:
        page 0, page type 1 lock denied (216)

-------------

The server has stopped responding to any subsequent request, but there were no errors on the client side, simply it hangs on connecting.

As this is a production system, this was quite a big problem, fortunately it's occurred in a low working load time frame.

I don't know if it's caused by the same problem tracked here, but just to prevent before it eventually occurs again, is there a way to patch my system?

Vlad Khorsun added a comment - 26/Apr/18 09:57 AM
> I don't know if it's caused by the same problem tracked here, but just to prevent before it eventually occurs again, is there a way to patch my system?

You could try current snapshot build, the patch is already committed there.

John Franck added a comment - 27/Apr/18 08:13 AM - edited
> You could try current snapshot build, the patch is already committed there.

Can snapshots be considered stable to be used in a production system?

Vlad Khorsun added a comment - 27/Apr/18 08:23 AM
> Are there binaries ready to use or should I compile them?

https://www.firebirdsql.org/en/snapshot-builds/

> (I'm not used with compilation on linux)

This ticket is about Windows only ! Read description carefully, please !
The patch committed is Windows specific and confirmed on Windows only.
And there was no reports about this issue on non-Windows systems so far.

John Franck added a comment - 27/Apr/18 08:32 AM
Sorry, I did a search for the exact error I got on my firebird.log and this issue come out, I didn't realized this was Windows only.

Well, so I got the same error on Linux x64. Should I file a new issue?

Vlad Khorsun added a comment - 27/Apr/18 08:39 AM
Yes, create new ticket, please

Vlad Khorsun added a comment - 13/Jun/18 11:23 AM
New reports from users allows to find another reason for this issue.
Thus i re-open the ticket

Vlad Khorsun added a comment - 13/Jun/18 11:28 AM
Thanks to John Kilin i have stack trace and lock-table dump now.

Stack trace

 # Call Site
00 ntdll!ZwWaitForSingleObject+0xa
01 KERNELBASE!WaitForSingleObjectEx+0x98
02 engine12!Firebird::Semaphore::tryEnter+0x21
03 engine12!Firebird::Dumper::writeDumpAsync+0xa5
04 engine12!Jrd::dumpMemory+0x13e
05 engine12!Jrd::LockManager::deadlock_walk+0x55
06 engine12!Jrd::LockManager::deadlock_walk+0x1a6
07 engine12!Jrd::LockManager::deadlock_scan+0x79
08 engine12!Jrd::LockManager::wait_for_request+0x481
09 engine12!Jrd::LockManager::grant_or_que+0xb3
0a engine12!Jrd::LockManager::enqueue+0x2eb
0b engine12!enqueue+0x8f
0c engine12!LCK_lock+0x151
0d engine12!lock_buffer+0x140
0e engine12!CCH_fetch_lock+0xf8
0f engine12!CCH_fetch+0x50
10 engine12!bump_transaction_id+0x6f
11 engine12!transaction_start+0x13d
12 engine12!TRA_start+0x13e
13 engine12!start_transaction+0xc6
14 engine12!Jrd::JAttachment::startTransaction+0x195
15 engine12!Firebird::IAttachmentBaseImpl<Jrd::JAttachment,Firebird::CheckStatusWrapper,Firebird::IReferenceCountedImpl<Jrd::JAttachment,Firebird::CheckStatusWrapper,Firebird::Inherit<Firebird::IVersionedImpl<Jrd::JAttachment,Firebird::CheckStatusWrapper,Firebird::Inherit<Firebird::IAttachment> > > > >::cloopstartTransactionDispatcher+0xa1
16 fbclient!Why::YAttachment::startTransaction+0x6e
17 fbclient!isc_start_multiple+0x181
18 fbclient!isc_start_transaction+0x13a
19 Legacy_Auth!Auth::SecurityDatabase::lookup_user+0xc0
1a Legacy_Auth!Auth::SecurityDatabase::verify+0x13b
1b Legacy_Auth!Auth::SecurityDatabaseServer::authenticate+0x4d9
1c Legacy_Auth!Firebird::IServerBaseImpl<Auth::SecurityDatabaseServer,Firebird::CheckStatusWrapper,Firebird::IAuthImpl<Auth::SecurityDatabaseServer,Firebird::CheckStatusWrapper,Firebird::Inherit<Firebird::IPluginBaseImpl<Auth::SecurityDatabaseServer,Firebird::CheckStatusWrapper,Firebird::Inherit<Firebird::IReferenceCountedImpl<Auth::SecurityDatabaseServer,Firebird::CheckStatusWrapper,Firebird::Inherit<Firebird::IVersionedImpl<Auth::SecurityDatabaseServer,Firebird::CheckStatusWrapper,Firebird::Inherit<Firebird::IServer> > > > > > > > >::cloopauthenticateDispatcher+0x9f
1d firebird!`anonymous namespace'::ServerAuth::authenticate+0x3d9
1e firebird!continue_authentication+0xdf
1f firebird!process_packet+0x179
20 firebird!loopThread+0x1a8
21 firebird!threadStart+0x50
22 msvcr100!endthreadex+0x43
23 msvcr100!endthreadex+0xdf
24 kernel32!BaseThreadInitThunk+0x22
25 ntdll!RtlUserThreadStart+0x34

Lock table:

LOCK BLOCK 83136
Series: 3, State: 3, Size: 8, Length: 8, Data: 0
Key: 0001:000000, Flags: 0x00, Pending request count: 3
Hash que (1): forward: 420, backward: 420
Requests (4): forward: 83064, backward: 92280
Request 83064, Owner: 78328, State: 3 (3), Flags: 0x00
Request 90408, Owner: 78872, State: 0 (6), Flags: 0x02
Request 90480, Owner: 78632, State: 0 (6), Flags: 0x02
Request 92280, Owner: 78328, State: 0 (6), Flags: 0x0A


Self-deadlock is happens: Owner 78328 hold SH lock for header page and acquires EX lock on the same page.
I.e. original lock was lost and not released.