Issue Details (XML | Word | Printable)

Key: CORE-5830
Type: Bug Bug
Status: Closed Closed
Resolution: Fixed
Priority: Critical Critical
Assignee: Alexander Peshkov
Reporter: Daniel Mazur
Votes: 0
Watchers: 5
Operations

If you were logged in you would be able to see more operations.
Firebird Core

Encryption Interface crashing Firebird process when working on big db file (6.7GB)

Created: 19/May/18 09:45 AM   Updated: 24/May/18 06:25 AM
Component/s: API / Client Library
Affects Version/s: 3.0.3
Fix Version/s: 3.0.4, 4.0 Beta 1

Environment: Windows 10 x64

QA Status: Cannot be tested


 Description  « Hide
I have written encryption plugin, it works fine but yesterday it break up the file. As first I thought that this is issue with encryption length etc. I have checked my code and it still encrypting/decrypting other databases except one which got 6.7 GB size. I decided to check the original code written by Mr. Peshkov (cryptDB.pas) from FB directory it still crashing the firebird process. So my conclusion is that Firebird can't handle encryption on Big Files. The same DB file was encrypted severe times before but got less size (around 6GB) and it works fine.

DB File after crash is broken and can't be fixed (encrypted data at the begin of file, rest is unencrypted). Process is stopped right after ALTER DATABASE encrypt

I wish I could share with this big db file but it cointains secret company data (clients info etc.) so maybe recreating big db file with random data may lead to recreate bug on another environment

 All   Comments   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Vlad Khorsun added a comment - 19/May/18 09:56 AM
Do you have a crash dump ?

Daniel Mazur added a comment - 20/May/18 09:18 AM - edited
This is dump from WinDBG when I'm trying to connect to this broken 6.7GB File

ModLoad: 00000001`10000000 00000001`10064000 C:\Program Files\Firebird\Firebird_3_0\plugins\KAMELEONCRT64.DLL
ModLoad: 00007ffb`2cda0000 00007ffb`2ce65000 C:\WINDOWS\System32\oleaut32.dll
(cd4.1d80): Access violation - code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
*** ERROR: Symbol file could not be found. Defaulted to export symbols for C:\Program Files\Firebird\Firebird_3_0\plugins\Engine12.DLL -
Engine12+0x1d7b63:
00007ffa`eadd7b63 41ff5250 call qword ptr [r10+50h] ds:534b4c49`575c3a93=????????????????
*** ERROR: Symbol file could not be found. Defaulted to export symbols for C:\Program Files\Firebird\Firebird_3_0\MSVCR100.dll -

Here is on ALTER DATABASE encrypt WITH plugin

ModLoad: 00000001`10000000 00000001`10064000 C:\Program Files\Firebird\Firebird_3_0\plugins\PLUGIN.DLL << encryption plugin from example code (xor 5)
ModLoad: 00007ffb`2cda0000 00007ffb`2ce65000 C:\WINDOWS\System32\oleaut32.dll
(1e44.1714): Access violation - code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
ntdll!RtlEnterCriticalSection+0xd:
00007ffb`2d040bcd f00fba710800 lock btr dword ptr [rcx+8],0 ds:00000000`00000028=????????
*** ERROR: Symbol file could not be found. Defaulted to export symbols for C:\Program Files\Firebird\Firebird_3_0\plugins\Engine12.DLL -
*** ERROR: Symbol file could not be found. Defaulted to export symbols for C:\Program Files\Firebird\Firebird_3_0\MSVCR100.dll -
0:016> g
(1e44.1714): Access violation - code c0000005 (!!! second chance !!!)
ntdll!RtlEnterCriticalSection+0xd:
00007ffb`2d040bcd f00fba710800 lock btr dword ptr [rcx+8],0 ds:00000000`00000028=????????
0:016> g
(1e44.1714): Access violation - code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
ntdll!RtlEnterCriticalSection+0xd:
00007ffb`2d040bcd f00fba710800 lock btr dword ptr [rcx+8],0 ds:00000000`00000028=????????

Call Stack

 # Call Site
00 ntdll!RtlEnterCriticalSection+0xd
01 Engine12+0x1d7ba6
02 Engine12+0x1d8669
03 Engine12!firebird_plugin+0x13bc60
04 MSVCR100!endthreadex+0x43
05 MSVCR100!endthreadex+0xdf
06 KERNEL32!BaseThreadInitThunk+0x14
07 ntdll!RtlUserThreadStart+0x21



My conclusion:

Firebird while crash didn't have plugin's functions (from plugin memory region) on the call stack so I think this is issue in engine12.dll
It may fails to handle database file for encryption due to overflow (??)
There is an Access Violation:
lock btr dword ptr [rcx+8],0 ds:00000000`00000028
RCX is 0x20 while crash
This is third instruction in RtlEnterCriticalSection

ntdll!RtlEnterCriticalSection:
00007ffb`2d040bc0 4883ec28 sub rsp,28h
00007ffb`2d040bc4 65488b042530000000 mov rax,qword ptr gs:[30h]
00007ffb`2d040bcd f00fba710800 lock btr dword ptr [rcx+8],0 ds:00000000`00000028=????????


Also, right before crash at Engine12+0x1d7ba6 we got this

00007ffa`eadd7b8e e89da31b00 call Engine12!firebird_plugin+0x127400 (00007ffa`eaf91f30) << setting rdi+0x20 to null I assume
00007ffa`eadd7b93 488b4f20 mov rcx,qword ptr [rdi+20h] << here we got 0 in RCX
00007ffa`eadd7b97 4883c120 add rcx,20h << RCX is now 0x20
00007ffa`eadd7b9b 48894c2440 mov qword ptr [rsp+40h],rcx
00007ffa`eadd7ba0 ff1522362500 call qword ptr [Engine12!firebird_plugin+0x1c0698 (00007ffa`eb02b1c8)] << jump to RtlEnterCriticalSection, then crash


EDIT: Forgot to send .DMP File (with WinDBG), crashing thread 0x1714
http://ge.tt/4kYgBrp2

Vlad Khorsun added a comment - 20/May/18 09:45 AM
dmp file is useless as it shows usual process stopped on the breakpoint, sorry

Could you provide full memory dump at the crash (AV) moment ?

Vlad Khorsun added a comment - 20/May/18 09:48 AM
Also, please explain what is "break up" here:

> I have written encryption plugin, it works fine but yesterday it break up the file.

What error(s) did you see ?
What messages was put to the firebird.log at that moment ?

Also, you could try to validate database with gfix -v -full ?

Daniel Mazur added a comment - 20/May/18 10:32 AM - edited
http://ge.tt/3gxBCrp2
Here is DMP on breakpoint right before RtlEnterCriticalSection
DMP file is made via TaskManager
If I doing it wrong, please tell me how to do it property

There wasn't any information in log file, only at firebird was closed abnormally and access violation code in decimal

> What error(s) did you see ?
Access Violation
> Also, you could try to validate database with gfix -v -full ?
No output in console after running gfix -v -full path -user SYSDBA -pa pwd, so I think file is fine
But there is change
when I did ALTER DATABASE encrypt on gfixed file, it crashed after 2-3 seconds then server died
After restart I could connect to this "encrypted" file but server dies right after connection (Access Violation in Engine12 but in another place)

Maybe you will give me your e-mail and I will send your credentails to AnyDesk or any other RD application so you will check it on your own.

Vlad Khorsun added a comment - 20/May/18 11:44 AM
> Here is DMP on breakpoint right before RtlEnterCriticalSection
What do you expect i should see here ? I really not undertand :(

> If I doing it wrong, please tell me how to do it property
I assume you have broken database and Firebird crashes every time you trying to connect to it.
If it is correct, you may attach WinDbg to the running firebird.exe before the crash, then reproduce
crash and WinDbg should stop and show you exception. At this point you may save full memory dump
using command .dump /ma <file.dmp>
Or you may use WER to create crash dump, read
https://msdn.microsoft.com/en-us/library/windows/desktop/bb787181.aspx

> There wasn't any information in log file, only at firebird was closed abnormally and access violation code in decimal
>
> > What error(s) did you see ?
> Access Violation

Exact and full message, please

> > Also, you could try to validate database with gfix -v -full ?
> No output in console after running gfix -v -full path -user SYSDBA -pa pwd, so I think file is fine
It is good, but - did you validate "broken file" ?
Are you sure, gfix was not crashed ?
There should be messages in firebird.log about validation start and finish - are they both present ?

> Maybe you will give me your e-mail and I will send your credentails to AnyDesk or any other RD application so you will check it on your own.
Sorry, i have no time to do it online

Daniel Mazur added a comment - 20/May/18 12:33 PM
gfix:Validation finished: 0 errors, 0 warnings, 0 fixed on non-crypted
gfix: Validation finished: 0 errors, 0 warnings, 0 fixed on crashing FB db file (crypted only few records)

Line you have asked about: "C:\Program Files\Firebird\Firebird_3_0\firebird.exe": terminated abnormally (4294967295)

About this dump there is breakpoint at 01 Engine12+0x1d7ba0 while dumped. So you can read state of registers and stack in WinDBG
This is a line before call to RtlEnterCriticalSection with null pointer which is actually crashing whole process.

Daniel Mazur added a comment - 20/May/18 12:35 PM - edited
Now I will try debug with PDB version of FB

EDIT:

Breakpoint 0 hit
Engine12!Jrd::CryptoManager::cryptThread+0x6e0:
00007ff9`7b3f7ba0 ff1522362500 call qword ptr [Engine12!_imp_EnterCriticalSection (00007ff9`7b64b1c8)] ds:00007ff9`7b64b1c8={ntdll!RtlEnterCriticalSection (00007ff9`b3bd0bc0)}
0:017> g
(22b8.1f68): Access violation - code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
ntdll!RtlEnterCriticalSection+0xd:
00007ff9`b3bd0bcd f00fba710800 lock btr dword ptr [rcx+8],0 ds:00000000`00000028=????????

While alter database encrypt with Alex Peshkov example plugin
Now I will try to add source files to debugger to get clear call stack for find problematic piece of code

Vlad Khorsun added a comment - 20/May/18 01:17 PM
Is it looks like this one ?

00 ntdll!RtlEnterCriticalSection+0xd
01 engine12!Jrd::CryptoManager::cryptThread+0x6e6 [c:\fb30\b3_0_release\prod_r3_0_3\firebird3\src\jrd\cryptomanager.cpp @ 970]
02 engine12!`anonymous namespace'::cryptThreadStatic+0x9 [c:\fb30\b3_0_release\prod_r3_0_3\firebird3\src\jrd\cryptomanager.cpp @ 63]
03 engine12!threadStart+0x50 [c:\fb30\b3_0_release\prod_r3_0_3\firebird3\src\common\threadstart.cpp @ 93]
04 msvcr100!endthreadex+0x43
05 msvcr100!endthreadex+0xdf
06 kernel32!BaseThreadInitThunk+0x14
07 ntdll!RtlUserThreadStart+0x21


Daniel Mazur added a comment - 20/May/18 01:33 PM
https://i.imgur.com/m48IVaK.png

Yeah exactly, here is the screenshot with source files.
cryptomanager.cpp function void CryptoManager::cryptThread() lines

ThreadContextHolder tdbb(att->att_database, att, &status_vector);
tdbb->tdbb_quantum = SWEEP_QUANTUM; << crash here, tdbb is not initialized for sure, the question is why?

Vlad Khorsun added a comment - 20/May/18 02:03 PM
I see line 970 a bit different

RefPtr<JAttachment> jAtt(REF_NO_INCR, dbb.dbb_provider->attachDatabase(&status_vector,
dbb.dbb_database_name.c_str(), writer.getBufferLength(), writer.getBuffer()));
check(&status_vector);

MutexLockGuard attGuard(*(jAtt->getStable()->getMutex()), FB_FUNCTION); <<< here
Attachment* att = jAtt->getHandle();

and the reason of crash is obvious - jAtt refers to NULL attachment

jAtt class Firebird::RefPtr<Jrd::JAttachment>
 ptr 0x00000000`0296eba8 class Jrd::JAttachment *
  att 0x00000000`00000000 class Jrd::StableAttachmentPart *

Looks like some problem at attachDatabase() call above.

Is it attach to the 'broken database' ?
Is it the same database where validation found no problem ?

Daniel Mazur added a comment - 20/May/18 03:27 PM - edited
Exactly, gfix found no problems in both files. I'm novice in FB code so I could not find the real reason in this specific case.
In not-crypted db file, let's call it ORIG.GDB here I can work without any problems.
When I exec ALTER DATABASE encrypt it crashes after few moments, when I've compared both files, ORIG.GDB and ENCR.GDB and they have different few kb at begining. Firebird can't handle ENCR.GDB anymore after failed ALTER, there is a need of binary reconstruction in case if someone wouldn't have backup of ORIG.GDB

That's why this is critical. I've trusted in Firebird's encryption interfase and sadly but got this issue. Database we speak in this thread comes from "production" of company which got many queries per day. DB get broken I they lost data from few hours which is very, very painful.
There is a need of a patch because new europian law about data security lead to situation where companies encrypt their databases. In my opinion encryption on Firebird in this stare is unstable.

So now, what to do to avoid the crash? ORIG.GDB is handled normally, gfix didnt get any troubles while scanning the file.

@EDIT:

>jAtt refers to NULL attachment

there is if (!att) line below, it shouldbe returns true in this case when jAtt.getHandle returns NULL?
@EDIT2:
Now I see i got different 970 line so getHandle is AV'ing here

Vlad Khorsun added a comment - 20/May/18 04:20 PM
Look at source file with correct line numbers:

https://github.com/FirebirdSQL/firebird/blob/R3_0_3/src/jrd/CryptoManager.cpp


The crash is here:

MutexLockGuard attGuard(*(jAtt->getStable()->getMutex()), FB_FUNCTION);

the reason is that jAtt->getStable() == NULL


BTW, could you check with current snapshot build of 3.0.4 ?

Daniel Mazur added a comment - 20/May/18 07:07 PM
Well, on 3.0.4 snapshot is the same
Very strange because if the db file were broken attach() from interface.cpp would fail, now it returns NULL which lead to AV's

Vlad Khorsun added a comment - 21/May/18 08:45 AM
Lets take a look from another side.
You said you wrote encryption plugin. Also you said you tried encryption plugin from examples cryptDB.pas.
Does you built it with Delphi ?
Does bitness of plugin is the same as Firebird ?
cryptDB.pas not set IsMultiThread to true (it should be done in Initialization section, iirc), does you set it in your plugin ?
Could you try to build cryptDB.pas with IsMultiThread := true and try it ?

Daniel Mazur added a comment - 21/May/18 09:00 AM
 IsMultiThread := true does not help

I'm compiling it with FPC as it stands in header of cryptdb.pas

Daniel Mazur added a comment - 21/May/18 11:33 AM
Bitness is the same, otherwise it cant be loaded.
Tried on 32bit version of Firebird with Peshkov plugin compiled under Delphi - same issue, NULL jAtt
Only one thing was different, when I have tried to open broken db file after ALTER DATABASE, it shows me the error isc_attach_database failed

Daniel Mazur added a comment - 22/May/18 06:18 AM
What Can I do now? Just wait?

Vlad Khorsun added a comment - 22/May/18 06:32 AM
We think we found the reason for AV, now looking for correct fix for it

Daniel Mazur added a comment - 22/May/18 06:37 AM
Yeah I have spoken with Mr. Peshkov, now I know everything.
Thanks you, this feature is urgently needed

Vlad Khorsun added a comment - 22/May/18 04:53 PM
Try next snapshot build (after 3.0.4.32974), please

Daniel Mazur added a comment - 23/May/18 07:49 AM
At version 3.0.4.32977 where were you changed pointers to AutoPtr, 6.7GB database is crypting and decrypting fine. I will also check both on x86 and x64 and with my plugin. But I'm pretty sure that now it will work fine.


My proposal is to add information in Firebird Encryption document about CORE-5830's fix or commit it into latest official version to avoid futher problems with other users.
Thank you