Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encryption Interface crashing Firebird process when working on big db file (6.7GB) [CORE5830] #6091

Closed
firebird-automations opened this issue May 19, 2018 · 27 comments

Comments

@firebird-automations
Copy link
Collaborator

Submitted by: Daniel Mazur (danielmazur)

I have written encryption plugin, it works fine but yesterday it break up the file. As first I thought that this is issue with encryption length etc. I have checked my code and it still encrypting/decrypting other databases except one which got 6.7 GB size. I decided to check the original code written by Mr. Peshkov (cryptDB.pas) from FB directory it still crashing the firebird process. So my conclusion is that Firebird can't handle encryption on Big Files. The same DB file was encrypted severe times before but got less size (around 6GB) and it works fine.

DB File after crash is broken and can't be fixed (encrypted data at the begin of file, rest is unencrypted). Process is stopped right after ALTER DATABASE encrypt

I wish I could share with this big db file but it cointains secret company data (clients info etc.) so maybe recreating big db file with random data may lead to recreate bug on another environment

Commits: 6bc775c fe04d32 01b1088 42d8dc1

@firebird-automations
Copy link
Collaborator Author

Modified by: Daniel Mazur (danielmazur)

description: I have written encryption plugin, it works fine but yesterday it break up the file. As first I thought that this is issue with encryption length etc. I have checked my code and it still encrypting/decrypting other databases except one which got 6.7 GB size. I decided to check the original code written by Mr. Peshkov (cryptDB.pas) from FB directory it still crashing the firebird process. So my conclusion is that Firebird can't handle encryption on Big Files. The same DB file was encrypted severe times before but got less size (around 6GB) and it works fine.

I wish I could share with this big db file but it cointains secret company data (clients info etc.) so maybe recreating big db file with random data may lead to recreate bug on another environment

=>

I have written encryption plugin, it works fine but yesterday it break up the file. As first I thought that this is issue with encryption length etc. I have checked my code and it still encrypting/decrypting other databases except one which got 6.7 GB size. I decided to check the original code written by Mr. Peshkov (cryptDB.pas) from FB directory it still crashing the firebird process. So my conclusion is that Firebird can't handle encryption on Big Files. The same DB file was encrypted severe times before but got less size (around 6GB) and it works fine.

DB File after crash is broken and can't be fixed (encrypted data at the begin of file, rest is unencrypted). Process is stopped right after ALTER DATABASE encrypt

I wish I could share with this big db file but it cointains secret company data (clients info etc.) so maybe recreating big db file with random data may lead to recreate bug on another environment

@firebird-automations
Copy link
Collaborator Author

Commented by: @hvlad

Do you have a crash dump ?

@firebird-automations
Copy link
Collaborator Author

Commented by: Daniel Mazur (danielmazur)

This is dump from WinDBG when I'm trying to connect to this broken 6.7GB File

ModLoad: 00000001`10000000 00000001`10064000 C:\Program Files\Firebird\Firebird_3_0\plugins\KAMELEONCRT64.DLL
ModLoad: 00007ffb`2cda0000 00007ffb`2ce65000 C:\WINDOWS\System32\oleaut32.dll
(cd4.1d80): Access violation - code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
*** ERROR: Symbol file could not be found. Defaulted to export symbols for C:\Program Files\Firebird\Firebird_3_0\plugins\Engine12.DLL -
Engine12+0x1d7b63:
00007ffa`eadd7b63 41ff5250 call qword ptr [r10+50h] ds:534b4c49`575c3a93=????????????????
*** ERROR: Symbol file could not be found. Defaulted to export symbols for C:\Program Files\Firebird\Firebird_3_0\MSVCR100.dll -

Here is on ALTER DATABASE encrypt WITH plugin

ModLoad: 00000001`10000000 00000001`10064000 C:\Program Files\Firebird\Firebird_3_0\plugins\PLUGIN.DLL << encryption plugin from example code (xor 5)
ModLoad: 00007ffb`2cda0000 00007ffb`2ce65000 C:\WINDOWS\System32\oleaut32.dll
(1e44.1714): Access violation - code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
ntdll!RtlEnterCriticalSection+0xd:
00007ffb`2d040bcd f00fba710800 lock btr dword ptr [rcx+8],0 ds:00000000`00000028=????????
*** ERROR: Symbol file could not be found. Defaulted to export symbols for C:\Program Files\Firebird\Firebird_3_0\plugins\Engine12.DLL -
*** ERROR: Symbol file could not be found. Defaulted to export symbols for C:\Program Files\Firebird\Firebird_3_0\MSVCR100.dll -
0:016> g
(1e44.1714): Access violation - code c0000005 (!!! second chance !!!)
ntdll!RtlEnterCriticalSection+0xd:
00007ffb`2d040bcd f00fba710800 lock btr dword ptr [rcx+8],0 ds:00000000`00000028=????????
0:016> g
(1e44.1714): Access violation - code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
ntdll!RtlEnterCriticalSection+0xd:
00007ffb`2d040bcd f00fba710800 lock btr dword ptr [rcx+8],0 ds:00000000`00000028=????????

Call Stack

#⁠ Call Site
00 ntdll!RtlEnterCriticalSection+0xd
01 Engine12+0x1d7ba6
02 Engine12+0x1d8669
03 Engine12!firebird_plugin+0x13bc60
04 MSVCR100!endthreadex+0x43
05 MSVCR100!endthreadex+0xdf
06 KERNEL32!BaseThreadInitThunk+0x14
07 ntdll!RtlUserThreadStart+0x21

My conclusion:

Firebird while crash didn't have plugin's functions (from plugin memory region) on the call stack so I think this is issue in engine12.dll
It may fails to handle database file for encryption due to overflow (??)
There is an Access Violation:
lock btr dword ptr [rcx+8],0 ds:00000000`00000028
RCX is 0x20 while crash
This is third instruction in RtlEnterCriticalSection

ntdll!RtlEnterCriticalSection:
00007ffb`2d040bc0 4883ec28 sub rsp,28h
00007ffb`2d040bc4 65488b042530000000 mov rax,qword ptr gs:[30h]
00007ffb`2d040bcd f00fba710800 lock btr dword ptr [rcx+8],0 ds:00000000`00000028=????????

Also, right before crash at Engine12+0x1d7ba6 we got this

00007ffa`eadd7b8e e89da31b00 call Engine12!firebird_plugin+0x127400 (00007ffa`eaf91f30) << setting rdi+0x20 to null I assume
00007ffa`eadd7b93 488b4f20 mov rcx,qword ptr [rdi+20h] << here we got 0 in RCX
00007ffa`eadd7b97 4883c120 add rcx,20h << RCX is now 0x20
00007ffa`eadd7b9b 48894c2440 mov qword ptr [rsp+40h],rcx
00007ffa`eadd7ba0 ff1522362500 call qword ptr [Engine12!firebird_plugin+0x1c0698 (00007ffa`eb02b1c8)] << jump to RtlEnterCriticalSection, then crash

EDIT: Forgot to send .DMP File (with WinDBG), crashing thread 0x1714
http://ge.tt/4kYgBrp2

@firebird-automations
Copy link
Collaborator Author

Commented by: @hvlad

dmp file is useless as it shows usual process stopped on the breakpoint, sorry

Could you provide full memory dump at the crash (AV) moment ?

@firebird-automations
Copy link
Collaborator Author

Commented by: @hvlad

Also, please explain what is "break up" here:

> I have written encryption plugin, it works fine but yesterday it break up the file.

What error(s) did you see ?
What messages was put to the firebird.log at that moment ?

Also, you could try to validate database with gfix -v -full ?

@firebird-automations
Copy link
Collaborator Author

Commented by: Daniel Mazur (danielmazur)

http://ge.tt/3gxBCrp2
Here is DMP on breakpoint right before RtlEnterCriticalSection
DMP file is made via TaskManager
If I doing it wrong, please tell me how to do it property

There wasn't any information in log file, only at firebird was closed abnormally and access violation code in decimal

> What error(s) did you see ?
Access Violation
> Also, you could try to validate database with gfix -v -full ?
No output in console after running gfix -v -full path -user SYSDBA -pa pwd, so I think file is fine
But there is change
when I did ALTER DATABASE encrypt on gfixed file, it crashed after 2-3 seconds then server died
After restart I could connect to this "encrypted" file but server dies right after connection (Access Violation in Engine12 but in another place)

Maybe you will give me your e-mail and I will send your credentails to AnyDesk or any other RD application so you will check it on your own.

@firebird-automations
Copy link
Collaborator Author

Commented by: @hvlad

> Here is DMP on breakpoint right before RtlEnterCriticalSection
What do you expect i should see here ? I really not undertand :(

> If I doing it wrong, please tell me how to do it property
I assume you have broken database and Firebird crashes every time you trying to connect to it.
If it is correct, you may attach WinDbg to the running firebird.exe before the crash, then reproduce
crash and WinDbg should stop and show you exception. At this point you may save full memory dump
using command .dump /ma <file.dmp>
Or you may use WER to create crash dump, read
https://msdn.microsoft.com/en-us/library/windows/desktop/bb787181.aspx

> There wasn't any information in log file, only at firebird was closed abnormally and access violation code in decimal
>
> > What error(s) did you see ?
> Access Violation

Exact and full message, please

> > Also, you could try to validate database with gfix -v -full ?
> No output in console after running gfix -v -full path -user SYSDBA -pa pwd, so I think file is fine
It is good, but - did you validate "broken file" ?
Are you sure, gfix was not crashed ?
There should be messages in firebird.log about validation start and finish - are they both present ?

> Maybe you will give me your e-mail and I will send your credentails to AnyDesk or any other RD application so you will check it on your own.
Sorry, i have no time to do it online

@firebird-automations
Copy link
Collaborator Author

Commented by: Daniel Mazur (danielmazur)

gfix:Validation finished: 0 errors, 0 warnings, 0 fixed on non-crypted
gfix: Validation finished: 0 errors, 0 warnings, 0 fixed on crashing FB db file (crypted only few records)

Line you have asked about: "C:\Program Files\Firebird\Firebird_3_0\firebird.exe": terminated abnormally (4294967295)

About this dump there is breakpoint at 01 Engine12+0x1d7ba0 while dumped. So you can read state of registers and stack in WinDBG
This is a line before call to RtlEnterCriticalSection with null pointer which is actually crashing whole process.

@firebird-automations
Copy link
Collaborator Author

Commented by: Daniel Mazur (danielmazur)

Now I will try debug with PDB version of FB

EDIT:

Breakpoint 0 hit
Engine12!Jrd::CryptoManager::cryptThread+0x6e0:
00007ff9`7b3f7ba0 ff1522362500 call qword ptr [Engine12!_imp_EnterCriticalSection (00007ff9`7b64b1c8)] ds:00007ff9`7b64b1c8={ntdll!RtlEnterCriticalSection (00007ff9`b3bd0bc0)}
0:017> g
(22b8.1f68): Access violation - code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
ntdll!RtlEnterCriticalSection+0xd:
00007ff9`b3bd0bcd f00fba710800 lock btr dword ptr [rcx+8],0 ds:00000000`00000028=????????

While alter database encrypt with Alex Peshkov example plugin
Now I will try to add source files to debugger to get clear call stack for find problematic piece of code

@firebird-automations
Copy link
Collaborator Author

Commented by: @hvlad

Is it looks like this one ?

00 ntdll!RtlEnterCriticalSection+0xd
01 engine12!Jrd::CryptoManager::cryptThread+0x6e6 [c:\fb30\b3_0_release\prod_r3_0_3\firebird3\src\jrd\cryptomanager.cpp @ 970]
02 engine12!`anonymous namespace'::cryptThreadStatic+0x9 [c:\fb30\b3_0_release\prod_r3_0_3\firebird3\src\jrd\cryptomanager.cpp @ 63]
03 engine12!threadStart+0x50 [c:\fb30\b3_0_release\prod_r3_0_3\firebird3\src\common\threadstart.cpp @ 93]
04 msvcr100!endthreadex+0x43
05 msvcr100!endthreadex+0xdf
06 kernel32!BaseThreadInitThunk+0x14
07 ntdll!RtlUserThreadStart+0x21

@firebird-automations
Copy link
Collaborator Author

Commented by: Daniel Mazur (danielmazur)

https://i.imgur.com/m48IVaK.png

Yeah exactly, here is the screenshot with source files.
cryptomanager.cpp function void CryptoManager::cryptThread() lines

ThreadContextHolder tdbb(att->att_database, att, &status_vector);
tdbb->tdbb_quantum = SWEEP_QUANTUM; << crash here, tdbb is not initialized for sure, the question is why?

@firebird-automations
Copy link
Collaborator Author

Commented by: @hvlad

I see line 970 a bit different

			RefPtr<JAttachment\> jAtt\(REF\_NO\_INCR, dbb\.dbb\_provider\-\>attachDatabase\(&status\_vector,
				dbb\.dbb\_database\_name\.c\_str\(\), writer\.getBufferLength\(\), writer\.getBuffer\(\)\)\);
			check\(&status\_vector\);

			MutexLockGuard attGuard\(\*\(jAtt\-\>getStable\(\)\-\>getMutex\(\)\), FB\_FUNCTION\);   <<< here
			Attachment\* att = jAtt\-\>getHandle\(\);

and the reason of crash is obvious - jAtt refers to NULL attachment

jAtt class Firebird::RefPtrJrd::JAttachment\
ptr 0x00000000`0296eba8 class Jrd::JAttachment *
att 0x00000000`00000000 class Jrd::StableAttachmentPart *

Looks like some problem at attachDatabase() call above.

Is it attach to the 'broken database' ?
Is it the same database where validation found no problem ?

@firebird-automations
Copy link
Collaborator Author

Commented by: Daniel Mazur (danielmazur)

Exactly, gfix found no problems in both files. I'm novice in FB code so I could not find the real reason in this specific case.
In not-crypted db file, let's call it ORIG.GDB here I can work without any problems.
When I exec ALTER DATABASE encrypt it crashes after few moments, when I've compared both files, ORIG.GDB and ENCR.GDB and they have different few kb at begining. Firebird can't handle ENCR.GDB anymore after failed ALTER, there is a need of binary reconstruction in case if someone wouldn't have backup of ORIG.GDB

That's why this is critical. I've trusted in Firebird's encryption interfase and sadly but got this issue. Database we speak in this thread comes from "production" of company which got many queries per day. DB get broken I they lost data from few hours which is very, very painful.
There is a need of a patch because new europian law about data security lead to situation where companies encrypt their databases. In my opinion encryption on Firebird in this stare is unstable.

So now, what to do to avoid the crash? ORIG.GDB is handled normally, gfix didnt get any troubles while scanning the file.

@edit:

>jAtt refers to NULL attachment

there is if (!att) line below, it shouldbe returns true in this case when jAtt.getHandle returns NULL?
@edit2:
Now I see i got different 970 line so getHandle is AV'ing here

@firebird-automations
Copy link
Collaborator Author

Commented by: @hvlad

Look at source file with correct line numbers:

https://github.com/FirebirdSQL/firebird/blob/R3_0_3/src/jrd/CryptoManager.cpp

The crash is here:

MutexLockGuard attGuard(*(jAtt->getStable()->getMutex()), FB_FUNCTION);

the reason is that jAtt->getStable() == NULL

BTW, could you check with current snapshot build of 3.0.4 ?

@firebird-automations
Copy link
Collaborator Author

Commented by: Daniel Mazur (danielmazur)

Well, on 3.0.4 snapshot is the same
Very strange because if the db file were broken attach() from interface.cpp would fail, now it returns NULL which lead to AV's

@firebird-automations
Copy link
Collaborator Author

Commented by: @hvlad

Lets take a look from another side.
You said you wrote encryption plugin. Also you said you tried encryption plugin from examples cryptDB.pas.
Does you built it with Delphi ?
Does bitness of plugin is the same as Firebird ?
cryptDB.pas not set IsMultiThread to true (it should be done in Initialization section, iirc), does you set it in your plugin ?
Could you try to build cryptDB.pas with IsMultiThread := true and try it ?

@firebird-automations
Copy link
Collaborator Author

Commented by: Daniel Mazur (danielmazur)

IsMultiThread := true does not help

I'm compiling it with FPC as it stands in header of cryptdb.pas

@firebird-automations
Copy link
Collaborator Author

Commented by: Daniel Mazur (danielmazur)

Bitness is the same, otherwise it cant be loaded.
Tried on 32bit version of Firebird with Peshkov plugin compiled under Delphi - same issue, NULL jAtt
Only one thing was different, when I have tried to open broken db file after ALTER DATABASE, it shows me the error isc_attach_database failed

@firebird-automations
Copy link
Collaborator Author

Commented by: Daniel Mazur (danielmazur)

What Can I do now? Just wait?

@firebird-automations
Copy link
Collaborator Author

Commented by: @hvlad

We think we found the reason for AV, now looking for correct fix for it

@firebird-automations
Copy link
Collaborator Author

Commented by: Daniel Mazur (danielmazur)

Yeah I have spoken with Mr. Peshkov, now I know everything.
Thanks you, this feature is urgently needed

@firebird-automations
Copy link
Collaborator Author

Modified by: @AlexPeshkoff

assignee: Alexander Peshkov [ alexpeshkoff ]

@firebird-automations
Copy link
Collaborator Author

Commented by: @hvlad

Try next snapshot build (after 3.0.4.32974), please

@firebird-automations
Copy link
Collaborator Author

Commented by: Daniel Mazur (danielmazur)

At version 3.0.4.32977 where were you changed pointers to AutoPtr, 6.7GB database is crypting and decrypting fine. I will also check both on x86 and x64 and with my plugin. But I'm pretty sure that now it will work fine.

My proposal is to add information in Firebird Encryption document about CORE5830's fix or commit it into latest official version to avoid futher problems with other users.
Thank you

@firebird-automations
Copy link
Collaborator Author

Modified by: @AlexPeshkoff

status: Open [ 1 ] => Resolved [ 5 ]

resolution: Fixed [ 1 ]

Fix Version: 4.0 Beta 1 [ 10750 ]

Fix Version: 3.0.4 [ 10863 ]

@firebird-automations
Copy link
Collaborator Author

Modified by: @pavel-zotov

status: Resolved [ 5 ] => Resolved [ 5 ]

QA Status: No test => Cannot be tested

@firebird-automations
Copy link
Collaborator Author

Modified by: @pavel-zotov

status: Resolved [ 5 ] => Closed [ 6 ]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment