Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Firebird crashes due to concurrent operations with expression indices [CORE5980] #6232

Closed
firebird-automations opened this issue Jan 7, 2019 · 22 comments

Comments

@firebird-automations
Copy link
Collaborator

Submitted by: Svend Meyland Nicolaisen (smndk)

Attachments:
Capture.PNG

We have an application which creates a firebird database from scratch, populating it with a large number of tables, views, indices, stored procedures etc. On servers that runs on hardware that are slower than production systems normally are, the Firebird server crashes every time our application tires to create the database. The statement that apparently crashes the server does not crash the server when executed at any other time, so I am thinking it is a result of some or all of the previous statements. Also - the server does not crash (every time - I might have seen it once or twice) on faster systems, which makes me think that is some sort of race condition in the server.

I have tried to connect WinDbg to the server, and an access violation repeatedly begins to occur several statements before the fatal statement. All of the statements (except for the fatal one) however executes as expected. Several access violations occur before the server crashes (too many to count). I have attached a screen dump of the call stack when one of the access violations occur.

I am experiencing the problem with Firebird 64 bit server version 2.5.6 and 2.5.8 running as SuperServer on Windows. I haven't tested with other versions of Firebird.

Our software works perfectly on all our production systems, but these crashes worries me a bit, as it indicates that there is some problem that could appear any time at any system.

I don't know if this is a problem that the Firebird developer group will look into, or if the 2.5 version is permanently closed. If it is, I will gladly help with further information.

Kind regards,
Svend

Commits: 99ca7f0 f8a2a36 50ca09a

====== Test Details ======

See details in .fbt

@firebird-automations
Copy link
Collaborator Author

Modified by: Svend Meyland Nicolaisen (smndk)

Attachment: Capture.PNG [ 13315 ]

@firebird-automations
Copy link
Collaborator Author

Commented by: @hvlad

Reproducible test case will be ideal to have.
Or, at least, full memory dump when first AV hapens.

@firebird-automations
Copy link
Collaborator Author

Commented by: Svend Meyland Nicolaisen (smndk)

Hi Vlad
A zipped full memory dump are larger than 10 MB. How can I upload that?

@firebird-automations
Copy link
Collaborator Author

Commented by: @hvlad

Upload to any filesharing service and send me the url, please (http://ge.tt for example)

Also, could you try to run script using current snapshot build of 2.5.9 ?

http://web.firebirdsql.org/downloads/snapshot_builds/win/2.5/

@firebird-automations
Copy link
Collaborator Author

Commented by: Svend Meyland Nicolaisen (smndk)

Here you go:

http://ge.tt/4LZpsnt2

I will try to try 2.5.9.

@firebird-automations
Copy link
Collaborator Author

Commented by: Svend Meyland Nicolaisen (smndk)

I think there is problems with the download page for the snapshot builds. I am not able to download any of them.

@firebird-automations
Copy link
Collaborator Author

Commented by: Svend Meyland Nicolaisen (smndk)

I have a little correction to my description of the problem. It is the last statement that initiates the access violations which results in a server crash. All other statements executes successfully with no access violations.

@firebird-automations
Copy link
Collaborator Author

Commented by: @hvlad

I also see the problem with snapshot's host.
I'll inform you when it gets fixed.

@firebird-automations
Copy link
Collaborator Author

Commented by: @hvlad

Two things FYI:
- i downloaded your dump and investigating it
- snapshots could be downloaded now

@firebird-automations
Copy link
Collaborator Author

Commented by: Svend Meyland Nicolaisen (smndk)

Great!

I am experiencing the same issues with the snapshot build.

@firebird-automations
Copy link
Collaborator Author

Modified by: @hvlad

assignee: Vlad Khorsun [ hvlad ]

@firebird-automations
Copy link
Collaborator Author

Commented by: @hvlad

Steps to reproduce:

0. Run Firebird SS and connect two isql sessions to the same database.

1. in isql session 1 run:

create table t1 (id int);
create index t1_idx on t1 computed by (id + 0);
insert into t1 values (1);
commit;

set plan on;
select * from t1 where id + 0 = 1;
exit;

output is:

PLAN (T1 INDEX (T1_IDX))

      ID

============
1

2. in isql session 2 run:

alter index t1_idx active;

Server crashed and output is:

Statement failed, SQLSTATE = 08006
connection lost to database

Note, release build could not crush, while debug build is crashed every time.
This is because debug build fills released memory with some non-zero bytes.

@firebird-automations
Copy link
Collaborator Author

Commented by: @hvlad

The fix is committed, please try next snapshot build (after 2.5.9.27125)

@firebird-automations
Copy link
Collaborator Author

Commented by: Svend Meyland Nicolaisen (smndk)

Fantastic! My application runs without problems with the 2.5.9.27126 snapshot build.

@firebird-automations
Copy link
Collaborator Author

Commented by: @hvlad

Thanks for confirmation

@firebird-automations
Copy link
Collaborator Author

Modified by: @hvlad

status: Open [ 1 ] => Resolved [ 5 ]

resolution: Fixed [ 1 ]

Fix Version: 4.0 Beta 1 [ 10750 ]

Fix Version: 2.5.9 [ 10862 ]

Fix Version: 3.0.5 [ 10885 ]

@firebird-automations
Copy link
Collaborator Author

Commented by: Omacht András (aomacht)

Hi Vlad!
Can you explain briefly what was the cause of the error?
Thanks!
András

@firebird-automations
Copy link
Collaborator Author

Commented by: @hvlad

The issue is related with expression indices and shared metadata cache in SuperServer.

It could happens when internal request, used to calculate index key, saves pointer to the attachment which first load and compiles this request.
Then this attachment is released and some other attachment going to recompile index expression. When old request is released,
"its" attachment (if present) is accessed - but that attachment was already released some time ago.
The bug is that pointer to the attachment at (shared) request should not be kept after execution of request.
The same is true for procedure's and trigger's shared requests but there was no such bug.

Hope it helps ;)

@firebird-automations
Copy link
Collaborator Author

Commented by: Omacht András (aomacht)

Yes Vlad, thanks!

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

summary: Firebird 2.5.6 & 25.8 server crash => Firebird crashes due to concurrent operations with expression indices

@firebird-automations
Copy link
Collaborator Author

Modified by: @pavel-zotov

status: Resolved [ 5 ] => Resolved [ 5 ]

QA Status: No test => Done with caveats

Test Details: See details in .fbt

Test Specifics: [Architecture (SS/CS) specific]

@firebird-automations
Copy link
Collaborator Author

Modified by: @pavel-zotov

status: Resolved [ 5 ] => Closed [ 6 ]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment