Issue Details (XML | Word | Printable)

Key: CORE-3050
Type: Bug Bug
Status: Closed Closed
Resolution: Fixed
Priority: Major Major
Assignee: Vlad Khorsun
Reporter: Neil Pickles
Votes: 0
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
Firebird Core

Race condition in LocksCache::get() could lead to AV in the engine

Created: 16/Jun/10 12:54 PM   Updated: 02/Dec/11 10:59 AM
Component/s: Engine
Affects Version/s: 2.1.3, 2.0.6
Fix Version/s: 2.1.4, 2.0.7

Time Tracking:
Not Specified

Environment:
Windows XP SP3, x86 hardware.
SuperServer
Issue Links:
Duplicate
 

Planning Status: Unspecified


 Description  « Hide
Originally thought to be related to CORE-2900 as that was the only other issue I could find showing "C:\Program Files\Firebird\Firebird_2_1\bin\fbserver.exe": terminated abnormally (4294967295). But since I have supplied a couple of crash dump files, that have been interpreted by Vlad Khorsun, I have been told to open a new issue as it is apparently unrelated to CORE-2900.

We have a client who has 100 sites with 3 to 4 WIndows XP SP3 PC's running our epos application. We were originally running Firebird v1.5 but upgraded to v2.1.3 after we came across a number of issues with v1.5 and the advice seemed to be to upgrade as the issues weren't going to be fixed in v1.5 anytime soon, if ever, not even in v1.5.6. We had been running a number of sites on v2.1.3 for many months without any issues like this.

After we had finished migrating all their sites to v2.1.3, and backed up and restored the databases to ODS 11.1, we began to see a number of occasions where, seemingly at random, Firebird would just stop responding on the server machine. All that was required to get things moving again was a quick restart of Firebird but that was a problem as it could occur many times a day at some sites and never at others.

The firebird.log file typically shows this:-

SVRA0000 (Client) Sun May 23 13:09:01 2010
"C:\Program Files\Firebird\Firebird_2_1\bin\fbserver.exe": terminated abnormally (4294967295)

SVRA0000 (Client) Sun May 23 13:09:01 2010
INET/inet_error: read errno = 10054

SVRA0000 (Client) Sun May 23 13:09:01 2010
INET/inet_error: read errno = 10054

SVRA0000 (Client) Sun May 23 13:09:01 2010
INET/inet_error: read errno = 10054

SVRA0000 (Client) Sun May 23 13:09:01 2010
INET/inet_error: send errno = 10054

SVRA0000 (Client) Sun May 23 13:09:01 2010
INET/inet_error: send errno = 10054

SVRA0000 (Client) Sun May 23 13:09:01 2010
INET/inet_error: send errno = 10054

SVRA0000 (Client) Sun May 23 13:09:01 2010
INET/inet_error: send errno = 10054

SVRA0000 (Client) Sun May 23 13:09:01 2010
INET/inet_error: send errno = 10054

SVRA0000 (Client) Sun May 23 13:09:01 2010
INET/inet_error: send errno = 10054

SVRA0000 (Client) Sun May 23 13:09:01 2010
REMOTE INTERFACE/gds__detach: Unsuccesful detach from database.
Uncommitted work may have been lost

SVRA0000 (Client) Sun May 23 13:09:01 2010
REMOTE INTERFACE/gds__detach: Unsuccesful detach from database.
Uncommitted work may have been lost

SVRA0000 (Client) Sun May 23 13:09:01 2010
INET/inet_error: send errno = 10054

SVRA0000 (Client) Sun May 23 13:09:01 2010
INET/inet_error: send errno = 10054

SVRA0000 (Client) Sun May 23 13:09:01 2010
INET/inet_error: read errno = 10054

SVRA0000 (Client) Sun May 23 13:09:01 2010
INET/inet_error: read errno = 10054

SVRA0000 (Client) Sun May 23 13:09:03 2010
Guardian starting: "C:\Program Files\Firebird\Firebird_2_1\bin\fbserver.exe"

SVRA0000 (Client) Sun May 23 13:19:00 2010
INET/inet_error: read errno = 10054

I have made sure that all clients are using the correct GDS32.DLL that was produced by the v2.1.3 installer routine.

There are some details of a couple of crash dumps and Dr Watsons logs on the CORE-2900 issue thread.

As I have said, sometimes it falls over several times in a day, other times it runs for days, weeks or even months without there being a problem. All sites are pretty much identical and in terms of firebird config and our epos application, they are identical.

 All   Comments   Work Log   Change History   Version Control   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Neil Pickles added a comment - 16/Jun/10 12:55 PM
You can now download the latest crash dump file I have, zipped with 7zip, from http://news.csy.co.uk/leedscrashdump2.7z, it's 96 Meg.

Vlad Khorsun added a comment - 16/Jun/10 02:45 PM
It is again different issue and *probably* it is already fixed in 2.1.4, see CORE-2698.
More details after more careful analyze.

BTW, the problem code is absent in v2.5

Neil Pickles added a comment - 17/Jun/10 07:10 AM
What about the first crash dump that was attached to the CORE-2900 thread ?

Is that a firebird issue or a development environment issue ?

Neil Pickles added a comment - 17/Jun/10 07:17 AM
Something I forgot to mention when describing the issue above is that we are working with quite large databases with this client, larger than 7Gb as a single file.

Vlad Khorsun added a comment - 17/Jun/10 07:38 AM
> What about the first crash dump that was attached to the CORE-2900 thread ?
> Is that a firebird issue or a development environment issue ?

This is bug in Firebird, related with process shutdown code. Could you create separate ticket for it ?

Vlad Khorsun added a comment - 01/Jul/10 01:54 PM
Better reflect nature of bug