Issue Details (XML | Word | Printable)

Key: CORE-2507
Type: Bug Bug
Status: Closed Closed
Resolution: Fixed
Priority: Major Major
Assignee: Vlad Khorsun
Reporter: Richard Wesley
Votes: 0
Watchers: 1
Operations

If you were logged in you would be able to see more operations.
Firebird Core

Intermittent CreateFile failure

Created: 12/Jun/09 06:31 PM   Updated: 12/Nov/09 05:54 PM
Component/s: Engine
Affects Version/s: 2.0.0, 1.5.4, 2.0.1, 2.0.2, 2.0.3, 1.5.5, 2.1.0, 2.0.4, 2.1.1, 2.0.5, 2.1.2
Fix Version/s: 2.0.6, 2.1.4

Time Tracking:
Not Specified

File Attachments: 1. File B24294.TDE (4.58 MB)
2. File B24294.twbx (432 kB)
3. PNG File B24294a.png (62 kB)
4. PNG File B24294b.png (64 kB)

Environment: Windows XP SP 2

Planning Status: Considered for inclusion


 Description  « Hide
We have been seeing an intermittent problem with embedded Firebird
2.1.1 trying to open a local read-only database under Windows XP. The
database has been opened read-only by multiple processes and
periodically we get the following in our logs:

--- FB Error
---------------------------------------------------------------
File: db\FirebirdProtocol.cpp, Line: 2381
Status: 335544373
operating system directive CreateFile failed
-The system cannot find the file specified.
----------------------------------------------------------------------------

The line referenced is just calling isc_attach_database. Usually the
call succeeds and the file is definitely there.

The strange thing is that the Windows logging for the CreateFile call
shows two calls, one failing and then one succeeding, but Firebird
seems to not notice the second success!

After a little digging, it looks like the error is being generated in
ISC_map_file. I am suspicious now of gds__prefix_lock's ability to
come up with unique name in this situation...



 All   Comments   Work Log   Change History   Version Control   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Claudio Valderrama C. added a comment - 13/Jun/09 04:23 PM
One thing we can do to track this is to add the file name to the error message, but only if
- it's embedded
- you're dba
- you're db owner.

C.

Stefan Schultze added a comment - 15/Jun/09 09:50 PM
We have a similar problem that might be related:

We have a C# Windows Service that also uses Firebird Embedded 2.1.1. Our customers frequently report that this Windows Service sometimes doesn't come up with Windows. The error logged occurs when opening a database connection (isc_attach_database called internally) and also sais "operating system directive CreateFile failed".

I am very sure that the database file is always there because we check the existence of the database file before trying to open it.

One more thing that might be of interest is that this problem occurs way more often on newer Windows platforms (e.g. Windows Vista and Windows Server 2008). Never heard of the same problem from customers with Windows 2000 (there are quite a lot).

Richard Wesley added a comment - 19/Jun/09 06:20 PM
This is a screen shot from Tableau showing the calls to CreateFile during our server startup when the error occurred. Notice that all processes manage to create the lock file successfully except the first one. Even stranger, the initial create succeeds, but the process is told that the file does not exist about .8 ms later. Maybe this is some kind of timing bug in Windows?

Richard Wesley added a comment - 19/Jun/09 06:21 PM
This is the full sequence of events for all files.

Richard Wesley added a comment - 19/Jun/09 06:23 PM
This is a Firebird database containing the data in the first two images.

Richard Wesley added a comment - 19/Jun/09 06:25 PM
This is the Tableau workbook used to generate the first two images.
Free reader is available at <http://www.tableausoftware.com/products/reader-download>.

Richard Wesley added a comment - 19/Jun/09 06:38 PM
The call that is failing appears to be using the "Open" disposition instead of the "OpenIf" disposition. Thsi c an be seen by looking at the corresponding [Details] field for the failure record.

Richard Wesley added a comment - 19/Jun/09 07:04 PM
We suspect that the problem is that the lock file already exists. This is likely in a server startup scenario.

The logic in isc_sync.cpp appears to not handle this case correctly in embedded mode because the file is opened with FILE_FLAG_DELETE_ON_CLOSE but sets file_exists to true - even though the CloseHandle call a few lines lower down will cause the file to be deleted and NOT exist!

The workaround should be to delete the lock file on process startup. The process can generate the file name from the PID: fb_PID.lck in the user's temporary directory.

Vlad Khorsun added a comment - 20/Jun/09 09:47 AM
This could happen only if TEMP directory already contains fb_pid.lck file. It is possible if sometime ago process with the same PID crashed and windows not removed file (opened with FILE_FLAG_DELETE_ON_CLOSE !).

I think we should remove FILE_FLAG_DELETE_ON_CLOSE from the first call of CreateFile.
Objections ?

Note, v2.5 have no this issue at all.

Vlad Khorsun added a comment - 20/Jun/09 09:51 AM
It is easy to reproduce :

1. Start application, using embedded (isql_embed, for ex.), and don't connect to any database
2. Create file in %TEMP% with name fb_PID.lck, where PID is process id of application above
3. Connect to any database by application above

Dmitry Yemanov added a comment - 20/Jun/09 10:13 AM
No objections from me. At least, this is worth trying ;-)