Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2.5 beta 1 has huge, fast-growing log files [CORE2462] #2875

Closed
firebird-automations opened this issue May 13, 2009 · 27 comments
Closed

2.5 beta 1 has huge, fast-growing log files [CORE2462] #2875

firebird-automations opened this issue May 13, 2009 · 27 comments

Comments

@firebird-automations
Copy link
Collaborator

Submitted by: Hannes Lowette (hanneslowette)

Jira_subtask_outward CORE2645
Relate to DNET275
Relate to CORE2656

Attachments:
BigLogFiles.zip

When installing Firebird 2.5 beta 1, the engine creates gigantic log files (as in 40GB/ week) for a simple development server that is not at all heavily used.

Eventually, this brings the whole machine to its knees if the disk reaches 0 bytes free space.

Haven't figured out yet if i can be fixed in the settings file, just deleting the log file from time to time now.

Commits: 14e9ed6

@firebird-automations
Copy link
Collaborator Author

Commented by: Hannes Lowette (hanneslowette)

Log file is full with this:

SMS-SERVER2 (Server) Wed May 13 11:55:57 2009
INET/inet_error: send errno = 10054

@firebird-automations
Copy link
Collaborator Author

Commented by: @hvlad

a) which server version you have used before v2.5 ?
b) inet error 10054 means aborted client connection (look for this error code in MSDN), search for reason of network errors

@firebird-automations
Copy link
Collaborator Author

Commented by: @cincuranet

If it's a development server, the 10054 error may be caused by killing your application in debugger and thus forcing the connection to close.

@firebird-automations
Copy link
Collaborator Author

Commented by: Hannes Lowette (hanneslowette)

Vlad:
a. We used 2.1.1 until we started our .NET Entity Framework development. We started using various 2.5 snapshot builds after the alpha (the alpha still had the left outer join bug that prevented us from generating accurate models from the database, so snapshots were our only option). This problem started to arise when we started switching development machines to 2.5 beta 1 in order to get a uniform environment again.
b. I already checked the network, everything seems to be OK on that side. Our network is rock stable for any other application.

Jiri:
Yes, This seems about the only plausible cause for dropped connections. We are currently doing a lot of .NET development using the 2.5 beta Data Provider (which is working like a charm by the way, thanks for all the work on that)

So ... I ran the following tests:

1. Stop FB service
2. Delete firebird.log
3. Start FB service
4. Wait a while
observation: no logfile entries
5. Make a couple of connections using IBExpert.
observation: no logfile entries
6. Do some .NET development using EF.
observation: starts writing log entries like hell.

About 1800 per second.

There is no way this machine lost 1800 connections per second from 2 developers in 2 minutes.

So either something is off in the .NET dataprovider, or the engine is getting into an infinite loop.

By the way, stopping the client machines does not stop the log file writing. We had 9GB today overnight when nobody was here at the office (developers take their laptops home)
The whole machine is getting to 100% CPU time on its firebird core, so I guess 1800 is more of a hardware limitation.

@firebird-automations
Copy link
Collaborator Author

Commented by: @hvlad

Do you have small test application to reproduce the issue ?

@firebird-automations
Copy link
Collaborator Author

Commented by: Hannes Lowette (hanneslowette)

I've been trying to write an application that reproduces this situation consistently, but no success.

I found it to be in the events part of the code. Using a small .NET program to raise & catch events, I get to this situation from time to time. Sometimes the loop is infinite, sometimes it stops after a few MB's of logfile.

I'll attach the project and the database backup to this issue, but you might need a number of tries before it goes wrong.

@firebird-automations
Copy link
Collaborator Author

Commented by: Hannes Lowette (hanneslowette)

.fbk and test project

You might have trouble reproducing the problem. Just try a few times and keep an eye on your CPU and your logfile.

@firebird-automations
Copy link
Collaborator Author

Modified by: Hannes Lowette (hanneslowette)

Attachment: BigLogFiles.zip [ 11452 ]

@firebird-automations
Copy link
Collaborator Author

Commented by: Hannes Lowette (hanneslowette)

Running the same code on a 2.1 server does NOT cause this problem.

My colleague tried this tonight.

@firebird-automations
Copy link
Collaborator Author

Modified by: @AlexPeshkoff

assignee: Alexander Peshkov [ alexpeshkoff ]

@firebird-automations
Copy link
Collaborator Author

Commented by: @AlexPeshkoff

Hannes, I have problems running BigLogFiles.vshost.exe under mono. Can you build it in the same manner as BigLogFiles.exe, i.e. avoiding without get_HostingProcessInitialized. Currently I get following error:

The following assembly referenced from /usr/home/firebird/tests/BigLogFiles/BigLogFiles/BigLogFiles/bin/Debug/BigLogFiles.vshost.exe could not be loaded:
Assembly: Microsoft.VisualStudio.HostingProcess.Utilities.Sync

Without vshost I've failed to reproduce a problem.

@firebird-automations
Copy link
Collaborator Author

Commented by: Hannes Lowette (hanneslowette)

I have no experience whatsoever with Mono. So I'm not sure I can get you what you need. Maybe someone with Visual Studio 2008 (Jiri maybe, as we're using his driver) can try this and reproduce the problem.

I do know that:

- Using this program and this database, I do manage to reproduce the problem on any system I've thrown it at. It might take a few times before I get the problem to show up.
- We have it on our development server very very frequently (we kill and restart FB every half hour or so) if people are coding.
- The latest snapshot builds still have this problem (we try new snapshots whenever you throw them on the site)

@firebird-automations
Copy link
Collaborator Author

Commented by: Hannes Lowette (hanneslowette)

On another account ... how can we leverage our Silver sponsorship to raise the priority of this issue?

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

assignee: Alexander Peshkov [ alexpeshkoff ] => Dmitry Yemanov [ dimitr ]

@firebird-automations
Copy link
Collaborator Author

Commented by: @dyemanov

It seems I can reproduce the issue. Already investigating it.

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

status: Open [ 1 ] => In Progress [ 3 ]

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

status: In Progress [ 3 ] => Open [ 1 ]

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

status: Open [ 1 ] => Resolved [ 5 ]

resolution: Fixed [ 1 ]

Fix Version: 2.5 RC1 [ 10300 ]

@firebird-automations
Copy link
Collaborator Author

Commented by: Hannes Lowette (hanneslowette)

Using the latest snapshot build (29/06/2009), we still have this issue.

I suggest you fix it ... or the bird gets it:

http://smstiming.fast-inet.com/Data/ClientFiles/63/sparky_hung.jpg

@firebird-automations
Copy link
Collaborator Author

Modified by: Hannes Lowette (hanneslowette)

priority: Minor [ 4 ] => Major [ 3 ]

Version: 2.5 Beta 2 [ 10300 ]

@firebird-automations
Copy link
Collaborator Author

Commented by: Hannes Lowette (hanneslowette)

2.5 beta 2 Still does this!

@firebird-automations
Copy link
Collaborator Author

Modified by: @cincuranet

Link: This issue relate to DNET275 [ DNET275 ]

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

Link: This issue relate to CORE2656 [ CORE2656 ]

@firebird-automations
Copy link
Collaborator Author

Modified by: @pcisar

status: Resolved [ 5 ] => Closed [ 6 ]

@firebird-automations
Copy link
Collaborator Author

Commented by: Vladimir Krapotkin (krapotkin)

I have found the same issue in Firebird-2.5.0.26074_1_Win32
Win2003SP3, Win 7 Pro
Applications are connected throw localhost:xxxx
From the moment X fbserver writes "unable to complete request to localhost"
and log grows very fast with errors 10038 and 10054

The ISSUE reproduced in 2.1.4 but count of bad requests is less by 100 times comparing to 2.5

@firebird-automations
Copy link
Collaborator Author

Modified by: @pavel-zotov

QA Status: No test

@firebird-automations
Copy link
Collaborator Author

Modified by: @pavel-zotov

status: Closed [ 6 ] => Closed [ 6 ]

QA Status: No test => Cannot be tested

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants