Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bad performance / slow response when many concurrent sorts are executed [CORE3989] #4321

Closed
firebird-automations opened this issue Nov 20, 2012 · 11 comments

Comments

@firebird-automations
Copy link
Collaborator

Submitted by: @pavel-zotov

Votes: 2

The issue manifests itself as slow server response under high load when many concurrent connections are performing external sorts (i.e. PLAN SORT). Backtrace shows that many threads are spending unexpectedly long time in system calls mmap/munmap. Besides being a partucular connection's problem, it's also blocking the AST delivery thus affecting other connections as well. In Classic / SuperClassic it may result to temporary (up to multiple seconds) server freezes.

The problem is that the every sort (regardless of its size) needs 128KB of memory for the first level buffer and such big allocations are redirected to the operating system. With a high number of relatively small (read: cheap) sorts the engine tends to spend more time mapping/unmapping the memory than sorting the records.

The proposed solution is to cache a few recently used sort buffers and reuse them in subsequent sorts rather than dealing with the system memory manager. The field testing has proved that being an effective measure.

Commits: b7372ba 7118059 9f60a27 FirebirdSQL/fbt-repository@4bfa9e8 FirebirdSQL/fbt-repository@0be7c2c

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

assignee: Dmitry Yemanov [ dimitr ]

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

status: Open [ 1 ] => In Progress [ 3 ]

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

reporter: Dmitry Yemanov [ dimitr ] => Pavel Zotov [ tabloid ]

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

description: The issue manifests itself as slow server response under high load when many concurrent connections are performing external sorts (i.e. PLAN SORT). Backtrace shows that many threads are spending unexpectedly long time in system calls mmap/munmap. Besides being a partucular connection's problem, it's also blocking the AST delivery thus affecting other connections as well. In Classic / SuperClassic it may result to temporary (up to multiple seconds) server freezes.

The problem is that the every sort (regardless of its size) needs 128KB of memory for the first level buffer and such big allocations are redirected to the operating system. With a high number of relatively small (read: cheap) sorts the engine tends to spend more time mapping/unmapping the memory than sorting the records.

The proposed solution is to cache a few recently used sort buffers and reuse them in subsequent sorts rather than dealing with the system memory manager. The field tesing has proved that being an effective measure.

=>

The issue manifests itself as slow server response under high load when many concurrent connections are performing external sorts (i.e. PLAN SORT). Backtrace shows that many threads are spending unexpectedly long time in system calls mmap/munmap. Besides being a partucular connection's problem, it's also blocking the AST delivery thus affecting other connections as well. In Classic / SuperClassic it may result to temporary (up to multiple seconds) server freezes.

The problem is that the every sort (regardless of its size) needs 128KB of memory for the first level buffer and such big allocations are redirected to the operating system. With a high number of relatively small (read: cheap) sorts the engine tends to spend more time mapping/unmapping the memory than sorting the records.

The proposed solution is to cache a few recently used sort buffers and reuse them in subsequent sorts rather than dealing with the system memory manager. The field testing has proved that being an effective measure.

@firebird-automations
Copy link
Collaborator Author

Commented by: @dyemanov

The "easy fix" has been committed into v2.5.3. A more intelligent solution will be investigated for v3.0.

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

Fix Version: 2.5.3 [ 10461 ]

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

status: In Progress [ 3 ] => Open [ 1 ]

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

status: Open [ 1 ] => Resolved [ 5 ]

resolution: Fixed [ 1 ]

Fix Version: 3.0 Alpha 2 [ 10560 ]

@firebird-automations
Copy link
Collaborator Author

Modified by: @pcisar

status: Resolved [ 5 ] => Closed [ 6 ]

@firebird-automations
Copy link
Collaborator Author

Modified by: @pavel-zotov

status: Closed [ 6 ] => Closed [ 6 ]

QA Status: No test

@firebird-automations
Copy link
Collaborator Author

Modified by: @pavel-zotov

status: Closed [ 6 ] => Closed [ 6 ]

QA Status: No test => Cannot be tested

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment