New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Classic Server could hung with (near) 100% CPU load [CORE4615] #4930
Comments
Modified by: @hvladassignee: Vlad Khorsun [ hvlad ] |
Commented by: @hvlad Disable checkouts in PIO code when AST is handled. In its current state v3 doesn't require this fix. |
Modified by: @hvladstatus: Open [ 1 ] => Resolved [ 5 ] resolution: Fixed [ 1 ] Fix Version: 2.5.4 [ 10585 ] |
Commented by: @hvlad Reassign ticket author to the most recent user who helps to investigate the issue |
Modified by: @hvladreporter: Vlad Khorsun [ hvlad ] => Yurij [ yurij ] |
Commented by: Konstantin Streletsky (streletsky) Some details: at the time of hung in processes we saw Process Name PercentProcessorTime in mon$statements was ordinary queriers of clients ... those clients do similar actions(select,insert,update) and usually use one table, with near 10k count of records Interesting thing, that old server works fine(without hung), and only one difference that it uses firebird 32 bit. One week had passed, since we have installed firebird 32bit version instead of 64bit on new server and no any hung... Hope it would help to test and fix . |
Commented by: @hvlad Konstantin, thank you for sharing this but... it is almost useless as we can't know if this is the same issue or something different. |
Commented by: @hvlad Another customer still experienced this issue even after initial fix. |
Modified by: @pcisarstatus: Resolved [ 5 ] => Closed [ 6 ] |
Submitted by: Yurij (yurij)
There are few reports from few customers that under some (not very well known) conditions Classic Server (or SuperClassic) could
stop response. At least one process could use almost 100% of CPU (core). Almost no IO. The issue is very rare, Firebird could
work days or weeks without a problem.
Memory dump shows very deep recursive calls of CCH\downgrade() function. Sometimes, in SuperClassic we see the cases when
another thread runs also very deep calls of CCH\write_buffer() function.
It was never reproduced by me, so i don't know exact reason for this issue. There is an idea that while AST thread writes pages and
cleans dependencies, worker thread doing some work (garbage collection of a very long versions chain, for example) and re-creates
same dependencies, forcing AST thread to clean them again and again.
In attempt to fix it we disabled engine checkouts when thread handles AST routine. It makes worker thread to wait while AST is processed.
Must note, that before v2.5 engine always works this way. Customers with private build was satisfied and i decided to commit the patch.
Commits: 3c74b75 1f5527b 4efcd0a FirebirdSQL/fbt-repository@68244df FirebirdSQL/fbt-repository@6fe15d7 FirebirdSQL/fbt-repository@c072361
The text was updated successfully, but these errors were encountered: