New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
possible memory leak ? INET/select_wait: select failed, errno = 10055 followed by SRVR_multi_thread/RECEIVE: error on main_port, shutting down [CORE3316] #3683
Comments
Commented by: vander clock stephane (arkadia) it's crash again just now... this time nothing in the firebird.log and all the processor go to 100% CPU utilisation (see attached picture). i was force to manually kill the process fb_inet_server.exe. i take the memory dump (4 GO in size) before to kill the process |
Modified by: vander clock stephane (arkadia)Attachment: fullcpu.jpg [ 11893 ] |
Commented by: vander clock stephane (arkadia) ok, i can now reproduce the bug, il look like than simply doing : at the beginning, for the first 10 min, only one CPU is at 100% (that look normal), but after 10 min, the cpu go down for few second, and imediatly after all the 8 CPU go up to 100 % like on the picture ! i will download the database on local do to some test, but it's a huge database (+100Go) |
Commented by: @dyemanov What is the "http://pictures.name"'s data type? |
Commented by: vander clock stephane (arkadia) VARCHAR(50) CHARACTER SET ISO8859_1 COLLATE ISO8859_1 |
Commented by: vander clock stephane (arkadia) ok, the validation of the database just finish (it's take 24hours) the result are : Number of record level errors: 1 now why so much error in the database it's another story ... |
Commented by: @hvlad Stephane, i'm sure most of errors is not significant and caused by killing of firebird process. Also, attach here crash dump you have. About 10055 errors - probalby this will help : http://support.microsoft.com/default.aspx?scid=kb;EN-US;196271 |
Commented by: vander clock stephane (arkadia) Dear Vlad, > i'm sure most of errors is not significant and caused by killing of firebird process. yes that possible... the Firebird.log contain only SERVER12 Fri Jan 28 01:31:16 2011 followed by SERVER12 Fri Jan 28 01:31:16 2011 about the crash dump, it's 100 mb, and as it can contain sensible data i rather prefere to send it to you in private ... is it possible ? stephane |
Commented by: @hvlad I mean firebird.log located at the computer where you run validation. It contains descritions fro every (1+501+80) errors detected. As for crash dump - could you send to me download link (privatly, of course) ? |
Commented by: vander clock stephane (arkadia) dear vlad, for the firebird.log, i was mistaken because i do the gfix on a temp machine and forget to take the firebird.log at the end of the process. i still have the temp machine but in other office and when i will go again i will take it. but anyway i don't thing that it's was because of a database corruption, because today, on a fresh backuped/restored database we encountered the exact same mistake INET/select_wait: select failed, errno = 10055 followed by SRVR_multi_thread/RECEIVE: error on main_port, shutting down and the server become to be "frozen". i read carrefully the http://support.microsoft.com/default.aspx?scid=kb;EN-US;196271 about the 10055 but i stay doubtful because they say the default maximum number of ephemeral TCP ports is 5000. so it's mean that i use all the 5000 TCP connections ? how it's possible i have max 100 clients connected to firebird at that time, and it's a dedicated Firebird Server ! and as i know Firebird use ephemeral TCP port only for the event ? thanks by advance |
Commented by: vander clock stephane (arkadia) the Firebird.log file when i do the gfix. but as i receive also yesterday the 10055 error after a fresh backup/restore i m sure that the "corruption" of the database was not connected to it ... also as i still use the official release without the BugFix on the Event probleme, is it possible that it's connected to it ? i don't know some loop inside the FB_inet_server.exe that try to open all TCP_port ? because we use lot of event connection (one by client) thanks by advance |
Modified by: vander clock stephane (arkadia)Attachment: firebird.log [ 11904 ] |
Commented by: vander clock stephane (arkadia) Today the sweep of the database using gfix never return :( more than 12 hours it's run (yesterday it's take only 2 hours)... i become to desesperate :( |
Commented by: @hvlad > i read carrefully the http://support.microsoft.com/default.aspx?scid=kb;EN-US;196271 about the 10055 but i stay doubtful because Every TCP connections have 2 endpoints. The ports, we talking about, is from "remote" endpoint where "local" endpoint is <server_ip>:<server_port> (localhost:gds_db, for example) Port number is assigned by OS and don't indicate count of connections. More, sockets after disconnection are not reused by OS immediately therefore new sockets allocated for new connections gets increased port number. If there is a lot of short-lived connections it is easy to obtain "high" port numers for new connections. Could you look how many TCP connections in system exists at time when you have problem ? Either using netstat -p tcp -n or TcpView by SysInternals > also as i still use the official release without the BugFix on the Event probleme, is it possible that it's connected to it ? i don't know some It is possible while i don't remember issues with 1055 error. Anyway, few bugs related with events was fixed (for ex. CORE3119 and CORE3170), so using current snapshot of 2.5.1 is highly desirable. |
Commented by: @hvlad Searching tracker i found one issue re. error 10055, this is CORE1791. Hope this helps. |
Commented by: vander clock stephane (arkadia) actually (but i don't have any problem) netstat -p tcp -n show me this TCP 61.213.12.116:3389 95.214.25.89:3472 ESTABLISHED not look like too much ? anyway this night i will deploy also the current snapshot ! |
Commented by: vander clock stephane (arkadia) > Searching tracker i found one issue re. error 10055, this is CORE1791. of course no antivirus (dedicated firebird server win 2008 R2) and in the windows firewall the exception on the fb_inet_server.exe process ... |
Commented by: @hvlad I counted 56 connections at 61.213.12.120:3050 |
Commented by: vander clock stephane (arkadia) 61.213.12.120 => the IP of our Firebird Server on 61.213.12.120, it's a default installation of windows 2008 R2 64 bit Standard, so i thing by default it's possible to have much more than 63 TCP ? how can i check that i can have more than 63 TCP on 61.213.12.120 ? but i m sure yes ... thanks by advance |
Commented by: Roman Vanicek (roman) I confirm this (or very similar) bug. My setup is: Web server (Linux+Apache+PHP) and database server (Windows 2000 server SP4 + Firebird 2.5 SuperServer). We have been using Firebird 1.5 and 2.0 SuperServer in the same setup and this error did not appear in the logs. Now the server runs fine for about two weeks and then 'freezes'. The client gets the message "Unable to complete network request", in the server log appear the pair of messages: INET/select_wait: select failed, errno = 10013 But the server does not shut down and stays in the frozen state when it does not recieve connections and does not shut down. Restarting the Firebird service helps and it runs immediately again very fine. I have tested to open 100 connection at the same time and do some select on every one and this works fine. I will attach my log file shortly. |
Modified by: Roman Vanicek (roman)Attachment: firebird.log [ 11951 ] |
Commented by: Nikolay Ponomarenko (pnv82) Seems we have the same issue When i stop freesing server, there is additional info in log about 63 connection limit: GROUND02 (Server) Mon Jun 04 12:21:47 2012 Server computer has no any antivirus or firewall and is used as web-server. |
Submitted by: vander clock stephane (arkadia)
Is duplicated by CORE3439
Attachments:
fullcpu.jpg
firebird.log
firebird.log
hello,
i receive today this error in the log file :
SERVER12 Fri Jan 28 01:31:16 2011
INET/select_wait: select failed, errno = 10055
followed by
SERVER12 Fri Jan 28 01:31:16 2011
SRVR_multi_thread/RECEIVE: error on main_port, shutting down
and the server become to be "frozen". he not stop, he simply continue to accept new connection but never answer to them making all the client application look like "frozen"
INET/select_wait: select failed, errno = 10055 mean
An operation on a socket or pipe was not performed because the system lacked sufficient buffer space or because a queue was full.
this is generally because of a memory leak somewhere ?
yesterday the server crash also in the same way, but with nothing in the firebird.log (frozen, not answer any connection, but not refuse them)
but this time i have taken from the crash of yesterday and today the crash dump file that probably can explain in with loop the server was to not shuntdow or not answer to client
stephane
The text was updated successfully, but these errors were encountered: