New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Client library could hang infinitely waiting for a reply packet on a forcibly disconnected server socket [CORE3387] #3753
Comments
Commented by: @dyemanov This is the second report of such a problem already. It seems we have a regression. |
Commented by: @dyemanov What v2.5.1 build have you used priorly, that didn't have this issue? |
Commented by: vander clock stephane (arkadia) Hard to say, in the 2.5 original i also have this kind of issue (but much much more often, but not sure it's the same). I try too little time the beta release between to 2.5 original and the 2.5 version 2630 to say with don't have this issue. i don't know, if you want i can try to attach a debuger on the fb_inet_server.exe to see what's happen ? |
Commented by: vander clock stephane (arkadia) I forget, just say me if i can kill the "frozen" clients or if you need them to do some tests ... |
Commented by: vander clock stephane (arkadia) the same error again .... today the behavior is little different one client was running several day with 2 connections on the FB server yesterday the client become frozen. when i look in the $mon table in the server side i see only the EVENT connection. this bug become very important because it's block all the time our application :( |
Commented by: vander clock stephane (arkadia) today a little different, i have on the server Thu Mar 24 11:07:35 2011 and on the client Unable to complete network request to host "xxx". BUT i have several other application connecting to the FB server on the same client machine that are still working correctly ! this error (10054 = A connection was forcibly closed by a peer.) look like morea bug than a network problem |
Commented by: vander clock stephane (arkadia) What the status of this bug ? it's appear in average one time a week but it's very strong because when it's happen it's "froze" the client Application :( more info (probably it's repeat but can not be bad in case i forget something) : We have one client application that run at full time in background. This client open 2 connections on the database, one for query and one for select/update Every second this client application do a query on the database (select + insert/update). so the connection is never "idle" more than 1 second. but after days of work (1, 2, 7, etc..) then the connection freeze waiting the answer of the server that never arrive. on the server side (when it's freeze), i can only see (in the $MON table) the connection of the EVENT, but not the second connection that do Select/update. in the log of the client application i see that the second connection is frozen from the 04-04-11 14:43:19 FBSERVER Mon Apr 04 14:43:48 2011 any idea ? did need to update the current version of the FBServer and FB client DLL ? actually client = legacy 2.5 and server = FB 26230 |
Commented by: @dyemanov Different solutions are being tested on the customer site who has it more or less reproducible. I will report here as soon as we have something concrete found. |
Commented by: vander clock stephane (arkadia) thanks dmitry, as soon you have something, you can come also to me to test it ... |
Commented by: @hvlad The fix is committed (into v2.5 source tree), please try the next snapshot build |
Commented by: vander clock stephane (arkadia) thanks Vlad ! but in the meantime Dmitry send me 3 differents versions to test. I actually test them now, and the 1rt version still have the bug, i m actually testing the second... did i meed to stop my test and use instead the next snapshot build ? stephane |
Commented by: @dyemanov The issue should be resolved in the snapshot build, so it's worth trying it now. If it still locks up, you can return to my builds :-) |
Commented by: vander clock stephane (arkadia) bad new, unfortunatly it's seam to not be resolved, i just have now a client that is waiting for already 2 hours the answer of the server and by this way enter in a "frozen" state :( on the server side i can see one EVENT connection with the client, but no any other connection (i look in the $Mon table). on the server i just have something like this at the time the client start to wait the server answer : FB version 26263 ... any idea ? |
Commented by: vander clock stephane (arkadia) i confirm, now 24 hours passed the the client is still frozen :( |
Commented by: @dyemanov What do you mean by the "EVENT connection"? Do you set up a separate connection to wait for events and this connection is not used for any client queries? |
Commented by: vander clock stephane (arkadia) at the begining, when the client is launch 2 connections are open to the server : here we see that the query connection is lost (not present anymore in the $Mon table) BUT the Event connection is still show in the $mon table and still aleave ! this mostly show that the probleme look like not a "hardware network" probleme. it's not really hard to reproduce this bug (on our database), it's appear after 1/2 days of running if you want i can make a dump of the client to know what he is waiting for ? |
Commented by: @dyemanov I'm not sure the dump would tell us something useful, but if it's possible for you then please proceed and send it to me by email (or make available for download and send me the link). |
Commented by: vander clock stephane (arkadia) I make a little mistake, when i try to launch the debuger to make the dump i close the client by mistake :( |
Commented by: vander clock stephane (arkadia) Hello everyone, Now i have the confirmation that this bug is still present :( One of my client application if waiting the serverr answer for now 2 days. this time i setup more log in my client application and i can say that * The client open 2 connections, one for event and one for queries the exact SQL Query was NB: this SQL is valid, just to show to you in case on the server side i can see in the $Mon table the Event connection, but not the Queries connection doing netstat on the client computer show this : Active Connections Proto Local Address Foreign Address State PID NB: i remove here the connection not related to WinRETranslator.exe (pur client application) in this report doint netstat on the server , i can found only thes row connected to our client application: Active Connections Proto Local Address Foreign Address State so i can not found on the server the line for TCP 94.210.76.118:49165 94.210.76.120:3050 ESTABLISHED 2132 thanks for you help ! |
Commented by: vander clock stephane (arkadia) sorry i make a litte error in pasting the netstat result doing netstat on the client computer show this : Active Connections Proto Local Address Foreign Address State PID NB: i remove here the connection not related to WinRETranslator.exe (pur client application) in this report doint netstat on the server , i can found these row connected to our client application: Active Connections Proto Local Address Foreign Address State so i can not found on the server the line for TCP 94.210.76.118:49165 94.210.76.120:3050 ESTABLISHED 2132 stephane |
Commented by: @hvlad Stephane, please explain : TCP 94.210.76.118:49164 94.210.76.120:3050 ESTABLISHED 2132 and 3 of them is for regular database connection. |
Commented by: vander clock stephane (arkadia) hmm, i look the code and it's possible that more than 2 connections are open for the queries. in fact another connection can be open for "log" that simply store in a table all the queries executed by the first connection. |
Commented by: vander clock stephane (arkadia) ok, 5 days passed and the client is still waiting the server answer :( net stat still show this in the client : Proto Local Address Foreign Address State PID and in the server still the same Proto Local Address Foreign Address State so it's really strange that the client show i was waiting few days hopping a time out or something like this will close the connection but no :( |
Commented by: @dyemanov what about the memory dump for the client app? Also, does number of fb_inet_server processes correspond to select count(*) from mon$attachments? |
Commented by: vander clock stephane (arkadia) dammed ... i just close today the client app before reading your message :( > Also, does number of fb_inet_server processes correspond to select count(*) from mon$attachments? i m in Firebird super classic, how to see that ? |
Commented by: @dyemanov Ah, sorry, I thought you were using Classic. |
Commented by: vander clock stephane (arkadia) ok, i have another client application that is frozen for 8 days too in the same situation. thanks ! |
Commented by: vander clock stephane (arkadia) Actually i m testing the new version of FBClient, and it's look fine for now (but need one or two week to know for sure) but i just want to say also that 4 days ago, i was force to hardly kill some clients (because of server overload). in the $MON table i can see 221 connections, and doing NetStat i can see around 286 established connections So just to say that the trouble seam to not be only in the fbclient.dll side but also in the server side ... |
Commented by: vander clock stephane (arkadia) The new version of FBclient that dmitry send me look like it's correct the bug ! congratulation dmitry and many thanks ! i thing we can close this bug as resolved now ! thanks again |
Commented by: @dyemanov Thanks for the good news. The fix will be committed today and thus will be available in the tomorrow's snapshot build. |
Modified by: @dyemanovassignee: Dmitry Yemanov [ dimitr ] |
Modified by: @dyemanovsummary: Server and client are connected, but server not anwser to client and client is waiting indefinitively server answer ! => Client library could hang infinitely waiting for a reply packet on a forcibly disconnected server socket |
Commented by: @dyemanov While fixing CORE1763, it was decided to not set the SO_KEEPALIVE flag for the client socket. But the lack of keep-alive packets is now proved causing problems in this ticket, supposedly due to some networking (or firewall related) troubles in the production environment. The solution is to turn keep-alive packets on. |
Modified by: @dyemanovstatus: Open [ 1 ] => Resolved [ 5 ] resolution: Fixed [ 1 ] Fix Version: 2.5.1 [ 10333 ] Fix Version: 3.0 Alpha 1 [ 10331 ] |
Commented by: vander clock stephane (arkadia) one thing, is the Gbak use the Fbclient.dll or it's own connection protocol ? because i also have one backup that is frozen for several days ... but i guess that gbak not use fbclient.dll but that the bug in gbak is simillar to the bug in the fbclient.dll ? |
Commented by: @dyemanov GBAK is a regular client application, i.e. it uses fbclient.dll. Perhaps this is a different issue. |
Commented by: vander clock stephane (arkadia) sorry for late answer, was in travel.. on the client where gtat run net stat show : on the server nada :( the gbak is frozen from the 24 may ... gbak:25900000 records written i just make a dump of the gbak and put it in the ftp in case you need... thanks ! |
Commented by: vander clock stephane (arkadia) another remark: on the firebird server i can see 4 EVENT connections alive that was open by client that are gone serveral week ago ! These connections was not open with client that have the new FBClient.dll (don't know if matter, because here it's on the server side) the server version is still the 26230 |
Commented by: vander clock stephane (arkadia) hello, I have now often the GBAK that freeze (one time a week). but except the GBAK i have much much (much) lower connection frozen bug than before !! gbak is still connected now (from several days) if you want i can also do a memory dump (or something else?) thanks ! |
Commented by: @dyemanov It's worth trying v2.5.1 RC1 regardless. A memory dump (ideally, for both frozen GBAK and the server it's supposedly connected to) is appreciated, as well as the network stats (also, for both server and client) for the moment. |
Commented by: vander clock stephane (arkadia) ok, in more details : on the Gbak computer the netstat show : TCP 93.122.12.118:52939 93.122.12.120:3050 ESTABLISHED on the firebird server computer ... nothing ! the gbak client is not in the $MON table nor in the netstat ! so it's simply look like that GBAK don't detect the deconnection from the server and stay connected forever :( i put the copy of the Gbak Dump file n the ftp server with i send you the credential by email the last time ... thanks ! |
Modified by: @pcisarstatus: Resolved [ 5 ] => Closed [ 6 ] |
Modified by: @pavel-zotovQA Status: No test |
Modified by: @pavel-zotovstatus: Closed [ 6 ] => Closed [ 6 ] QA Status: No test => Cannot be tested |
Submitted by: vander clock stephane (arkadia)
Is related to CORE1763
hello,
i have several client that are running for some days now waiting that the server answer to them.
so at the end they look like "frozen"!
in the server side, in the $MON table i can see that the client are well attached from the 3/11/2011 (so 5 days) but the server don't answer them ! client are waiting for 5 days now one answer from the server that never arrive ...
the server version is 26230. the client use the original 2.5 client DLL
i still not kill the client (nor the server too) so i can still do some stuff on it if you need (like full dump?)
thank by advance
Commits: 070718b e04bf21 9087961
The text was updated successfully, but these errors were encountered: