Go Back   EQEmulator Home > EQEmulator Forums > Development > Development::Server Code Submissions

Reply
 
Thread Tools Display Modes
  #1  
Old 11-03-2016, 10:49 AM
image
Demi-God
 
Join Date: Jan 2002
Posts: 1,289
Default Deadlock in TCPConnection::ClearBuffers from FinishDisconnect

I am pretty confident my world server encountered this issue after doing a code inspection and it seems the eqemu code is still susceptible to this problem.

https://github.com/EQEmu/Server/blob...ction.cpp#L293 -> https://github.com/EQEmu/Server/blob...ction.cpp#L310 -> https://github.com/EQEmu/Server/blob...ction.cpp#L504

we double lock MState which is not possible so it becomes a deadspin/deadlock whatever you want to call it.

Need to remove the lock of MState in ClearBuffers: https://github.com/EQEmu/Server/blob...ction.cpp#L504

So far no issues in my testing, but its hard to test all disconnect paths manually so gotta let it run.
__________________
www.eq2emu.com
EQ2Emu Developer
Former EQEMu Developer / GuildWars / Zek Seasons Servers
Member of the "I hate devn00b" club.
Reply With Quote
  #2  
Old 11-07-2016, 09:23 AM
image
Demi-God
 
Join Date: Jan 2002
Posts: 1,289
Default

Just to update seems I was wrong on this being the source of the deadlock. I still don't think its a good idea for us to lock a mutex we already locked before, seems like a bad design and might not be the only place.

In any case I reviewed and in net.cpp there was some older code that was delaying how long till the reconnect happens which I removed (so now its solely on the 10 second timer instead of like 120+ seconds).

What I did see is last night we failed our first reconnect attempt to the eqemu LS, typically I don't even see a reconnect attempt just the ending thread error. I also added a log message inside the AutoInitLoginServer thread creation in net.cpp to track this:


20366 [11.06. - 22:37:22] [COMMON__THREADS] Ending TCPConnectionLoop with thread ID -54917376
20366 [11.06. - 22:37:24] [WORLD__INIT_ERR] Not all login servers are connected, calling AutoInitLoginServer.
20366 [11.06. - 22:37:24] [WORLD__LS] Connecting to login server: login.eqemulator.net:5998
20366 [11.06. - 22:37:24] [COMMON__THREADS] Starting TCPConnectionLoop with thread ID -546068736
20366 [11.06. - 22:37:34] [WORLD__INIT_ERR] Not all login servers are connected, calling AutoInitLoginServer.
20366 [11.06. - 22:37:34] [WORLD__LS] Connecting to login server: login.eqemulator.net:5998
20366 [11.06. - 22:37:34] [WORLD__LS] Connected to Loginserver: login.eqemulator.net:5998

Will continue monitoring to see if any issues happen again, but maybe this shorter retry is helping the situation.
__________________
www.eq2emu.com
EQ2Emu Developer
Former EQEMu Developer / GuildWars / Zek Seasons Servers
Member of the "I hate devn00b" club.
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

   

All times are GMT -4. The time now is 04:59 AM.


 

Everquest is a registered trademark of Daybreak Game Company LLC.
EQEmulator is not associated or affiliated in any way with Daybreak Game Company LLC.
Except where otherwise noted, this site is licensed under a Creative Commons License.
       
Powered by vBulletin®, Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Template by Bluepearl Design and vBulletin Templates - Ver3.3