View Full Version : A possible linkdeath fix
Submitting this for your consideration.
A friend and I have been playing on our own eqemu server for a couple of weeks now. Unfortunately his ISP seems to suffer from occasional bouts of packet loss. At generally random but frequent intervals he would be running along, fighting or just standing around and find he could receive chat messages but not send his own, cast spells, /con mobs, etc.
A look at the logs revealed that his client seemed to be deciding to continually retransmit certain packets and/or fragments part of a larger sequence for which it apparently did not receive an acknowledgment from the server. His client continues to send the same single or sequence of packets over and over until it eventually gives up and he gets the you have been disconnected message.
The current code in EQStream.cpp logs but otherwise ignores packets received from the client that are older than the current sequence number. It seemed logical to me to have the server send an out-of-order ack to the client to make it happy:
--- EQEmu-0.7.0-1118/common/EQStream.cpp.orig 2008-06-25 16:50:09.000000000 +0000
+++ EQEmu-0.7.0-1118/common/EQStream.cpp 2008-06-25 16:50:47.000000000 +0000
@@ -178,6 +178,8 @@
_raw(NET__DEBUG, seq, p);
//kludge to see if it helps:
//SendAck(GetLastAckSent());
+
+ SendOutOfOrderAck(seq);
} else {
// In case we did queue one before as well.
EQProtocolPacket *qp=RemoveQueue(seq);
@@ -222,6 +224,8 @@
} else if (check == SeqPast) {
_log(NET__DEBUG, _L "Duplicate OP_Fragment: Expecting Seq=%d, but got Seq=%d" __L, NextInSeq, seq);
_raw(NET__DEBUG, seq, p);
+
+ SendOutOfOrderAck(seq);
} else {
// In case we did queue one before as well.
EQProtocolPacket *qp=RemoveQueue(seq);
This change may very well be a kludge, but it seems to be working well (for the last week or so). Even though my friend is still suffering from some packet loss here and there, he no longer goes linkdead and the game is now playable.
Congdar
06-25-2008, 02:21 PM
I've been looking for a fix in this area, the error I was getting was from this same EQStream.cpp file:
if(CompareSequence(NextOutSeq, seq_send) == SeqFuture) {
_log(NET__ERROR, _L "Tried to write a packet beyond the end of the queue! (%d is past next out %d)" __L, seq_send, NextOutSeq);
I would get huge log files with this message and the same seq_send and NextOutSeq for every line. Somebody suggested remming out that log line and that did stop the huge log files but didn't solve the problem. I'll test your fix and see how it goes.
moydock
06-25-2008, 03:44 PM
I worship the ground you walk on if you managed to fix this bug...
trevius
06-25-2008, 06:32 PM
HOLY SHIT, I AM ADDING THIS IN NOW LOL!
I knew it had to be something like that that was causing the bug. Hopefully this resolves the issue for good. This is one of the biggest issues I have had on my server during peak times. With this resolved, I know quite a few players that will be ecstatic lol!
If this actually works to resolve this bug, the entire community owes you a huge thanks lol! And this should get added into the source as soon as it is verified to work.
cavedude
06-25-2008, 07:42 PM
This is going into TGC with its next reboot, thank you :)
MNWatchdog
06-26-2008, 01:41 AM
Hopefully this will fix some login and zoning issues too where niether completes reliably.
Its not as much of a issue on my cable connection, but when Im using the cities of Minneapolis WiFi network, it happens far more often.
If this does solves this issue, I can finally cancel my cable connection, saving me $30/month using the cities WiFi over cable.
I have to give them credit that the cities network works as well as it does though. Speed wise for browsing its decent. Im paying $15/mo for the 3Mb connection compared to Comcasts $45 for 6Mb.
I can't promise that this fix will be the proverbial silver bullet, particularly for servers with dozens of players (versus my maximum of two), but I am fairly confident that it should at least help a little.
Typically, when the problem that I was having occurred, the debug zone log would fill with messages about duplicate OP_Packet and OP_Fragment with repeating sequence numbers (it still does, but at least now the server is taking some kind of action). The client debug log would indicate that it requested status for X and that it never came.
Teknocrat
06-26-2008, 05:08 PM
You are amazing if this fixes the issue!
trevius
07-01-2008, 06:22 AM
This code has been on my server for 4 days so far and not a single report of the "lag bug" since. I don't even think I have had to reboot it since I put this code in and normally I do every 2 days or so.
Looks like this one is probably good to be added to the source ASAP. Interested to hear if other servers running this have seen similar results.
I'll add it in the next few days if no one's experienced any issues by then.
trevius
07-04-2008, 05:24 AM
Just wanted to mention that 2 of my players that have had this bug chronically have just posted saying that they haven't seen it happen a single time since this code was added about a week ago.
trevius
07-25-2008, 05:57 PM
This one definitely needs to be added in the next release! It is a verified fix as far as I am concerned. Still have not had a single player report this bug since I added this on my server.
trevius
07-25-2008, 06:22 PM
Sorry, just saw that lol. Didn't see you already just put a release out. Going through and posting in the others I think should be in the next one.
Congdar
09-09-2008, 11:12 AM
I'm still seeing this error... Did this fix make it in? If it did, it wasn't the final solution :(
Tried to write a packet beyond the end of the queue!
This is filling up log files. I could just rem out the line in the code, but that doesn't fix whatever is wrong, only hides it. Maybe somebody can convice me that this isn't really an error?
trevius
09-09-2008, 05:16 PM
This fix doesn't stop out of order packets from coming in, it just gives the server a way to handle them.
The out of order packets issue that fills the log files is something different. I find that it is normally caused by players with high latency, like from China. I don't think there is really any possibly solution to it, since it seems to be network related. Of course, I could always be wrong :P
cavedude
09-09-2008, 05:37 PM
Of course, I could always be wrong :P
I believe you to be correct. The packets are coming in out of order so it is a network issue. There isn't anything EQEmu can do besides deal with it. PEQ too sees the logs fills up from people that are in Asian countries, where there are fewer links to the rest of the world.
Congdar
09-09-2008, 08:56 PM
Hmm. OK. Well it is a wierd bug as the packet numbers increase a little bit but it isn't long before the numbers never change... like it's in a loop with the same packet number printed and not changing. Commenting out the logging would clear up the logs but I think it will still be looping and causing server/zone lag. Another reason I think a packet is stuck in a loop is that the logs continue to grow, even after the player has logged off the server. The code has a continue; statement... is there were a way to exit instead without causing too much of a hiccup in the network connection?
trevius
09-09-2008, 11:23 PM
I am guessing you are running a Windows Server?
I used to have it flood my logs insanely like that, but it was only on Windows. You should be able to edit your log.ini file to stop it from logging those types of errors. But, that doesn't exactly fix the problem. I did notice that Windows server have a major resource issue that builds up over time. I am fairly certain that the resource issue is tied to the player ghosting issue, which is probably the biggest problem with Windows servers. And, it is just a guess, but I am willing to bet that the error you are seeing is related to player ghosts. Player ghosts are characters that hang in the game after the player has logged off or exited EQ. I bet that if one of these issues are fixed that it will fix the other.
I'm still seeing this error... Did this fix make it in? If it did, it wasn't the final solution :(
Tried to write a packet beyond the end of the queue!
Unfortunately the change I made won't (shouldn't, anyway!) have an effect on the issue you are seeing. As far as I can tell from a cursory look at the code, the "Tried to write a packet beyond the end of the queue" problem is related to server => client communications, whereas my change primarily affects client => server communications.
vBulletin® v3.8.11, Copyright ©2000-2025, vBulletin Solutions Inc.