PDA

View Full Version : Players going semi link dead


HurtinuDaily
03-06-2008, 02:42 PM
Does anyone remember the bug where people would go half way link dead?? players spell bars would be greyed out / melees could not atatck but mobs and other players could still attack the player? From what i remember it was resolved but I cannot remember what fixxed it. Thanks in advance!

AndMetal
03-06-2008, 05:53 PM
There's actually quite a few things that can cause it. In my experience, it usually has something to do with spells, usually one that isn't setup properly somewhere. I actually posted a bug report (http://www.eqemulator.net/forums/showthread.php?t=24459) about it, which might help to determine if it's the same problem.

On the same note, relogging usually fixes the issue, although you sometimes have to /q.

HurtinuDaily
03-07-2008, 01:30 PM
it happens to melees to, its like the client is waiting to recieve a packet back from the server. Its very annoying to say the least.

jasper
03-08-2008, 01:10 AM
I keep having this problem on my War's.. only been my War so far.. 2 of them on 2 different servers.

Gets on my nerves.. heh

cavedude
03-08-2008, 03:14 AM
its like the client is waiting to recieve a packet back from the server.

That's exactly what is happening. Any function can cause this, if they are missing an opcode, or aren't properly implemented.

moydock
03-11-2008, 10:10 AM
Yeah this bug plagues me. Is there any way to add code that says, if theres a problem with an opcode, ignore it and go on?

Also, I've noticed that if you make a change in your inventory while half linkdead, it will remember that upon logging back in. But it won't remember any movement you may have made.

Also, it seems to happen a lot more to some people than others.

So_1337
03-11-2008, 10:40 AM
My girlfriend's bard has been unplayable for some time now due to this. I've even had her remake the character two different times. All three versions wind up the same; it gets to where she can't log in. Then, when she can, she crashes whole zones when she does =P She just started playing a warrior instead, heh.

Had a thread going here about this same issue (http://www.eqemulator.net/forums/showthread.php?t=24054), including some of the op_code errors I was seeing. However, just seems something that has to be dealt with. Admittedly, I've seen it become a lot rarer lately than in past versions.

trevius
03-11-2008, 11:41 AM
I am curious if this bug is seen on Linux servers. If it isn't, there may be some adjustments that can be made on Windows to stop it.

I put in a small change this afternoon on my server running on windows XP SP2. It has to do with how many connections windows allows per second. If I get feedback saying it has stopped the problem completely, I will definitely post it in the forums here. Otherwise, if it didn't stop it completely, then it probably didn't effect it at all. It is hard to judge if you are helping an issue like this since it is so sporadic.

I am trying to make adjustments 1 at a time to see if it helps resolve this issue or the player ghosting issue. Both of those problems are high on my list of bugs I would like worked out, since they both effect the players and the server overall in a huge and annoying way. I am also curious if player ghosting is only a Windows thing and not seen on servers running on Linux at all.

cavedude
03-11-2008, 12:43 PM
The bard issue is a completely different situation. That's a zone crash caused by something in the AA listing code.

As I stated above, the "bugged" state is caused when something is missing/broken in the code and the client sends a packet to the server, but the server doesn't know how to respond so it sends nothing back. The client will do nothing but wait for the packet to arrive. Anything can cause this, and if I think about it I can probably come up with several different and unrelated situations where this can occur. Since we are unable to change the client's behavior (and let's say force it to continue on without getting a receipt packet for a function) there is nothing to be done but correct the root issue which is not always easy to find.

Generally speaking, this isn't a symptom of connection issues. They will usually cause full fledged link deaths or crashes to desktop/login.

So_1337
03-11-2008, 02:00 PM
Sorry, you're right. Bard is unrelated, but it led me to find all those other op_code errors in the thread I linked. I side-track myself sometimes. Was mostly making the point that I found with her bard, which is that in some cases there's nothing we can do but wait until the problem is found.

I know you mentioned pick pocket is an easy one to produce. I think that rogue poison (the applying portion) is another, though it doesn't bug the client entirely.

moydock
03-11-2008, 02:46 PM
Often when this bug occurs i'm just running around. Not sure what opcodes i'd be sending that wouldn't get an answer there. Although, often i'm sending tells or using /w all. Perhaps if someone is between zoning or already half ld themselves, it causes you to half ld... lol that would be interesting.

trevius
03-11-2008, 04:14 PM
This bug does seem to be caused sometimes by doing absolutely nothing on the client side. I know I have definitely seen this happen in certain customized zones MUCH more commonly than normal PEQ zones. On one server, I would get this bug almost every time if I tried to load up more than 1 character into DSP or BoT. Even if I only logged 1 in, and hit absolutely no buttons, and then loaded a second one in, etc, the first time I would try to cast a spell or do a /who all or attack anything on any character other than the first one I loaded, it would show that the character had this bug.

I am not sure what exactly would cause this, but my point is that it definitely seems to occur when players are doing nothing out of the ordinary. And for some reason, certain zones seem to cause the problems more than others. I have 100% confirmed that from previous testing in BoT and DSP on the other server I played on vs other custom zones or standard PEQ ones.

I don't know why certain people experience this more than others, but some people see it when multi-boxing and others see it when they are playing just 1 character. But, some people don't see it at all, or hardly ever.

If this is due to a the client waiting for a certain packet back from the server, couldn't that be caused by the packet getting dropped instead of the just when the server doesn't send the correct packet that the client is waiting for?

I just find it hard to believe that certain zones can cause this problem when my client is doing absolutely nothing. And that it can happen on all of my characters that I am boxing and almost always leave the first character in the zone unaffected. I would think that if this was an issue with the client waiting on something from the server, but not receiving it because the server doesn't know how to reply to an odd request, that this issue would be repeatable 100% and would be caused by certain actions. I have been unable to find anything common between what is causing this issue, other than certain custom zones in some cases.

I am not trying to question Cavedude's information, because he is far more knowledgeable about the emu than I will ever be, but it would be nice if we could get this figured out and resolved, so I am adding all info I can think of.

I made a change on my server today that I hope will help connectivity on my server. Here is the post I added to my forums to give a basic idea of what it is supposed to do:

As an update on my work towards making he server more stable and to try to lessen issues like the "lag bug" and other disconnects, I made a minor change on the server. The short of it is that windows XP SP2 has TCP-IP that is hard coded (cannot be adjusted) to only allow a max of 10 simultaneous connections per second. It also has a sort of limiter that restricts a certain amount of connections. This wouldn't effect most people at all on their home PCs, and they were put in place by MS to restrict worms that may have infected your home PC to go crazy making an unlimited amount of connections to infect other computers or whatever. Since I am running a server, my PC requires different settings than a standard home PC.

The new change I made was replacing the TCP-IP files with some that bump the max connections per second up to 50. This doesn't mean that I can only have 50 players on my server, it only means my PC can now deal with 50 per second. Previously when it was set for 10 per second, I believe it would have to kinda switch back and forth to a different 10 connections every second. I think that means most packets for more connections than 10 will have to get buffered and dealt with later. I imagine that might have caused some issues with connectivity and I really hope this new change will make a difference. With it now being set to 5X as high as it was before, we may see significant improvement on less disconnects and "lag bugs". I wouldn't even be surprised if the default XP SP2 files I was using may have been causing the player ghosts. I haven't tried running a Linux server yet, but as far as I have read I haven't been able to find 1 person mentioning having ghosts on their server when they run Linux. So, I imagine it is something that is tied to Windows.

This is all part of me trying to increase server stability on Windows and increase performance on my server. Hopefully, eventually I can reduce lag even when there are a ton of players on.

cavedude
03-11-2008, 11:58 PM
Oh crap, I didn't know this was happening when the player was doing nothing. Don't tell me THAT bug has returned :( We had a similar issue a while back that got worse and worse until the Emu was practically unplayable by many. I forgot who fixed it, but it did eventually get corrected. And when I say many, I don't mean all. That was the bitch some people never saw this bug, even at the worst times.

What do you mean when you say "standard" PEQ zones? Is that as opposed to custom zones, or ones with altered content from PEQ? Do those zones typically cause more problems?

To answer:

If this is due to a the client waiting for a certain packet back from the server, couldn't that be caused by the packet getting dropped instead of the just when the server doesn't send the correct packet that the client is waiting for?No, because after a bit the client will ask the server to resend. If it was just lag, it will and the client will be happy. If it's a situation where the server doesn't know how to reply, the client sits there with its thumb up its ass (pardon my french)

I guess I should start using Windows a bit more, both of my test servers and PEQ itself are linux which seem to work fine for me, at least in this regard. I noticed a few things that were off in Windows when I compiled my new installer that I attempted to patch up, but I am sure there are plenty more. Though, I do know KLS uses Windows exclusively (maybe?) and Wildcard I am certain uses both.

So_1337
03-12-2008, 12:42 AM
I have, very rarely, had it happen while doing nothing. Most notably, I recall it happening in Plane of Storms. We'd be doing nothing more than zoning everyone in to start getting ready for BoT keys. However, I can't attest that there was nothing going on at the time. Someone may have been buffing and possibly glitched everyone else as well?

However, this same behavior occurs if your connection should drop for, say, fifteen seconds or so while doing either something or nothing. Our router at work was acting up for about a week, where it would do exactly that. (I knew it was connection related because we host a Ventrilo server on the server box as well, and we would all lose that connection as well.)

The lag bar would consistently go out to about 83%, then jog back up to 100% in the midst of whatever we were doing. Some people would recover, some wouldn't. Those who didn't would have the behavior described above. The server sure does its best job to jog you back up to where the server is, but it doesn't always take.

cavedude
03-12-2008, 01:18 AM
I'll have a look at postorms scripts, though off the top of my head I don't think there is anything weird going on in that regard. I do know that storms is one of the heaviest populated zones in the db, NPC wise. That may contribute.

When you lose connection, those who don't recover will eventually get booted. It may take a while but it does happen (Titanium client is VERY lenient with timeout periods). That's a true LD situation. Communication between both the server and client has failed.

In the "bugged" state, you never get kicked (or at least I've never been kicked, even after making dinner, eating it, and coming back) I guess it's because the client is still chatting away so the server knows it's still there so it never boots it. So in that case, we still have one way communication.

So_1337
03-12-2008, 02:19 AM
Hrm. I think I should clarify: When I said that some don't "recover", I mean from being bugged. Everyone's connection recovered. We've had it happen on boss fights before, and if nothing else, your character will still be auto-attacking away. So depending on which class it is (boxed rangers, especially), it doesn't hurt to leave them DPSing until the end of the fight, and then you can camp them out and back to un-bug them. The easiest way to tell if a character recovered from being bugged or not in that situation was to see if the mob actually died on their screen, or if it was stuck at 0% health indefinitely.

I probably shouldn't have brought it up, sorry. It's related, in that it has the same symptoms, but it's certainly not from the same cause.

I thought that the high NPC population might be a contributing factor in Storms, as it's not like a script had been executed yet with us just zoning in and grouping up. (Other than the script that makes the corpses at zone-in actually lay down on the ground.) Heck, it may well have been a connection hiccup that caused the problem. In fact, I don't think I've seen anyone get bugged from doing literally nothing since we upgraded our pipeline. We used to run far too many people for the connection we had, and would see the problem often in zones with large populations, like Ssra and PoStorms.

For instance, we used to be able to run about 9 people with no trouble. Any time we'd get over that, a random person would usually get bugged. The server and clients just couldn't pass information back and forth quickly enough. And yes, sometimes it would happen just from doing nothing. Especially if while someone was doing nothing, there were people doing lots of things elsewhere =P

moydock
03-12-2008, 08:22 AM
I'm liking the windows server max connections idea.

I run windows xp pro to run my server as well but i've always got it locked and never have more than 5 people on it. And i've never experienced the half-d/c on my server. And i've spent hundreds of hours deving it.

However, on the tz/vz server I get this bug nonstop. Some days more than others.

What opcodes does the server send to the client while he's sitting still? That may be a good thing to make sure is functioning properly.

trevius
03-12-2008, 09:45 AM
Here is a post on my server forums that shows a few players reporting the problem on my server, which is running 1099 emu version atm. The earliest Emu version I have ran was 1071.



It seems like the complaints about this has died down since I added in my zone resetters. Since those keep the number of connections under control somewhat (with player ghosts), that leans me even more towards thinking this could be a bandwidth/connection issue.

I can't imagine why some people would see this no matter what class of character they play. It isn't like they are seeing it every time they are clicking their epic or casting 1 particular spell. They see it at random times.

I have been doing some research into it possibly being a connection issue and I will keep people up-to-date here on what I find. And if I can find a good solution, I will start a new thread for it.

Last night, I flashed my Linksys router's BIOS to run a linux based BIOS called DD-WRT. It has a TON more features than my previous bios. It was a bit harder to get it running properly, but it seems pretty smooth already. The reason I did it was to get more options with how my bandwidth is used and I haven't gotten into that part just yet. BUT, when I loaded it, I found that it has many great features that I think will really help me narrow down some network performance issues and maybe even help troubleshoot player ghosting as well as the "lag bug" issues. I can now see my router CPU utilization, memory utilization and how it is being divided up, buffer utilization, total number of connections and a full list of every IP that is connected to each of my machines, and it even shows a real-time chart of bandwidth utilization for LAN/WAN/Wireless for in and out totals. This is some really awesome stuff! I am a network admin, so I get into this kinda thing :P It makes me wish I had changed it long ago.

I still have a lot to learn about customizing my new router BIOS to see how much I can really tweak it, but it is already set to run at least as well as the old one. I am hoping that with all of this new real-time info I can see easily, it will help to narrow down certain issues.

I currently have my server log directory renamed to keep it from getting to gigs in size after a week or so. So, I might have to rename it back for a few days to see if I can compare the debugs with what I am seeing on the router when these problems occur. I will be sure to post my findings here when I get around to that. I still have more content to work on. My network and server are by no means unstable or in bad shape, so this is a secondary thing I am trying to work on when I can. The goal is to get something out that will help all other servers to increase performance and reduce disconnects and bugs.

trevius
03-12-2008, 10:27 AM
Oops, messed up the link to my thread for examples of this bug:


http://sh.makeforum.org/lots-of-server-discons-for-anyone-else-t103.html

So_1337
03-12-2008, 11:42 AM
It's just happened to me 4 or so times in the past 30 mins or so... There are only 49 people logged in, and I've never had it happen this bad even when there were 60ish people in.
What kind of download/upload rate is your server's connection running on? That seems like a lot of people unless you've got some heavy-duty speed and a router that can handle the throughput. Your download really doesn't matter, it's gonna be the upload that's your limiting factor.

So_1337
03-12-2008, 11:45 AM
Server stats are:
OS: Windows XP Pro
AMD 3400+ CPU
3GB RAM
Dual Hard Drives for easy database backups
Cable Internet Connection
Server box seems well enough, but even cable can have issues with enough concurrent connections. Hit http://www.speedtest.net/ on your server box (or any computer on the same LAN) and let us know what kind of upload you're getting.

trevius
03-12-2008, 11:05 PM
Ya, with my new router bios, I can finally tell exactly how many connections are being used and how much bandwidth etc. Seem with about 45ish people on, my upload utilization jumps around a bit, but goes as high as the mid 300s. I think my upload is supposed to be 512, but from speed tests in the past, I normally get in the 650 range.

So, getting more than 60+ on is probably starting to push my limit on upload. Seems like there isn't much I can do about that, and I expected there to be a limit around there.

But that is not the point of this thread. The point is mainly about the lag bug issue. I am still looking into it on my side. So far most of the tweaks I have done seem to have lowered player ghosts to a minimum and reduced the occurrences of the lag bug. Again, this is hard to confirm, since both are sporadic.

So_1337
03-13-2008, 01:11 AM
Right, but it's seeming more and more like your issues might be bandwidth related. I was having trouble keeping more than two groups connected on a business-class DSL connection (3 Mb down/700k up) that was mostly limited by a junky, off-the-shelf router whose throughput seemed to be the bottleneck. You seem much better off in regards to the router, but I'd bet dollars to donuts that you don't see this happening at times of low population (12 people and under).

It's exactly akin to the 'bugged' state we discussed earlier in the thread that is caused by missing op_codes and the client and server getting out of sync. It's much easier to miss an op_code or crucial piece of data if your server's connection can't push things to the client. We used to see people on my server getting bugged very often any time we had more than 12 (twelve!) characters logged in. Since we upgraded to our DS1 connection, we've never seen it happen except from spells/skills and op_codes, etc.

jasper
03-16-2008, 12:20 AM
so im seeing this problem still on other char types, not just my war as mentioned. I get it alot .. and it seems like there is no reason for it.

Today I was just running along.. tried to con a mob and got no response. Did the /who all, which I use to see if the bug has happend.. and yup was bugged. I know I did nothing special to cause it.. I was just running from one zone to another, EC to WC.

Joetuul
01-14-2009, 05:38 PM
I have recently come across this bug myself, as it happens to me. here are some details that I'm hoping will help the dev's...

When I sit in the nexus, I am fine for hours on end. but if I sit in Natimbi for too long this happens to me, could it be a problem with specific zones? is there a fix for this that just hasn't been posted? Even if I am actively doing something in Natimbi it still does it. I have to keep doing /who all just to make sure I am not bugged, if I am I /q out and log back in just fine. No one else on my server has reported this happening to them.

any help for me? or am I SOL?

trevius
01-14-2009, 06:46 PM
This problem has been resolved months ago in a code update. If you are running old code, the best solution would be to upgrade it to current (from the SVN).