Fix for the Windows Server lag issue
Hello, regarding the lag issues with some Windows Platforms (http://www.eqemulator.org/forums/showthread.php?t=42311), it looks like for some reason Windows Server tripped up over itself when using LibUV in a while loop with UV_RUN_NOWAIT and is much happier with UV_RUN_DEFAULT. Here's the commit: https://github.com/peterigz/Server/c...d9354b846bddb0
Currently running this on The Hidden Forest and all lag issues have now been resolved, 18 would lag heavily for us and drop clients, so far we've had 50+ in zone with no problem, hopefully we can stress test with more soon but either way it's way better then it was. I'll give it some more time in case any other issues arise, then I can do a pull request. I can't check Linux, maybe someone can build on that platform to see if all still ok there, can't see any reason why not. |
Reading the documentation on libuv for UV_RUN_NOWAIT, it says
"Poll for i/o once but don’t block if there are no pending callbacks. Returns zero if done (no active handles or requests left), or non-zero if more callbacks are expected (meaning you should run the event loop again sometime in the future)." We might just need to actually check the return value and run the event loop sooner. |
I'm assuming, anyone using older code (few months or more), where zone/main.cpp did not exist, this diff would be used in zone/net.cpp ? Or would it work ?
|
I think you'd probably be ok to do that Huppy, just have to be careful to only merge in the relevant changes, not sure how much changed in net/main.cpp since your version.
Quote:
|
Anyone else tested this yet?
|
Nope but it be cool if it is working gets pushed main source soon
|
Quote:
Thanks for investigating into this, it is always appreciated when people take initiative to improve the codebase for the community The UV library definitely polls the underlying systems differently for each platform so it would make sense that this would cause the symptoms that we've been seeing to some degree on the Windows platform The reason why the sleeps were in place is because KLS was still going through refactoring at the time and they can probably be removed As far as testing this, I'd like to see other populated Windows servers give this a test and then test it on a branch via PEQ and I see no reason to at least run the loop with the default mode set which is what we should be doing anyways but for the former reasons mentioned; KLS left it as is for the time being. I know you mentioned that it seemed to have worked after your changes, but we had also had confirmed fixes with Windows servers of over 100 people in a zone and then eventually someone runs into the symptoms again. So to be sure this is what we were looking into I'd just want a few others to validate that I think most servers 30+ (subtract p99) are on Linux currently so not sure who could test currently, but either way they can report their findings here and we can go from there Nice find! |
I not one who likes add in changes to files if they was a branch had this fix in it i mess with it
|
I have a windows server so if I get time this weekend I'll give it a run.
Edit: Pushed changes to server this morning. Will probably be tonight before enough people online to see results. |
30+ in zone, plus pets and 24 bots. No lag. Like Peter, actually seems to be running a little better.
|
All times are GMT -4. The time now is 04:58 PM. |
Powered by vBulletin®, Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.