Thread: lag problems
View Single Post
  #18  
Old 02-16-2019, 09:30 PM
Drakiyth's Avatar
Drakiyth
Dragon
 
Join Date: Apr 2012
Posts: 549
Default

Quote:
Originally Posted by Akkadius View Post
Again, these are completely unrelated.

Just because you saw a bunch of disk activity and a bunch of queries in a file doesn't mean that its the reason for lag. If you have an improperly tuned MySQL server along with something enabled that is pegging your MySQL server that is another thing and I'm happy to help diagnose those with you

I want you to contrast all of what you observed with PEQ's disk activity:

http://peq.akkadius.com:19999/#menu_...late;help=true

PEQ has over 800 players right now at maybe stays around 1MB/s writes if at all and occasional bursts, IO operations stay down at a very very low amount even for 800 players

Client::Save is a very light operation, there's maybe a handful of INSERT's or REPLACE into's that occur which are all sub 10ms inserts. We could use less Client::Save's in general but it really isn't the problem here

You don't need to turn on the MySQL general log when you can see exactly what a zone process is doing by enabling MySQL logging at the process level. Even if you pipe that to another drive it still is overhead to the MySQL process

https://github.com/EQEmu/Server/wiki...-System#gm-say

In the `logsys_categories` table you can shut off any category you are piping to file

Back to the Network Issue

We know exactly what's going on with the network issue because we've taken CPU snapshot profiles during the problem. It's just not a quick "Fix" and we typically chose to go through a very careful staged approach before reintroducing this into mainline because of the complex factors involved

The reason we've seen this far less on PEQ is because PEQ has a OC'ed 5Ghz core processor, DDR4 memory and NVME Datacenter SSD's. When the zone processes goes into resend storm logic, it can keep up with the very aggressive resend logic just enough until the client either disconnects from their own terrible connection or the client itself recovers.

There is still a breaking point with our hardware however, it just takes a lot more to get there. If we had over 100 toons in a zone PEQ and we had something produce enough resend logic (Like raid combat spam burning) it would trip the same inflection point that most folks are seeing on their Windows nodes at 20-40 people in a zone with 2.6Ghz ish processors and whatever else they're using on their boxes. Even with over 100 toons it is still very rare to see it just because of the very tight hardware that is being utilized

Regardless, you shouldn't need the above specs to run a server, that is not the point at all. The point is why we've not run into this issue up until this point because most of our code QA goes through PEQ and our hardware has been masking the problem. Before we released the new netcode overhaul to mainline we went through several several iterations of issues and actually drastically improved our overall netcode utilization massively which I am still super stoked about to this day, we just have this one issue plaguing people and we will have it resolved soon, so just stay tuned for updates
Akkadius,

I just want to say that the Varlyndria players and myself really appreciate everything you and the main EQ Devs are doing to fix this lag issue. I could only imagine the frustration it could bring. One thing I have done for my hub zone is create public instances that players can travel to. This helps free up congestion if lag starts occurring in the non-instanced zone. I encourage any server owner to do the same while this issue remains.

Here is to a quick recovery so we can all once again enjoy a solid amount of players in the same zone with no issues.
Reply With Quote