Zone Threads
I'm trying to track down the LD issues encountered when a server is load stressed. It could be any number of things, so might as well start at the bottom.
Here's a breakdown of the threads in EQEMu's zone servers. If you see something that should be shifted or potential improvements to logic, please comment. Main Thread Function: main() Quote:
Function: TCPConnectionLoop() Quote:
UDP Thread Function: EQNetworkServerLoop() Quote:
Async TCP Thread Quote:
Quote:
|
One optimization I have seen so far is on the UDP Thread.
Currently, we are receiving data on the socket and blocking remaining thread operations until that packet has been decrypted. I believe this could be optimized by pushing the processing onto another thread, and allowing the UDP thread to continue socket operations. Thoughts? |
Re: Zone Threads
Quote:
Hey! This means that EQEMu won't be efficient on Token Ring or FDDI! Bah! :P Regards, krich |
1) shouldn't the /who all only be refreshed per call to that, or is that reference for server side use for the server itself to know who is online?
2)Wouldn't it be easier to keep a single (or multiple) virtual connection(s) alive (statically), and route through that than to keep opening per packet and then closing it? |
Async TCP Thread
If i remember correctly, this is a breif thread to handel the connecting process of the TCP socket and ends once the connection is estabilished, i thought it was no longer used however. Async DB Thread Allowed database queries to be run without blocking the main thread. Quote:
Referance: http://msdn.microsoft.com/msdnmag/issues/1000/Winsock/ |
Re: Zone Threads
Quote:
|
Re: Zone Threads
Quote:
Regards, krich |
Most people tweak their MTU's to reduce fragmentation, and you're right, most are alot lower than 1500.
Im not sure how Windows handles it, but I think it 576 is considered an 'optimized' MTU for a dialup connection. |
Note: I'm not 100% sure of any of this, and this is being kind of hastly written as I'm busy, but meh. I have also not had the time to look over the net code.
The 1518 recvfrom() buffer is fine. This is the buffer of what the server is not sending, but receiving. Even if someone is using a higher MTU than normal it may still be getting fragged by a router along the path(?). Reading the MTU of the server really does you no good either, you need to know the MTU of the clients. I also believe that the netcode reuses this buffer for every packet. It would be bothersome to figure out which connection the packet goes to, then allocate the buffer size you want and read the packet in. Also note that I can only think of a handful of packets that will even come close to this size. On the server sending side, items, mass spawns, playerprofile, guildlist, /who, petitions, and maybe some of the gm commands. Now packets that approach this size that the client sends. The only one that comes to mind is /bug, obviously it is not used often. |
I believe the only pertinent MTU in any connection is the one between *here* and *there* :p.
To be more precise, here := the node yer on and there := the next hop (usually your nearest friendly router). Along the way, even if the packet traverses 23 hops or whatever, each MTU from point to point can be different. Hence fragmentation, flags to deal with it, and the entire TCP suite. But when coding, one needn't worry about the rest of the net, just your neighbor. You can reasonably expect each hop to be as optimized MTU-wise as it's admin can work out. Anywhoo, I still think it might be worthwhile to test 1 server with an odd MTU between it and it's default gateway, and see if that server when used produces *way* too many fragmented packets, or just a 'reasonable' amount, whatever you want that reasonable threshold to be. I don't know your code like you do, but kathgar was right in focusing (sp?) on packet types the server deals with that are larger than said MTU. Smaller ones dont matter, of course. So if you do a /who all, and it gens a packet say 2k large, the buffer identified by merth might cause problems do to the CPU time required to defragment and due to the various bits of network lag that slow down the receipt of all the frags of a given packet. Gotta wait for em all to be there, and all. Just food for thought. Enjoy the meal. And I'm tired and likely blithering on too long to folks who already got the gist of the post..g'nite y'all. |
Again, that is the size of the buffer on *RECEIVING* packets NOT *SENDING* packets. The command the client sends for /who all is NOT that big. The RESPONSE which would NOT BE IN THIS BUFFER would be much larger (depending on your admin level and the number of players). Also note that this is all UDP, with custom control code.
|
Actually the EQ Network layer never allows a packet more than ~550ish bytes, so could lower it to that. ;p (probably so that there would never be IP layer fragmentation on dialup connections) But it's irrelevent, no downside to having it oversized (wasting what, 1k per zone load? It's a static buffer the oversize isnt passed along).
Thinking about this, the fix is probably to change the socket to blocking mode and make another thread and leave it blocked on recv() forever. That's probably the linux way of handeling this, and should work on windows too - however i'm guessing it'll cause problems on zone shutdown getting that recv call to unblock for a reason other than incomming data. And technically krich was right, ppp or pppoe frames != "ethernet frame". ;p |
Linux has both POSIX 4 async_io and a completion ports port. I also think that read() is non reintrant and we should your mmap().. I don't know.. i'm sick quickpost and I don't have my resources availible at this moment.
|
All times are GMT -4. The time now is 10:50 PM. |
Powered by vBulletin®, Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.