EQEmulator Forums

EQEmulator Forums (https://www.eqemulator.org/forums/index.php)
-   Support::Linux Servers (https://www.eqemulator.org/forums/forumdisplay.php?f=588)
-   -   Strange server behavior (https://www.eqemulator.org/forums/showthread.php?t=43291)

joedit 05-04-2021 05:55 PM

Strange server behavior
 
Hi all,

My private server has been running fine for 3 months. For the past 2 weeks I have been experiencing issues with some of my characters losing connectivity to the server while playing. It is now happening almost every time, when I have about 18-24 toons (MQ2-E3) in a zone raiding. When the toons are standing there doing nothing, the connection is fine, but soon after engaging in the raid, all the characters would lose connectivity to the server after pulling/killing a few monsters.

However, at the same time, I checked the network connection to the server and it was fine, and all other characters not in the same raid/zone were still connected to the server without issues. At any given time there were no more than 36 characters connected to the server.

I ran server_status.sh and it returned with:

Akka's Linux Server Launcher
World: UP Zones: (1093/30) UCS: UP Queryserv: UP

I have never seen the zones showing anything but 30/30. It has always been 30/30 for me. I ran htop and noticed the memory utilization was very closed to maxing out, hence the response time was quite slow, but the machine along with the eqemualtor software were still running, and my characters in other zones were still fine.

The characters that lost connectivity were kicked to the char selection screen, and I was not able to log back in. I tried logging in after rebooting the client computer, and all the characters in the raid zone were stuck at character selection screen, not able to login, but other characters that were not in the raid before were able to login fine. I have to reboot the server in order to log in the stuck characters.

I don't think the up zones being (1093/30) is normal? I tried rebuilding a new server, and restoring the database, but I experienced the same behavior on the newly built server with existing database restored. Does anyone have an idea about what is going on? Or can point me in the right direction in terms of how to troubleshoot the issue?

**Additional info**
I checked the logs and I have been getting the zone crash due to database error. Any thoughts?

[04-24-2021 :: 22:03:51] [Crash] [ZoneServer] [Info] Database Error: Lost connection, attempting to recover

Thanks.

iraxion 05-21-2021 11:31 AM

I can't help with the actual issue (I guess the zone crashes for some reason or other when you begin your raid for real). But I had this
Quote:

I ran server_status.sh and it returned with:

Akka's Linux Server Launcher
World: UP Zones: (1093/30) UCS: UP Queryserv: UP
happen on my machine here as well. Whenever a zone crashed, the number of "zone" processes shot up (and after a while they would consume all available memory). I think I have found the culprit. To get rid of that you might try the following patch

Code:

*** common/crash.cpp.orig      Thu Dec 31 19:27:00 2020
--- common/crash.cpp    Mon Jan 11 16:28:22 2021
***************
*** 135,140 ****
--- 135,145 ----
        name_buf[readlink("/proc/self/exe", name_buf, 511)] = 0;
        int child_pid = fork();
        if (!child_pid) {
+              signal(SIGABRT, SIG_DFL);
+              signal(SIGFPE, SIG_DFL);
+              signal(SIGFPE, SIG_DFL);
+              signal(SIGSEGV, SIG_DFL);
+
                int fd = open(temp_output_file.c_str(), O_RDWR | O_CREAT, S_IRUSR | S_IWUSR);

                dup2(fd, 1); // redirect output to stderr

and recompile.

If you would like to have stack traces for crashes in the log files, I recommend making sure that "sudo gdb" actually works, without asking for a password, for the user running eqemu.

(I've been lurking here for quite a while now... just running a small family-type server myself, mostly vanilla eqemu+peq with very few customizations. Question from my side: how do I contribute bugfixes to the code, or to peq data, if I have any (like the above)? Just post on the forums? I tried to find a "how to submit bugfix" on the forums and on the wiki but didn't find any... well maybe I didn't look closely enough. :) If there is such a thing please just point me there. TIA!)


All times are GMT -4. The time now is 09:56 AM.

Powered by vBulletin®, Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.