PDA

View Full Version : zone segfaulting.


discore
04-24-2002, 02:18 PM
Hello. I am attempting to run an EQEmu server off of a Debian 2.4.17 machine.

Firstly I had a problem since my machine is multihomed and it wasn't getting the correct interface. The_Coder's patch fixed that nicely.

Now I can connect to the server, and create a character, but when I attempt to login the zone process segfaults and I see "Zone not available" in EQ.

Here is straced output of the zone binary:
<zone loads, no errors, and goes to sleep>
nanosleep({0, 1000000}, NULL) = 0
gettimeofday({1019699449, 383551}, NULL) = 0
nanosleep({0, 1000000}, Map: Maps\halas.map not found.
NULL) = 0
<insert less than a second of nanosleep() spam here>
nanosleep({0, 1000000}, NULL) = 0
gettimeofday({1019699449, 837093}, NULL) = 0
nanosleep({0, 1000000}, 0) = -1 EINTR (Interrupted system call)
--- SIGSEGV (Segmentation fault) ---
+++ killed by SIGSEGV +++

It happens right after it tries to load the nonexistant .map file. I have read that these files aren't meant to exist yet, so I assume that is supposed to happen. Looking at the Map.cpp source, it seems to make a clean enough exit if the map isn't found. I can't figure out why they would segfault instead.

Also, the world binary complains a bit:
Removing zoneserver from ip:127.0.0.1 port:51429 (166.70.x.x:7998)
Zoneserver send() failed, assumed disconnect: #7 127.0.0.1:51427 (166.70.x.x:7996)
send_socket=15, Status: -1, Errorcode: Broken pipe
Removing zoneserver from ip:127.0.0.1 port:51427 (166.70.x.x:7996)
Zoneserver send() failed, assumed disconnect: #8 127.0.0.1:51428 (166.70.x.x:7997)
send_socket=16, Status: -1, Errorcode: Broken pipe


Just to check, I assume this shell script will work fine for starting zones:
(discore@chalkdust /usr/local/eqemu)% cat startzones.sh
./zone . <eth0's ip> 7995 127.0.0.1 &
./zone . <eth0's ip> 7996 127.0.0.1 &
./zone . <eth0's ip> 7997 127.0.0.1 &
./zone . <eth0's ip> 7998 127.0.0.1 &
./zone . <eth0's ip> 7999 127.0.0.1 &

Like I said, all 5 load without any errors. I have tried multiple things in the 4th field, 'localhost', '<eth0's ip>', and such. /etc/hosts has all the proper info as well. <eth0's ip> also matches the worldaddress= in the .ini file. I removed the account= and password= lines from the .ini, as per The_Coder's instructions on the NAT thread. But I doubt that's relevant..

Any ideas?

theCoder
04-27-2002, 08:16 AM
I can't tell offhand what the problem is (I haven't had this problem, but I haven't tested 0.3.1 extensively -- finals coming up and all...)

Your world log is probably just reflecting the fact that the world server noticed the zone server had disappeared mysteriously (segfaulted) and was removing it from it's list of zone servers.

The fourth parameter to zone should only affect how zone connects to the world server, so it can be anything that can resolve to the host world is running on (localhost, 127.0.0.1, public IP, internal IP, etc). The NAT patch binds to the interface specified in the second parameter (just checked to make sure).

I'm not sure what exactly is going on from the strace. I looks like there's a Sleep() call that is failing. Really, it's usleep which is called unser the Linux build (see common/unix.cpp), which I guess calls nanosleep(?), which is failing. AFAIK, there are no direct calls in the code to either nanosleep or usleep (except through Sleep in unix.cpp). Based on the parameters to the call of nanosleep, it looks like the call was Sleep(1). There are 5 calls of Sleep(1) in zone/* and 2 in common/* (from grep). I'd bet that the problem is around one of them. Perhaps some debug printf's (or cout as seems to be the style in the emu) would be helpful in determining where it's happening. I know there are several debug print outs already in the code, but you have to turn the debug level up (I think at least one is debug_level in common/EQPacketManager.cpp). However, I'm not sure that those would help here since they seem to be debugs for the networking code.

Also, are you running 0.3.1, or the patch 0.3.1.1? Or another older version?