EQEmulator Forums

EQEmulator Forums (https://www.eqemulator.org/forums/index.php)
-   Support::Linux Servers (https://www.eqemulator.org/forums/forumdisplay.php?f=588)
-   -   Death causes LD, corpse loss, and zone segfault (https://www.eqemulator.org/forums/showthread.php?t=30809)

cubber 03-14-2010 12:01 PM

Death causes LD, corpse loss, and zone segfault
 
Every time a character dies on my server it causes the zone to crash, the character comes back into the game in the same zone near where they died, naked and without a corpse.

/var/log/messages shows this almost instantly upon character death:

Code:

Mar 14 11:49:59 eqsrv kernel: zone[14527]: segfault at 87c9000 ip 0815ac0e sp bfd2f78c error 4 in zone[8048000+319000]
I have tried compiling multiple svn rev/peq db combos to no avail. I have gone all the way back to rev1038 up to 1265 and it still persists, I have also tried compiling the source on a different machine then moving it over to the server.

Months back when I originally updated to 1038 I did not have this issue, so I am thinking that a system update may have been the culprit. I run gentoo x86 and update on a weekly basis. The machine only has what is necessary to run eqemu since it is a dedicated server.

Congdar 03-14-2010 03:17 PM

is there a difference between a character dying with bots and without bots?
what other logging messages do you have, like from /eqemu/logs/ ?

cubber 03-14-2010 03:27 PM

No difference I have compiled the server with bots enabled and without. As soon as a character dies either way the zone segfaults.

the error logs don't show anything, I just found the one in /var/log/messages. I compile up a server again and pastebin some of my logs.

cubber 03-14-2010 04:02 PM

Ok this test run was with a clean peq 1265 db. And svn rev 1265 compiled w/out bots.

Created a new character, loaded into glooming deep and attacked the first NPC I could find. Died and went LD.

/var/log/messages:

Code:

[/CMar 14 15:46:18 cronic kernel: zone[14518]: segfault at 88e5000 ip 0815abee sp bfdb8d2c error 4 in zone[8048000+319000]
I pasted the following logs with wgetpaste

eqemu_world.log:

http://dpaste.com/171904/

eqemu_zone.log:

http://dpaste.com/171906/

eqemu_error_zone.log

http://dpaste.com/171907/

eqemu_error_world.log

http://dpaste.com/171909/

the tail of eqemu_debug_world.log since it was to large to paste and just had the normal world boot stuff.

Code:

14505 [03.14. - 15:45:38] [WORLD__CLIENT] testuser: Character creation request from testuser LS#1 (192.168.42.30:48392) :
14505 [03.14. - 15:45:38] [WORLD__CLIENT] testuser: Name: Gonnadyealot
14505 [03.14. - 15:45:38] [WORLD__CLIENT] testuser: Race: 6  Class: 13  Gender: 1  Deity: 206  Start zone: 189
14505 [03.14. - 15:45:38] [WORLD__CLIENT] testuser: STR  STA  AGI  DEX  WIS  INT  CHA    Total
14505 [03.14. - 15:45:38] [WORLD__CLIENT]testuser:  60  80  90  75  83  134  60    582
14505 [03.14. - 15:45:38] [WORLD__CLIENT] testuser: Face: 4  Eye colors: 3 3
14505 [03.14. - 15:45:38] [WORLD__CLIENT] testuser: Hairstyle: 2  Haircolor: 11
14505 [03.14. - 15:45:38] [WORLD__CLIENT] testuser: Beard: 255  Beardcolor: 255
14505 [03.14. - 15:45:38] [WORLD__CLIENT] Validating char creation info...
14505 [03.14. - 15:45:38] [WORLD__CLIENT] Found 0 errors in character creation request
14505 [03.14. - 15:45:38] [WORLD__CLIENT] testuser: Current location: tutorialb  18.00, -147.00, 20.00
14505 [03.14. - 15:45:38] [WORLD__CLIENT] testuser: Bind location: tutorialb  18.00, -147.00, 20.00
14505 [03.14. - 15:45:38] [WORLD__CLIENT] testuser: Character creation successful: Gonnadyealot
14505 [03.14. - 15:45:43] [WORLD__CLIENT] testuser: Attempting autobootup of tutorialb (189:0)
14505 [03.14. - 15:45:44] [WORLD__ZONE] [5] Setting to 'tutorialb' (189:0)
14505 [03.14. - 15:45:44] [WORLD__CLIENT] testuser: Entering zone tutorialb (189:0)
14505 [03.14. - 15:45:44] [WORLD__ZONE] [5] [tutorialb] Broadcasting a world time update
14505 [03.14. - 15:45:45] [WORLD__ZONE] [5] [tutorialb] Setting to 'tutorialb' (189:0)
14505 [03.14. - 15:45:45] [WORLD__CLIENT] testuser: Sending client to zone tutorialb (189:0) at mydomain.net:7004
14505 [03.14. - 15:45:45] [WORLD__CLIENT] testuser: Client disconnected (not active in process)
14505 [03.14. - 15:46:19] [WORLD__ZONELIST] Removing zoneserver #5 at :7004
14505 [03.14. - 15:46:19] [WORLD__ZONELIST] Hold Zones mode is ON - rebooting lost zone
14505 [03.14. - 15:46:19] [WORLD__LAUNCH] zone: dynamic_01 reported state STOPPED (1 starts)
14505 [03.14. - 15:46:29] [WORLD__LAUNCH] zone: dynamic_01 reported state STARTED (2 starts)
14505 [03.14. - 15:46:29] [WORLD__ZONE] New TCP connection from 127.0.0.1:45555
14505 [03.14. - 15:46:29] [WORLD__CONSOLE] New zoneserver #6 from 127.0.0.1:45555
14505 [03.14. - 15:46:29] [WORLD__ZONE] [6] Zone started with name dynamic_01 by launcher zone
14505 [03.14. - 15:46:29] [WORLD__ZONE] [6] Auto zone port configuration.  Telling zone to use port 7005

and the tail of eqemu_debug_zone.log

Code:

14553 [03.14. - 15:46:29] [ZONE__INIT] Entering sleep mode
14553 [03.14. - 15:46:29] [NET__IDENTIFY] Registered patch 6.2
14553 [03.14. - 15:46:29] [NET__IDENTIFY] Registered patch Titanium
14553 [03.14. - 15:46:29] [NET__IDENTIFY] Registered patch SoF
14553 [03.14. - 15:46:29] [COMMON__THREADS] Main thread running with thread id -1223702832
14553 [03.14. - 15:46:29] [NET__WORLD] Connected to World: localhost:9000
14553 [03.14. - 15:46:29] [ZONE__WORLD] World indicated port 7005 for this zone.
14553 [03.14. - 15:46:29] [ZONE__INIT] Starting EQ Network server on port 7005
14553 [03.14. - 15:46:29] [COMMON__THREADS] Starting EQStreamFactoryReaderLoop with thread ID -1312003216
14553 [03.14. - 15:46:29] [COMMON__THREADS] Starting EQStreamFactoryWriterLoop with thread ID -1320395920

and the tail of the dynamic zone log for the zone I was in

Code:

[Debug] [ZONE__INIT] Loaded default rule set 'default'
[Debug] [ZONE__INIT] Loading Tasks
[Debug] [ZONE__INIT] Loading embedded perl XS
[Debug] [ZONE__INIT] Loading quests
[Quest] Starting Log: logs/eqemu_quest_zone.log
[Quest] Tying perl output to eqemu logs
[Quest] Creating EQEmuIO=HASH(0x8466468)
[Quest] Creating EQEmuIO=HASH(0x84667b0)
[Quest] Loading perlemb plugins.
[Quest] Loading perl commands...


cubber 03-14-2010 04:11 PM

Also to note I have gone as far as trying different versions of mysql as well as deleting my .ccache directory before building.

cubber 03-14-2010 04:18 PM

for anyone that may care the world file for my gentoo eqemu server. These and the packages that these depend on are all that is isntalled on this system besides the eqemu server. Everything installed on the system is at the current stable version as of 3/12/2010.

Code:

app-admin/logrotate
app-admin/showconsole
app-admin/sudo
app-admin/syslog-ng
app-admin/tmpreaper
app-admin/tmpwatch
app-arch/rar
app-arch/unrar
app-arch/unzip
app-arch/zip
app-editors/vim
app-misc/screen
app-portage/gentoolkit
app-text/tree
app-text/wgetpaste
dev-db/mysql
dev-perl/IO-stringy
dev-perl/Net-Telnet
dev-util/ccache
dev-util/lafilefixer
dev-util/subversion
mail-client/mailx
net-analyzer/netselect
net-fs/nfs-utils
net-misc/dhcpcd
net-misc/netkit-telnetd
net-misc/ntp
sys-apps/pciutils
sys-apps/slocate
sys-boot/grub
sys-fs/lvm2
sys-kernel/genkernel
sys-kernel/gentoo-sources
sys-kernel/gentoo-sources:2.6.31-r6
sys-kernel/module-rebuild
sys-power/acpid
sys-process/vixie-cron

package.keywords

Code:

#NTP
net-misc/ntp caps

#Openldap
net-nds/openldap sasl

#Subversion
dev-util/subversion -apache2

and make.conf (eqemu does not use this file when building, this is stuff installed via portage)

Code:

# These settings were set by the catalyst build script that automatically
# built this stage.
# Please consult /usr/share/portage/config/make.conf.example for a more
# detailed example.
CFLAGS="-O2 -march=native -pipe -fomit-frame-pointer"
CXXFLAGS="${CFLAGS}"
# WARNING: Changing your CHOST is not something that should be done lightly.
# Please consult http://www.gentoo.org/doc/en/change-chost.xml before changing.
CHOST="i686-pc-linux-gnu"
MAKEOPTS="-j2"
LINGUAS="en"
VIDEO_CARDS="nvidia"
INPUT_DEVICES="keyboard mouse"
PORTAGE_NICENESS="19"
APACHE_MODULES=""
ALSA_CARDS=""
ALSA_PCM_PLUGINS=""
LCD_DEVICES=""
FEATURES="ccache"
CCACHE_DIR="/var/tmp/ccache"
CCACHE_SIZE="2G"

USE="-X -xorg -isdnlog -gtk -gtk2 -gnome -qt3 -qt4 -arts -kde -ipv6 mmx mysql pni sse sse2"


GENTOO_MIRRORS="http://www.gtlib.gatech.edu/pub/gentoo http://gentoo.osuosl.org/ "

SYNC="rsync://rsync.namerica.gentoo.org/gentoo-portage"


Derision 03-14-2010 04:34 PM

The first step in diagnosing a segfault in Linux is to enable core dumps in the shell in which you start eqlaunch:
Code:

ulimit -c unlimited
Then, once the zone crashes, you should have a core file, either just called 'core' or 'core.<process number>'.

Fire up gdb:
Code:

gdb <path to zone executable> <core file name>
Then once gdb has loaded up and gives you the (gdb) prompt, get a backtrace:
Code:

(gdb) bt
That should tell you the source file and line number that the crash occurred at, with a backtrace of how it got there. If you get that far, post the backtrace here and I'll take a look at it.

cubber 03-14-2010 04:51 PM

the core dump:

Code:

gdb zone core

warning: Can not parse XML syscalls information; XML support was disabled at compile time.
GNU gdb (Gentoo 7.0 p2) 7.0
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "i686-pc-linux-gnu".
For bug reporting instructions, please see:
<http://bugs.gentoo.org/>...
Reading symbols from /opt/eqemu/zone...done.
[New Thread 2054]
[New Thread 2040]
[New Thread 2046]
[New Thread 2053]

warning: Can't read pathname for load map: Input/output error.
Reading symbols from /usr/lib/gcc/i686-pc-linux-gnu/4.3.4/libstdc++.so.6...(no debugging symbols found)...done.
Loaded symbols for /usr/lib/gcc/i686-pc-linux-gnu/4.3.4/libstdc++.so.6
Reading symbols from /lib/libdl.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib/libdl.so.2
Reading symbols from /usr/lib/libmysqlclient.so.15...(no debugging symbols found)...done.
Loaded symbols for /usr/lib/libmysqlclient.so.15
Reading symbols from /lib/libz.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib/libz.so.1
Reading symbols from /lib/libcrypt.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib/libcrypt.so.1
Reading symbols from /lib/libnsl.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib/libnsl.so.1
Reading symbols from /lib/libm.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib/libm.so.6
Reading symbols from /usr/lib/libssl.so.0.9.8...(no debugging symbols found)...done.
Loaded symbols for /usr/lib/libssl.so.0.9.8
Reading symbols from /usr/lib/libcrypto.so.0.9.8...(no debugging symbols found)...done.
Loaded symbols for /usr/lib/libcrypto.so.0.9.8
Reading symbols from /usr/lib/libperl.so.1...(no debugging symbols found)...done.
Loaded symbols for /usr/lib/libperl.so.1
Reading symbols from /lib/libpthread.so.0...(no debugging symbols found)...done.
Loaded symbols for /lib/libpthread.so.0
Reading symbols from /lib/libutil.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib/libutil.so.1
Reading symbols from /lib/libc.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib/libc.so.6
Reading symbols from /usr/lib/gcc/i686-pc-linux-gnu/4.3.4/libgcc_s.so.1...(no debugging symbols found)...done.
Loaded symbols for /usr/lib/gcc/i686-pc-linux-gnu/4.3.4/libgcc_s.so.1
Reading symbols from /lib/ld-linux.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib/ld-linux.so.2
Reading symbols from /lib/librt.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib/librt.so.1
Reading symbols from /lib/libnss_files.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib/libnss_files.so.2
Reading symbols from ./libEMuShareMem.so...done.
Loaded symbols for ./libEMuShareMem.so
Core was generated by `./zone dynamic_04 zone'.
Program terminated with signal 11, Segmentation fault.
#0  0x0815abee in CRC32::Update (
    buf=0x815aca7 "\367\320\203\304\f\303\220WVS\203\354 \213t$0\213\\$4\307D$\034\377\377\377\377\203\373\003v\025\205\366t\021\271\004", bufsize=141720844,
    crc32=91) at ../common/crc32.cpp:247
247            );
(gdb)


Derision 03-14-2010 05:02 PM

It appears to be crashing in the assembler in common/crc32.cpp.

Maybe a GCC version related issue (I use 4.1.1 without issue).

A quick thing to try rather than downgrading your GCC version would be to edit common/crc32.cpp and change line 175 from:
Code:

#elif defined(i386)
to
Code:

#elif defined(i386xxx)
and recompiling, so it should fall back to using the C version of the CRC code instead of the assembler.

cubber 03-14-2010 05:04 PM

Using sys-devel/gcc-4.3.4 . I will try that change and post my results.

snorkle 03-14-2010 05:09 PM

I had to make some changes to get the old VZTZ source to work in Linux. I'm not sure what revision their source was based on but I ended up scrapping it for 8.0. Here's one of the things I had to change to get it to work with newer gcc versions:

Code:

vztzfebsource-read-only/common$ svn diff crc32.cpp
Index: crc32.cpp
===================================================================
--- crc32.cpp  (revision 7)
+++ crc32.cpp  (working copy)
@@ -112,6 +112,14 @@
 #undef i386    //darwin seems to think we are generating PIC, and we clobber ebx
 #endif

+/* Some 64bit systems do not like the i386 assembly code below. However, some 64bit
+  systems do work with the assembly code below. We #undef i386 to be on the safe
+  side if we are compiling 64bit. */
+
+#ifdef __x86_64__
+#undef i386
+#endif
+
 uint32 CRC32::Update(const int8* buf, uint32 bufsize, uint32 crc32) {
 #if defined(WIN32)
    // Register use:
@@ -167,8 +175,8 @@
 #elif defined(i386)
        register uint32  val __asm ( "ax" );
        val = crc32;
-
 __asm __volatile (
+      "push  %%ebx\n"
        "xorl  %%ebx, %%ebx\n"
        "movl  %1, %%esi\n"
        "movl  %2, %%ecx\n"
@@ -232,9 +240,10 @@
        "xorb  2(%%esi), %%bl\n"
        "xorl  (%%edi,%%ebx,4), %%eax\n"
    "2:\n"
+      "pop  %%ebx\n"
        :
        : "a" (val), "g" (buf), "g" (bufsize)
-      : "bx", "cx", "dx", "si", "di"
+      : "cx", "dx", "si", "di"
    );

    return val;


cubber 03-14-2010 05:13 PM

This worked!

Quote:

Originally Posted by Derision (Post 185292)
It appears to be crashing in the assembler in common/crc32.cpp.

Maybe a GCC version related issue (I use 4.1.1 without issue).

A quick thing to try rather than downgrading your GCC version would be to edit common/crc32.cpp and change line 175 from:
Code:

#elif defined(i386)
to
Code:

#elif defined(i386xxx)
and recompiling, so it should fall back to using the C version of the CRC code instead of the assembler.

Thanks a bunch!


All times are GMT -4. The time now is 10:30 AM.

Powered by vBulletin®, Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.