PDA

View Full Version : INFO: Core Files (core.####)


Xorith
11-21-2004, 07:23 AM
Hello,

I'm an experienced Linux programmer, and one of the major tools for developers are bug reports of any kind. One way to tell that a bug exsists is by having core files in your directory that you run your executables from.

A core file usually looks like this:

eqemu@laguz:~/wicked/bin$ ls -la
total 60160
drwxr-xr-x 6 eqemu eqemu 616 Nov 21 13:41 .
drwxr-xr-x 3 eqemu eqemu 72 Nov 17 06:56 ..
-rw-r--r-- 1 eqemu eqemu 692 Nov 17 08:01 LoginServer.ini
drwxr-xr-x 2 eqemu eqemu 7920 Nov 16 03:42 Maps
-rw-r--r-- 1 eqemu eqemu 1310 Nov 17 07:40 addon.ini
-rwxr-xr-x 1 eqemu eqemu 16227 Nov 17 06:56 cleanipc
-rw-r--r-- 1 eqemu eqemu 233 Nov 20 00:04 commands.pl
*** -rw------- 1 eqemu eqemu 8359936 Nov 19 21:34 core.6379
*** -rw------- 1 eqemu eqemu 32518144 Nov 20 10:45 core.7013
*** -rw------- 1 eqemu eqemu 3334144 Nov 20 10:45 core.7017
-rw-r--r-- 1 eqemu eqemu 249 Nov 17 07:40 db.ini
-rw-r--r-- 1 eqemu eqemu 29 Nov 21 04:36 eqtime.cfg
-rwxr-xr-x 1 eqemu eqemu 1435488 Nov 21 13:41 libEMuShareMem.so
drwxr-xr-x 2 eqemu eqemu 712 Nov 18 18:49 logs
-rw-r--r-- 1 eqemu eqemu 3266 Oct 2 16:25 plugin.pl
drwxr-xr-x 5 eqemu eqemu 152 Nov 18 18:45 quests
-rwx------ 1 eqemu eqemu 38 Nov 17 07:50 shutdown
-rw-r--r-- 1 eqemu eqemu 3778243 Nov 17 08:42 spells_us.txt
-rwxr-x--- 1 eqemu eqemu 991 Nov 19 14:09 startup
-rwxr-xr-x 1 eqemu eqemu 4319085 Nov 21 13:41 world
-rwxr-xr-x 1 eqemu eqemu 21563191 Nov 21 13:41 zone
eqemu@laguz:~/wicked/bin$

(Note the lines starting with ***)

The file format is: core.#### where the number is the process ID of the program that crashed.

If you note, they're fairly large sometimes. The reason why is because a core file is a dump of everything that the program was doing - from memory to function tasks. It's a snapshot of the entire instance of a program before it crashed.

Core files are useful because they help a developer find where a crash originated from. In order to find out this information, you have to use a certain tool commonly found on Linux systems called 'gdb'.

GDB stands for GNU Debugger, and it's a way to examine what a program was doing right when it crashed. Basically, you can usually find the exact part of the code that caused it to crash, however this isn't always the case.

The syntax for GDB is:
eqemu@laguz:~/wicked/bin$ gdb --core=core.12345 ./zone

Note that you have to tell GDB what program you believe crashed. It won't always be the zone program, but the best way to figure it out is to check your log files. You can do this with the 'tail' command.

eqemu@laguz:~/wicked/bin/logs$ tail -10 world.log

The above will show you the last 10 lines of your world log, should it be called 'world.log'. If you have shut down your server, then you should see something that says 'Got signal', followed by a number. If it looks like the log file of a server that's still running, there's a good chance it crashed.

You can also simply gdb --core=<file> and read the first few lines GDB has to say:


Core was generated by `./zone everfrost 127.0.0.1 8027 127.0.0.1.
Program terminated with signal 11, Segmentation fault.


When you bring up GDB, it'll load all the symbols needed for it to figure out what went wrong. A tip here: KEEP YOUR SOURCE CODE. Also, make sure you debug a core for the current executable, not a previous compile of an executable. It just won't work.

When you get into GDB, the first command you want to run is 'bt'. This stands for 'backtrace'. It'll give you a list of procedures that were on the stack when the program crashed. This list is in order, the top most one, or Frame #0, was the last function to be called. The one under Frame #0, or Frame #1, is the function that called Frame #0. This goes on all the way down the list. The last function is the one that started it all, and is usually a system call.

Simply posting the backtrace list is helpful enough, but sometimes someone working on the code might need more information. It'd be a good idea to save your core file, along with the executable that generated it. You could do something like this to archive it:
eqemu@laguz:~/wicked/bin$ tar -vczf zonecore1.tgz core.12345 zone
Which will archive the two major files needed, or even better would be to also archive the source:
eqemu@laguz:~/EQEmuCVS/Source$ tar -vczf zonesrc.tgz zone

You can change the filenames to whatever you prefer.

A few other commands you can use while in GDB:

The 'frame' command sets the current frame to whatever number you provide. Lets say you want to investigate the top-most frame: 'frame 0'. This will output something similar to the top line on the backtrace.

You can get information on variables used in the function by using the 'info' command after you select a 'frame':

info args -- This shows the arguments passed to the function
info local -- This shows any local variables used inside the function.

Note: see 'help info' for more information

You can also 'print' a variable:

print tmp
print *tmp

Note that in the case of pointers, if you don't include a *, it'll simply print the memory address. If you want to see what's in the variable, include the *.

Hope this was helpful -- submit corrections please. :)

RangerDown
11-22-2004, 03:26 AM
IMO, *this* is worthy of a sticky.

Thanks a lot for the guide.

Doodman
11-22-2004, 08:39 AM
Just make sure you compiled all the source with -ggdb switch so that the executable has full symbol tables.

It makes the executables larger, but it keeps enough information for a more useful debugging session.

If you run your zones/world stripped of symbol information, you can go recompile (assuming you've not made changes) with -ggdb and then use gdb on your new zone/world with that core. GDB will probably complain as follows, but should take it just fine:warning: exec file is newer than core file.
BTW, in general it drops core.pid for multithreaded applications (where pid is the process ID of the thread that cored) and just plain "core" for single threaded apps.

You can also see what application generated the core file, if you don't have gdb installed for some reason (not that the core file will do you much good, at least on that system):% strings core.10415| head
tdH$
CORE
CORE
zone
./zone qeynos 192.168.0.5 7101 192.168.0.5
CORE
zone
-panel
CORE
CORE

Xorith
11-22-2004, 09:23 AM
Thanks for clarifying it all, Doodman. I've only ever used GDB in the MUD community, and a few smaller projects. :)

kathgar
11-23-2004, 12:24 AM
Core files will be truncated to the current limit, set by ulimit -c. You can also change this in /etc/limits on some systems.

NOTE: 0 means 0 bytes, NOT unlimited as on UNIX systems.