I'm scratching my head on this one.
I've been running my server happily since last summer, and everything was working just fine last week.
I come in today and find the machine locked up.
After power cycling, X wouldn't start. But removing the nvidia drivers and updating seemed to fix things. During this process a newer kernel was installed. I have booted under the previous kernel, and this did not fix anything.
Once booted, I go to start up eqemu and receive an error message about the shared memory region being the wrong size (rebuild him bigger stronger etc). I kill the servers and remove the IPC regions.
After restarting, I am able to select a server with the client, but the client's screen goes black (with the gold arrow). It sits there for a bit and then returns to server select. (One minute transpires from login until server select).
I had been making some database changes last week, so I restored from a known good database. The behavior remains the same. So I don't think any database changes are to blame.
Comparing a snapshot of the log directory before selecting a server and afterwards shows that only the login server's log is modified. A diff of
the two versions:
Code:
< [Client] [02.13.12 - 16:41:31] Found sequence and play of 5 1
< [Network Trace] [02.13.12 - 16:41:31] dumping packet of size 20
< 05 00 00 00 00 00 00 00 - 00 00 00 00 00 00 00 00 | ................
< 01 00 00 00 | ....
< [Network Trace] [02.13.12 - 16:41:31] Sending play response with following data, allowed 1, sequence 5, server number 1, message 101
< [Network Trace] [02.13.12 - 16:41:31] dumping packet of size 20
< 05 00 00 00 00 00 00 00 - 00 00 01 65 00 00 00 00 | ...........e....
< 01 00 00 00 | ....
< [Network Trace] [02.13.12 - 16:41:31] Sending play response for Harcourt.
< [Network Trace] [02.13.12 - 16:41:31] dumping packet of size 20
< 05 00 00 00 00 00 00 00 - 00 00 01 65 00 00 00 00 | ...........e....
< 01 00 00 00 | ....
< [Network] [02.13.12 - 16:41:32] Client disconnected from the server, removing client.
< [Network] [02.13.12 - 16:42:33] New SoD client connection from 129.120.60.236:55652
< [Network] [02.13.12 - 16:42:33] Application packet recieved from client (size 14)
< [Network] [02.13.12 - 16:42:33] Session ready recieved from client.
< [Network] [02.13.12 - 16:42:33] Session ready indicated logged in from world(unsupported feature), disconnecting.
This looks like the same response as previously successful logins received.
Again, that is the only logfile that changes on the server.
The server is a modified r1976 (although no code changes in a few months). I know this is from July, but it has been working fine for what I'm doing.
My rough read on this is that the response is sent but not received by the client. Both machines are on the same switch, and both have working networks (can browse the web). I was able to log onto a live server from the client machine. Can anyone think of a good next step to troubleshoot this?
Here's hoping that the problem miraculously fixes itself after the forum post.