Log in

View Full Version : Odd Client Disconnects and Server Crashes


Uleat
07-05-2016, 11:04 PM
Has anyone recently experienced issues with indeterminable server crashes or client disconnects upon zoning?


Possible factors:

- Bots Enabled (Verified occurrence in non-bot enabled code)
- Perl Support (Verified occurrence in non-perl enabled code)
- Lua Support (Verified occurrence in non-lua enabled code)
- Windows Server


EDIT: I define recently within the last 3 months or so using up-to-date code.

Shin Noir
07-06-2016, 12:01 AM
If so, please consult your doctor.
(My code was last git pulled from master branch about 1 month ago, and I don't use bot, and I don't have these symptoms)

Proxeeus
07-06-2016, 02:41 AM
No crashes per se on my side, but I've been experiencing a couple of client disconnects here and there using the latest code (bots enabled + Windows server).

These disconnects were few and usually happened during the first 5 or so minutes of my server coming online and me logging in to a zone - I'd then get kicked back to the Character screen.

No other disconnections were noticed past that point, tho !

Uleat
07-06-2016, 01:22 PM
Thanks for the input guys!


I did replicate the issue in non-bot code..so, that contributor is out.


TBH, it looks like the zone server is spooling up and shutting down before the client is able to contact it (5000ms timer.)

After the shutdown, there does appear to be a TCP re-connection from zone to world. (I have 3 dynamic zones for solo play - no static.)

I need to test the #zone command to see if the problem persists there..but, any world->zone login or zone->zone login can create the issue. (Tested UF, RoF and RoF2)


My test was zoning from 'Crescent Reach' to 'Blightfire Moors'

I know that Moors is a large zone..but, that shouldn't be a factor. I have successfully made that transition hundreds of times in the past.


EDIT:

There is at least one other player experiencing this.

I was trying to help him update his code and database and we were weeding out issues when this particular behavior popped up.

I logged in to see if I could replicate it and did so in my 'clean' code and database localhost server.


UPDATE:

I build a clean server with no bots and no perl/lua support.

The issue still persists..

Uleat
07-06-2016, 04:59 PM
Logs for affected world and zone servers:


world: http://wiki.eqemulator.org/i?Module=Pastebin&Paste=jTKJ91ei

zone: http://wiki.eqemulator.org/i?Module=Pastebin&Paste=Ln371PZj


EDIT: I changed the zone auto-shutdown value from 5 to 120 seconds with no affect.

joligario
07-06-2016, 06:42 PM
So you aren't getting any actual world/zone crash dumps? Those packets are definitely gobbledygook...

Uleat
07-06-2016, 07:01 PM
Nothing..and I enabled even more log cats after posting those.

The only changes locally since it worked last are the routine repo pulls and db updates.

Uleat
07-06-2016, 07:09 PM
[07-06-2016 :: 16:48:55] [World Server] Zone bootup timer expired, bootup failed or too slow.

joligario
07-06-2016, 09:05 PM
HD or ram corruption, perhaps

Uleat
07-06-2016, 09:41 PM
I don't have any issues (atm) with anything else on my system..and I push the limits of 8GB with apps sometimes.

I'm running Win7 pro and the other user has Windows Server 2012.


I know most ppl that update their servers regularly are probably Linux users..so, if it's a Windows-related issue, it may be up to a month or more before others start to see it.

(That issue with linking the dll's from static to dynamic went a few months before others started reporting the problem.)

Mortykins
07-06-2016, 10:12 PM
I can't wait until this bug is fixed , it has been a plague for the players on Raid Addicts. Really appreciate you looking into this folks and hope to see a fix soon from these efforts.

Mort
www.raidaddicts.org

DanCanDo
07-06-2016, 10:36 PM
I tried replicating this, but no bananas.I updated source today. (one with bots enabled and one without). I zoned all over, including crescent>blightfire and #zone'd all over, didn't get
any problems. Only issue I have ever had in the past with a client crashing, once in awhile,
after I do some DB edits, start server up, then when I try to enter world, the client crashes
but it will only do it once and server sees it as linkdead. That only happens once in a blue
moon though. But I've never crashed from zoning. Win 7 Ultimate OS

Uleat
07-06-2016, 10:37 PM
I'm starting to lean towards a timeout issue and map loading.

High detail zone maps (moors, springs, mesa) take forever to load and I think the bootup timer may be expiring.

May not be the issue..but, I'll keep digging in this hole for awhile.

DanCanDo
07-06-2016, 10:45 PM
I'm starting to lean towards a timeout issue and map loading.

High detail zone maps (moors, springs, mesa) take forever to load and I think the bootup timer may be expiring.

May not be the issue..but, I'll keep digging in this hole for awhile.

I will agree, just zoning in to moors takes a good 20+ secs. (my client box is not
shy for gaming hardware).

Uleat
07-06-2016, 10:54 PM
I got this message in the client when I was actually able to get into moors:

ZServer BROADCASTS, 'OkxyHwt: TIjUX HOQLGj qSUDgljcSt XoWA'


I have a ton of logging enabled atm..but, it's all going to file and nothing additional to gmsay.

joligario
07-06-2016, 10:58 PM
What is your current ruleset setting for World:ZoneAutobootTimeoutMS ?

joligario
07-06-2016, 10:58 PM
I got this message in the client when I was actually able to get into moors:




I have a ton of logging enabled atm..but, it's all going to file and nothing additional to gmsay.

That's awesome...

Uleat
07-06-2016, 11:04 PM
World:ZoneAutobootTimeoutMS: 120000
Zone:AutoShutdownDelay: 5000
Zone:ClientLinkdeadMS: 180000

joligario
07-06-2016, 11:07 PM
Ok, normal.

It would be funny if zone server was drunk or using different language and getting garbled messages...

Uleat
07-06-2016, 11:22 PM
ROFL!!


Quoted for emphasis..

DanCanDo
07-07-2016, 12:16 AM
Not sure if this is relevant or not. When a zone shuts down, (5 secs as per rules), should
it always take 5 secs ? I just noticed (below) one time was 5 secs and another 10 secs.
(I'm just learning here) The zonings were both in the same client session.(UF)
(zoning from crescent to moors)

[07-06-2016 :: 20:20:57] [Zone Server] Dropping client: Process=false, ip=192.168.0.10 port=52041
[07-06-2016 :: 20:21:02] [Status] Zone Shutdown: crescent (394)
[07-06-2016 :: 20:21:02] [Normal] Zone shutdown: going to sleep

[07-06-2016 :: 20:22:44] [Zone Server] Dropping client: Process=false, ip=192.168.0.10 port=52045
[07-06-2016 :: 20:22:54] [Status] Zone Shutdown: crescent (394)
[07-06-2016 :: 20:22:54] [Normal] Zone shutdown: going to sleep

Uleat
07-07-2016, 01:15 PM
5 seconds is the minimum length of time.

If the zone isn't processed again until 10 seconds after the timer was set, that is still ok..but, I might question what was going on for the latent 5 seconds.

Uleat
07-07-2016, 01:28 PM
I think I found the source of this particular issue - V2 maps! (at least the larger ones..and probably on a slow/low time-slice computer)

[with 'moors' map installed]

[07-07-2016 :: 12:09:51] [Error] MARK - Pre-zone->Init()
[07-07-2016 :: 12:09:52] [Error] MARK - Post-zone->Init()
[07-07-2016 :: 12:09:52] [Error] MARK - Pre-LoadMapFile()
[07-07-2016 :: 12:09:52] [Error] MARK - MapV2
[07-07-2016 :: 12:11:17] [Error] MARK - Post-LoadMapFile()
[07-07-2016 :: 12:11:17] [Error] MARK - Pre-LoadWaterMapfile()
[07-07-2016 :: 12:11:17] [Error] MARK - Post-LoadWaterMapfile()
[07-07-2016 :: 12:11:17] [Error] MARK - Pre-LoadPathFile()
[07-07-2016 :: 12:11:17] [Error] Path File Maps/moors.path not found.
[07-07-2016 :: 12:11:17] [Error] MARK - Post-LoadPathFile()
[07-07-2016 :: 12:11:17] [Error] MARK - Pre-'loglevel'
[07-07-2016 :: 12:11:17] [Error] MARK - Post-'loglevel'
[07-07-2016 :: 12:11:17] [Error] MARK - Pre-worldserver.SetZoneData()
[07-07-2016 :: 12:11:17] [Error] MARK - Post-worldserver.SetZoneData()


[with 'moors' map removed]

[07-07-2016 :: 12:33:59] [Error] MARK - Pre-zone->Init()
[07-07-2016 :: 12:34:01] [Error] MARK - Post-zone->Init()
[07-07-2016 :: 12:34:01] [Error] MARK - Pre-LoadMapFile()
[07-07-2016 :: 12:34:01] [Error] MARK - Post-LoadMapFile()
[07-07-2016 :: 12:34:01] [Error] MARK - Pre-LoadWaterMapfile()
[07-07-2016 :: 12:34:01] [Error] MARK - Post-LoadWaterMapfile()
[07-07-2016 :: 12:34:01] [Error] MARK - Pre-LoadPathFile()
[07-07-2016 :: 12:34:01] [Error] Path File Maps/moors.path not found.
[07-07-2016 :: 12:34:01] [Error] MARK - Post-LoadPathFile()
[07-07-2016 :: 12:34:01] [Error] MARK - Pre-'loglevel'
[07-07-2016 :: 12:34:01] [Error] MARK - Post-'loglevel'
[07-07-2016 :: 12:34:01] [Error] MARK - Pre-worldserver.SetZoneData()
[07-07-2016 :: 12:34:01] [Error] MARK - Post-worldserver.SetZoneData()


Plus, I was able to get in the zone with no issue in the removed map scenario.


I tried up'ing the 'World:ZoneAutobootTimeoutMS' rule to 3 minutes with no change in behavior.


Is there an internal timeout in the client that might cause it to stop communicating with the server if a certain length of time goes by
without any data transfer - say, a minute and twenty-five seconds?

wirepuller134
07-07-2016, 02:30 PM
Don't know if it helps or not. But I use an Imac running Crossover with the ROF client.
Our server is updated every Saturday night. It has bots enabled and mercs disabled. No static zones. It runs on a Windows 7 pro 64, AMD Athlon 64 X2 dual core 1.9 processor with 4 gigs of ram. Antique 120GB primary drive, the servers run on, and 2 3TB drives for media storage. The little computer runs headless on our network, with an FTP server as a backend for Kodi and Beyondtv, and also runs an WOW emulation server, 4.3.4 Trinitycore with playerbots. So it is quite busy. But we have only 2 players in the house, occasionally a friend logs in from Texas to play.

Now on to the asked question. When zoning into the Moors, if the zone isn't already booted up. It takes about 35 seconds but no disconnect. But we do see a yellow zserver broadcast saying "world server connection lost", after we finish zoning. While zoning back and fourth, after the zone was booted, it takes about 15 seconds to zone in and we don't see the message. Lavastorm took a long time to load as well the first time in. So far those are the only 2 with long loading times. The last server crash we had was back in May, while I was working on an Expedition (the system Akadias made a long time ago, yes we still use it!).....I caused it. So with a week old server, no crashes or client disconnects. My wife plays daily and she would be all over me if she was having issues. She mentioned those 2 zones loading time to me when I told her about this thread.


Edit: the FTP server is serving media to 3 televisions running Kodi.

Uleat
08-05-2016, 11:09 PM
This appears to be 100% identified: http://www.eqemulator.org/forums/showthread.php?t=40805

..and fixed by using zone mmf files.

xarisd
12-22-2020, 04:09 PM
Man, I wish that last link didn't go to an invalid thread, but I will look for and fix zone mmf files, whatever those are. When this problem came up on me last night after a server backup, I noticed that I can create a new character and they load up just fine, but I still can't log in to my old characters. Furthermore, what started all this is enabling the merc code, going through all the things, wiped out my merchants server-wide. What I think is whoever got the merc merchant code done made it to where it changed some rule or line of code to think that every merchant everywhere either needed to have the same rules as mercenary merchants, or.. well, no, that seems to be the case. Anyway, long story short I had to reinstall my entire server because HeidiSQL is clunky and doesn't know to restore from backup, nor seems to know to let go of a database that has been shift-deleted. So after all the table reloads, I have this exact problem from... 2016. I have no problem gravedigging, as you can no doubt tell, a habit I've gotten into from dead links, links that go nowhere, and posters who tell us nevermind, they fixed it and never bother to say how. I'm a necromancer by necessity, as back in the old days when Google was just a search engine and not a multi-trillion dollar evil empire focused on controlling the world, I could find all my relevant information from current year.

Interestingly enough, I don't know if this contributes at all, but it seems there's leftover merc code from the deleted server still floating around in my new one. I'm thinking I need to perform a complete fresh install, maybe even on a sub-login account, and try from scratch. Sorry for gravedigging, but dead links are what they are. I'll try to figure out what a zone mmf is, but hopefully someone, anyone can tell me what and where they are so I can fix them.

Vexyl
12-22-2020, 04:16 PM
Does this help? http://web.archive.org/web/20161111235955/http://www.eqemulator.org/forums/showthread.php?t=40805

Huppy
12-22-2020, 04:27 PM
MMF's are a thing of the past.....

xarisd
12-22-2020, 04:29 PM
That helps hugely, thank you, Vexyl. Although, now, I don't know if it helps at all because of what Huppy said. Feels like one step forward, two steps back.

Huppy
12-22-2020, 04:39 PM
I don't know if it helps at all because of what Huppy said.

Something I posted in another thread, which I "think" you read, things are changing quite frequently in this project. Reading old threads is risky, because it may contain content, that could be obsolete now.

Not sure what your doing as far as "a server" goes. Are you trying to revive an old server you had ? If that is the case, are you trying to run it with updated, current binaries ? You could run into issues, under various circumstances, depending on what you're doing, or trying to accomplish.

xarisd
12-22-2020, 05:26 PM
Yeah, I'm really starting to see that now. I need to be more careful with what I do.

No, just trying to restore my server back to a few days ago because of screwed up code. It's RoF2 to RoF2, nothing weird. It's decided to throw this error at me, so what I've done since posting is redo each table one a time to be sure I have everything replicated exactly as it was, based on the SQL backups I made. The tables I'm working with are:
"Account", "Account_IP", "Account_Rewards", "Character_Activities", "Character_Alternate_Abilities", "Character_Alt_Currency", "Character_Auras", "Character_Bind", "Character_Buffs", "Character_Currency", "Character_Data", "Character_Enabled_Tasks", "Character_Item_Recasts", "Character_Languages", "Character_Memmed_Spells", "Character_Pet_Info", "Character_Skills", "Character_Spells", Tasks, Create_Combinations, Inventory, Event_Logs, and Timers.

This is the second fresh install I've done, but I think I know where I messed up last night when I started working on the restore. I thought I had everything figured out until I ran into the problem of "[World Server] Zone bootup timer expired, bootup failed or too slow." I tried logging as the new character as well as the old, EQ locked up for about two minutes, then started moving again on the character select. I looked at the CLI window and it said "[World Server] Zone bootup timer expired, bootup failed or too slow."

I also found out that I apparently don't have the zone file for misty thicket, which would explain why the Rivervale PoK stone never worked, as just to test I ran a halfling to Misty to see if a new character could cross zone lines. Logged back in and that new halfling couldn't log in, nor could the human that I made, ran to PoK, and logged out. I'm hoping that with this fresh install the problem is solved once and for all. Then I can focus on finding a MistyThicket file for my client.

Huppy
12-22-2020, 05:37 PM
You might find find this link handy ;)
http://www.projecteq.net/forums/index.php?threads/fix-read-here-for-zoning-into-tox-old-zones.15199/

xarisd
12-22-2020, 05:39 PM
Okay, it's fixed for the time being, although I really hate having go scorched earth to fix anything. I feel like there was a much more elegant solution, a scalpel rather than an atom bomb. I'm not adding any queries or code from now on, my lesson has been learned. Like I indicated in the other thread, I'm just going to be happy with what I have and not care about how green the grass is anywhere else.

I do still want that Misty Thicket client map, though - not sure how that got missed. I invented halfling bards and I insist on playing them in their start zone! That and Misty Thicket always was a fun place back in the day, a nice comfortable zone for levelling a newbie with death traps all around. You know, like Kithicor at night. I'll never forget the first time I did that and was dead from reds not two minutes later. Never knew zombies could run that fast!

Huppy
12-22-2020, 06:01 PM
Not sure if you're aware, but there is a file in your main server folder called eqemu_server.pl
Normally, in windows, you should be able to just double click on that file to run it, for a menu to pop up. It has various functions, but also an option to backup your entire database.