EQEmulator Forums

EQEmulator Forums (https://www.eqemulator.org/forums/index.php)
-   Support::General Support (https://www.eqemulator.org/forums/forumdisplay.php?f=598)
-   -   Pet losing some - not all - his gear (https://www.eqemulator.org/forums/showthread.php?t=38505)

noudess 07-22-2014 10:18 PM

Ok...

I added a named semaphore and created it in eqlaunch before anything else.

Then I use it to access the pet inventory in the functions in zonedb.cpp.

It seems to me this is:

1) a bigger issue - since the zone I am leaving doesn't other work in Save() before the zone I land in starts accessing it..

or

2) Somehow another mechanism was meant to handle this and isn't for pet inventory.

I can show, so far, that without my fix, the zone I land in can read the content of the character_pet_inventory table before the zone I am leaving has it completely updated... thus the problem.

This is all on my Linux server.

I can post the named semaphore fix, but I'm not happy with it and I'm hoping someone more knowledgeable with how the old zone and new zone get ordered might step in and slap me up some ;)

noudess 07-23-2014 01:16 PM

My solution has evolved to a named semaphore in the client class. This semaphore is named after the character_id for the client. It is created on successful arrival in a zone (if you're coming from another zone, the CREAT ends up opening the one you left from (if its in use). There can, of course be multiple semaphores going on at once, for various characters, but they will clean themselves up on logout.

It's used when you save and read pet information.

It works, and I leave no semaphores around on exit.

I still feel that someone with a better idea of how db coordination is done between zones on zoning might have a more straightforward fix - or maybe tweak of existing mechanism.

Should I post about it on development? My fix works, but I'd think a patch more in tune with the original thought process might be better long term. I just can't see what that is.. :)

[ edit: expanded the critical section to start when the zoning flag is set and to end at client's destructor. The new zone now has perfect pet data. I also moved my check on the new zone to before we read in character buffs - with the wrong timing - these could get scrogged as well.

With 15 items, I could make the pet bug show up alot - and now its always perfect, And no semaphore baggage, all cleaned up.]

Let me know if you guys want the patch posted. My code is older, but I think the changes are small enough you can fit them in if you want them.

Players happy - they were sick of losing pet gear occasionally. ]

noudess 07-29-2014 09:56 AM

My fix works. No more pet gear lost. You guys want it? I don't mind posting the details.

Akkadius 07-29-2014 10:45 AM

Quote:

Originally Posted by noudess (Post 232292)
My fix works. No more pet gear lost. You guys want it? I don't mind posting the details.

Go ahead and post a diff

vsab 07-30-2014 08:42 AM

Looks like you have lots to contribute- it would be way easier for the dev's to integrate your changes if you forked the source on github ( https://github.com/EQEmu/Server ), made the change and then submitted a pull request.

noudess 07-30-2014 01:52 PM

I was really hoping that one of the core guys would pick this up, if it is going to be useful to the community. My fix works fine for my server (we have less than 20 players). Since it involved the use of a named semaphore, I'm loath to apply a diff to a branch and have someone just roll that in. For player bases that are more normal (I assume much larger than mine), Im worried that abnormal aborts, etc, might require semaphore cleanup and server restarts.

I've not run into that yet, but it seems inevitable.

So why did I bother posting? I guess I wanted to point the race condition on zoning and hope that someone more core to the project might have a better idea for the mainstream solution.

As an interim step, I'll post my diffs here.

noudess 07-30-2014 02:07 PM

I believe this is all the diffs. My code is divergent from the base, but this should give context and intent. Basically, I'm grabbing a named semaphore in the instance of client in the zone process when we zone and not releasing it until all pertinent data is saved. The recipient zone tries to grab the same semaphore, which ends up delaying the recipient in cases where saving had not finished yet. I've seen it delay, and solve the pet gear issue. The code in the zone error functions are to release the semaphore in the originating zone if the zoning process is blocked in some way (flags, etc). Note there is a diff semaphore per char id, so multiple zoning clients don't slow each other down.

My fear, is other hard crashes after the zoning client grabs the sem, but before it is released.

Also note, we close/delete the sem on both sides, all the time, to avoid issues with leftover semaphores. Its temping to leave the sem there for future zoning, but the dynamic state of things makes that very risky.

Tons of printfs left in the code so you can test. Remove as you see fit.


Code:

=== modified file 'zone/zoning.cpp'
--- zone/zoning.cpp        2013-07-09 15:28:09 +0000
+++ zone/zoning.cpp        2014-07-28 14:16:48 +0000
@@ -36,6 +36,13 @@
 #endif
 
        zoning = true;
+
+        // Lock a semaphore so the new client won't read
+        // until we're done writing out data
+        // This will get released on client exit
+        OpenZoningSem();
+        WaitZoningSem();
+
        if (app->size != sizeof(ZoneChange_Struct)) {
                LogFile->write(EQEMuLog::Debug, "Wrong size: OP_ZoneChange, size=%d, expected %d", app->size, sizeof(ZoneChange_Struct));
                return;
@@ -282,6 +289,11 @@
 }
 
 void Client::SendZoneCancel(ZoneChange_Struct *zc) {
+
+        // Not zoning, so client destructor won't be doing this.
+        PostZoningSem();
+        CloseZoningSem();
+
        //effectively zone them right back to where they were
        //unless we find a better way to stop the zoning process.
        SetPortExemption(true);
@@ -302,6 +314,10 @@
 {
        LogFile->write(EQEMuLog::Error, "Zone %i is not available because target wasn't found or character insufficent level", zc->zoneID);
       
+        // Not zoning, so client destructor won't be doing this.
+        PostZoningSem();
+        CloseZoningSem();
+
        SetPortExemption(true);
 
        EQApplicationPacket *outapp;

=== modified file 'zone/client.cpp'
--- zone/client.cpp        2014-01-11 17:57:12 +0000
+++ zone/client.cpp        2014-07-24 15:25:58 +0000
@@ -230,6 +230,7 @@
 zonesummon_ignorerestrictions = 0;
 zoning = false;
 zone_mode = ZoneUnsolicited;
+zoning_sem = (sem_t *) NULL;
 proximity_x = FLT_MAX;        //arbitrary large number
 proximity_y = FLT_MAX;
 proximity_z = FLT_MAX;
@@ -435,8 +436,107 @@
 
 entity_list.RemoveClient(this);
 UninitializeBuffSlots();
+
+// Clear the semaphore since this client is dying - zone/crash/logout
+PostZoningSem();
+CloseZoningSem();
 }
 
+// Added these functions to make sure saving of data finished before new
+// zone loaded it in.
+
+void Client::OpenZoningSem()
+        {
+        printf("Trying to CREATE a semaphore for char id %d.\n", CharacterID());
+
+        // Use a semaphore named after character id.
+        sprintf(zoning_sem_name,"%d-%s", CharacterID(),name);
+
+    zoning_sem=sem_open(zoning_sem_name, O_CREAT, 0644, 1);
+       
+        if (zoning_sem == SEM_FAILED)
+                {
+                printf("Cannot open zoning semaphore for char id %d.\n", CharacterID());
+                }
+        fflush(stdout);
+        }
+
+int Client::WaitZoningSem()
+        {
+        int        ret;
+
+        printf("Trying to WAIT a semaphore for char id %d.\n", CharacterID());
+
+        ret = sem_trywait(zoning_sem);
+
+        if (ret != 0)
+                {
+                printf("LOCKED - WAITING\n");
+                ret = sem_wait(zoning_sem);
+                }
+
+        if (ret != 0)
+                {
+                printf("Error waiting on zoning semaphore.\n");
+                }
+
+        fflush(stdout);
+        return ret;
+        }
+
+int Client::PostZoningSem()
+        {
+        int ret=0;
+
+        printf("Trying to POST a semaphore for char id %d.\n", CharacterID());
+
+        // Its possible for the sem to not exist, if the client never zoned.
+        if (zoning_sem != (sem_t *) NULL)
+                {
+                ret= sem_post(zoning_sem);
+               
+                if (ret != 0)
+                        {
+                        printf("Error posting on zoning semaphore.\n");
+                        }
+                }
+
+        fflush(stdout);
+        return ret;
+        }
+
+int Client::CloseZoningSem()
+        {
+        int ret;
+
+        // This gets called by destructor, when the sem should already be closed
+        // So lets not do extra work if so.
+        if (zoning_sem != (sem_t *) NULL)
+                {
+                printf("Trying to CLOSE a semaphore for char id %d.\n", CharacterID());
+
+                ret = sem_close(zoning_sem);
+
+                if (ret != 0)
+                        {
+                        printf("Error closing on zoning semaphore.\n");
+                        }
+
+                ret = sem_unlink(zoning_sem_name);       
+
+                if (ret != 0)
+                        {
+                        printf("Error unlinking on zoning semaphore.\n");
+                        }
+                }
+       
+        fflush(stdout);
+
+        zoning_sem = (sem_t *) NULL;
+
+        return ret;
+        }
+
 void Client::SendLogoutPackets() {
 
 EQApplicationPacket* outapp = new EQApplicationPacket(OP_CancelTrade, sizeof(CancelTrade_Struct));

=== modified file 'zone/client.h'
--- zone/client.h        2013-05-02 18:37:13 +0000
+++ zone/client.h        2014-07-23 19:11:48 +0000
@@ -19,6 +19,7 @@
 #define CLIENT_H
 class Client;
 
+#include <semaphore.h>
 #include "../common/timer.h"
 #include "../common/ptimer.h"
 #include "../common/emu_opcodes.h"
@@ -491,6 +492,10 @@
        inline uint32        GetGold()                const { return m_pp.gold; }
        inline uint32        GetPlatinum()        const { return m_pp.platinum; }
 
+        void        OpenZoningSem();
+        int                WaitZoningSem();
+        int                PostZoningSem();
+        int                CloseZoningSem();
 
        /*Endurance and such*/
        void        CalcMaxEndurance();        //This calculates the maximum endurance we can have
@@ -1303,6 +1308,8 @@
        uint8        zonesummon_ignorerestrictions;
        ZoneMode zone_mode;
 
+        sem_t        *zoning_sem;
+        char        zoning_sem_name[75];
 
        Timer        position_timer;
        uint8        position_timer_counter;

=== modified file 'zone/client_packet.cpp'
--- zone/client_packet.cpp        2014-01-18 17:25:31 +0000
+++ zone/client_packet.cpp        2014-07-24 22:16:53 +0000
@@ -8971,7 +8971,13 @@
        }
 
 
+        // Obtain a lock that proves the old zone is done
+        // Updating this data.
+        OpenZoningSem();
+        WaitZoningSem();
+
        if (spells_loaded)
+
        {
                for(uint32 z=0;z<MAX_PP_MEMSPELL;z++)
                {
@@ -9154,6 +9160,7 @@
                rest_timer.Start(m_pp.RestTimer * 1000);
 
        database.LoadPetInfo(this);
+
        //this was moved before the spawn packets are sent
        //in hopes that it adds more consistency...
        //Remake pet
@@ -9169,9 +9176,15 @@
                }
                m_petinfo.SpellID = 0;
        }
+
        // Moved here so it's after where we load the pet data.
        if(!GetAA(aaPersistentMinion))
                memset(&m_suspendedminion, 0, sizeof(PetInfo));
+
+        // I think this was the last thing we need to load.
+        // Release the Semaphore.
+        PostZoningSem();
+        CloseZoningSem();
 
        ////////////////////////////////////////////////////////////
        // Server Zone Entry Packet


noudess 08-13-2014 03:49 PM

This is still working for me after weeks. Again, I have less than 10 players, so a large playerbase might be an issue if crashes end up leaving the named sem around. Pet gear loss completely fixed however, and it safeguards against other data loss as well.


All times are GMT -4. The time now is 01:46 AM.

Powered by vBulletin®, Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.