View Full Version : COMMITTED: UCS crash fix
Zothen
05-03-2011, 05:59 AM
Okay, I think it makes sense to post my fix here, since my UCS isnt crashing anymore.
I think I found the bug in UCS that is causing a crash.
Debugging showed up a halt in DBCore::Open() after a call of mysql_real_connect() using an uninitialized mysql struct.
The struct gets uninitialized whenever theres a connection error of some sort, because mysql.close() makes it invalid. Without another call to mysql.init() all following sql connects/commands will fail, resulting in a crash.
So we just need to enter a single line of code... :
(May need some more testing, but mine is running stable now for 20 hours.)
bool DBcore::Open(int32* errnum, char* errbuf) {
if (errbuf)
errbuf[0] = 0;
LockMutex lock(&MDatabase);
if (GetStatus() == Connected)
return true;
if (GetStatus() == Error)
{
mysql_close(&mysql); // <- Makes struct 'mysql' invalid!
mysql_init(&mysql); // <- Initialize structure again
}
if (!pHost)
return false;
/*
Quagmire - added CLIENT_FOUND_ROWS flag to the connect
otherwise DB update calls would say 0 rows affected when the value already equalled
what the function was tring to set it to, therefore the function would think it failed
*/
int32 flags = CLIENT_FOUND_ROWS;
if (pCompress)
flags |= CLIENT_COMPRESS;
if (pSSL)
flags |= CLIENT_SSL;
// crashed here, because 'mysql' wasnt valid after an error followed by mysql.close()
if (mysql_real_connect(&mysql, pHost, pUser, pPassword, pDatabase, pPort, 0, flags)) {
pStatus = Connected;
return true;
}
else {
if (errnum)
*errnum = mysql_errno(&mysql);
if (errbuf)
snprintf(errbuf, MYSQL_ERRMSG_SIZE, "#%i: %s", mysql_errno(&mysql), mysql_error(&mysql));
pStatus = Error;
return false;
}
}
trevius
05-03-2011, 06:23 AM
Sounds good to me. I will give it a shot and commit it when I have a few minutes. Thanks for the submission!
Zothen
05-03-2011, 06:30 AM
Sure thing, happy to contribute to this project.
sorvani
05-03-2011, 11:42 AM
My UCS has been running for 24+ hours now after this change. (Windows server)
Secrets
05-03-2011, 03:23 PM
Shouldn't you just return false, then initialize the mysql connection after that if it is false? That makes more sense to me.
Zothen
05-04-2011, 02:19 AM
I guess youre right regarding the return false. With my quick and dirty fix you may not get any database error messages again (some may think thats a good thing though LOL).
But on the other hand that way its possible that the UCS will just quit clean after it encounters a database error. I am not sure about that.
I would leave init where it is, since its the only place where the mysql struct gets closed (except in the destructor), so theres no reason to call it elsewhere.
joligario
05-04-2011, 08:55 AM
I think what he may be saying is that instead of closing mysql, return false instead? Then the connection wouldn't need to be re-initialized.
sorvani
05-04-2011, 11:54 AM
came down to my office and looked at the PC running my test server and noticed that UCS crashed over night. :(
Zothen
05-04-2011, 02:40 PM
Were any players on during that time?
sorvani
05-04-2011, 03:29 PM
Not when i came down no. It is mainly just used by me to test things.
I had toons in and out yesterday most of the day and it was running.
Went to bed early and when i came down to my office this morning UCS was crashed.
UCS only crashes on a new log in that I've noticed, but then the times that I've seen it actually crash were when I logged in a toon after being offline for hours and no one else had logged in during the interim.
I restarted the UCS right when I seen it crashed this morning and I have had a couple toons in and out of my server this afternoon and it hasn't crashed again yet.
Zothen
05-04-2011, 03:52 PM
My UCS is running in visual studio debug mode all of the time. Got a database error actually today, but no crash.
Zothen
05-05-2011, 02:34 AM
I analyzed the UCS code a bit further and I am now 99.99% ;) sure that my fix is the way it is supposed to be.
When the function dbcore::RunQuery() results in a connection/database error, it is calling itself again for exactly 1 retry, but on that 2nd call it checks for the status. If pStatus = error then dbcore::Open() is called.
Open() calls mysql_close() on any error status and is supposed to reconnect in the same call, but will definitetely fail without mysql_init(). So placing a FALSE after the "close" or "init" would prevent Open() to reconnect.
All that is missing now is a confirmation message, that the recovery of the encountered error was successful.
bool DBcore::RunQuery(const char* query, int32 querylen, char* errbuf, MYSQL_RES** result, int32* affected_rows, int32* last_insert_id, int32* errnum, bool retry) {
_CP(DBcore_RunQuery);
if (errnum)
*errnum = 0;
if (errbuf)
errbuf[0] = 0;
bool ret = false;
LockMutex lock(&MDatabase);
if (pStatus != Connected)
Open();
#if DEBUG_MYSQL_QUERIES >= 1
char tmp[120];
strn0cpy(tmp, query, sizeof(tmp));
cout << "QUERY: " << tmp << endl;
#endif
if (mysql_real_query(&mysql, query, querylen)) {
if (mysql_errno(&mysql) == CR_SERVER_GONE_ERROR)
pStatus = Error;
if (mysql_errno(&mysql) == CR_SERVER_LOST || mysql_errno(&mysql) == CR_SERVER_GONE_ERROR) {
if (retry) {
cout << "Database Error: Lost connection, attempting to recover...." << endl;
ret = RunQuery(query, querylen, errbuf, result, affected_rows, last_insert_id, errnum, false);
if ( !ret )
cout << "Reconnection to database successful." << endl;
}
else {
...
...
...
Unfortunetely, that doesn't explain the latest crash of yours, sorvani. This may be an other error I have not encountered yet.
sorvani
05-05-2011, 09:50 AM
Looked at my server this morning and the chat service was still running. Looked at the character_ table and it look like only a single random toon signed on after the last time I was on.
I started my client and connected. As soon I clicked enter world the UCS crashed. With the standard WIndows XP please report dialog box.
In the command window:
Database Error: Lost connection, attempting to recover....
Character TimeLastOn
TestBeasti 1304535868 <- working
GMSorvani 1304535876 <- working
XXXXXX 1304540215 <- working
TestBeasti 1304602955 <- crashed at enter world
Zothen
05-05-2011, 09:54 AM
Aye, thats the error I encountered last time, but for me it doesnt crash, it actually recovers now.
sorvani
05-05-2011, 10:12 AM
Looked at my build again and found an error on my side. When I added the mysql_init I had a syntax error. Rebuilt and restarted.
sorvani
05-05-2011, 10:59 PM
Just got in and glanced at my server, no one had logged in all day (12 hours), so I logged in a toon and wham. DB err lost connection, but no crash :) it reconnected just fine.
Thanks for the fix!
Zothen
05-06-2011, 01:50 AM
WOOT! Grats! :)
Zothen
05-16-2011, 02:59 AM
Okay, I found a cosmetic error, as I interpreted the return value of RunQuery() wrong. RunQuery() returns true when successful, so we need to edit a line:
bool DBcore::RunQuery(const char* query, int32 querylen, char* errbuf, MYSQL_RES** result, int32* affected_rows, int32* last_insert_id, int32* errnum, bool retry) {
_CP(DBcore_RunQuery);
if (errnum)
*errnum = 0;
if (errbuf)
errbuf[0] = 0;
bool ret = false;
LockMutex lock(&MDatabase);
if (pStatus != Connected)
Open();
#if DEBUG_MYSQL_QUERIES >= 1
char tmp[120];
strn0cpy(tmp, query, sizeof(tmp));
cout << "QUERY: " << tmp << endl;
#endif
if (mysql_real_query(&mysql, query, querylen)) {
if (mysql_errno(&mysql) == CR_SERVER_GONE_ERROR)
pStatus = Error;
if (mysql_errno(&mysql) == CR_SERVER_LOST || mysql_errno(&mysql) == CR_SERVER_GONE_ERROR) {
if (retry) {
cout << "Database Error: Lost connection, attempting to recover...." << endl;
ret = RunQuery(query, querylen, errbuf, result, affected_rows, last_insert_id, errnum, false);
// if ( !ret ) // <- Wrong return value
if ( ret )
cout << "Reconnection to database successful." << endl;
}
else {
...
...
...
trevius
05-16-2011, 04:55 AM
lmao, I forgot about that. I meant to correct that before I committed it. I saw that, but was working on other stuff at the time and didn't get back to it. Thanks for the reminder. I will try to get that in unless someone else does before I find time to.
Zothen
05-16-2011, 05:44 AM
I was wondering why I didnt get a reconnection message after my last crash, so I rechecked the code. :)
sorvani
05-16-2011, 11:38 AM
UCS has been running great though, thanks Zothen
Zothen
05-16-2011, 11:49 AM
Glad to hear :)
joligario
05-16-2011, 12:03 PM
Gotcha covered Trev. Committed in r1901.
vBulletin® v3.8.11, Copyright ©2000-2025, vBulletin Solutions Inc.