Using foreign keys and cascades [Archive]

View Full Version : Using foreign keys and cascades

Rhodan

11-10-2006, 05:06 AM

I'm looking through the database and seeing that its very easy to end up with orphaned data. Say you have a grid with 40 waypoints and delete the grid table entry while forgetting to remove the 40 waypoints. You now have 40 rows of data in limbo.

Same goes with quest globals. Say charID #5623 goes out and racks up 2000 quest global entries and then the player deletes the character. Now there are 2000 rows of data orphaned and sitting in the database forever.

There is a method to associate tables and data with other tables and data so that if one is changed or deleted, the other follows suit - changing with the original table or being deleted along with the original.

I'm wondering, has this been discussed and rejected or has nobody thought to use it yet?

John Adams

11-10-2006, 07:23 AM

That would be optimizing the database to be more efficient, and I am sure it's been thought of or considered. Available time might be a factor why it has not happened yet - or compatibility across versions of MySQL. You are correct, though - cascading deletes across table relationships is what grown up databases do. However, another good reason to NOT do this is the accidental removal of one innocent record could "cascade" throughout your database and delete things you didn't mean to delete.

Not sure of the advantages. Do you honestly think 20k extra data is going to somehow lag your database any? Buy a real machine. :)

sesmar

11-10-2006, 08:36 AM

This is actually a very good point and from what I have seen Orphaned records is a problem in the database; however, I do not think that cascading updates/deletes would be the best way to fix the problem. Simply because someone could easily make a mistake when customizing their own DB and do exactly as John pointed out. I believe something a little smarter would be required. I have thought about writing a application tool that would systematically search through the DB looking for orphan records and then present a report to the user letting them know what it found and where it found it. Then the user would be able to decided which entries they want to delete and the program would delete them. I do intend to do something along these lines as soon as I am done with current projects I am working on.

Rhodan

11-11-2006, 04:16 AM

My brother is having a look at the database now. He's been creating commercial database systems since the 1970s and currently gets flown around to create/update/repair payroll and inventory systems for some of the largest companies in the Province.

He's hooked on the game and wants to see if he can make the database easier to maintain. I had a chat with him last night and he too isn't a huge fan of cascading deletes (though in some instances he says it just makes too much sense not to). He prefers external utilities and/or stored procedures.

Last night he was just getting familiar with MySQL's syntax (he's mostly Oracle and MSSQL) and wants to peruse the database for orphaned rows etc.

Another thing we talked about was index numbers. I described how most tables use the zone number as the basis for row ID to make it easier for us humans to figure out what zone the record is for. I also talked about the idea of reserving the initial set of IDs for the main database developer (e.g. peq devs use 3000-3999 for zone 30) and add-on people would use a higher numbered system. Say we leave 1000 zones for the origninal database devs (ID 0 to ID 100999) and add-on devs would start at 110000.

One issue with existing utilities is that when they create a new index number they typically just get the highest index number and add 1. While this is safe, it doesn't follow the convention of using the zone number nor does it make use of gaps in the index sequence caused by deletes. Say I made 25 spawns in zone 113000 and deleted spawn # 12. If a new spawn is added, it should choose the lowest free ID, #12 and not #26. 1000 spawn IDs might seem like a lot but if you edit the zone semi-frequently, deleting old spawns and creating new ones, it wouldn't take long to run out of IDs if you didn't go back and re-use the "free" IDs.

Although I know about transactions (*safe* database operations that can be rolled back if you don't like the results) I've never actually used them. I'm assuming my brother would ONLY use transactions, showing the user the results before the user commits to the changes. I'm going to have to get him to show me how that works ;)

Anyway, the first thing he's going to do is figure out the table relationships then he's going to start writing utilities to check for and clean up orphans. After that he'll start looking at simplifying deletes and creating new IDs within the numbering convention.

Of course, in a couple weeks he has to fly off to Radium or some other logging town and work on someone's payroll system, so don't expect tons of stuff to be posted in the next month.

John Adams

11-11-2006, 11:48 AM

My background is MSSQL as well, and imo, your bro should have no difficulty seeing how things currently work - but like me, freaking out that MySQL is so limited in what it can do comparitively. MySQL works for what we're doing, and it's free. Two great perks. :) I'm sure many a DBA have reviewed this beast in the past, yet here it stays.

My interested would be in getting it faster, rather than worrying about orphaned data or ID ranges. But you'd be amazed how well MySQL runs on a dedicated FC4 box doing nothing but service MySQL to the Emu. Time to buy more hardware. :)

Rhodan

11-11-2006, 11:58 AM

Well, here's the first thing he came up with. It shows all the orphaned waypoints in the database.

select a.*
from grid_entries a
left outer join grid b on a.zoneid = b.zoneid and a.gridid = b.id
where
b.zoneid is null
order by
zoneid, gridid, number

I got 231 orphaned records in the ax_peq18d database.

I'll have to see what else he came up with during the day while I was at work.

sesmar

11-11-2006, 01:53 PM

I got 231 orphaned records in the ax_peq18d database.

That is not that bad. Where you really want to look is the Spawn Groups.

SELECT
spawngroup.id,
spawnentry.spawngroupID,
spawnentry.npcID,
npc_types.name
FROM
spawnentry
LEFT JOIN spawngroup
ON spawngroup.id = spawnentry.spawngroupID
LEFT JOIN npc_types
ON npc_types.id = spawnentry.npcID
WHERE
spawngroup.id IS NULL
ORDER BY
npc_types.name

4187 Records. And if you also look at the number of them that have a null NPC ID as well. That is more surprising.

Some other things that I have been looking into to help increase performance is Indexing; however, the current indexing seems to be what is needed.

fathernitwit

11-24-2006, 06:18 AM

just to answer questions... we tend not to use transactions, stored proceedures, foreign keys, constraints and cascading deletes because honestly... mysql sucks at being a "grown up" database... plus most of them are rather new features anyhow.. if we desired a strong DB, we would use postgres.

mysql is fast, and what makes it fast is that it dosent support (or at least didnt used to) all of the features listed above. I do agree that external utilties is the safest way to "clean" the DB of crap which might build up over time. That said, nobody is running a server (and likely ever will) on a large enough scale to make cleaning the DB really matter.

John Adams

11-28-2006, 02:17 AM

@ Rhodan -- neener, told ya so. :)

Missed me? I'm baaack!