Relational databases are powerful for many reasons, but you cannot deny the ‘relational’ part is pretty compelling. Rows or ‘tuples’ in a table are comprised of related data. So for one row in an Employee table, you can assume that each value or column pertains to that particular Employee.
Of course you can’t – or rather shouldn’t – put everything about said employee in a single row or table. I hate using absolutes, but in general it’s good practice to normalize your data. Just because Oracle allows you to use 1,000 columns in a table doesn’t mean you should.
On a side note – curious why developers are crazy about the NoSQL movement? Cassandra just announced (story) they will support 2,000,000,000 (that’s 2 BILLION) columns per table.
So odds are your database has more than one table, or at least your application is using more than one table. There is a fair chance that some of these tables will be RELATED. The ANSI SQL standard allows for these relationships to be defined and enforced via the database. To quote the ANSI shopping site -
“Defines the data structures and basic operations on SQL-data. Provides functional capabilities for creating, accessing, maintaining, controlling, and protecting SQL-data.”
The ‘controlling and protecting’ piece of that quote is not accidental. The database is setup to protect the data. A smart DBA once said that protecting the data was the core of their job. Foreign Keys are one of the databases mechanisms for protecting the integrity of the data across tables. For example if you delete a department from your HR system, you want to make sure you don’t have any employees assigned to that department after the fact. Or going the other direction, if you hire a new employee, you want to make sure they belong to a department. A foreign key constraint can be employed to enforce this business rule.
So why do application developers INSIST on hard-coding the relationship into their applications?
Has this ever happened to you? The first day on the job, you get your database credentials, you login to the database to answer a question for someone, and you have no idea how the data is structured? And to make matters worse, there are no foreign keys, so the database doesn’t know either.
And this is not limited to folks of the Oracle persuasion. I see this rampant across the relational database spectrum (yes, you too MSFT SQL Server.)
I have a few theories why this happens:
Maybe there was a kernel of truth in some of these assumptions at one point, but I do not see anything that supports those theories today.
Pity the poor people who have to query this data.
First you have to dig to find out how the data is related. Most modelling tools will rely on foreign keys to document relationships. Smarter tools will make a best-guess from common-named column names, but this is just a guess.
Any time you write a query against these related tables, you will need to write the JOIN manually. Your database IDE will be handicapped, and it will take you longer to write your code as a result. Hey, how fun is this to write?
SELECT ...
FROM SH.COSTS
JOIN SH.PROMOTIONS
ON (COSTS.PROMO_ID = PROMOTIONS.PROMO_ID)
JOIN SH.TIMES
ON (COSTS.TIME_ID = TIMES.TIME_ID)
JOIN SH.PRODUCTS
ON (COSTS.PROD_ID = PRODUCTS.PROD_ID)
JOIN SH.CHANNELS
ON (COSTS.CHANNEL_ID = CHANNELS.CHANNEL_ID)
JOIN SH.SALES
ON (SALES.PROMO_ID = PROMOTIONS.PROMO_ID)
JOIN SH.COUNTRIES
ON (CUSTOMERS.COUNTRY_ID = COUNTRIES.COUNTRY_ID)
JOIN SH.CUSTOMERS
ON (SALES.CUST_ID = CUSTOMERS.CUST_ID)
JOIN SH.TIMES
ON (SALES.TIME_ID = TIMES.TIME_ID)
JOIN SH.PRODUCTS
ON (SALES.PROD_ID = PRODUCTS.PROD_ID)
JOIN SH.CHANNELS
ON (SALES.CHANNEL_ID = CHANNELS.CHANNEL_ID)
The big boys should know better. But they don’t.
It’s easy to assume that this only happens with emerging companies. But I see it in enterprise class solutions. In some cases they will define some foreign keys, but not all of them. I JUST DON’T GET IT.
So developers and architects – please explain it to me – why are foreign keys such a foreign concept?
OK Jeff, we get the point, but what are we supposed to do, redo the entire database design?
That would make me and all of your customers very happy. It won’t make you any more money, and sadly that is probably going to prevent it from ever happening. But going forward, if you are going to build something, do it right.
I love you guys.
I’ve been a little hard some people on this post, and I am more than ready to soften my position if the evidence supports it. I want everyone to remember that ‘I love you guys.’


Twitter
RSS
GooglePlus
Facebook
Response: Normalization Myths that Really Make Me Crazy
Jun 15, 2011 @ 21:31:04
[...] foreign keys to preserve the relationships IN the database. I’ve talked about this before here, you can click here to hear me rant even more if [...]