Disclaimer: When I wrote this I was not an employee of Oracle, a RDBMS vendor. I am now, and you should know that.
First, please go read Karen Lopez’s rant on Normalization Myths here.
Now, a bit of my experience with relational database theory, design, and practice.
As a senior Computer Science undergraduate I took a 200 level course on ‘Databases.’ We were taught set theory, a bit of algebra, Boyce-Codd, what it means to have a ‘relational database’, first, second, third, and Boyce-Codd normal forms, and we used Oracle for our examples and lab work.
In my first (and only interview) coming out of school, the President of the software company asked me to explain what 3NF really meant. I totally flubbed the answer, but still managed to land the job. That was about 13 years ago, and I think I have a pretty good practical understanding of database design.
What I have a really good understanding of now is the frustration and problems caused by software vendors that hard-code their relations into their software and completely neglect the database. It’s so common that I would say it happens more often than not. One of the biggest offenders is in fact a database vendor, AKA Oracle!
Why don’t vendors have foreign keys in their apps?
Great question! I don’t have a good answer. My theory is that most systems are designed by developers who don’t understand databases or how their data works. But, here are the most common responses I hear from architects or DBAs when ‘confronted’ about the lack of normalization and foreign keys in their database:
- It’s bad for performance
- We need a design that can work on multiple database platforms
- Muttering and staring at shoes or something over your shoulder.
I am not going to debunk the first two points, but I will say that I think they are both junk. Here is one example that proves adding foreign keys actually IMPROVES performance.
What I Think is Really Happening
Someone a really, really long time ago heard that foreign keys caused performance issues or didn’t understand why they were important, so they weren’t considered during the design phase. In many cases there may have been no data architect and instead an application developer designed the data model.
Their system ‘works’ and now many years later it would be extremely expensive to fix the model. No one wants to spend money on a problem that isn’t creating problems for the end-users. To expect a vendor to completely re-design their database with no promise of additional revenue will only cause more disappointment.
Who Really Suffers
The people that support the system suffer. If they need to fix bad data or figure out what tables are in play for a complete ‘record’, then it’s going to be a nightmare. The business analysts suffer because it will take them many months to get the relationships straight when the presence of foreign keys would literally draw the picture for them in minutes. And of course the data suffers. Since the application is the only one preserving the business rules, any one that accesses the data outside the application is likely to screw it up.
I hate, hate, hate this. It’s unacceptable. If I were a customer evaluating a multi-million dollar ERP or CRM system, I would totally give mega-bonus points to a vendor that attempted to normalize their data model and used foreign keys to preserve the relationships IN the database. I’ve talked about this before here, if you can’t get enough of my foreign key rants
What Can You Do?
If you are ever tasked to build an application and you are going to store data in a database, spend some time and design a proper data model. If you don’t know how to do that, find someone who does and pay them good $$$ for their advice. This is a design decision that will affect you and your customers for decades.
PS I wanted to call this post Turd Normal Form, but I chickened out