Relational database theory part 1—How I got here

In which the author reflects on some of the better aspects of a generally misspent youth.

It must have been 1995 and I was working as a developer in LargeFinCo* primarily as a mainframe COBOL coder. I had gotten pretty good at using the IBM IMS database, which in computer science terms is a navigational database—one basically navigates in an inverted tree structure. The database administrators were promoting the use of DB2 which is IBM’s relational database product.

I was conservative, technologically, and skeptical of all new things from IBM due their track record of releasing buggy or short lived products. I resisted the use of DB2. A friend expressed his excitement about DB2 and relational databases in general, and at some point I thought “Maybe there are some new database concepts I ought to learn about”. I purchased a copy of Chris Date’s book “An introduction to Database Systems”, and this small act, as can sometimes happen in life, changed the trajectory of my career.

In a high school math class, perhaps in 1976, I recall set operations being discussed while a Venn diagram was displayed on the chalkboard. Little did I know that day or two on set theory was a glimpse at my destiny. So, back to LargeFinCo in 1995—Date’s book had the two qualities I like to see in a technology book—it was authoritative and comprehensive. It turned out I had a knack for writing SQL, the relational database query language, and quickly became a go-to guy at work for people with difficult SQL questions—and that remains true today. At some point, perhaps a few years later, I was exposed to another relational product, Sybase SQL Server. This provided another glimpse into my future as that product evolved into Microsoft SQL Server.

LargeFinCo, in the 1990s, had another process which proved indispensable to me—Database Design meetings. As a coder on a project which would use DB2 or Sybase, or even perhaps IMS back then, developers, system users, and database administrators had a series of meetings in a room with lots of whiteboards. The DBAs would take the system users through a series of questions about their data, and prepare relational entity relationship diagrams which formed the database design they would implement. I note that at the time, they used crow’s foot notation to indicate cardinality among tables, and perhaps bonding to it as the first approach I saw, as any baby animal bonds to the first thing it sees upon birth, I still use crows foot notation today, some 30 years later.

As with many computer design discussions, there were disputes among the participants about various points of the design, and at times these devolved into raised voices and anger—myself not least among them… It turned out that in an environment where the smallest nuances of design were endlessly debated, I received a comprehensive understanding of the design concepts—brand new to me then—which have formed the foundation of all of my design work since then. Reflecting on the decades, I have seen that the fundamentals of relational database design have not changed much from Chris Date’s articulation of them, and the process of meeting with system users to document how their data can be properly contained within relational database tables, have changed very little since then. The moral of this story is that once you learn the fundamentals, you are good to go.

I have many system implementations under my belt, and one thing I noticed pretty early on with relational databases was that once properly designed and implemented, the systems using these databases tended to be stable and the data accurate. When improperly designed or implemented, I saw systems which were unstable and required constant attention from programmers, and had multiple data quality problems. One indicator of an improperly designed relational database is that developers and report writers use ‘select distinct’ frequently in their queries, even in the multiple levels of a complex query.

Relational database theory is beautiful to those who find certain mathematical concepts beautiful, and it is fundamentally a mathematical creature. The theory is determinate and not complicated, once the fundamentals are mastered. Once I got past the initial ‘boot camp’ learning process, I have found it enjoyable and satisfying to build a solid and reliable system.

I note over the years there have been data storage concepts that tout the advantage of not having the strict structure of proper relational design. One appears to have been named in defiance of relational database theory called ‘No SQL’. There seems to be some sort of emotional appeal to freedom from rules and minimal learning curve which makes things ‘easy’ to use. Some years ago I noted Chris Date arguing endlessly in internet forums (or pre-internet blogs) against those who advocated for inaccurate relational designs, or against the entire concept as obsolete. Yet, as concepts and products became fashionable and fell out of fashion, a properly designed relational database continues to provide a powerful and accurate storage mechanism in the right circumstances.

Certainly a relational database is not the appropriate tool for all storage processes, and at times I wonder to what degree even an entire organization lacks the skills to build a good relational database. I have seen more than one vendor product with hundreds, or even thousands, of tables with zero primary keys declared and zero enforced referential integrity.

As I reflect on a 45 year career as a programmer, and a man nearing retirement—or at least retirement age, I am proud of much of what I have learned and done. Relational database theory and SQL coding have in part paid for my house, cars, and the raising of my children. All of this based on a passing thought I had one day: “Maybe there are some new database concepts I ought to learn about”.

* In my career, I spent many years working at large financial companies which I will refer to as LargeFinCo. They had large mainframe systems with DB2, and some had Sybase SQL Server. The specific identity of these companies is, ultimately, of no importance.

Mark C Knutson

All content Copyright 2023 Mark C Knutson

Relational database theory part 1—How I got here