Ice Pond Software - Reporting Tools

Ice Pond Software
Custom Database & Software Development

Databases - First Normal Form (1NF)

Recall that 1NF sets the very basic rules for an organized database:

Eliminate duplicative columns from the same table.
Create separate tables for each group of related data and identify each row with a unique column - the primary key (PK).

What do these rules mean when contemplating the practical design of a database? It is actually quite simple.

The first rule dictates that we must not duplicate data within the same row of a table. Within the database community, this concept is referred to as the atomicity of a table. Tables that comply with this rule are said to be atomic. Let's explore this principle with a classic example - a table within a human resources database that stores the manager-subordinate relationship. For the purposes of our example, we'll impose the business rule that each manager may have one or more subordinates while each subordinate may have only one manager.

Intuitively, when creating a list or spreadsheet to track this information, we would draw up something like this:

Manager	Subordinate1	Subordinate2	Subordinate3	Subordinate4
Bob	Jim	Mary	Beth
Mary	Mike	Jason	Carol	Mark
Jim	Alan

However, recall the first rule imposed by 1NF: eliminate duplicative columns from the same table. Clearly, the Subordinate1-Subordinate4 columns are duplicative. Take a moment and ponder the problems raised by this scenario. Jim only has one subordinate - the Subordinate2-Subordinate4 columns are simply wasted storage space (a precious database commodity). Furthermore, Mary already has 4 subordinates - what happens if she takes on another employee? The whole table structure would require modification.

At this point, a second bright idea usually occurs to database novices: We don't want to have more than one column and we want to allow for a flexible amount of data storage. Let's try something like this:

Manager	Subordinates
Bob	Jim, Mary, Beth
Mary	Mike, Jason, Carol, Mark
Jim	Alan

This solution is closer, but it also falls short of the mark. The subordinates column is still duplicative and non-atomic. What happens when we need to add or remove a subordinate? We need to read and write the entire contents of the table. That's not a big deal in this situation, but what if one manager had one hundred employees? Also, it complicates the process of selecting data from the database in future queries.

Here's a table that satisfies the first rule of 1NF:

Manager	Subordinate
Bob	Jim
Bob	Mary
Bob	Beth
Mary	Mike
Mary	Jason
Mary	Carol
Mary	Mark
Jim	Alan

Now, what about the second rule: identify each row with a unique column or set of columns (the primary key)? You might take a look at the table above and suggest the use of the subordinate column as a primary key. In fact, the subordinate column is a good candidate for a primary key due to the fact that our business rules specified that each subordinate may have only one manager. However, the data that we've chosen to store in our table makes this a less than ideal solution. What happens if we hire another employee named Jim? How do we store his manager-subordinate relationship in the database?

It's best to use a truly unique identifier (like an employee ID or SSN) as a primary key. Our final table would look like this:

Manager	Subordinate
182	143
182	201
182	123
201	156
201	041
201	187
201	196
143	202

Now, the table is in first normal form.

On to ... Second Normal Form