Normalization 102: Atomicity -- A place
for everything and everything in its place.
I walked into a friend's workshop last weekend and noticed that
he had an outline on a pegboard for each of his tools. There was one
empty slot for a screwdriver which was in his hand. I remarked how
well organized he was. He said that by keeping his tools organized
he was able to save time because he knew exactly where everything
was. In addition, if something was not where it was supposed to be,
he knew it immediately just by glancing at his pegboard.
We'll continue our ongoing discussion of normalization by
addressing a topic in this month's newsletter inspired by my
friend's organized workshop: atomicity. Atomicity is a critical
feature of a normalized database. Like the tools on the pegboard, a
well-designed database will put each data element in its correct
place. This month, we'll define atomicity and simplify the jargon to
help you understand a basic building block of a solid database
application. In addition, we'll cover a few more tricks to show you
how to manipulate data elements when the need arises.
First you take the Pieces Apart (aka
Atomicity) |
|
The concept of atomicity is quite simple: break data into
its smallest component piece and put each piece into its own
field. The idea here is that before you design a database, you
must analyze your data and break it down to the lowest level
of detail. Data in this low level of detail is then stored in
the database as a data element. This concept was inspired by
atoms, the smallest units of a molecule - hence atomicity.
This is the reason that good database design dictates
separate fields for first and last names. Take the following
two tables as an example
In the table CustomerWrong on the left, the first name and
last name are joined into a single field. This works just fine
when we need an alphabetical list of customers with their last
name first and first name last. But if we need to do anything
else, say write a form letter (which would require changing
the order of the names), we would have to extract elements
from (or parse, in database lingo, which we'll cover next
month) the single field FullName into two separate pieces and
then reorder the first name and last name elements.
It is possible, in this example, to extract the last name
(Hawke) and the first name (Thomas), but it does require some
effort. This is the same reason we separate City, State and
Zip code into separate fields. Of course we don't take this to
the extreme of separating a Street address into Street Name
and number, because for most databases there is never (and we
have to be comfortable that we really mean never, either
currently or in the future) a need to separate those elements.
|
Then You Put the Pieces Back Together |
|
It's much easier to start with separate fields for
FirstName and LastName (as shown in the CustomerRight table
above on the right)and then manipulate the pieces as we need
them. For instance, if we wanted to create a list of clients
with their last name and then first name, we would create a
query as follows:
When we run this query, we get the following:
What we've done here is called concatenation, essentially
hooking data elements from separate fields together. We
created a query to pull data elements together in a common
field in the order that we want them. What we are doing here
is hooking three things together: the LastName, the FirstName,
and the stuff we want between them.
To understand this, let's look at each element. The title
of the column created in the query is "Full Name", which we
established by typing it into in design view in the Field row
followed by a colon (See the Design View picture above). This
tells Access to show the label "Full Name" as the column
heading when it displays the column in the query. We follow
this with [LastName], the field name holding the customer's
last name in the Customer table. If your field names have no
spaces (like LastName), then the square brackets are added
automatically, otherwise you'll have to add them manually. The
brackets tell Access that it's dealing with a field name.
The next part of the expression is the ampersand symbol
"&", which is a signal telling Access that you want to
join two things together. Next, we tell Access what we want
between the fields LastName and FirstName. In this case, we
want a comma and then a space, so we follow the first
ampersand with a double quote, a comma, a space, and another
double quote. The opening quotes tell Access that a string is
coming up; whatever we type between the opening and the
closing quotes will be hooked on to whatever was on the other
side of the ampersand. The closing double quotes tell Access
that this is the end of the string. Lastly, we add one more
ampersand and the First Name field [FirstName]
This may seem a bit confusing, but once you understand the
concept of concatenation and have done it a few times, you'll
discover that it's an easy technique. The possibilities with
this technique are endless; it can be used to join fields such
as city, state and zip code (which should also be stored in
separate fields as discussed above) when making an address
label.
|
The Advantages of Atomicity |
|
You might be saying to yourself, "Well I'm sure the only
way I'll ever use these pieces of data is in one specific
order, so I might as well start with them that way". If you
know with absolute certainty that this is the case, you may be
one of those rare people living in a static world. Most people
live in a changing, unpredictable environment. As such, they
have to be ready to manipulate their data in order to meet the
dynamic demands of their particular situation.
But remember, better safe than sorry. If you strive toward
atomicity in the beginning and start with each piece of data
in its own field, then you'll always be able to put them
together in any order as you need them. More importantly,
you'll be able to sort and filter your data by the individual
pieces whenever needed. For instance if you've properly
atomicized city, state and zip into their own fields and you
need to sort by zip code for bulk mailing purposes, it's a
much easier process if zip code is in its own field to begin
with.
|
Conclusion |
|
Getting to the smallest piece of data and putting into its
own field is a major step in developing a flexible database.
With this approach, we can manipulate the pieces of data to
give us what we want in whatever order we need them.
When you're designing your tables, keep this principle in
mind and you'll be laying a brick into what will become a
solid foundation.
|
Shameless Plug Department |
|
As I mentioned last month, I spoke at the Boston Access
Users Group's October meeting about Self Updating Combo Boxes.
As result of that meeting, I've been invited to give the same
presentation to the Connecticut Access Users group (CTAUG) in
Farmington, Connecticut.
I'll be there on Wednesday, December 8. For those of you in
the Central Connecticut /Greater New York area who would like
to see this technique in action, watch the CTAUG page for
details.
|
|
Tip of the Month - An Easy Way to Select All Fields in
a Query |
|
When designing a query we frequently want to pull all the
fields into the results.
You probably know that you can select the * in the design
view and that will show all the fields when you run the query.
But let's say you want to get all the fields into the query
design view so that you can change the order that they appear
or to set criteria.
You could of course pull them in one-by one, but this can
be tedious. A much easier way is to double-click on the table
title in the query design view. This will highlight all of the
fields, then you can just drag them into the bottom half of
the query saving lots of time. |
|