Custom Software
The Access Wizard Newsletter Tips Tricks and Traps for Access Users and Developers
December 2004

Parsing – or the Art of Pulling Things Apart

In most of life putting things together is a whole lot more difficult then taking things apart. With databases, however, just the opposite is true. Pulling things apart requires significantly more effort than putting them together. In last month's issue we talked about atomicity and how easy it is to put elements together as long as they are stored in their smallest logical pieces. As you may recall, I encouraged you to carefully consider atomicity as you are planning and designing your databases, and I also showed you how to put elements together as needed.

This month we'll tackle a problem that I frequently see - the case where multiple data elements are stored in a single field. To get them apart we'll use parsing.

In this issue
  • Tip of the Month - Recover from those Oh-Oh Moments!
  • The Building Blocks of Parsing
  • Step 1: Locating the Lever
  • Step 2: Isolating the Last Name
  • Step 3: Isolating the First Name
  • Conclusion
  • Trap of the Month

  • The Building Blocks of Parsing

    Let's say you have a customer database and for whatever reason, the first and last names of your customers are stored in a single field called FullName, as shown below:

    What do you do if you want only the last name of each customer, or the first and last name in a different order (as would be required if you were writing a form letter to your customers)? The answer is parsing, or taking data apart. You can write a query to isolate, or parse, the data elements you need.

    Let's imagine that we want to write a form letter to our customers. We'll need to parse the field FullName in our database to extract their first and last names to change the order. Our plan of attack has three basic steps:


    Step 1: Locating the Lever

    First, determine how you can separate the pieces of data. In our example, we have a comma between the last name and first name, so we can use the comma as a lever to separate the two pieces. Other characters, like semi-colons, slashes, or dashes will also work, as long as they are used consistently in a given field. Spaces can work too, but they're a little tougher to work with since they may appear multiple times.

    In this first step, we'll write a query using an Access function called "InStr" to determine where the comma falls in the string. InStr, which can be used to locate a character or a fixed string of characters, will give us a number that tells us the character location of our lever (in this case, a comma) in our FullName string containing [LastName], [FirstName].

    Here is an example of a query using the function InStr to determine the position of the comma in the field FullName:

    To use the function InStr, we have three arguments within the parentheses: first is which character we should start searching from (in our case the very first character - 1; second, the name of the field (FullName); and third, the character (or string of characters) we want to locate, in quotes. These arguments are separated by commas.

    When you run this query, you get the following results:


    Step 2: Isolating the Last Name

    Now that we know where the comma occurs in the string, the second step is to run a query to extract everything to the left of the comma and put it in a separate field labeled Last Name. We do this by utilizing a function called "Left". The Left function returns all the characters to the left of the position of the lever (in our case, the comma) that we identified using the InStr function in step one.

    So we create a query as follows:

    To use the function Left, we have two arguments within the parentheses: the first argument is the string we are starting with, and the second argument is the number of characters to return. This piece is just a tad tricky since it changes from record to record, that's why we're using the InStr function (from Step one) as our second argument.

    As you can see the InStr function is nested as an argument within the Left function in this query. The innermost function InStr returns a number that determines the position of the comma in the string of characters that make up [LastName], [FirstName] (our FullName field). Then, the function Left extracts all the characters to the left of the comma. Notice the -1 at the end of the line? This is actually a mathematical calculation that tells Access that we don't want the comma-we want only the characters starting one to the left of the comma and continuing leftward to the beginning of the string. These characters will be reported in a column labeled Last Name.

    When we run this we finally get what we want:


    Step 3: Isolating the First Name

    Although the process of isolating the first name is similar to isolating the last name, it is slightly more complicated. At the onset, let me warn that this step is actually more complicated to explain then it is to do. This is due to the fact that we have to determine the endpoint of the string to the right of the comma (our lever and reference point) to isolate the first name (theoretically, the string could be infinitely long). Stay with me, because once you understand the principle behind this technique, you'll be able to experiment to get the results you want.

    To parse the first name, you'll use a routine similar to isolating the last name, except this time you'd use the "Right" function along with another function called "Len". The Len function takes a single argument -- the string you want to know the length of. In our case we use Len(FullName). For both "Hawke, Thomas" and "Litter, Kitty" that function returns 13, for "Nasium, Jim" we get 11. Keep in mind that things like commas, spaces and periods are characters, even though in some cases (like spaces) we can't see them. We'll use the Len function and a mathematical calculation to isolate the first name string.

    Here is the query, which shows three nested functions (InStr, Len, and Right):

    To use the function Right, we have two arguments within the parentheses: the first argument is the string we are starting with (FullName), and the second argument indicates the number of characters we want to return.

    Again, the tricky part of this step is that we have to tell Access a number that depends on the string we start with. When we used the Left function, the query returned all the characters from the comma to the left, all the way back to the beginning of the string. In this case, our string to the right of the comma could be infinitely long. That is why it is necessary to do some math in the query.

    When you run this you get:

    OK, so let's pull this one apart to make sure it's clear:

  • We use the InStr function to determine where the comma lies in the string FullName. Remember, this function returns a number.
  • We use the function Len to give us a number that tells us the length of the string FullName.
  • We tell Access that we want the rightmost characters from the string from one space after the comma.

    As you can see in the query, we are doing a mathematical calculation to get the number of characters we want:

    [the total number of characters] - [the length of the string up to the comma] - 1 (to get rid of the space between the comma and the first name).

    This gives us the number of characters in the first name.

    At this point you have parsed out the First Name and the Last Name into separate fields. Writing the greeting in your form letter is as simple as joining (or concatenating) them together in the correct order as we talked about in the November Wizard.

    You can imagine that if you had a middle initial or middle name as well, life would become even more complex. In that case you could use similar techniques and focus on the string after you've stripped out the first name. Once it gets to that level of complexity though, it's much easier to do it in Visual Basic for Applications (VBA) code. VBA is considerably easier to troubleshoot, and it allows you to visualize more clearly what's going on. Check out your local bookstore or library if you are interested in learning more about VBA. They may have some books to get you started.


  • Conclusion

    As you can see, parsing involves considerably more work than the joining (or concatenating) we did last month. This is a factor which should motivate you take the time to analyze (and atomize) your data in the beginning, and set up your tables and fields to avoid this problem altogether.

    However, when you are faced with more than one piece of data meshed into a single field, you have the tools of parsing available to you to help pull them apart.


    Trap of the Month

    Have you ever noticed that sometimes when you open a form you get the property box? I personally find this behavior of Access really annoying. What's going on is that because you had the property box open when you last saved the form in design view, you get it when you run the form. To make this problem go away, follow these steps:

  • Open the form in design view
  • Select the form by clicking the grey box in the upper left hand corner (you should already have the properties box visible, but if not choose View, Properties)
  • Click on the tab marked Other
  • In "Allow Design Changes" at the bottom of the list, choose "Design View Only"


  • Tip of the Month - Recover from those Oh-Oh Moments!

    Have you been working in Access and had one of those "Oh-Oh" moments? You just accidentally hit the backspace key and deleted something, or changed something that you really didn't want to change. In many cases you can recover using "undo". Simply hit Ctrl-Z or choose "undo" from the Edit menu. It won't work in every single case (sometimes what you do is permanent), but if you run into a situation where you wish you could take back those last keystrokes, give "undo" a try and see if you can easily recover.

    Quick Links...

    Register Now

    Newsletter Archive

    Related Topics

    More About Us



    Join our mailing list!

    Forward email

    This email was sent to jim@custom-software.biz, by jim@custom-software.biz
    Powered by

    Custom Software | - | Westford | MA | 01886