Previous Page
Next Page

Hack 54. Use Regular Expressions in Access Queries

Sometimes wildcards aren't enough. With a little hacking, you can use regular expressions in your queries.

Although Access allows for some powerful string matching (see "Wildcard characters and the Like operator" in the Access Help system), sometimes you require an even more powerful solution. Microsoft added the ability to use regular expressions back in Version 5.0 of its Windows Scripting Engine, bringing it up to par with JavaScript. You can use this power inside an Access query as well.

Although the advanced details of regular expressions are beyond the scope of this hack, this example will get you started if you are new to the subject. If you need more information, I recommend the book Mastering Regular Expressions (O'Reilly).

In many cases it's possible to work around the lack of built-in regular expressions using Access's wildcard characters and multiple calls to different string functions, such as Left, Mid, Right, Len, and so on. However, once you see what you can do with a single custom function call, you can imagine the advanced possibilities and time savings.

5.16.1. Creating the Custom Function

The first thing we need to do is create a function that can be called from our Access query that ties into the Microsoft Scripting Runtime library.

This hack assumes the machine you are running has the latest version of Microsoft's Scripting Engine installed. If you are unsure, visit http://www.microsoft.com/scripting.


The following code uses the CreateObject function so that you don't have to check the Referenceeach time the code is placed in a new database:

     Public Function RegExp(strString As String, _
         strRegExp As String, Optional bolIgnoreCase As Boolean = False) As
     Boolean
         Dim re As Object
         Set re = CreateObject("vbscript.RegExp")
         re.Pattern = strRegExp
         re.IgnoreCase = bolIgnoreCase
         If re.Test(strString) Then
             RegExp = True
         Else
             RegExp = False
         End If
     End Function

The function has two required parameters: the string being matched against and the string that contains the regular expression. The third, optional parameter tells the function whether to match the regular expression while ignoring the case; the default won't ignore the case.

5.16.2. Creating an Example Query

As an example, let's look at verifying part numbers by finding those that don't match a given criterion. Many times, you might receive data from multiple people and platforms that needs to be cleaned before going into a master database. Let's say that part numbers for a factory have the following criteria:

  • They must start with a capital PN or a capital P.

  • The next two positions must be numeric.

  • The next position must be a capital letter (AZ).

  • The next three to four positions must be numeric.

  • The next five to six positions must be capital letters (AZ).

Examples of part numbers that meet the criteria include PN12W123ABCDE and P12W123ABCDE. Examples that don't meet the criteria include PN12W13ABCDE (only two digits after the W) and 12W123ABCDE (doesn't start with PN or P).

Given the set of criteria for the part number, here's the regular expression:

	"^(PN|P)[0-9][0-9][A-Z][0-9]{3,4}[A-Z]{5,6}$"

As mentioned earlier, these regular expressions can become quite overwhelming until you get used to them. If you aren't familiar with them, I strongly recommend additional reading to learn the full power of these expressions. To better understand it, let's break down this expression:


^

Tells the expression to start at the beginning of the string


(PN|P)

Says to match the characters PN or P


[0-9][0-9]

Tells the expression to match two digits, both in the range 0 through 9


[A-Z]

Says to match a single character A through Z


[0-9]{3,4}

Says to match a single digit 0 through 9 at least three times and a maximum of four times


[A-Z]{5,6}

Says to match a single character A through Z at least five times and a maximum of six times

Figure 5-53 shows the layout for a query to find part numbers that don't match our criteria.

Figure 5-53. Calling the RegExp function from a query


Running the query in Figure 5-53 returns the part numbers that do not match our given criteria so that you can review them before placing them into a master database. Although you can do this without tapping into the power of regular expressions, it requires a much more involved solution.

5.16.3. Hacking the Hack

As you discover the power of regular expressions, you will find them to be very robust for all kinds of text processing. Another handy trick is to use them to verify text input on a form. To do so, call the custom RegExp function from the BeforeUpdate event of the text box. If it returns false, set the Cancel parameter variable to true, which clears the input on the text box.

You can even add an advanced feature to your application, which allows the user to do searches based on her own regular expressions!

Steve Huff

    Previous Page
    Next Page