Previous Page
Next Page

Hack 55. Test for Duplication

Before you insert multiple entries into master tables in a busy data-entry environment, you'll need a custom validation process to avoid duplicated data.

Just because a database is deployed on the server doesn't mean the entire application must be in that server copy. A common approach is to put the data in the server database and distribute the forms to the local client computers, inside another Access file. This is a typical Access version of a client/server application.

Because the client installations are Access databases, using tables in the client databases opens up possibilities. One useful technique is to have new data entries go into local tables first, and later to bulk-insert them into the master table or tables on the server.

The heart of this technique is that entry operators have local tables that mirror the server tables. Their forms are bound to the local tables and all entry is done locally. At the end of the day, or at scheduled times throughout the day, a process runs that takes the data out of the local tables and moves it to the server tables.

Here is where the advantage lies. The insert process gathers the input from all the entry operators' client tables and tests for duplication before the actual insert into the server tables. Normally, such an interim validation would be overkill because data usually is validated upon entry. However, the point here isn't really to validate data in terms of correct content, but rather, to see if duplicate records were entered during input.

This is certainly a possibility in a busy customer service, sales, or telemarketing operation. For example, in an environment where phone calls come in, it is possible that Jane places an order for something and, an hour later, her husband Joe places an order for the same item. If different operators handled the husband and wife, no one would be the wiser that this is a duplicate order. Even the two created records might not be actual duplicates because the first name is different in each record. But if a custom-designed validation process is used, these two records can be flagged as duplicates because at least the address is the same in both records.

You also can test for near duplication on the server, so this begs the question: why bother with the separate table-entry approach? The answer is performance. If all entry goes straight to the server tables, and the custom duplication process runs on the larger tables, there could be some issues with speed.

Another issue to consider is how far back in time to look for duplicates. With the local approach, the test certainly is done at least at the end of the day, if not during scheduled times throughout the day, all for that day's processing. Involving older records in the server tables isn't necessary. An order placed twice in one day is probably a duplicate. An order that resembles one placed last week or last month is probably a repeat order.

Andrea Moss

    Previous Page
    Next Page