|[ Team LiB ]|
Recipe 1.8 Use a Query to Retrieve a Random Set of Rows
You need to be able to retrieve a random set of rows from a table or a query so you can identify a random sample for a research study. You can't find a way to make this happen in the normal query design grid. What's the trick to getting a random sample of a certain number of rows?
The solution to this problem is not quite as simple as it might first appear, because of the way Access attempts to optimize the use of function calls in queries. You can call a VBA function to generate a random value for each row, but to ensure that your function runs for each row, and not just once, you need to feed it a value from the row. Once you've generated the random numbers, you can sort by that random column and use a Top Values query to select a random group.
In 01-08.MDB, open tblRandom. This table includes 50 rows of data. Your goal is to pull five randomly selected rows for this set of data. To do this, follow these steps:
The general concept behind this solution is simple: you add a new column to your query, fill it with a list of random numbers, sort on those random numbers, and retrieve the top n rows, where n is a number between 1 and the number of rows in your underlying data. There's only one complicating factor: to create the random number, you need to call a function for each row. Access tries to optimize such a function call and will call it only once for the entire set of data, unless the function call involves a field in the data. That is, if you replace the call to acbGetRandom (in Step 3) with a simpler call directly to Access's random number function (Rnd), you'll find that every value in every row will be exactly the same. Access's query engine thinks that the function has nothing to do with data in the query, so it calls the function only once. This makes the random number meaningless, as the whole point of using a random number is to generate a different one for each row.
The workaround, though, is simple: pass a field, any field, as a parameter to the function you call. That way, Access believes that the return value from the function is dependent on the data in each row and so calls the function once per row, passing to it the field you specify in the expression. The acbGetRandom function doesn't really care about the value you pass it, because its only goal is to get a random number and return that back to the query. Once you successfully place a random number in each row Access will sort the data based on that number, because you specified Ascending for the column's sorting.
Finally, by specifying the TopValues property for the query, you're asking Access to return only that many rows as the result set of the query. If you want a certain percentage of the total rows, change it by adding the % sign after the Top value.
The acbGetRandom function includes a call to the VBA Randomize subroutine. By calling Randomize, you're asking Access to give you a truly random result every time you call the function. If you omit this call, Access gives you the same series of random numbers each time you start it up and run this query. If you want a repeatable series of random rows, remove the call to Randomize. If you want a different set of rows each time you run the query, leave the Randomize statement where it is.
Because Access will pass a field value to the acbGetRandom function for each and every row of data in your data source, you'll want to optimize this function call as much as you can. If possible, use either a very short text field (zip code, for example) or, even better, an integer. You must pass some value, but you want it to be as small as possible to minimize the amount of information that must be moved around for each row of the data.
|[ Team LiB ]|