Section 6.10. List Generation

6.10. List Generation

Use map instead of for when generating new lists from old.

A for loop is so convenient that it's natural to reach for it in any situation where a fixed number of list elements is to be processed. For example:

    my @sqrt_results;
    for my $result (@results) {
        push @sqrt_results, sqrt($result);
    }

But code like that can be very inefficient, because it has to perform a separate push for every transformed element. Those pushes usually require a series of internal memory reallocations, as the @sqrt_results array repeatedly fills up. It is possible to preallocate space in @sqrt_results, but the syntax to do that is a little obscure, which doesn't help readability:

    my @sqrt_results;

    # Preallocate as many elements as @results already has...
    $#sqrt_results = $#results;

    for my $next_sqrt_result (0..$#results) {

        $sqrt_results[$next_sqrt_result] = sqrt $results[$next_sqrt_result];
    }

You also have to use an explicit counter if you preallocate. You can't use push, because you just gave the array some number of preallocated elements, so push would put each new value after them.

The alternative is to use Perl's built-in map function. This function is specifically aimed at those situations when you want to process a list of values, to create some kind of related list. For example, to produce a list of square roots from a list of numbers:


    my @sqrt_results = map { sqrt $_ } @results;

Some of the benefits of this approach are very obvious. For a start, there's less code, so (provided you know what map does) the code is significantly easier to understand. Less code also means there are likely to be fewer bugs, as there are fewer places for things to go wrong.

There are a couple of other advantages that aren't quite as obvious. For example, when you use map, most of your looping and list generation is being done in heavily optimized compiled C code, not in interpreted Perl. So it's usually being done considerably faster.

In addition, the map knows in advance exactly how many elements it will eventually process, so it can preallocate sufficient space in the list it's returning. Or rather it can usually preallocate sufficient space. If the map's block returns more than one value for each element of the original list, then extra allocations will still be necessary. But, even then, not as many as the equivalent series of push statements would require.

Finally, on a more abstract level, a map is almost always used to transform a sequence of data, so seeing a map immediately suggests to the reader that a data transformation is intended. And the syntax of the function makes it easy to visually locate both the transformation itself (what's in the braces) and the data it's being applied to (what's after the braces).