Section 8.7. String Evaluations

8.7. String Evaluations

Avoid string eval.

There are numerous reasons why the string form of eval:

    use English qw( -no_match_vars );

    eval $source_code;
    croak $EVAL_ERROR if $EVAL_ERROR;
    # ALWAYS check for an error after any eval

is better avoided. For a start, it has to re-invoke the parser and the compiler every time you call it, so it can be expensive and can cause expected processing delays, especially if the eval is inside a loop.

More importantly, a string eval doesn't provide compile-time warnings on the code that it creates. It does produce run-time warnings, of course, but encountering those warnings then depends on the thoroughness of your testing regime (see Chapter 18).

This is a serious problem, because writing code that generates other code that is then eval'd is typically much harder (and therefore more error-prone) than writing normal code. And code-generating code is likewise very much harder to maintain.

Perhaps the most common rationale for using a string eval is to create new subroutines that are built around some expression the user supplies. For example, you might need to generate a range of sorting routines using different, user-provided keys. Example 8-1 demonstrates how to do that with a string eval.

Example 8-1. Creating subroutines via run-time compilation

sub make_sorter {
    my ($subname, $key_code) = @_;
    my $package = caller(  );

    # Create and compile the source of a new subroutine in the caller's namespace
    eval qq{
        # Go to the caller's namespace...
        package $package;

        # Define a subroutine of the specified name...
        sub $subname {

            
            # That subroutine does a Schwartzian transform...
            return map  { \$_->[0] }                    # 3. Return original value
                   sort { \$a->[1] cmp \$b->[1] }       # 2. Compare keys
                   map  { my (\$key) = do {$key_code};  # 1. Extract keys as asked,
                          [\$_, \$key];                 #    and cache with values
                        }
                        \@_;                            # 0. Sort full arg list
        }
    };

    # Confirm that the eval worked...
    use English qw( -no_match_vars );
    croak $EVAL_ERROR if $EVAL_ERROR;

    return;
}

# and then...

make_sorter(sort_sha => q{ sha512($_)    } );   # sorts by SHA-512 of each value
make_sorter(sort_ids => q{ /ID:(\d+)/xms } );   # sorts by ID field from each value
make_sorter(sort_len => q{ length        } );   # sorts by length of each value

# and later...

@names_shortest_first = sort_len(@names);
@names_digested_first = sort_sha(@names);
@names_identity_first = sort_ids(@names);

That approach certainly works, provided you get all your backslashes in the right places. But it leaves make_sorter( ) at the run-time mercy of whoever calls it. If the caller passes a key-extraction string that is not itself valid code:

    make_sorter(sort_sha => q{ sha512($_  } );

then the error message will be generated only at runtime, and even then it will not be particularly informative:

    syntax error at (eval 11) line 11, at EOF
    Global symbol "$key" requires explicit package name at (eval 11) line 12.
            main::make_sorter('sort_sha',' sha512($_     ') called at demo.pl line 42

Worse still, if (like almost everyone else) you forget to include the post-eval error test in make_sorter( ):

        croak $EVAL_ERROR if $EVAL_ERROR;

then no error message will be seen at all. Or, rather, no error message will be produced until callers attempt to call their new sort_sha( ) subroutine, at which point they'll be curtly and unhelpfully informed:

    Undefined subroutine &sort_sha called at demo.pl line 86.

A cleaner solution is to use anonymous subroutines to specify each key extractor. Then you can use another anonymous subroutine to implement each new sorter, and install those sorters yourself using the Sub::Installer module, as shown in Example 8-2.

Example 8-2. Creating subroutines via anonymous closures


# Generate a new sorting routine whose name is the string in $sub_name
# and which sorts on keys extracted by the subroutine referred to by $key_sub_ref
sub make_sorter {
    my ($sub_name, $key_sub_ref) = @_;

    # Create a new anonymous subroutine that implements the sort...
    my $sort_sub_ref = sub {
        # Sort using the Schwartzian transform...
        return map  { $_->[0] }                 # 3. Return original value
               sort { $a->[1] cmp $b->[1] }     # 2. Compare keys
               map  { [$_, $key_sub_ref->(  )] }     # 1. Extract key, cache with value
                    @_;                         # 0. Perform sort on full arg list
    };

    # Install the new anonymous sub into the caller's namespace
    use Sub::Installer;
    caller->install_sub($sub_name, $sort_sub_ref);

    return;
}

# and then...
make_sorter(sort_sha => sub{ sha512($_)  } );
make_sorter(sort_ids => sub{ /^ID:(\d+)/ } );
make_sorter(sort_len => sub{ length      } );

# and later...

@names_shortest_first = sort_len(@names);
@names_digested_first = sort_sha(@names);
@names_identity_first = sort_ids(@names);

In this second version, instead of passing make_sorter( ) the key-extracting code as a string, you pass it a small anonymous subroutine. The critical difference is that these anonymous subroutines are syntax checked at compile time, so you'll get a compile-time error if there's a mistake in any of them. For example:


    make_sorter(sort_sha => sub{ sha512($_  } );

would produce the very accurate fatal compile-time error:


    syntax error at demo.pl line 42, near "$_  }"

Assuming it's passed a valid key extractor, the make_sorter( ) subroutine then creates its own anonymous subroutine (which it temporarily stores in $sort_sub_ref). Once again, this anonymous subroutine is syntax checked at compile time, so any errors are likely to be discovered before the code is shipped, whether or not make_sorter( ) is ever actually called during testing. And because the subroutine is real code, there's no need to "backslash" any of it, so it's easier to read and understand.

The $sort_sub_ref subroutine implements the requested sorting algorithm, but in this version it extracts its keys by calling the key-extraction subroutine that was passed to make_sorter( ) (and which is now held in $key_sub_ref).

Finally, having created the new sort subroutine, make_sorter( ) installs it in the caller's namespace, using the facilities provided by the Sub::Installer module. Once this module has been loaded, every package namespace automatically has its own install_sub( ) method. Subroutines can then be installed in a particular namespace simply by calling that namespace's install_sub( ) method and passing it the name by which the subroutine is to be known, followed by a reference to the subroutine itself.

In other words, each time this second version of make_sorter( ) is called, it takes the key-extractor subroutine it was passed and wraps that key-extractor up in a new anonymous sorting subroutine, which it then installs back in the namespace where make_sorter( ) was originally called. But every piece of code involved in that process is checked at compile time. No run-time evaluations are required.