Previous Page
Next Page

12.3. String Boundaries

Use \A and \z as string boundary anchors.

Even if you don't adopt the previous practice of always using /m, using ^ and $ with their default meanings is a bad idea. Sure, you know what ^ and $ actually mean in a Perl regex. But will those who read or maintain your code know? Or is it more likely that they will misinterpret those metacharacters in the ways described earlier?

Perl provides markers that alwaysand unambiguouslymean "start of string" and "end of string": \A and \z (capital A, but lowercase z). They mean "start/end of string" regardless of whether /m is active. They mean "start/end of string" regardless of what the reader thinks ^ and $ mean.

They also stand out well. They're unusual. They're likely to be unfamiliar to the readers of your code, in which case those readers will have to look them up, rather than blithely misunderstanding them.

So rather than:

    
    # Remove leading and trailing whitespace...
    $text =~ s{^ \s* | \s* $}{}gx;

use:


    

    # Remove leading and trailing whitespace...
$text =~ s{\A \s* | \s* \z}{}gxm;

And when you later need to match line boundaries as well, you can just use ^ and $ "naturally":


    

    # Remove leading and trailing whitespace, and any -- line...
$text =~ s{\A \s* | ^-- [^\n]* $ | \s* \z}{}gxm;

The alternative (in which ^ and $ each have three distinct meanings in different contexts) is unnecessarily cruel:

    
    # Remove leading and trailing whitespace, and any -- line...
    $text =~ s{^ \s* | (?m: ^-- [^\n]* $) | \s* $}{}gx;

    Previous Page
    Next Page