Previous Page
Next Page

12.21. Alternations

Always use character classes instead of single-character alternations.

Individually testing for single character alternatives:

    if ($cmd !~ m{\A (?: a | d | i | q | r | w | x ) \z}xms) {
        carp "Unknown command: $cmd";
        next COMMAND;
    }

may make your regex slightly more readable. But that gain isn't sufficient to compensate for the heavy performance penalty this approach imposes. Furthermore, the cost of testing separate alternatives this way increases linearly with the number of alternatives to be tested.

The equivalent character class:


    if ($cmd !~ m{\A [adiqrwx] \z}xms) {
        carp "Unknown command: $cmd";
        next COMMAND;
    }

does exactly the same job, but 10 times faster. And it costs the same no matter how many characters are later added to the set.

Sometimes a set of alternatives will contain both single- and multicharacter alternatives:

    if ($quotelike !~ m{\A (?: qq | qr | qx | q | s | y | tr ) \z}xms) {
        carp "Unknown quotelike: $quotelike";
        next QUOTELIKE;
    }

In that case, you can still improve the regex by aggregating the single characters:


    if ($quotelike !~ m{\A (?: qq | qr | qx | [qsy] | tr ) \z}xms) {
        carp "Unknown quotelike: $quotelike";
        next QUOTELIKE;
    }

Sometimes you can then factor out the commonalities of the remaining multicharacter alternatives into an additional character class:


    if ($quotelike !~ m{\A (?: q[qrx] | [qsy] | tr ) \z}xms) {
        carp "Unknown quotelike: $quotelike";
        next QUOTELIKE;
    }

    Previous Page
    Next Page