Previous Page
Next Page

8.9. Substrings

Use 4-arg substr instead of lvalue substr.

The substr builtin is unusual in that it can be used as an lvalue (i.e., a target of assignment). So you can write things like:

    substr($addr, $country_pos, $COUNTRY_LEN)
        = $country_name{$country_code};

This statement first locates the substring of the string in $addr which starts at $country_pos and runs for $COUNTRY_LEN characters. Then that substring is replaced with the string in $country_name{$country_code}. Effectively, it's an assignment into part of the string value in a variable.

But to readers who are unused to this particular feature, an assignment to a function call can be confusing, or even scary, and therefore less comprehensible. So substr assignments become an issue of maintainability.

Of course, it's not hard to look up the perlfunc manual and learn about the special semantics of substr assignments, so their impact on maintainability is marginal. Then again, almost every maintainability issue is, by itself, marginal. It's only collectively that subtleties, clevernesses, and esoterica begin to sabotage comprehensibility. And it's only collectively that obviousness, straightforwardness, and conformity to standards can help to enhance it. Every small choice when coding contributes in one direction or the other.

However you choose to assess their cognitive load, there is another problem with assignments to substrings: they're relatively slow. The call to substr has to locate the required substring, create an interim representation of it, return that interim representation, perform the assignment to it, re-identify the required substring, and then replace it.

To avoid those extra steps, in Perl 5.6.1 and later substr also comes in a four-argument model. That is, if you provide a fourth argument to the function, that argument is used as the string with which to replace the substring identified by the first three arguments. So the previous example could be rewritten more efficiently as:


    substr $addr, $country_pos, $COUNTRY_LEN , $country_name{$country_code};

Because that assignment now takes place within the original call, there's no need to create and return an interim representation, and no effort wasted re-identifying the substring during the assignment. That means a four-argument substr call is always faster than the equivalent assignment to a three-argument substr call.

    Previous Page
    Next Page