Section 13.1. Exceptions

13.1. Exceptions

Throw exceptions instead of returning special values or setting flags.

Returning a special error value on failure, or setting a special error flag, is a very common error-handling technique. Collectively, they're the basis for virtually all error notification from Perl's own built-in functions^[*].

^[*] For example, the builtins eval, exec, flock, open, print, stat, and system all return special values on error. Unfortunately, they don't all use the same special value. Some of them also set a flag on failure. Sadly, it's not always the same flag. See the perlfunc manpage for the gory details.

Error notification via flags and return values has a serious flaw: flags and return values can be silently ignored. And ignoring them requires absolutely no effort on the part of the programmer. In fact, in a void context, ignoring return values is Perl's default behaviour. Ignoring an error flag that has suddenly appeared in a special variable is just as easy: you simply don't bother to check the variable.

Moreover, because ignoring a return value is the void-context default, there's no syntactic marker for it. So there's no way to look at a program and immediately see where a return value is deliberately being ignored, which means there's also no way to be sure that it's not being ignored accidentally.

The bottom line: regardless of the programmer's (lack of ) intention, an error indicator is being ignored. That's not good programming.

Ignoring error indicators frequently causes programs to propagate errors in entirely the wrong direction, as happens in Example 13-1.

Example 13-1. Returning special error values

# Find and open a file by name, returning the filehandle
# or undef on failure...
sub locate_and_open {
    my ($filename) = @_;

    # Check acceptable directories in order...
    for my $dir (@DATA_DIRS) {
        my $path = "$dir/$filename";

        

        # If file exists in an acceptable directory, open and return it...
        if (-r $path) {
            open my $fh, '<', $path;
            return $fh;
        }
    }

    # Fail if all possible locations tried without success...
    return;
}

# Load file contents up to the first <DATA/> marker...
sub load_header_from {
    my ($fh) = @_;

    # Use DATA tag as end-of-"line"...
    local $/ = '<DATA/>';

    $ Read to end-of-"line"...
    return <$fh>;
}

# and later...

for my $filename (@source_files) {
    my $fh = locate_and_open($filename);
    my $head = load_header_from($fh);
    print $head;
}

Within the locate_and_open( ) subroutine, the call to open is simply assumed to work, and the filehandle ($fh) is then immediately returned, whatever the actual outcome of the open. Presumably, the expectation is that whoever calls locate_and_open( ) will check whether the return value was a valid filehandle.

Except, of course, "whoever" doesn't. Instead of testing for failure, the main for loop takes the failure value and immediately propagates it "across" the block, to the rest of the statements in the loop. That causes the call to loader_header_from( ) to propagate the error value "downwards". And it's in that subroutine that the attempt to treat the failure value as a filehandle eventually kills the program:

    readline(  ) on unopened filehandle at demo.pl line 28.

Code like thatwhere an error is reported in an entirely different part of the program from where it actually occurredis particularly onerous to debug.

Of course, you could argue that the fault lies squarely with whoever wrote the loop, for using locate_and_open( ) without checking its return value. And, in the narrowest sense, that's entirely correct. But the deeper fault lies with whoever actually wrote locate_and_open( ) in the first place. Or, at least, whoever assumed that the caller would always check its return value.

Humans simply aren't like that. Rocks almost never fall out of the sky, so humans soon conclude that they never do, and stop looking up for them. The rains almost always come in Spring, so humans assume that they always will, and stop sacrificing unbelievers to Tlaloc to make it happen. Fires rarely break out in their homes, so humans soon forget that they might, and stop testing their smoke detectors every month. In the same way, programmers inevitably abbreviate "almost never fails" to "never fails", and then simply stop checking.

That's why so very few people bother to verify their print statements:

    if (!print 'Enter your name: ') {
        print {*STDLOG} warning => 'Terminal went missing!'
    }

It's human nature to "trust but not verify".

And human nature is why returning an error indicator is not best practice. Errors are (supposed to be) unusual occurrences, so error markers will almost never be returned. Those tedious and ungainly checks for them will almost never do anything useful, so eventually they'll be quietly omitted. After all, leaving the tests off almost always works just fine. It's so much easier not to bother. Especially when not bothering is the default!

The second shortcoming of return values as failure markers is the implicit assumption that the caller of a subroutine will be in a position to do anything about a failure that's reported. That's not always the case, especially when a complex procedure has been carefully factored into several nested levels of subroutine calls. If the immediate caller can't recover from the error, it will have to return a failure value itself, which its caller will then have to test for. And if that caller can't resolve the problem, it too will have to return a failure value, and so on. Every subroutine in the call chain will have to explicitly check for a returned failure value, and then explicitly pass it back up the call tree. That explicit propagation of errors increases the amount of unproductive infrastructure code required around any subroutine call that might fail, which, in turn, reduces the overall readability for the code and offers new opportunities for subtle flow-of-control errors to creep in.

Don't return special error values when something goes wrong; throw an exception instead. The great advantage of exceptions is that they reverse the usual default behaviours, bringing untrapped errors to immediate and urgent attention^[*]. On the other hand, ignoring an exception requires a deliberate and conspicuous effort: you have to provide an explicit eval block to neutralize it.

^[*] The imminent prospect of program termination concentrates the programmer's mind wonderfully.

Exceptions also avoid the need to explicitly test-and-return failure values. Instead, error indicators are automatically propagated upwards, bypassing any callers who are unable to cope with them.

The locate_and_open( ) subroutine would be much cleaner and more robust if the errors within it threw exceptions:


    
    # Find and open a file by name, returning the filehandle
    # or throwing an exception on failure...
    sub locate_and_open {
        my ($filename) = @_;

        # Check acceptable directories in order...
        for my $dir (@DATA_DIRS) {
            my $path = "$dir/$filename";

            # If file exists in an acceptable directory, open and return it...
            if (-r $path) {
                open my $fh, '<', $path
                    or croak( "Located $filename at $path, but could not open" );
                return $fh;
            }
        }

        # Fail if all possible locations tried without success...
        croak( "Could not locate $filename" );
    }

    # and later...

    for my $filename (@source_files) {
        my $fh = locate_and_open($filename);
        my $head = load_header_from($fh);
        print $head;
    }

Notice that the main for loop didn't change at all. The developer using locate_and_open( ) still assumes that nothing can go wrong. But now there's some justification for that expectation, because if anything does go wrong, the loop code will be automatically terminated by the exception that's thrown.

If the maintainers of the for loop wanted it to survive a failure, they could easilyand explicitlyensure that with an eval block:


    for my $filename  (@source_files) {
        if (my $fh = eval { locate_and_open($filename) }) {
            my $head = load_header_from($fh);
            print $head;
        }
        else {
            carp "Couldn't access $filename. Skipping it\n";
        }
    }

Exceptions are a better choice even if you are the careful type who religiously checks every return value for failure:

    SOURCE_FILE:
    for my $filename (@source_files) {

        my $fh = locate_and_open($filename);
        next SOURCE_FILE if !defined $fh;

        my $head = load_header_from($fh);
        next SOURCE_FILE if !defined $head;

        print $head;
    }

Constantly checking return values for failure clutters your code with validation statements, often greatly decreasing its readability. In contrast, exceptions allow an algorithm to be implemented without having to intersperse any error-handling infrastructure at all. The error-handling code can be completely factored out of the code and either relegated to after the surrounding eval (see "OO Exceptions" later in this chapter) or else dispensed with entirely:


    for my $filename (@directory_path) {
        # Just ignore any source files that don't load...
        eval {
            my $fh = locate_and_open($filename);
            my $head = load_header_from($fh);
            print $head;
        }
    }