Previous Page
Next Page

10.10. Power Slurping

Slurp a stream with Perl6::Slurp for power and simplicity.

Reading in an entire input stream is common enough, and the do {...} idiom is ugly enough, that the next major version of Perl (Perl 6) will provide a built-in function to handle it directly. Appropriately, that builtin will be called slurp.

Perl 5 doesn't have an equivalent builtin, and there are no plans to add one, but the future functionality is available in Perl 5 today, via the Perl6::Slurp CPAN module. Instead of:

    my $text = do { local $/; <$file_handle> };

you can just write:

    use Perl6::Slurp;

    my $text = slurp $file_handle;

which is cleaner, clearer, more concise, and consequently less error-prone.

The slurp( ) subroutine is also much more powerful. For example, if you have only the file's name, you would have to write:

    my $text = do {
        open my $fh, '<', $filename or croak "$filename: $OS_ERROR";
        local $/;

which almost seems more trouble than it's worth. Or you can just give slurp( ) the filename directly:

    my $text = slurp $filename;

and it will open the file and then read in its full contents for you.

In a list context, slurp( ) acts like a regular <> or readline, reading in every line separately and returning them all in a list:

    my @lines = slurp $filename;

The slurp( ) subroutine also has a few useful features that <> and readline lack. For example, you can ask it to automatically chomp each line before it returns:

    my @lines = slurp $filename, {chomp => 1};

or, instead of removing the line-endings, it can convert each one to some other character sequence (say, '[EOL]'):

    my @lines = slurp $filename, {chomp => '[EOL]'};

or you can change the input record separatorjust for that particular call to slurp( )without having to monkey with the $/ variable:


    # Slurp chunks...
my @paragraphs = slurp $filename, {irs => $EMPTY_STR};

Setting the input record separator to an empty string causes <> or slurp to read "paragraphs" instead of lines, where each "paragraph" is a chunk of text ending in two or more newlines.

You can even use a regular expression to specify the input record separator, instead of the plain string that Perl's standard $/ variable restricts you to:


    # Read "human" paragraphs (separated by two or more whitespace-only lines)...
my @paragraphs = slurp $filename, {irs => qr/\n \s* \n/xms};

    Previous Page
    Next Page