While in most of the cases we'd process a text file line-by-line, there are cases when it is easier to do the work if all the content of the file is in the memory in a single scalar variable.

For example when we need to replace Java is Hot by Jabba the Hutt in a text file.

(Probably this is going to be funny only to programmers who are Start Wars fans and who have bad Hungarian accent in English as I do. Or maybe not even to them.)

In any case you can escape now and read more about Jabba the Hutt.

For example this is what we have in the data.txt file:

Java is Hot

Java is
Hot

use strict;
use warnings;
use 5.010;

my $file = 'data.txt';
open my $fh, '<', $file or die;
$/ = undef;
my $data = <$fh>;
close $fh;
print $data;

$data =~ s/Java\s+is\s+Hot/Jabba The Hutt/g;
say '-' x 30;

print $data;

Running the above Perl program we get the following output:

Java is Hot

Java is
Hot
------------------------------
Jabba The Hutt

Jabba The Hutt

Explanation

The $/ variable is the Input Record Separator in Perl. When we put the read-line operator in scalar context, for example by assigning to a scalar variable $x = <$fh>, perl will read from the file up-to and including the Input Record Separator which is, by default, the new-line \n.

What we did here is we assigned undef to $/. So the read-line operator will read the file up-till the first time it encounters undef in the file. That never happens so it reads till the end of the file. This is what is called slurp mode, because of the sound the file makes when we read it.

The big problem with the above code is that $/ is a global variable. This if we change $/ in one place of our code, this will change the behavior of perl in other places of our code and even in third-party modules.

So it is better to localize it:

localize the change

use strict;
use warnings;
use 5.010;

my $file = 'data.txt';
my $data;
{
    open my $fh, '<', $file or die;
    local $/ = undef;
    $data = <$fh>;
    close $fh;
}
print $data;

$data =~ s/Java\s+is\s+Hot/Jabba The Hutt/g;
say '-' x 30;

print $data;

We have 3 changes in this code:

  • We put the local keyword in front of the assignment to $/. This will make sure the value of $/ returns to whatever it was when the enclosing block ends.
  • For this we needed an enclosing block, so we added a pair of curly braces around the code-snippet dealing with the file.
  • The third change is that we had to declare the $data variable outside of the block, or it would go out of scope when the block ends.

Creating a slurp function

In the third iteration of the code, we create a separate function called slurp that will get the name of the file and return the content as a single string. This allows us to hide the code-snippet at the end of the program or even in a separate file. It also makes it reusable, so instead of copying it to other places where we might need the same functionality we can just call the slurp function.

This makes the main body of our code much nicer.

use strict;
use warnings;
use 5.010;

my $file = 'data.txt';
my $data = slurp($file);

print $data;

$data =~ s/Java\s+is\s+Hot/Jabba The Hutt/g;
say '-' x 30;

print $data;

sub slurp {
    my $file = shift;
    open my $fh, '<', $file or die;
    local $/ = undef;
    my $cont = <$fh>;
    close $fh;
    return $cont;
}

Of course we could further improve our slurp function by setting the encoding to utf-8 and by providing better error message in case on of the system calls fail.

File::Slurp

In the article replacing a string in a file we had a similar example, except that there we used the read_file function of the File::Slurp module.

Path::Tiny

An even better solution is to use the Path::Tiny module. It exports the path function that gets a path to a file as a parameter and returns an object. We can then call the slurp method on that object:

use Path::Tiny qw( path );

my $file = 'data.txt';
my $data = path($file)->slurp;