In this part of the Perl tutorial we are going to see how to read from a file in Perl.

At this time, we are focusing on text files.

In this article we see how to do this with core perl, but there are more modern and nicer ways to do this using Path::Tiny to read files.

There are two common ways to open a file depending on how would you like to handle error cases.

Exception

Case 1: Throw an exception if you cannot open the file:

use strict;
use warnings;

my $filename = 'data.txt';
open(my $fh, '<:encoding(UTF-8)', $filename)
  or die "Could not open file '$filename' $!";

while (my $row = <$fh>) {
  chomp $row;
  print "$row\n";
}

Warn or keep silent

Case 2: Give a warning if you cannot open the file, but keep running:

use strict;
use warnings;

my $filename = 'data.txt';
if (open(my $fh, '<:encoding(UTF-8)', $filename)) {
  while (my $row = <$fh>) {
    chomp $row;
    print "$row\n";
  }
} else {
  warn "Could not open file '$filename' $!";
}

Explanation

Let's see them explained:

First, using a text editor, create a file called 'data.txt' and add a few lines to it:

First row
Second row
Third row

Opening the file for reading is quite similar to how we opened it for writing, but instead of the "greater-than" (>) sign, we are using the "less-than" (<) sign.

This time we also set the encoding to be UTF-8. In most of the code out there you will see only the "less-than" sign.

use strict;
use warnings;

my $filename = 'data.txt';
open(my $fh, '<:encoding(UTF-8)', $filename)
  or die "Could not open file '$filename' $!";

my $row = <$fh>;
print "$row\n";
print "done\n";

Once we have the filehandle we can read from it using the same readline operator that was used for reading from the keyboard (STDIN). This will read the first line of the file. Then we print out the content of $row and print "done" just to make it clear we reached the end of our example.

If you run the above script you will see it prints

First row

done

Why is there an empty row before the "done" you might ask.

That's because the readline operator read all the line, including the trailing newline. When we used print() to print it out, we added a second newline.

As with the case of reading from STDIN, here too, we usually don't need that trailing newline so we will use chomp() to remove it.

Reading more than one line

Once we know how to read one line we can go ahead and put the readline call in the condition of a while loop.

use strict;
use warnings;

my $filename = $0;
open(my $fh, '<:encoding(UTF-8)', $filename)
  or die "Could not open file '$filename' $!";

while (my $row = <$fh>) {
  chomp $row;
  print "$row\n";
}
print "done\n";

Every time we reach the condition of the while loop, first it will execute my $row = <$fh>, the part that will read the next line from the file. If that line has anything in it, that will evaluate to true. Even empty lines have a newline at the end, so when we read them, the $row variable will contain a \n that will evaluate to true in boolean context.

After we read the last line, in the next iteration the readline operator (<$fh>) will return undef which is false. The while-loop will terminate.

An edge-case

There is an edge-case though when the very last line has a single 0 in it, without a trailing newline. The above code would evaluate that line to false and the loop would not be executed. Fortunately, Perl is actually cheating here. In this very specific case (reading a line from a file within a while-loop), perl will actually act as if you wrote while (defined my $row = <$fh>) { and so even such lines will execute properly.

open without die

The above way of handling files is used in Perl scripts when you absolutely have to have the file opened or there is no point in running your code. For example when the whole job of your script is to parse that file.

What if this is an optional configuration file? If you can read it you change some settings, if you cannot read you just use the defaults.

In that case the second solution might be a better way to write your code.

if (open(my $fh, '<:encoding(UTF-8)', $filename)) {
  while (my $row = <$fh>) {
    chomp $row;
    print "$row\n";
  }
} else {
  warn "Could not open file '$filename' $!";
}

In this case we check the return value of open. If it is true we go ahead and read the content of the file.

If it failed we give a warning using the built-in warn function but don't throw an exception. We don't even need to include the else part:

if (open(my $fh, '<:encoding(UTF-8)', $filename)) {
  while (my $row = <$fh>) {
    chomp $row;
    print "$row\n";
  }
}