Writing to files with Perl
Lots of Perl programs deal with text files such as configuration files or log files, so in order to make our knowledge useful it is important at an early stage to learn about file handling.
Let's first see how can we write to a file, because that seems to be easier.
This article shows how to write to a file using core perl. There are much simpler and more readable ways to do that using Path::Tiny.
Before you can write to a file you need to open it, asking the operating system (Windows, Linux, OSX, etc) to open a channel for your program to "talk to" the file. For this Perl provides the open function with a slightly strange syntax.
examples/open_file_for_writing.pl
use strict; use warnings; my $filename = 'report.txt'; open(my $fh, '>', $filename) or die "Could not open file '$filename' $!"; print $fh "My first report generated by perl\n"; close $fh; print "done\n";
This is a good working example and we'll get back to it, but let's start with a simpler example:
Simple example
examples/open_file_for_writing_simple.pl
use strict; use warnings; open(my $fh, '>', 'report.txt'); print $fh "My first report generated by perl\n"; close $fh; print "done\n";
This still needs some explanation. The open function gets 3 parameters.
The first one, $fh, is a scalar variable we just defined inside the open() call. We could have defined it earlier, but usually it is cleaner to do it inside, even if it looks a bit awkward at first. The second parameter defines the way we are opening the file. In this case, this is the the greater-than sign (>) that means we are opening the file for writing. The third parameter is the path to the file that we would like to open.
When this function is called it puts a special sign into the $fh variable. It is called file-handle. We don't care much about the content of this variable; we will just use the variable later. Just remember, the content of the file is still only on the disk and NOT in the $fh variable.
Once the file is open we can use the $fh file-handle in a print() statement. It looks almost the same as the print() in other parts of the tutorial, but now the first parameter is the file-handle and there is no(!) comma after it.
The print() call above will print the text in the file.
Then with the next line we close the file handle. Strictly speaking this is not required in Perl. Perl will automatically and properly close all the file-handles when the variable goes out of scope, at the latest when the script ends. In any case, explicitly closing the files can be considered as a good practice.
The last line print "done\n" is only there so the next example will be clearer:
Error handling
Let's take the above example again and replace the filename with a path does not exist. For example write:
open(my $fh, '>', 'some_strange_name/report.txt');
If you run the script now you will get an error message:
print() on closed file-handle $fh at ... done
Actually this is only a warning; the script keeps running and that's why we see the word "done" printed on the screen.
Furthermore, we only got the warning because we explicitly asked for warnings with use warnings statement. Try commenting out the use warnings and see the script is now silent when it fails to create the file. So you won't even notice it until the customer, or - even worse - your boss, complains.
Nevertheless it is a problem. We tried to open a file. We failed but then still tried to print() something to it.
We'd better check if the open() was successful before proceeding.
Luckily the open() call itself returns TRUE on success and FALSE on failure, so we could write this:
Open or die
open(my $fh, '>', 'some_strange_name/report.txt') or die;
This is the "standard" open or die idiom. Very common in Perl.
die is a function call that will throw an exception and thus exit our script.
"open or die" is a logical expression. As you know from the previous part of the tutorial, the "or" short-circuits in Perl (as in many other languages). This means that if the left part is TRUE, we already know the whole expression will be TRUE, and the right side is not executed. OTOH if the left hand side is FALSE then the right hand side is also executed and the result of that is the result of the whole expression.
In this case we use this short-circuit feature to write the expression.
If the open() is successful then it returns TRUE and thus the right part never gets executed. The script goes on to the next line.
If the open() fails, then it returns FALSE. Then the right side of the or is also executed. It throws an exception, which exits the script.
In the above code we don't check the actual resulting value of the logical expression. We don't care. We only used it for the "side effect".
If you try the script with the above change you will get an error message:
Died at ...
and will NOT print "done".
Better error reporting
Instead of just calling die without a parameter, we could add some explanation of what happened.
open(my $fh, '>', 'some_strange_name/report.txt') or die "Could not open file 'some_strange_name/report.txt'";
will print
Could not open file 'some_strange_name/report.txt' ...
It is better, but at some point someone will try to change the path to the correct directory ...
open(my $fh, '>', 'correct_directory_with_typo/report.txt') or die "Could not open file 'some_strange_name/report.txt'";
...but you will still get the old error message because they changed it only in the open() call, and not in the error message.
Could not open file 'some_strange_name/report.txt' No such file or directory ...
It is probably better to use a variable for the filename:
my $filename = 'correct_directory_with_typo/report.txt'; open(my $fh, '>', $filename) or die "Could not open file '$filename'";
Now we get the correct error message, but we still don't know why it failed. Going one step further we can use $! - a built-in variable of Perl - to print out what the operating system told us about the failure:
my $filename = 'correct_directory_with_typo/report.txt'; open(my $fh, '>', $filename) or die "Could not open file '$filename' $!";
This will print
Could not open file 'correct_directory_with_typo/report.txt' No such file or directory ...
That's much better.
With this we got back to the original example.
Greater-than?
That greater-than sign in the open call might be a bit unclear, but if you are familiar with command line redirection then this can be familiar to you too. Otherwise just think about it as an arrow showing the direction of the data-flow: into the file on the right hand side.
Non-latin character?
In case you need to handle characters that are not in the ASCII table, you'll probably want to save them as UTF-8. To do that you need to tell Perl, you are opening the file with UTF-8 encoding.
open(my $fh, '>:encoding(UTF-8)', $filename) or die "Could not open file '$filename'";
Published on 2012-12-20