Open file to read and write in Perl, oh and lock it too
If you don't have time to read this, just use
open my $fh, '+<', $filename or die;
If you have time, read on.
In most cases when you need to updated a file the best strategy is to read the entire file into memory, make the changes and then write the whole file back. Well, of course unless the file is too big, which is a separate story.
In most of these cases it is ok to
- open the file for reading
- read the whole content
- make changes in memory
- open the file again, this time for writing
- write out the whole content
However, sometimes you need to make this operation "atomic", that is, you need to make sure no other process will change the file while your are changing it.
OK, so just to clarify, you probably never want other process to modify the file while you do it, but in most cases you don't have to worry about that as no other processes are dealing with the file.
What happens when there are competing processes? Even if that is the same script?
This script is a counter that for every invocation increases the number in the counter.txt file by one and prints it to the screen:
examples/counter_plain.pl
use strict; use warnings; my $file = 'counter.txt'; my $count = 0; if (open my $fh, '+<', $file) { $count = <$fh>; close $fh; } $count++; print "$count\n"; if (open my $fh, '>', $file) { print $fh $count; close $fh; }
Create a file called counter.txt with a single 0 in it and then run:
perl counter_plain.pl
several times. You'll see the number incremented as expected.
What if several people invoke the script at the same time?
To demonstrate that we will run the script 1000 times in two separate windows.
IF you are using Linux or Mac you can use the following Bash snippet:
for x in {1..1000}; do perl counter_plain.pl; done
I have not tried this on Windows, and because it has a different file-locking methodology the results might be totally different.
If you execute the above command in two terminals at more or less the same time, you'll see the numbers progressing, but they'll not reach 2,000. They might even get reset to 1 from time-to-time as the file operations of two instances of the script collide.
Locking
On Unix-like operating systems, such as Linux and OSX, we can use the native file locking mechanism via the flock function of Perl.
For this however we need to open the file for both reading and writing.
examples/counter_lock.pl
use strict; use warnings; use Fcntl qw(:flock SEEK_END); my $file = 'counter.txt'; my $count = 0; if (open my $fh, '+<', $file) { flock($fh, LOCK_EX) or die "Cannot lock mailbox - $!\n"; $count = <$fh>; $count++; print "$count\n"; seek $fh, 0, 0; truncate $fh, 0; print $fh $count; close $fh; }
In this script we
- Open the file for reading and writing
- Ask for an exclusive lock on it. (Wait till we get it).
- Read the file content.
- Make the changes in memory. (increment by 1)
- Rewind the filehandle to the beginning using the seek function.
- Remove the content of the file using truncate.
- Write the new content
- close the file (and by that free the lock)
We could not open the file separately once for reading and once for writing, the closing of the filehandle always frees the lock. So the other instance of our script might come between the two open calls in our instance.
We needed to rewind the filehandle (using seek) so we write the new content at the beginning of the file and not at the end.
In this case we did not have to truncate the file as the new content is never going to be shorter than the old content (after all the numbers are only incrementing), but in the general case it is a better practice. It will ensure that we don't have left-over content from the previous version of the file.
If you try to run this script 1,000 each in two separate windows you'll see it reaches 2,000 as expected.
Published on 2017-12-07