Bug in the for-loop of Perl? - B::Deparse to the rescue

The other day I got an email asking for my help. The reader had a short Perl snippet traversing two arrays. One using the C-style for-loop, the other using Perl-style foreach-loop.

They acted differently.

A beginner would stare at it for a long time and not understand the problem. As he is not trusting Perl yet, he might jump to the conclusion that there is a bug in Perl, or maybe that the behavior of for has changed.

A seasoned Perl programmer would see the problem immediately. And he would be wrong.

Here is how you can find out what is the problem:

The code

#!/usr/bin/perl

use strict;
use warnings;

my %hash = (
    'chr1' => [
        ['start','end','cat'],
        ['raj','end','cat']
    ],
    'chr2' => [
        ['start','end','cat'],
        ['start','end','cat']
    ]
);

print "Using C-ish for-loop syntax version\n";

foreach my $key (keys %hash) {
    my $j;
    my $i;
    print "$key: ";
    for ($i=0, $i < 2, ++$i ) {
        for ($j=0, $j<3, ++$j) {
             print "$hash{$key}[$i][$j]  ";
        }
    }
    print "\n";
}

print "\n\nUsing Perl-ish for-loop syntax version\n";

foreach my $key (keys %hash) {
    print "$key: ";
    for my $i (0..1) {
       for my $j (0..2) {
           print "$hash{$key}[$i][$j]  ";
       }
    }
    print "\n";
}

The output:

Using C-ish for-loop syntax version
chr1: end  end  end  end  end  end  end  end  end  
chr2: end  end  end  end  end  end  end  end  end  


Using Perl-ish for-loop syntax version
chr1: start  end  cat  raj  end  cat  
chr2: start  end  cat  start  end  cat

When I got the script, first thing I checked if there is use strict; use warnings;. There were. Great!

The indentation also looked good. So what can cause he different behavior?

For a second I thought about maybe keys returning the values in different order in the two calls, but that probably should not happen, and even if that was the case, the result should not be this.

Interestingly the author of this code used foreach for the outer loops and for in the 4 inner loops. Twice he was using it in C-style, twice he was using it Perl-style. I am not sure if he was aware that for and foreach are synonyms, and Perl knows which one to used based on the syntax.

Anyway, I still did not know what is the problem, but there was one small thing bothering me. The declaration of $i and $j are outside of the for loop in the first case. let's fix that and have the following as the first loop:

foreach my $key (keys %hash) {
    print "$key: ";
    for (my $i=0, $i < 2, ++$i ) {
        for (my $j=0, $j<3, ++$j) {
             print "$hash{$key}[$i][$j]  ";
        }
    }
    print "\n";
}

Running the script now gave me this error:

$ perl for-each.pl 
Global symbol "$i" requires explicit package name at for-each.pl line 21.
Global symbol "$i" requires explicit package name at for-each.pl line 21.
Global symbol "$j" requires explicit package name at for-each.pl line 22.
Global symbol "$j" requires explicit package name at for-each.pl line 22.
Execution of for-each.pl aborted due to compilation errors.

What the ??? How did this happen?

I was quite frustrated by this time. You see I just woke up. Have not eaten anything yet.

Clearly Perl and I have a misunderstanding here.

So I wondered what does Perl think about this code.

B::Deparse - the magic wand

That's where I brought in the magic wand, aka. B::Deparse. It can tell me what Perl thinks I wrote.

So I ran the original script using B::Deparse: $ perl -MO=Deparse for-each.pl

use warnings;
use strict;
my(%hash) = ('chr1', [['start', 'end', 'cat'], ['raj', 'end', 'cat']], 'chr2', [['start', 'end', 'cat'], ['start', 'end', 'cat']]);
print "Using C-ish for-loop syntax version\n";
foreach my $key (keys %hash) {
    my $i;
    my $j;
    print "${key}: ";
    foreach $_ ($i = 0, $i < 2, ++$i) {
        foreach $_ ($j = 0, $j < 3, ++$j) {
            print "$hash{$key}[$i][$j]  ";
        }
    }
    print "\n";
}
print "\n\nUsing Perl-ish for-loop syntax version\n";
foreach my $key (keys %hash) {
    print "${key}: ";
    foreach my $i (0 .. 1) {
        foreach my $j (0 .. 2) {
            print "$hash{$key}[$i][$j]  ";
        }
    }
    print "\n";
}

It replaced all for occurrences of for by foreach. That's strange. I would have expected it to show the c-style for loops with the for keyword. Not only that, but if we take a closer look at the first internal loop. It says foreach $_ ($i = 0, $i < 2, ++$i). Why is it iterating over the $_ the default variable?

That's bad. I thought we are iterating over $i. And we are using $i in the expression printing the value of the %hash.

That's when the light came on. Perl thinks we had a Perl-style foreach loop there while we wanted a C-style for loop.

The way Perl can differentiate between the two loops is that the Perl-style foreach loop has a list of values in the parentheses, while the C-style for loop has 3 parts separated by ;.

We had - by mistake - , (comma) separating the 3 parts of the for loop instead of ; (semicolon), and so Perl thought we meant a foreach loop iterating over the values $i=0, $i < 2, and ++$i instead of the C-style for loop.

A silly mistake, that causes this strange problem because of the duality of the for-loop in Perl.

The fixed version

The correct way to write those internal for-loops are as follows. Including the move of my inside the for expression.

foreach my $key (keys %hash) {
    print "$key: ";
    for (my $i=0; $i < 2; ++$i ) {
        for (my $j=0; $j<3; ++$j) {
             print "$hash{$key}[$i][$j]  ";
        }
    }
    print "\n";
}

Conclusion

There are cases when we have disagreement with Perl. B::Deparse can help us understand what Perl though of our code.

Written by
Gabor Szabo

Published on 2014-02-27

If you have any comments or questions, feel free to post them on the source of this page in GitHub. Source on GitHub. Comment on this post

Bug in the for-loop of Perl? - B::Deparse to the rescue

The code

B::Deparse - the magic wand

The fixed version

Conclusion

Author: Gabor Szabo