Perl
Preface
Content
- Introduction to writing Perl
- Many code snippets that can be used as base or as example
- Introduction to basic syntax
- Many examples explained
- Many examples where not all the syntax is explained
Introdcution
Installation
- On UNIX/Linux you usually have it installed in /usr/bin/perl
- On Microsoft Windows install ActivePerl from ActiveState
- Better yet, download Strawberry Perl
- DWIM Perl for Windows (and Linux) - not maintained any more.
- Citrus Perl
- Installing and getting started with Perl
- Download and install Perl
Editors
IDEs
- Komodo of ActiveState
- Eclipse with EPIC or Perlipse
- Jetbrains IntelliJ IDEA + perl plugin.
- Padre, the Perl IDE (bundled with DWIM Perl) (not maintained any more)
- Perl Editor
Perl on the command line
- -e|options
- -v|options
On the command line one can type:
- perl -e "print 42"
- perl -v
- perl -V
- perl -E "say 42"
See Perl One-liners.
First script
- string
- number
- "
- ;
- \n
- sh-bang
#!/usr/bin/env perl
use v5.8.7; # use 5.008007;
use strict;
use warnings;
print "Hello world\n";
print 42, "\n";
run it by typing
perl hello_world.pl
On unix you can also make it executable:
chmod u+x hello_world.pl
and then run like:
./hello_world.pl
A couple of notes
- Strings and numbers
- Strings must be quoted, you can use special characters such as "\n"
- The print statement (Output) - gets comma delimitered list of things
- ; after every statement
Safety net
- -w|options
- strict
- warnings
- diagnostics
#!/usr/bin/perl
use strict;
use warnings;
You should always use them as they are a safety net helping reduce mistakes.
It is usually very hard to add this safety net after you already have some code.
If the warnings you get don't make sense add
use diagnostics;
line and you will get more verbose warnings.
Why are use warnings and use strict so important even in small (< 100 lines) scripts ?
-
Help avoiding trouble with recursive functions
-
Help avoiding typos in variable names
-
Disable unintentional symbolic references
-
Reduce debugging time
-
Enable/enforce better coding standard => cleaner code, maintainability
First sub
- sub
#!/usr/bin/env perl
use v5.8.7; # use 5.008007;
use strict;
use warnings;
sub main {
print "Hello world\n";
print 42, "\n";
}
main();
- Putting all your code in small(!) functions will make your code better.
First say
- say
#!/usr/bin/env perl
use v5.10; # use 5.010;
use strict;
use warnings;
sub main {
say "Hello world";
say 42;
}
main();
print vs. say
say "hello"; # print a newline at the end. (available since perl 5.10)
print "hello"; # print as it is
print "hello\n"; # print a newline at the end.
Debugging
-
-d|options
-
debug
-
Devel::ptkdb
-
Running script perl mycode.pl
-
Use the built-in debugger perl -d mycode.pl
-
Install Devel::ptkdb and run ** perl -d:ptkdb mycode.pl**
-
IDEs have debuggers
-
Include print statements in the critical places of your code.
-
use Data::Dumper qw(Dumper);
Avoid global variables
- Not really Perl specific, but declare variables as late as possible!
- Don't use global variables!
- You will thank yourself a year later.
Keep it Simple - KISS
- A script has to be simple, readable and reusable.
- Sophisticated optimization and algorithms are less interesting.
Comments
# Comments for other developers
print 42; # the answer
Perl documentation
-
perldoc
-
perldoc perl
-
perldoc perlsyn
-
perldoc perlfunc
-
perldoc -f print
-
perldoc -q sort
-
perldoc perlrun
-
perldoc strict
-
perldoc warnings
Web based: perldoc
Core Perl documentation and CPAN module documentation
POD - Plain Old Documentation
- POD
- =head1
- =head2
#!/usr/bin/perl
use strict;
use warnings;
print "Hello, there is no more code here\n";
=head1 Explaining how PODs work
Documentation starts any time there is a =tag
at the beginning of a line (tag can be any word)
and ends where there is a =cut at the beginning
of a line.
Around the =tags you have to add empty rows.
A few example tags:
Main heading =head1
Subtitle =head2
Start of indentation =over 4
element =item *
end of indentation =back
Documentation of PODs can be found in B<perldoc perlpod>
See a few examples:
=head1 Main heading
text after main heading
=head2 Less important title
more text
some text shown verbatim
more verbatim text typed in indented to the right
=over 4
=item *
Issue
=item *
Other issue
=back
documentation ends here
=cut
print "Just documentation\n";
perl examples/firstperl/documentation.pl
perldoc examples/firstperl/documentation.pl
Exercise: Hello world
Try your environment:
- Make sure you have access to the right version of Perl (5.8.x) or newer
- Check you can read the documentation. (perldoc)
- Check if you have a good editor with syntax highlighting
- Write a simple script that prints Hello world
- Add comments to your code
- Add user documentation to your code
Scalars
Scalars intro
A single piece of data either a number or a string is called a 'scalar' in Perl.
Scalar values
- undef
- a number
- a string
- a reference to any other data structure or function
undef
- undef in Perl
- NULL in SQL
- None in Python
- null in PHP or JavaScript
Numbers - integers, real or floating-point
integer (decimal)
No need for quotes around the value.
26
1_234_567 # like 1,234,567 in human writing
integer (hex/oct/binary)
0x1a # hex also written as hex("1a");
032 # oct also written as oct("32");
0b11010 # binary also written as oct("0b11010");
# all 3 equal to 26 decimal
real or floating-point
3.5e+3 # 3500
Strings
"Hello world";
or
'Hello world';
Scalar variables (use my)
-
my
-
scalar variables
-
Scalar variables always start with a $ sign, name is alphanumeric (a-zA-Z0-9) and underscore (_)
-
A scalar variable can hold a string, a number, a reference to another data-structure, undef
-
Value assignment to variable is done by the = sign
-
Use the my keyword to declare variables (optional but recommended)
$this_is_a_long_scalar_variable
$ThisIsAlsoGoodButWeUseItLessInPerl
$h
$H # $h and $H are two different variables
#!/usr/bin/perl
use strict;
use warnings;
my $greeting = "Hello world\n";
my $the_answer = 42;
my $name; # undef
print $greeting;
print $the_answer, "\n";
$name = 'Foo Bar';
print $name, "\n";
$the_answer = "Hi, you two";
The value can be changed any time.
Sigils and variables
- $
- @
- %
- &
-
Variables in Perl are Sigil + variable name:
- $ - scalar
- @ - array
- % - hash
- & - function (only used in references)
-
-
- everything (just in very extreme code)
-
$name_of_a_scalar
@name_of_an_array
%name_of_a_hash
Common error messages
- undef
Global symbol "$x" requires explicit package name at ..
You need to declare the variable $x by my
.
Use of uninitialized value $x in ... at ...
$x contained undef
.
Name "main::x" used only once: possible typo at ...
What it said. It probably refers to $x.
Can't locate Module/NameX.pm in @INC (@INC contains: ... )
You probably have "use Module::NameX" in your code meaning you are trying to load Module::NameX. Either there is a typo in the name of the module (e.g. in our case it is probably called Module::Name) or you need to install Module::NameX.
Scalar found where operator expected at ...
Probably a , is missing between parameters of a function?
Greeting with a name, Variable interpolation
- interpolation|scalar
#!/usr/bin/perl
use strict;
use warnings;
my $name = "Foo";
print "Hello ", $name, " - how are you ?\n";
print "Hello $name - how are you ?\n";
User Input
- <STDIN>
- STDIN
#!/usr/bin/perl
use strict;
use warnings;
print "Enter your name, please: ";
my $name = <STDIN>;
print "Hello $name - how are you ?\n";
- STDIN - Standard Input (usually it is the keyboard)
- Reading one line (till ENTER) from STDIN
$ perl examples/scalars/read_from_stdin.pl
Enter your name, please: Foo
Hello Foo
- how are you ?
There is this problem of the newline
chomp
- chomp
#!/usr/bin/perl
use strict;
use warnings;
print "Enter your name, please: ";
my $name = <STDIN>;
chomp $name;
print "Hello $name - how are you ?\n";
chomp
will remove the new line "\n" character from the end of the string if there was one.
Numerical Operators
- operators|numerical
-
-
-
- /
- %
- assignment
- shorthand
- =
- ==
- *=
- variable interpolation
- operator precedence
- operator associativity
#!/usr/bin/perl
use strict;
use warnings;
my $x = 3;
my $y = 11;
my $z = $x + $y;
print "$z\n"; # 14
$z = $x * $y;
print "$z\n"; # 33
print $y / $x, "\n"; # 3.66666666666667
$z = $y % $x; # (modulus)
print "$z\n"; # 2
$z += 14; # is the same as $z = $z + 14;
print "$z\n"; # 16
$z++; # is the same as $z = $z + 1;
$z--; # is the same as $z = $z - 1;
$z = 23 ** 2; # exponentiation
print "$z\n"; # 529
See also perldoc perlop
for all the operators.
Autoincrement
- auto-increment
- auto-decrement
- ++
- --
#!/usr/bin/perl
use strict;
use warnings;
my $x = 7;
# Postfix ++ increments AFTER the OLD value was used
my $y = $x++;
print "y = $y, x = $x\n"; # y = 7, x = 8,
$x = 7;
$y = ++$x;
print "y = $y, x = $x\n"; # y = 8, x = 8
String Operators
- operators|string
- .
- x
#!/usr/bin/perl
use strict;
use warnings;
my $x = "Hello";
my $y = "World";
# . is the concatenation operator, ataching ons string after the other
my $z = $x . " " . $y; # the same as "$x $y"
print $z, "\n"; # Hello World
my $w = "Take " . (2 + 3); # you cannot write "Take (2 + 3)" here
print "$w\n"; # Take 5
$z .= "! "; # the same as $z = $z . "! ";
print "'$z'\n"; # 'Hello World! '
# x is the string repetition operator
my $q = $z x 3;
print "'$q'\n"; # 'Hello World! Hello World! Hello World! '
See also perldoc perlop
for all the operators.
String - Number conversion
- automatic conversion
#!/usr/bin/perl
use strict;
use warnings;
print 3 . "", "\n";
print 3.1 . "", "\n";
print "3" + 0, "\n";
print "3.1" + 0, "\n";
print "3x" + 0, "\n"; # warning: Argument "3x" isn't numeric in addition (+)
print "3\n" + 0, "\n";
print "3x7" + 0, "\n"; # warning: Argument "3x7" isn't numeric in addition (+)
print "" + 0, "\n"; # warning: Argument "" isn't numeric in addition (+)
print "z" + 0, "\n"; # warning: Argument "z" isn't numeric in addition (+)
print "z7" + 0, "\n"; # warning: Argument "z7" isn't numeric in addition (+)
Conditional statements: if
- if
- ==
- indentation
Sometimes based on some condition a piece of code has to be executed or not.
#!/usr/bin/perl
use strict;
use warnings;
print "First number: ";
my $x = <STDIN>;
chomp $x;
print "Second number: ";
my $y = <STDIN>;
chomp $y;
if ($y == 0) {
print "Cannot divide by zero\n";
} else {
my $z = $x / $y;
print "The result is $z\n";
}
Syntax of if statement
- else
- elsif
- indentation
{} are always required
if (COND) {
STATEMENTs;
}
if (COND) {
STATEMENTs;
} else {
STATEMENTs;
}
if (COND_1) {
A_STATEMENTs;
} else {
if (COND_2) {
B_STATEMENTs;
} else {
if (COND_3) {
C_STATEMENTs;
} else {
D_STATEMENTs;
}
}
}
if (COND_1) {
A_STATEMENTs;
} elsif (COND_2) {
B_STATEMENTs;
} elsif (COND_3) {
C_STATEMENTs;
} else {
D_STATEMENTs;
}
Comparison operators
- comparison operators
- ASCII
- ==
- <=
- eq
- ne
- lt
Two sets of relation operators. One is to compare numerically the other is to compare based on the ASCII table
| Numeric | String (ASCII) | Meaning | | == | eq | equal | | != | ne | not equal | | < | lt | less than | | > | gt | greater than | | <= | le | less than or equal | | >= | ge | greater than or equal |
Compare values - examples
| Expression | Value | | "12.0" == 12 | TRUE | | "12.0" eq 12 | FALSE | | 2 < 3 | TRUE | | 2 lt 3 | TRUE | | 12 > 3 | TRUE | | 12 gt 3 | FALSE ! | | "foo" == "" | TRUE ! (Warning) | | "foo" eq "" | FALSE | | "foo" == "bar" | TRUE ! (Warning) | | "foo" eq "bar" | FALSE |
When reading from STDIN you can always expect a string
#!/usr/bin/perl
use strict;
use warnings;
my $input = <STDIN>;
chomp $input;
if ($input == "") { # wrong! use eq
# empty string
}
undef, the initial value and defined
$q = $x + 1; # is 1 + warning as the default number is 0
$w = $y . "abc"; # is 'abc' + warning as the default string is ""
$z++; # is 1 - no warning
You can check if the variable already has a value or if it still undef:
if (not defined $x) { # to avoid warning
$x = 0;
}
$x = $x + 1;
Boolean expressions (logical operators)
- and
- or
- not
- xor
- &&
- ||
| and | && | | or | || | | not | ! | | xor | |
if (COND and COND) {
}
if (COND or COND) {
}
if (not COND) {
}
See also perldoc perlop
for precedence and associativity tables
and/or use () to define the order of evaluation.
Boolean values: TRUE and FALSE
- undef
- defined
- boolean values
- TRUE
- FALSE
if ($z) {
# $z is true
}
The FALSE values:
undef
""
0 0.0 00000 0e+10
"0"
All other values, such as the following are TRUE:
1
"00"
"0\n"
In many cases the separation must be between "real" values and undefined values.
For that you can use the defined
function:
if (defined $x) {
# $x is defined (not undef)
}
Your Salary is in Danger - Short-Circuit
- short circuit
If perl already knows the final value of a boolean expression after computing
only part of it, perl will NOT calculate the rest of the expression:
if ($my_money > 1_000_000 or $my_salary > 10_000) {
# I can live well
}
if ($my_money > 1_000_000 or $my_salary++ > 10_000) {
# I can live well
}
Exercise: Rectangular
Write a program that computes the area of a rectangular ($length * $width) and prints it. Use two variables and hard code their values.
Exercise: Rectangular prompt
Modify the previous program to prompt for the two values (on two separate lines)
Exercise: Rectangular warn
Modify the previous area-calculator program to print a warning if one of the values is negative and make the area 0 sized.
Exercise: Concatenation
Script that gets two strings (on two separate lines) and prints the concatenated version.
Exercises: Simple Calculator
Write a script that will ask for a number, an operator (+,*,-,/)
and another number. Compute the result and print it out.
Solution: Rectangular
#!/usr/bin/perl
use strict;
use warnings;
my $length = 3;
my $width = 5;
my $area = $length * $width;
print "$area\n";
Solution: Rectangular prompt
#!/usr/bin/perl
use strict;
use warnings;
print "length: ";
my $length = <STDIN>;
print "width: ";
my $width = <STDIN>;
my $area = $length * $width;
print "$area\n";
Solution: Rectangular warn
#!/usr/bin/perl
use strict;
use warnings;
print "length: ";
my $length = <STDIN>;
chomp($length);
print "width: ";
my $width = <STDIN>;
chomp($width);
if ( $length < 0 ) {
$length = 0;
print "WARN: The first value was negative\n";
}
if ( $width < 0 ) {
$width = 0;
print "WARN: The second value was negative\n";
}
my $area = $length * $width;
print "The area is $area\n";
Solution: Concatenation
#!/usr/bin/perl
use strict;
use warnings;
print "Type in two strings, I'll print them out concatenated\n\n";
print "The first string: ";
my $first = <STDIN>;
chomp($first);
print "The second string: ";
my $second = <STDIN>;
print ("\nThe concatenated string: ", $first . $second);
# alternative solution:
# print ("The concatenated string: $first$second");
# don't forget to chomp the input strings.
# if you use them as numbers then you don't have to chomp.
Solution: Simple Calculator
#!/usr/bin/perl
use strict;
use warnings;
print "Type in 2 numbers and an operator and I'll print the results\n\n";
print "First number: ";
my $first = <STDIN>;
chomp($first);
print "Second number: ";
my $other = <STDIN>;
chomp($other);
print "The operator: ";
my $oper = <STDIN>;
chomp($oper);
my $result;
if ($oper eq "+") { $result = $first + $other; }
if ($oper eq "-") { $result = $first - $other; }
if ($oper eq "*") { $result = $first * $other; }
if ($oper eq "/") {
if ($other == 0) {
print "\nCannot divide by 0\n";
$result = "ERROR";
} else {
$result = $first / $other;
}
}
print "\nResult of $first $oper $other = $result\n";
# What if the given operator is not one of the 4 ?
Solution: Simple Calculator (using eval)
#!/usr/bin/perl
use strict;
use warnings;
print "Type in 2 numbers and an operator and I'll print the results\n\n";
print "First number: ";
my $first = <STDIN>;
chomp($first);
print "Second number: ";
my $other = <STDIN>;
chomp($other);
print "The operator: ";
my $oper = <STDIN>;
chomp($oper);
my $result = eval "$first $oper $other";
print "\nResult of $first $oper $other = $result\n";
String functions (length, lc, uc, index)
-
length
-
lc
-
uc
-
ucfirst
-
lcfirst
-
length STRING - number of characters
-
lc STRING - lower case
-
uc STRING - upper case
-
lcfirst STRING - lower case
-
ucfirst STRING - upper case
use 5.010;
use strict;
use warnings;
my $name = 'Foo Bar';
say length $name; # 7
say length "Hello World!\n"; # 13
say length 19; # 2
String functions (index, rindex)
-
index
-
rindex
-
index STRING, SUBSTRING - the location of a substring given its content
-
rindex STRING, SUBSTRING - the location of a substring given its content
#!/usr/bin/perl
use strict;
use warnings;
my $s = "The black cat jumped from the green tree";
print index $s, "ac"; # 6
print "\n";
print index $s, "e"; # 2
print "\n";
print index $s, "e", 3; # 18
print "\n";
print index $s, "dog"; # -1
print "\n";
print rindex $s, "e"; # 39
print "\n";
print rindex $s, "e", 38; # 38
print "\n";
print rindex $s, "e", 37; # 33
print "\n";
String functions: length, lc, uc, index, substr
String functions (substr)
- substr
substr STRING, OFFSET, LENGTH - the content of a substring given its location
#!/usr/bin/perl
use strict;
use warnings;
my $s = "The black cat climbed the green tree";
my $z;
$z = substr $s, 4, 5;
print "$z\n"; # $z = black
$z = substr $s, 4, -11;
print "$z\n"; # $z = black cat climbed the
$z = substr $s, 14;
print "$z\n"; # $z = climbed the green tree
$z = substr $s, -4;
print "$z\n"; # $z = tree
$z = substr $s, -4, 2;
print "$z\n"; # $z = tr
$z = substr $s, 14, 7, "jumped from";
print "$z\n"; # $z = climbed
print "$s\n"; # $s = The black cat jumped from the green tree
Strings - Double quoted
use 5.010;
use strict;
use warnings;
say "normal string"; # normal string
say "two\nlines"; # two
# lines
say "another 'string'"; # another 'string'
my $name = "Foo";
say "Hello $name, how are you?"; # Hello Foo, how are you?
# say "His "real" name is Foo"; # ERROR - Bareword found where operator expected
say "His \"real\" name is Foo"; # His "real" name is Foo
say "His \"real\" name is \"$name\""; # His "real" name is "Foo"
say qq(His "real" name is "$name"); # His "real" name is "Foo"
say qq(His "real" name is ($name)); # His "real" name is (Foo)
# say qq(His "real" name )is( $name); # ERROR - Bareword found where operator expected
say qq[His "real" name )is ($name)]; # His "real" name )is (Foo)
In double quoted strings you can use the following:
- Backslash escapes sequences like \n \t see in perldoc perlop
- Variable interpolation
Strings - Single quoted
- q
print 'one string'; # one string
print 'a\n'; # a\n
print 'a $name'; # a $name
print 'another "string"'; # another "string"
There are only two special characters in this kind of string the ' and the \ at the end of the string
print 'a'b'; # ERROR - perl will see the string 'a'
# and something attached to it
print 'a\'b'; # a'b
print q(His "variable" name '$name'\n); # His "variable" name is '$name'\n
Scope of variables
- scope
Variables defined within a block {} are hiding more global variables with the same name. They are destructed when leaving the block.
#!/usr/bin/perl
use strict;
use warnings;
{
my $email = 'foo@bar.com';
print "$email\n"; # foo@bar.com
}
# print $email;
# $email does not exists
# Global symbol "$email" requires explicit package name at ...
Scope of variables - 2
#!/usr/bin/perl
use strict;
use warnings;
my $lname = "Bar";
print "$lname\n"; # Bar
{
print "$lname\n"; # Bar
$lname = "Other";
print "$lname\n"; # Other
}
print "$lname\n"; # Other
Scope of variables - 3
#!/usr/bin/perl
use strict;
use warnings;
my $fname = "Foo";
print "$fname\n"; # Foo
{
print "$fname\n"; # Foo
my $fname = "Other";
print "$fname\n"; # Other
}
print "$fname\n"; # Foo
Scope of variables - 4
#!/usr/bin/perl
use strict;
use warnings;
{
my $name = "Foo";
print "$name\n"; # Foo
}
{
my $name = "Other";
print "$name\n"; # Other
}
Scope of variables - 5
#!/usr/bin/perl
use strict;
use warnings;
my $fname = "Foo";
print "$fname\n"; # Foo
package Other;
use strict;
use warnings;
print "$fname\n"; # Foo
my $lname = 'Bar';
print "$lname\n"; # Bar
package main;
print "$fname\n"; # Foo
print "$lname\n"; # Bar
Random numbers
#!/usr/bin/perl
use strict;
use warnings;
my $num = rand(); # returns a random number: 0 <= NUMBER < 1
my $n = rand(100); # returns a number: 0 <= NUMBER < 100
my $i = int(3.12); # returns the whole part of the number (3 in this case)
my $number = int(rand(100)); # returns a whole number: 0 <= NUMBER < 100
Here documents - double quotes
- here document
- <<
use strict;
use warnings;
my $fname = 'Foo';
my $lname = 'Bar';
my $str = <<"END_REPORT";
Daily user report
------------------
User details:
Name: $fname $lname
END_REPORT
print $str;
Daily user report
------------------
User details:
Name: Foo Bar
Here documents - single quotes
use strict;
use warnings;
my $str = <<'END_STRING';
Some text
$this_is_not_interpolated
more text
END_STRING
print $str;
Some text
$this_is_not_interpolated
more text
Exercise: Number Guessing game
Using the rand() function the computer "thinks" about a whole number between 1 and 200.
The user has to guess the number. After the user types in his guess the computer tells if this was bigger or smaller than the number it generated.
At this point there is no need to allow the user to guess several times.
Solution: Number Guessing game
#!/usr/bin/perl
use strict;
use warnings;
my $N = 200;
my $hidden = 1 + int rand $N;
print "Please guess between 1 and $N\n";
my $guess = <STDIN>;
chomp $guess;
if ($guess < $hidden) {
print "$guess is too small\n";
}
if ($guess > $hidden) {
print "$guess is too big\n";
}
if ($guess == $hidden) {
print "Heureka!\n";
}
Files
die, warn, exit
- die
- warn
- exit
- STDERR
exit() - exits from the program
warn() - writes to STDERR
die() - writes to STDERR and exits from the program (raising exception)
warn "This is a warning";
This is a warning at script.pl line 132.
If no \n at the end of the string both warn and die add the
name of file and line number and possibly the chunk of the input.
while loop
Another tool to create a loop is by using while
.
while (EXPRESSION) { BLOCK }
my $i = 100;
while ($i > 0) {
# do something
$i--;
}
while (1) {
# a nice infinite loop
}
Loop controls: next, last
-
next
-
last
-
next - evaluate the loop condition and if it is true go to the next iteration. (continue in other languages)
-
last - exit the loop. (break in other languages)
-
redo - start the iteration again without evaluating the loop condition.
{% embed include file="src/examples/perlarrays/loop_control.txt)
#!/usr/bin/perl
use strict;
use warnings;
my $counter = 0;
my $total = 0;
while (1) {
$counter++;
my $num = rand(1);
# print "Debug: $num $total\n";
if ($num < 0.2) {
next;
}
$total += $num;
if ($total > 3) {
last;
}
### next jumps here ###
}
### last jumps here ###
print "Counter: $counter Total: $total\n"
'First line'
'Line after empty line'
'Another text line'
Number of non empty rows: 3 out of a total of 6
Opening file for reading
- open
While working over most of the operating systems today, no program can
access a file directly. This is in order to allow the Operating System
to apply user rights.
Before you can read from a file you have to ask the Operating System to "open"
it for you. When opening a file you provide a variable that will become your
handle to the opened file. It is called a filehandle.
#!/usr/bin/perl
use strict;
use warnings;
my $filename = "input.txt";
open(my $fh, "<", $filename);
close $fh;
Opening a file
- append
- write
#!/usr/bin/perl
use strict;
use warnings;
my $filename = "some_filename";
open(my $fhb, "<", $filename); # read
open(my $fhc, ">", $filename); # write
open(my $fhd, ">>", $filename); # append
open(my $fhe, "+<", $filename); # read and write
Opening a file - error handling
-
$!
-
$! - error message from the Operating system
#!/usr/bin/perl
use strict;
use warnings;
my $filename = "input.txt";
if (open my $in, "<", $filename) {
# do your thing here
# no need to explicitly close the file
} else {
warn "Could not open file '$filename'. $!";
}
# here the $in filehandle is not accessible anymore
A more Perlish way to open a file and exit with error message if you could not open the file:
#!/usr/bin/perl
use strict;
use warnings;
my $filename = "input.txt";
open(my $fh, "<", $filename) or die "Could not open file '$filename'. $!";
# do your thing here
close $fh;
Opening a missing file
#!/usr/bin/perl
use strict;
use warnings;
if (open my $fh, '<', "other_data.txt") {
# should do here something
} else {
warn $!;
}
The error message we get:
No such file or directory at examples/files-perl/open_missing_file.pl line 7.
Read one line from a file
#!/usr/bin/perl
use strict;
use warnings;
# Reading a line from a file (or rather from a filehandle)
my $filename = "input.txt";
if (open my $data, "<", $filename) {
my $line = <$data>;
print $line;
} else {
warn "Could not open file '$filename': $!";
}
Process an entire file line by line (while, cat)
-
while
-
while - executes as long as there is something in $line, as long as there are lines in the file
-
Loop over file (name hard-coded) and print every line (UNIX cat)
#!/usr/bin/perl
use strict;
use warnings;
my $filename = "input.txt";
open(my $fh, "<", $filename) or die "Could not open '$filename'\n";
while (my $line = <$fh>) {
print $line;
}
Instead of printing the line you could do anything with it.
Write to a file
#!/usr/bin/perl
use strict;
use warnings;
my $filename = "report.txt";
open my $fh, '>', $filename or die "Could not open file '$filename' $!";
my $title = "Report by: Foo Bar";
print $fh "$title\n";
print $fh "-" x length $title, "\n";
Sum of numbers in a file
3
7
23
-17
98
12
#!/usr/bin/perl
use strict;
use warnings;
# given a file with a number on each row, print the sum of the numbers
my $sum = 0;
my $filename = "numbers.txt";
open(my $fh, "<", $filename) or die "Could not open '$filename'\n";
while (my $line = <$fh>) {
$sum += $line;
}
print "The total value is $sum\n";
Analyze the Apache log file
127.0.0.1 - - [10/Apr/2007:10:39:11 +0300] "GET / HTTP/1.1" 500 606 "-" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.3) Gecko/20061201 Firefox/2.0.0.3 (Ubuntu-feisty)"
127.0.0.1 - - [10/Apr/2007:10:39:11 +0300] "GET /favicon.ico HTTP/1.1" 200 766 "-" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.3) Gecko/20061201 Firefox/2.0.0.3 (Ubuntu-feisty)"
139.12.0.2 - - [10/Apr/2007:10:40:54 +0300] "GET / HTTP/1.1" 500 612 "-" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.3) Gecko/20061201 Firefox/2.0.0.3 (Ubuntu-feisty)"
139.12.0.2 - - [10/Apr/2007:10:40:54 +0300] "GET /favicon.ico HTTP/1.1" 200 766 "-" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.3) Gecko/20061201 Firefox/2.0.0.3 (Ubuntu-feisty)"
127.0.0.1 - - [10/Apr/2007:10:53:10 +0300] "GET / HTTP/1.1" 500 612 "-" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.3) Gecko/20061201 Firefox/2.0.0.3 (Ubuntu-feisty)"
127.0.0.1 - - [10/Apr/2007:10:54:08 +0300] "GET / HTTP/1.0" 200 3700 "-" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.3) Gecko/20061201 Firefox/2.0.0.3 (Ubuntu-feisty)"
127.0.0.1 - - [10/Apr/2007:10:54:08 +0300] "GET /style.css HTTP/1.1" 200 614 "http://pti.local/" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.3) Gecko/20061201 Firefox/2.0.0.3 (Ubuntu-feisty)"
127.0.0.1 - - [10/Apr/2007:10:54:08 +0300] "GET /img/pti-round.jpg HTTP/1.1" 200 17524 "http://pti.local/" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.3) Gecko/20061201 Firefox/2.0.0.3 (Ubuntu-feisty)"
127.0.0.1 - - [10/Apr/2007:10:54:21 +0300] "GET /unix_sysadmin.html HTTP/1.1" 200 3880 "http://pti.local/" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.3) Gecko/20061201 Firefox/2.0.0.3 (Ubuntu-feisty)"
217.0.22.3 - - [10/Apr/2007:10:54:51 +0300] "GET / HTTP/1.1" 200 34 "-" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.3) Gecko/20061201 Firefox/2.0.0.3 (Ubuntu-feisty)"
217.0.22.3 - - [10/Apr/2007:10:54:51 +0300] "GET /favicon.ico HTTP/1.1" 200 11514 "-" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.3) Gecko/20061201 Firefox/2.0.0.3 (Ubuntu-feisty)"
217.0.22.3 - - [10/Apr/2007:10:54:53 +0300] "GET /cgi/pti.pl HTTP/1.1" 500 617 "http://contact.local/" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.3) Gecko/20061201 Firefox/2.0.0.3 (Ubuntu-feisty)"
127.0.0.1 - - [10/Apr/2007:10:54:08 +0300] "GET / HTTP/0.9" 200 3700 "-" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.3) Gecko/20061201 Firefox/2.0.0.3 (Ubuntu-feisty)"
217.0.22.3 - - [10/Apr/2007:10:58:27 +0300] "GET / HTTP/1.1" 200 3700 "-" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.3) Gecko/20061201 Firefox/2.0.0.3 (Ubuntu-feisty)"
217.0.22.3 - - [10/Apr/2007:10:58:34 +0300] "GET /unix_sysadmin.html HTTP/1.1" 200 3880 "http://pti.local/" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.3) Gecko/20061201 Firefox/2.0.0.3 (Ubuntu-feisty)"
217.0.22.3 - - [10/Apr/2007:10:58:45 +0300] "GET /talks/Fundamentals/read-excel-file.html HTTP/1.1" 404 311 "http://pti.local/unix_sysadmin.html" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.3) Gecko/20061201 Firefox/2.0.0.3 (Ubuntu-feisty)"
#!/usr/bin/perl
use strict;
use warnings;
my $file = "examples/files/apache_access.log";
open my $fh, '<', $file or die "Could not open '$file': $!";
my $local = 0;
my $remote = 0;
while (my $line = <$fh>) {
my $length = index ($line, " ");
my $ip = substr($line, 0, $length);
if ($ip eq "127.0.0.1") {
$local++;
} else {
$remote++;
}
}
print "Local: $local Remote: $remote\n";
Encoding and UTF-8
use strict;
use warnings;
my $file = "data.txt";
open(my $fh, '<:encoding(utf8)', $file) or die "Could not open '$file'\n";
...
Open files in the old way
In old version of perl (before 5.6) we could not use scalar variables as file handles so we used uppercase letters such as XYZ or INPUT, QQRQ or FILEHANDLE.
Also the function had only 2 parameters.
#!/usr/bin/perl
use strict;
use warnings;
my $filename = "data.txt";
open(FH, ">$filename") or die;
print FH "data";
close FH;
open(FH, $filename) or die;
my $line = <FH>;
close FH;
Security problems.
Being global, difficult to pass as parameter to functions.
Don't Open Files in the old way
Binary mode
#!/usr/bin/perl
use strict;
use warnings;
my $filename = "input.txt";
open(my $fh, "<", $filename) or die;
binmode($fh);
Reading from file, read, eof
- read
- eof
In Perl we usually care about lines of input so the above is enough. Still some like to read files with chunks of arbitrary length. read puts the read string to the variable passed to the function and returns the number of characters actually read READ_LENGTH = read FILEHANDLE,SCALAR,LENGTH
#!/usr/bin/perl
use strict;
use warnings;
# reading in 30 characters:
open my $in, "<", $0 or die $!;
my $expected = 30;
my $buf;
my $actual = read $in, $buf, $expected;
if ($actual < $expected) {
print "reached end of file\n";
}
# returns TRUE if we are at the end of file.
eof($in)
tell, seek
- tell
- seek
For our purposes a file is a line of characters. After a bunch of read and/or write operations we need to tell where are we on that line ?
LOCATION = tell FILEHANDLE
We might also want to move within that file
seek FILEHANDLE, OFFSET, WHENCE
WHENCE:
0 from beginning of file
1 from current location
2 from end of file
OFFSET:
+/- number of bytes to move
the important values are:
seek $fh, 0,0; # go to the beginning of the file
seek $fh, 0,2; # go to the end of the file
truncate
- truncate
# Sometimes you need to
truncate FILEHANDLE, LENGTH;
#!/usr/bin/perl
use strict;
use warnings;
my $new = $ARGV[0];
my $filename = "file.txt";
open my $fh, "+<", $filename or die "Could not open $!\n";
my $old = <$fh>;
seek $fh, 0, 0; # move to the beginning of the file
print $fh $new;
truncate $fh, length $new; # cut the file to the new size
Exercise: Add more statistics
Take the script from the previous example (examples/files-perl/count_sum.pl" %} and in addition to the sum of the numbers print also
minimum
maximum
average
median and standard deviation are probably too difficult for now.
Exercise: Write report to file
Take the exercise creating statistics of the numbers.txt file and write the results to the numbers.out file.
minimum: -17
maximum: 98
total: 126
count: 6
average: 21
You might need to look up the documentation of the printf command in order to align the columns.
Exercise: Analyze Apache - number of successful hits
In the Apache log file (examples/files-perl/apache_access.log) after the "GET something HTTP/1.1" part there is the result code of the requests. 200 is OK the rest might be some failure.
Please create a report showing how many of the hits were successful (200) and how many were something else.
Could you put all the lines in either of the categories?
Solution: Add more statistics
#!/usr/bin/perl
use strict;
use warnings;
my $total = 0;
my $count = 0;
my $min;
my $max;
my $filename = "numbers.txt";
open(my $fh, "<", $filename) or die "Could not open '$filename'\n";
while (my $line = <$fh>) {
chomp $line;
$total += $line;
if (not $count) {
$min = $line;
$max = $line;
}
$count++;
if ($line < $min) {
$min = $line;
}
if ($line > $max) {
$max = $line;
}
}
if (not defined $min) {
print "No values were given\n";
} else {
print "Min: $min Max: $max Total: $total count: $count Average: ",
$total / $count, "\n";
}
Solution: Write report to file
#!/usr/bin/perl
use strict;
use warnings;
my $total = 0;
my $count = 0;
my $min;
my $max;
my $filename = "examples/files/numbers.txt";
open(my $fh, "<", $filename) or die "Could not open '$filename'\n";
while (my $line = <$fh>) {
chomp $line;
$total += $line;
if (not $count) {
$min = $line;
$max = $line;
}
$count++;
if ($line < $min) {
$min = $line;
}
if ($line > $max) {
$max = $line;
}
}
open my $out, '>', 'numbers.out';
if (not defined $min) {
print $out "No values were given\n";
} else {
printf($out "Minimum: %5s\n", $min);
printf($out "Maximum: %5s\n", $max);
printf($out "Total: %5s\n", $total);
printf($out "Count: %5s\n", $count);
printf($out "Average: %5s\n", $total / $count);
}
Solution: Analyze Apache - number of successful hits
#!/usr/bin/perl
use strict;
use warnings;
my $file = "examples/files/apache_access.log";
open my $fh, '<', $file or die "Could not open '$file': $!";
my $good = 0;
my $bad = 0;
my $invalid = 0;
while (my $line = <$fh>) {
chomp $line;
my $request = q( HTTP/1.1" );
my $start = index ($line, $request);
if ($start < 0) {
$request = q( HTTP/1.0" );
$start = index ($line, $request);
}
if ($start < 0) {
#print "ERROR: Unrecognized Line: $line\n";
$invalid++;
next;
}
my $end = index($line, " ", $start + length($request));
my $result = substr($line,
$start + length($request),
$end - $start - length($request));
#print "$start, $end '$result'\n";
if ($result eq "200") {
$good++;
} else {
$bad++;
}
}
print "Good: $good\n";
print "Bad: $bad\n";
print "Invalid: $invalid\n";
# Disclaimer: this is not an optimal solution.
# We will see a much better one after learning functions, regular expressions
Lists and Arrays
Lists and Arrays intro
While a scalar is single value a list is a bunch of single values. A scalar value can be stored in a scalar variable. A list can be stored in a list variable. It is called and array.
List Literals, list ranges
- ()
- ..
- qw
A list is an ordered series of scalar values separated by comma and enclosed in parentheses.
The scalar values themselves can be references to other data structures.
(An array, explained later is a variable holding the content of a list.)
Examples of lists:
(1, 5.2, "apple pie") # 3 values
('string', 42, 2.3, undef, ['one', 'two'], { 'name' => 'Foo Bar' }) # 6 values
($x, $y, $z) # We can also use scalar variables as elements of a list
($x, 3, "foobar") # or we can mix them
Special case, all of them "words":
("apple", "banana", "peach", "blueberry") # is the same as
qw(apple banana peach blueberry) # quote word
Special case, range operator:
(1 .. 10) # same as
(1,2,3,4,5,6,7,8,9,10)
List Assignment
my ($x, $y, $z);
my ($x, $y, $z) = (1, "apple pie", 3.14);
($x, $y, $z) = f(); # where f() returns (2, 3, 7);
# nearly the same as $x=2; $y=3; $z=7;
($x, $y, $z) = f(); # where f() returns (8, 1, 5, 9);
# ignore 9
($x, $y, $z) = f(); # where f() returns (3, 4);
# $z will be undef
A regular question on job interviews:
How can we swap the values of 2 variables, let say $x and $y?
loop over elements of list with foreach
-
foreach
-
for
-
list
-
foreach ITEM (LIST) {BLOCK}
-
my - in the foreach loop
#!/usr/bin/perl
use strict;
use warnings;
foreach my $color ("Blue", "Yellow", "Brown", "White") {
print "$color\n";
}
Blue
Yellow
Brown
White
Create an Array, loop over with foreach
- @
#!/usr/bin/perl
use strict;
use warnings;
my @colors = ("Blue", "Yellow", "Brown", "White");
print "@colors\n";
foreach my $color (@colors) {
print "$color\n";
}
Blue Yellow Brown White
Blue
Yellow
Brown
White
Perl for loop explained with examples
Array Assignment
You can also mix the variables on the right side and if there are arrays on the right side the whole thing becomes one flat array !
use strict;
use warnings;
my @x = ('foo', 'bar');
my @y = ('moose', 'barney');
my @z = (@x, @y); # ('foo', 'bar', 'moose', 'barney');
print "@z\n"; # foo bar moose barney
my $owner = 'Moose';
my @tenants = qw(Foo Bar);
my @people = ($owner, 'Baz', @tenants); # ('Moose', 'Baz', 'Foo', 'Bar')
print "@people\n"; # Moose Baz Foo Bar
my ($x, @y, @z);
($x, @y) = f(); # where f() returns (1, 2, 3, 4);
# $x is 1; @y is (2, 3, 4)
($x, @y, @z) = f(); # where f() returns (1, 2, 3, 4);
# $x is 1; @y is (2, 3, 4) @z is empty: ()
@z = (); # Emptying an array
Debugging an array
- Data::Dumper
- Dumper
- @
use strict;
use warnings;
use Data::Dumper qw(Dumper);
my @names_1 = ('Moose', 'Barney', 'Foo', 'Bar');
print "@names_1\n"; # Moose Barney Foo Bar
my @names_2 = ('Moose', 'Barney', 'Foo Bar');
print "@names_2\n"; # Moose Barney Foo Bar
print Dumper \@names_1;
print Dumper \@names_2;
Moose Barney Foo Bar
Moose Barney Foo Bar
$VAR1 = [
'Moose',
'Barney',
'Foo',
'Bar'
];
$VAR1 = [
'Moose',
'Barney',
'Foo Bar'
];
foreach loop on numbers
foreach my $i (1..10) {
print "$i\n";
}
1
2
3
4
5
6
7
8
9
10
Array index (menu)
-
$#array
-
$array[0]
-
$#array - the largest index
-
$array[1] - array elements are scalar
#!/usr/bin/perl
use strict;
use warnings;
my $color;
my @colors = ("Blue", "Yellow", "Brown", "White");
print "Please select a number:\n";
foreach my $i (0..$#colors) {
print "$i) $colors[$i]\n";
}
my $num = <STDIN>;
chomp($num);
if (defined $colors[$num]) {
$color = $colors[$num];
} else {
print "Bad selection\n";
exit;
}
print "The selected color is $color\n";
- $array[-1]}
Load Module
-
Scalar::Util
-
looks_like_number
-
is_number
-
looks_like_number
#!/usr/bin/perl
use strict;
use warnings;
use Scalar::Util qw(looks_like_number);
my $color;
my @colors = ("Blue", "Yellow", "Brown", "White");
print "Please select a number:\n";
foreach my $i (0..$#colors) {
print "$i) $colors[$i]\n";
}
my $num = <STDIN>;
chomp($num);
if (looks_like_number($num) and defined $colors[$num]) {
$color = $colors[$num];
} else {
print "Bad selection\n";
exit;
}
print "The selected color is $color\n";
Command line parameters
-
@ARGV
-
$0
-
@ARGV - all the arguments on the command line
-
$ARGV[0] - the first argument
-
$0 - name of the program
-
perl read_argv.pl blue
#!/usr/bin/perl
use strict;
use warnings;
my $color = $ARGV[0];
if (not defined $color) {
my @colors = ("Blue", "Yellow", "Brown", "White");
print "Please select a number:\n";
foreach my $i (0..$#colors) {
print "$i) $colors[$i]\n";
}
my $num = <STDIN>;
chomp($num);
if (defined $colors[$num]) {
$color = $colors[$num];
} else {
print "Bad selection\n";
exit;
}
}
print "The selected color is $color\n";
Process command line parameters, use modules
-
use
-
|reference
-
Getopt::Long
-
scalar reference
-
perl process_command_line.pl --color blue
#!/usr/bin/perl
use strict;
use warnings;
use Getopt::Long qw(GetOptions);
my $color;
GetOptions("color=s" => \$color) or die "Usage: $0 [--color COLOR]\n";
if (not defined $color) {
my @colors = ("Blue", "Yellow", "Brown", "White");
print "Please select a number:\n";
foreach my $i (0..$#colors) {
print "$i) $colors[$i]\n";
}
my $num = <STDIN>;
chomp($num);
if (defined $colors[$num]) {
$color = $colors[$num];
} else {
print "Bad selection\n";
exit;
}
}
print "The selected color is $color\n";
Module documentation
process CSV file
-
split
-
CSV
-
split
{% embed include file="src/examples/perlarrays/process_csv_file.csv)
#!/usr/bin/perl
use strict;
use warnings;
my $file = 'process_csv_file.csv';
if (defined $ARGV[0]) {
$file = $ARGV[0];
}
my $sum = 0;
open(my $data, '<', $file) or die "Could not open '$file' $!\n";
while (my $line = <$data>) {
chomp $line;
my @fields = split /;/, $line;
$sum += $fields[2];
}
print "$sum\n";
process csv file (short version)
while (<>) {
$sum += (split /;/)[2];
}
print "$sum\n";
Use the following command to run the script:
`perl examples/perlarrays/process_csv_file_short.pl examples/perlarrays/process_csv_file.csv`
One-liner sum numbers in CSV file
perl -n -a -F; -e '$sum += $F[2]; END {print $sum}' examples/perlarrays/process_csv_file.csv
-n = loop over lines but do NOT print them
-a = autosplit by ' ' and assign to @F
-F; = replace the split string by ';'
The END
block gets executed at the end of the execution and only once.
process csv file using Text::CSV
- Text::CSV
- Text::CSV_XS
What if there is a field with embedded comma (,)?
{% embed include file="src/examples/perlarrays/process_csv_file_module.csv)
#!/usr/bin/perl
use strict;
use warnings;
use Text::CSV;
my $csv = Text::CSV->new({
sep_char => ';'
});
my $file = 'process_csv_file_module.csv';
if (defined $ARGV[0]) {
$file = $ARGV[0];
}
my $sum = 0;
open(my $data, '<', $file) or die "Could not open '$file'\n";
while (my $line = <$data>) {
chomp $line;
if ($csv->parse($line)) {
my @fields = $csv->fields();
$sum += $fields[2];
} else {
warn "Line could not be parsed: $line\n";
}
}
print "$sum\n";
process csv file using Text::CSV
- Text::CSV
- Text::CSV_XS
What if there is a field with embedded newline?
{% embed include file="src/examples/perlarrays/process_csv_file_module_newline.csv)
#!/usr/bin/perl
use strict;
use warnings;
use Text::CSV;
my $file = 'process_csv_file_module.csv';
if (defined $ARGV[0]) {
$file = $ARGV[0];
}
my $csv = Text::CSV->new ({
binary => 1,
auto_diag => 1,
sep_char => ';'
});
my $sum = 0;
open(my $data, '<:encoding(utf8)', $file) or die "Could not open '$file'\n";
while (my $fields = $csv->getline( $data )) {
$sum += $fields->[2];
}
print "$sum\n";
Join
- join
my @fields = qw(Foo Bar foo@bar.com);
my $line = join ";", @fields;
print "$line\n"; # Foo;Bar;foo@bar.com
Labels
Normally the last and next keywords are related to the innermost loop. In some cases that's not good. Perl allows us to define labels in front of loops and then to use those labels in conjunction with last or next to go to the last or next iteration of the specified loop.
use strict;
use warnings;
OUTER:
while (1) {
print "start outer\n";
while (1) {
print "start inner\n";
last OUTER;
print "end inner\n";
}
print "end outer\n";
}
print "done\n";
Exercise: Make the color selector user friendly
Take the examples/perlarrays/process_command_line.pl script ( the color selector ) and make it more user friendly by showing the numbers starting from 1 (and not from 0).
Exercise: improve the color selector
Take the examples/perlarrays/process_command_line.pl script and improve it in several ways:
- Check if the value given on the command line is indeed one of the possible values and don't let other colors pass.
- Allow a --force flag that will disregard the previously implemented restriction. Meaning
- --color Purple should still report error but
- --color Purple --force should accept this color as well.
Exercise: Improve the Number Guessing game
Let the user guess several times (with responses each time) till he finds the hidden number. Base your code on the solution from the previous exercise. ( examples/scalars/number_guessing.pl )
Allow the user to type
n - skip this game and start a new one (generate new number to guess)
s - show the hidden value (cheat)
d - debug mode
(It is a toggle.
Pressing once the system starts to show the current
number to guess every time before asking the user for new input
pressing again, turns off the behavior.
)
m - move mode
(It is a toggle.
Pressing once the object will start to move a little bit after
every step. Pressing again will turn this feature off.)
x - exit
Now I can tell you that what you have is actually a 1 dimensional space fight and you are trying to guess the distance of the enemy space ship. As it is not a sitting duck, after every shot the spaceship can randomly move +2-2.
Extra exercise:
- For training purposes you might want to limit the outer space to 0-100.
- Make sure the enemy does not wander off the training field.
- Give warning if the user shoots out of space.
- Keep track of the minimum and maximum number of hits (in a file).
Solution: improved color selector
#!/usr/bin/perl
use strict;
use warnings;
use Getopt::Long qw(GetOptions);
my $color;
my $force;
GetOptions(
"color=s" => \$color,
"force" => \$force,
) or exit;
my @colors = ("Blue", "Yellow", "Brown", "White");
if ($color and not $force) {
my $valid_color;
foreach my $c (@colors) {
if ($c eq $color) {
$valid_color = 1;
next;
}
}
if (not $valid_color) {
print "The color '$color' is not valid.\n";
$color = '';
}
}
if (not $color) {
print "Please select a number:\n";
foreach my $i (0..$#colors) {
print "$i) $colors[$i]\n";
}
my $num = <STDIN>;
chomp($num);
if (defined $colors[$num]) {
$color = $colors[$num];
} else {
print "Bad selection\n";
exit;
}
}
print "The selected color is $color\n";
Solution: Improve the Number Guessing game
#!/usr/bin/perl
use strict;
use warnings;
my $N = 200;
my $debug;
my $moving;
my $scores = 'scores.txt';
my ($min, $max);
if (open my $in, '<', $scores) {
($min, $max) = <$in>;
chomp ($min, $max);
}
print <<'END_TXT';
x - exit
q - quit
n - next game
s - show target
d - toggle debug mode
m - toggle - allow object to move or not
END_TXT
GAME:
while (1) {
my $hidden = 1 + int rand $N;
my $count = 0;
while (1) {
print "Please guess between 1 and $N :";
if ($debug) {
print " ($hidden) ";
}
print "\n";
my $guess = <STDIN>;
$count++;
chomp $guess;
if ($guess eq 'x' or $guess eq 'q') {
last GAME;
}
if ($guess eq 'n') {
last;
}
if ($guess eq 'd') {
$debug = $debug ? 0 : 1;
next;
}
if ($guess eq 'm') {
$moving = $moving ? 0 : 1;
next;
}
if ($guess eq "s") {
print "The hidden value is $hidden\n";
next;
# is it fair to let the player keep playing after seeing the result?
# if not replace the next by last
}
if ($guess < 1 or $guess > $N) {
warn "You shot ($guess) in the outer space.\n";
}
if ($guess < $hidden) {
print "$guess is too small\n";
}
if ($guess > $hidden) {
print "$guess is too big\n";
}
if ($guess == $hidden) {
print "Heureka!\n";
last;
}
if ($moving) {
$hidden += int(rand 5)-2;
# move target by -2 .. +2
# don't wander off the space
if ($hidden > $N) {
$hidden = $N;
}
if ($hidden < 1) {
$hidden = 1;
}
}
}
if (not defined $min) {
$min = $count;
}
if (not defined $max) {
$max = $count;
}
$min = $count < $min ? $count : $min;
$max = $count > $max ? $count : $max;
}
if (defined $min) {
open my $out, '>', $scores or die;
print $out "$min\n";
print $out "$max\n";
}
Advanced Arrays
The year 19100
First, let's talk about time.
- time}
- localtime}
- gmtime}
$t = time(); # 1021924103
# returns a 10 digit long number,
# the number of seconds since 00:00:00 UTC, January 1, 1970
$x = localtime($t); # returns a string like Thu Feb 30 14:15:53 1998
$z = localtime(); # returns the string for the current time
$z = localtime(time - 60*60*24*365);
# returns the string for a year ago, same time, well almost
@y = localtime($t); # an array of time values:
# 53 15 14 30 1 98 4 61 0
# the 9 values are the following:
# 0 1 2 3 4 5 6 7 8 (the index)
($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime(time);
# The localtime function is aware of what is on the left side of the = sign !!!!
# OK but where does that 19100 come from ?
$mon 0..11
$min 0..59
$sec 0..60
$year YEAR-1900 # for example 2000-1900 = 100
# but people used "19$year" instead of 1900 + $year
# which is 19100 instead of 2000
gmtime is the same just gives the time as it is in Greenwich. The year of 19100
SCALAR and LIST Context
- SCALAR context
- LIST context
my @a = ("zero", "one", "two", "three");
my @b = @a; # LIST context
my $c = @a; # SCALAR context
if (@a) {
}
while (@a) {
}
Scalar and List context in Perl
Context Sensitivity
- scalar()}
Every operator creates a 'context' let's see a few examples
Assignment to a scalar variable creates SCALAR context:
$x = localtime();
$x = @z;
$x = SCALAR
Assignment to an array creates LIST context:
@y = localtime();
@y = @z;
@y = LIST
# Expressions providing SCALAR context
$x = SCALAR;
$y[3] = SCALAR;
8 + SCALAR
"Foo: " . SCALAR
if (SCALAR) { ... }
while (SCALAR) { ... }
scalar(SCALAR)
# Expressions providing LIST context:
@a = LIST;
($x, $y) = LIST;
($x) = LIST;
foreach $x (LIST) {...}
join ";", LIST
print LIST
# example
@a = qw(One Two Three);
print @a; # OneTwoThree print LIST
print 0 + @a; # 3 SCALAR + SCALAR
print scalar(@a); # 3 scalar(SCALAR)
see also perldoc -f function-name
Filehandle in scalar and list context
#!/usr/bin/perl
use strict;
use warnings;
my $file = "numbers.txt";
open(my $fh, '<', $file) or die "Could not open '$file'";
# reading in SCALAR context (line by line) and processing each line
while (my $row = <$fh>) {
chomp $row;
print "READ: $row\n";
}
open (my $other_fh, '<', $file) or die "Could not open '$file'";
# reading in LIST context all the lines at once
my @rows = <$other_fh>;
chomp @rows;
print "READ ", scalar(@rows), " lines\n";
slurp mode
- $/
- slurp
#!/usr/bin/perl
use strict;
use warnings;
my $file = "numbers.txt";
# slurp mode
my $all;
{
open(my $fh, '<', $file) or die "Could not open '$file'\n";
local $/ = undef;
$all = <$fh>;
}
File::Slurp
- File::Slurp
- slurp
#!/usr/bin/perl
use strict;
use warnings;
use File::Slurp qw(slurp);
my $filename = $ARGV[0];
if (not defined $filename) {
die "Usage: $0 FILENAME\n";
}
my $text = slurp($filename);
my @lines = slurp($filename);
Diamond operator
#!/usr/bin/perl
use strict;
use warnings;
while (my $line = <>) {
print $line;
}
pop, push
- pop
- push
#!/usr/bin/perl
use strict;
use warnings;
my @names = ("Foo", "Bar", "Baz");
my $last_name = pop @names;
print "$last_name\n"; # Baz
print "@names\n"; # Foo Bar
push @names, "Moo";
print "@names\n"; # Foo Bar Moo
1
27
-13
2
push example
use strict;
use warnings;
my $file = 'data.txt';
my @positive;
open my $fh, '<', $file or die "Could not open $file : $!";
while (my $line = <$fh>) {
if ($line > 0) {
push @positive, $line;
}
}
print @positive;
stack (pop, push) Reverse Polish Calculator
- stack
#!/usr/bin/perl
use strict;
use warnings;
my @stack;
while (1) {
print '$ ';
my $in = <STDIN>;
chomp $in;
if ($in eq "q") { last; }
if ($in eq "c") {
pop @stack;
next;
} # fetch the last value
if ($in eq "*") {
my $x = pop(@stack);
my $y = pop(@stack);
push(@stack, $x*$y);
next;
}
if ($in eq "+") {
my $x = pop(@stack);
my $y = pop(@stack);
push(@stack, $x + $y);
next;
}
if ($in eq "/") {
my $x = pop(@stack);
my $y = pop(@stack);
push(@stack, $y / $x);
next;
}
if ($in eq "-") {
my $x = pop(@stack);
my $y = pop(@stack);
push(@stack, $y - $x);
next;
}
if ($in eq "=") {
print pop(@stack), "\n";
next;
}
push @stack, $in;
}
shift, unshift
- shift
- unshift
#!/usr/bin/perl
use strict;
use warnings;
my @names = ("Foo", "Bar", "Baz");
my $first = shift @names;
print "$first\n"; # Foo
print "@names\n"; # Bar Baz
unshift @names, "Moo";
print "@names\n"; # Moo Bar Baz
FIRST = shift ARRAY;
unshift ARRAY, VALUEs;
queue (shift, push)
- queue
#!/usr/bin/perl
use strict;
use warnings;
my @people = ("Foo", "Bar");
while (@people) {
my $next_person = shift @people;
print "$next_person\n"; # do something with this person
print "Type in more people:";
while (my $new = <STDIN>) {
chomp $new;
if ($new eq "") {
last;
}
push @people, $new;
}
print "\n";
}
shift
#!/usr/bin/perl
use strict;
use warnings;
my $filename = shift or die "Usage: $0 FILENAME\n";
shift defaults to shift @ARGV
Another usage of the short circuit
Slight bug.
Does it matter?
reverse
- reverse
use strict;
use warnings;
# LIST context
my @names = qw(Foo Bar Baz);
my @reverses_names = reverse @names;
print "@reverses_names\n"; # Baz Bar Foo
# SCALAR context
my $string = "abcd";
my $reversed_string = reverse $string;
print "$reversed_string\n"; # dcba
Sort
- sort
- $a
- $b
- cmp
- <=>
- spaceship operator
#!/usr/bin/perl
use strict;
use warnings;
my @words = qw(Foo Moo Bar);
my @sorted_words = sort @words;
print "@sorted_words\n"; # Bar Foo Moo
my @data = (11, 2, 23, 12);
my @sorted = sort @data;
print "@sorted\n"; # 11 12 2 23
my @sorted_ascii = sort {$a cmp $b} @data;
my @sorted_numeric = sort {$a <=> $b} @data;
print "@sorted_numeric\n"; # 2 11 12 23
my @sorted_by_length
= sort {length($a) <=> length($b)} @data;
my @sorted_by_length_and_ascii
= sort {
length($a) <=> length($b)
or
$a cmp $b
} @data;
my @sorted_by_abc = sort {lc($a) cmp lc($b)} @data;
my @sorted_abc_ascii
= sort {
lc($a) cmp lc($b)
or
$a cmp $b
} @data;
Ternary operator
- ternary operator
- ? :
my $var;
if (COND) {
$var = A;
} else {
$var = B;
}
my $var = COND ? A : B;
Count digits
23 34 9512341
3 34 2452345 5353 67 22
42136357013412
42 5 65 64
#!/usr/bin/perl
use strict;
use warnings;
my $filename = shift or die "Usage: $0 filename\n";
my @count;
open(my $fh, "<", $filename)
or die "Could not open '$filename': $!";
while (my $line = <$fh>) {
chomp $line;
my @chars = split //, $line;
foreach my $c (@chars) {
if ($c ne " ") {
$count[$c]++;
}
}
}
foreach my $i (0..9) {
print "$i ", ($count[$i] ? $count[$i] : 0), "\n";
}
Exercise: Color selector
Take the solution from the previous chapter (you can use the file examples/perlarrays/color_selector.pl ) and add the following features:
- Read the names of the colors from a file called colors.txt
- Let the user pass the name of the color file using the --filename FILENAME option.
Exercise: sort numbers
Take the file examples/perlarrays/count_digits.txt
from the previous example and sort the numbers (not the digits).
Exercise: sort mixed string
In a file we have the string where each string has a single letter at the beginning and then a number. Sort them based on the number only, disregarding the letter.
Input:
A4
B1
A27
A3
Expected output:
B1
A3
A4
A27
File:
A1
A27
C1
B1
B12
A38
B3
A3
Expected output from sample file
A1
C1
B1
B3
A3
B12
A27
A38
Exercise: sort mixed string 2
Take the above example and change it so now we'll take in account the letter too. Sort the strings based on the first letter, and among values with the same leading letter, sort them according to the numbers.
Input
A4
B1
A3
A27
Expected output
A3
A4
A27
B1
Expected output from sample file
A1
A3
A27
A38
B1
B3
B12
C1
Solution: color selector files
#!/usr/bin/perl
use strict;
use warnings;
use Getopt::Long qw(GetOptions);
my $color;
my $filename = "examples/colors.txt";
my $force;
GetOptions(
"color=s" => \$color,
"filename=s" => \$filename,
"force" => \$force,
) or exit;
open(my $fh, "<", $filename)
or die "Could not open '$filename' for reading: $!";
my @colors = <$fh>;
chomp @colors;
if ($color and not $force) {
my $valid_color;
foreach my $c (@colors) {
if ($c eq $color) {
$valid_color = 1;
next;
}
}
if (not $valid_color) {
print "The color '$color' is not valid.\n";
$color = '';
}
}
if (not $color) {
print "Please select a number:\n";
foreach my $i (0..$#colors) {
print "$i) $colors[$i]\n";
}
my $num = <STDIN>;
chomp($num);
if (defined $colors[$num]) {
$color = $colors[$num];
} else {
print "Bad selection\n";
exit;
}
}
print "The selected color is $color\n";
Solution: sort numbers
#!/usr/bin/perl
use strict;
use warnings;
my $filename = shift or die "Usage: $0 filename\n";
open(my $fh, "<", $filename)
or die "Could not open '$filename': $!";
my @numbers;
while (my $line = <$fh>) {
chomp $line;
my @num = split / /, $line;
push @numbers, @num;
}
my @sorted = sort {$a <=> $b} @numbers;
foreach my $n (@sorted) {
print "$n\n";
}
Solution: sort mixed strings
#!/usr/bin/perl
use strict;
use warnings;
my $file = 'sort_mixed_strings.txt';
if (@ARGV) {
$file = shift;
}
open(my $fh, '<', $file) or die "Could not open '$file'\n";
my @data = <$fh>;
chomp @data;
my @sorted = sort { substr($a, 1) <=> substr($b, 1) } @data;
foreach my $v (@sorted) {
print "$v\n";
}
Solution: sort mixed strings
#!/usr/bin/perl
use strict;
use warnings;
my $file = 'sort_mixed_strings.txt';
if (@ARGV) {
$file = shift;
}
open(my $fh, '<', $file) or die "Could not open '$file'\n";
my @data = <$fh>;
chomp @data;
my @sorted = sort {
substr($a, 0, 1) cmp substr($b, 0, 1)
or
substr($a, 1) <=> substr($b, 1) } @data;
foreach my $v (@sorted) {
print "$v\n";
}
List::Util
List::Util provides functions such as
- max
- min
- sum
It resides in a distribution called Scalar-List-Utils
Advanced: Multi dimensional array
- @
#!/usr/bin/perl
use strict;
use warnings;
my @matrix;
$matrix[0][0] = 0;
$matrix[1][1] = 11;
$matrix[1][2] = 12;
#print "$matrix\n";
print "$matrix[0]\n"; # ARRAY(0x814dd90)
print "$matrix[1][1]\n"; # 11
use Data::Dumper qw(Dumper);
print Dumper \@matrix;
ARRAY(0x5a7b4c)
11
$VAR1 = [
[
0
],
[
undef,
11,
12
]
];
Actually what we have is a simple array and each element of that
array can be another (anonymous) array (reference).
Functions and Subroutines
Why to use subroutines?
We use subroutines in order to
- make the script more modular
- reuse code for repeated tasks
- easier to debug
- easier to maintain
Subroutines
- sub
- subroutines
- functions
- return
#!/usr/bin/perl
use strict;
use warnings;
my $sum = add(2, 3);
print "$sum\n";
print add(5, 8), "\n";
my $result = add2(4, 7);
print "$result\n";
sub add {
my ($x, $y) = @_;
my $z = $x+$y;
return $z;
}
sub add2 {
my $x = shift;
my $y = shift;
return $x+$y;
}
sub add_ugly {
return $_[0]+$_[1];
}
- no prototypes needed
- no signature
- parentheses are optional
- functions always return a value
Variable number of parameters
#!/usr/bin/perl
use strict;
use warnings;
print sum(3, 7, 11, 21), "\n";
print sum(1, 2, 3), "\n";
sub sum {
my $sum = 0;
foreach my $v (@_) {
$sum += $v;
}
return $sum;
}
Subroutines and functions in Perl
Recursive subroutines: factorial
- recursive
Mathematical definition of factorial:
n! = n * (n-1) * (n-2) * ... 1
1! = 1
n! = n * (n-1)!
#!/usr/bin/perl
use strict;
use warnings;
my $n = shift or die "Usage: $0 NUMBER\n";
my $result = factorial($n);
print $result;
sub factorial {
my ($n) = @_;
if ($n == 1) {
return 1;
}
my $prev = factorial($n - 1);
return $n * $prev;
}
Recursive subroutines: Fibonacci
Recursive definition of Fibonacci
f(1) = 1
f(2) = 1
f(n) = f(n-1) + f(n-2)
#!/usr/bin/perl
use strict;
use warnings;
sub fib {
my $n = shift;
if ($n == 1 or $n == 2) {
return 1
}
return (fib($n-1)+fib($n-2)); # recursive calling
}
print fib(10); # calling the function
Sort using a function
#!/usr/bin/perl
use strict;
use warnings;
my @data = qw(foo Foo Bar Moo FooBar Name Moose Apple Ape More Peach);
my @sorted = sort by_length @data;
print "@data\n";
print "@sorted\n";
# foo Foo Bar Moo FooBar Name Moose Apple Ape More Peach
# Ape Bar Foo foo Moo More Name Apple Moose Peach FooBar
sub by_length {
length($a) <=> length($b)
or
lc($a) cmp lc($b)
or
$a cmp $b
}
Return a list
- return|list
Perl allows us to return any number of values.
#!/usr/bin/perl
use strict;
use warnings;
my @numbers = fibonacci(10);
print "@numbers\n";
sub fibonacci {
my $num = shift;
if ($num == 1) {
return (1);
}
if ($num == 2) {
return (1, 1);
}
my @fib = (1, 1);
foreach (3..$num) {
push @fib, $fib[-1]+$fib[-2];
}
return @fib;
}
Return several elements
#!/usr/bin/perl
use strict;
use warnings;
my ($sum, $multi) = calc(3, 4);
print "Sum: $sum\n"; # 7
print "Multi: $multi\n"; # 12
sub calc {
my ($x, $y) = @_;
return $x + $y, $x * $y;
}
Error handling with eval - files
-
eval {};
-
die|catching
-
try|eval
-
die - raise exception or throw exception
-
eval - catch exception - try in other languages)
#!/usr/bin/perl
use strict;
use warnings;
my @files = ('abc.txt', $0);
foreach my $file (@files) {
eval {
process($file);
print " processing of $file done\n";
};
if ($@) {
my $err = $@;
chomp $err;
warn "Error seen: $err";
}
}
sub process {
my ($file) = @_;
open my $fh, '<', $file or die "Could not open '$file' $!";
local $/ = undef;
my $all = <$fh>;
close $fh;
print "Size of $file: ", length($all), "\n";
}
Error handling with eval - two calls
#!/usr/bin/perl
use strict;
use warnings;
my $result;
my $x = 19;
my $y = 23;
eval {
$result = unstable_add_function($x, $y);
print "unstable done\n";
};
if ($@) {
chomp $@;
warn "Exception '$@' received\n";
$result = slow_but_stable_add($x, $y);
print "slow done\n";
}
print "Result: $result\n";
sub unstable_add_function {
if (rand() < 0.2) {
die "broken";
}
return $_[0]+$_[1];
}
sub slow_but_stable_add {
sleep (2);
return $_[0]+$_[1];
}
Exercise: Subroutines
Take the solution from the subroutines chapter (examples/files-perl/statistics.pl" %} and move the code in a subroutine called main()
Take the solution from the subroutines chapter ( examples/files-perl/write_report_to_file.pl ) and create 3 subroutines. read_file() and write_file() for the two main parts of the script and main() that will call these two subroutines.
Take ( examples/arrays/color_selector.pl ) and create two functions. One that handles the case when the user provided a value on the command line, and another function that shows the menu and accepts the value from there. Maybe also add a main() function to wrap the rest of the code as well.
Exercise: Number guessing in sub
Take the solution of the improved number guessing game
from examples/arrays/number_guessing.pl
and change it so some parts of it will be functions.
Specifically you can create subs for
- the moving of the spaceship
- checking the hit How would you indicate the need to 'last' from within the function?
Solution: Number guessing in sub
#!/usr/bin/perl
use strict;
use warnings;
my $N = 200;
my $debug;
my $moving;
my $scores = 'scores.txt';
my ($min, $max);
if (open my $in, '<', $scores) {
($min, $max) = <$in>;
chomp ($min, $max);
}
print <<'END_TXT';
x - exit
q - quit
n - next game
s - show target
d - toggle debug mode
m - toggle - allow object to move or not
END_TXT
GAME:
while (1) {
my $hidden = 1 + int rand $N;
my $count = 0;
while (1) {
print "Please guess between 1 and $N :";
if ($debug) {
print " ($hidden) ";
}
print "\n";
my $guess = <STDIN>;
$count++;
chomp $guess;
if ($guess eq 'x' or $guess eq 'q') {
last GAME;
}
if ($guess eq 'n') {
last;
}
if ($guess eq 'd') {
$debug = $debug ? 0 : 1;
next;
}
if ($guess eq 'm') {
$moving = $moving ? 0 : 1;
next;
}
if ($guess eq "s") {
print "The hidden value is $hidden\n";
next;
# is it fair to let the player keep playing after seeing the result?
# if not replace the next by last
}
if (check_hit($hidden, $guess)) {
last;
}
if ($moving) {
$hidden = move($hidden);
}
}
if (not defined $min) {
$min = $count;
}
if (not defined $max) {
$max = $count;
}
$min = $count < $min ? $count : $min;
$max = $count > $max ? $count : $max;
}
if (defined $min) {
open my $out, '>', $scores or die;
print $out "$min\n";
print $out "$max\n";
}
sub move {
my ($hidden) = @_;
$hidden += int(rand 5)-2; # move a bit
# don't wander off the space
if ($hidden > $N) {
$hidden = $N;
}
if ($hidden < 1) {
$hidden = 1;
}
return $hidden;
}
sub check_hit {
my ($hidden, $guess) = @_;
if ($guess < 1 or $guess > $N) {
warn "You shot ($guess) in the outer space.\n";
}
if ($guess < $hidden) {
print "$guess is too small\n";
}
if ($guess > $hidden) {
print "$guess is too big\n";
}
if ($guess == $hidden) {
print "Heureka!\n";
return 1;
}
return;
}
Hashes
What is a hash?
-
%
-
hash
-
associative array
-
Unordered group of key/value pairs where
-
key is a unique string
-
value is any scalar
-
Associative Arrays in PHP
-
Dictionary in Python
Uses of hashes
Mapping of single feature of many similar items:
- phone book (name => phone number)
- worker list (ID number => name)
- CGI: (fieldname => field value)
Features of an object:
- Information about a person (fname, lname, email, phone, ...)
Creating hashes
my %user;
%user = ("fname", "Foo", "lname", "Bar");
my %user = (
"fname", "Foo",
"lname", "Bar",
);
my %user = (
fname => "Foo",
lname => "Bar",
);
print $user{"fname"}, "\n";
print $user{fname}, "\n";
my $key = "fname";
print $user{$key}, "\n";
$user{fname} = 'Moo';
$user{email} = 'foo@bar.com';
Create hash from an array
my %user = qw(fname Foo lname Bar);
my @person = qw(fname Foo lname Bar);
my %user = @person;
my @foobar = %user;
print "@foobar\n"; # fname Foo lname Bar
# or
print "@foobar\n"; # lname Bar fname Foo
$user{phone} = '123-456'; # change the value of one element
# or add key/value pair
%user = (phone => '123-456'); # change the hash
# remove all previous elements from the hash
# add a single key/value pair
Hash in scalar context
A hash in LIST context returns its keys and values.
my @foobar = %user;
In SCALAR context:
if (%user) {
# the hash is not empty
}
Fetching data from hash
- keys
- sort keys
my @fields = keys %user;
foreach my $field (@fields) {
print "$field $user{$field}\n";
}
foreach my $field (keys %user) {
print "$field $user{$field}\n";
}
my @fields = keys %user;
my @sorted_fields = sort @fields;
foreach my $field (@sorted_fields) {
print "$field $user{$field}\n";
}
foreach my $field (sort keys %user) {
print "$field $user{$field}\n";
}
exists, delete hash element
- exists
- delete
- Data::Dumper
#!/usr/bin/perl
use strict;
use warnings;
my %phones;
$phones{Foo} = '111';
$phones{Bar} = '222';
$phones{Moo} = undef;
print defined $phones{Foo} ? "Foo: $phones{Foo}\n" : "Foo not defined\n";
print defined $phones{Moo} ? "Moo: $phones{Moo}\n" : "Moo not defined\n";
print defined $phones{Baz} ? "Baz: $phones{Baz}\n" : "Baz not defined\n";
print exists $phones{Moo} ? "Moo exists\n" : "Moo does not exist\n";
print exists $phones{Baz} ? "Baz exists\n" : "Baz does not exist\n";
delete $phones{Foo};
print exists $phones{Foo} ? "Foo exists\n" : "Foo does not exist\n";
Foo: 111
Moo not defined
Baz not defined
Moo exists
Baz does not exist
Foo does not exist
Multi dimensional hashes
- Data::Dumper
#!/usr/bin/perl
use strict;
use warnings;
my %grades;
$grades{"Foo Bar"}{Mathematics} = 97;
$grades{"Foo Bar"}{Literature} = 67;
$grades{"Peti Bar"}{Literature} = 88;
$grades{"Peti Bar"}{Mathematics} = 82;
$grades{"Peti Bar"}{Art} = 99;
#foreach my $name (sort keys %grades) {
# foreach my $subject (keys %{$grades{$name}}) {
# print "$name, $subject: $grades{$name}{$subject}\n";
# }
#}
use Data::Dumper qw(Dumper);
print Dumper \%grades;
$VAR1 = {
'Peti Bar' => {
'Art' => 99,
'Literature' => 88,
'Mathematics' => 82
},
'Foo Bar' => {
'Literature' => 67,
'Mathematics' => 97
}
};
Count words
#!/usr/bin/perl
use strict;
use warnings;
my $filename = shift or die "Usage: $0 filename\n";
my %count;
open(my $fh, "<", $filename)
or die "Could not open '$filename': $!";
while (my $line = <$fh>) {
chomp $line;
my @words = split / /, $line;
foreach my $word (@words) {
$count{$word}++;
}
}
foreach my $word (keys %count) {
print "$word : $count{$word}\n";
}
Exercise: Parse HTTP values
You get one line like the following:
fname=Foo&lname=Bar&phone=123&email=foo@bar.com
Build a hash table from it so:
print $h{fname}; # Foo
print $h{lname}; # Bar
...
Start with this file:
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper qw(Dumper);
my @input = (
'fname=Foo&lname=Bar&email=foo@bar.com',
'ip=127.0.0.1&machine=foobar',
);
foreach my $str (@input) {
process($str);
}
sub process {
my %data;
# put your code here
print Dumper \%data;
}
$VAR1 = {
'email' => 'foo@bar.com',
'lname' => 'Bar',
'fname' => 'Foo'
};
$VAR1 = {
'ip' => '127.0.0.1',
'machine' => 'foobar'
};
Exercise: Improve the color selector
In the external file where we defined the colors, for each color
keep also a character that will be used to display the menu:
{% embed include file="src/examples/hashes/color_map.txt)
When displaying the menu show:
y) yellow
z) brown
b) black
e) blue
and wait till the user selects the appropriate letter.
Allow the user to provide a color on the command line
using the --color option and check if that is one of
the valid colors.
When reading in the file describing the menu options,
check for duplicate use of the same letter.
Exercise: Display scores
Read in a file where on each line there is a name and a score
with a comma between them.
Print them sorted based on name.
Then also print sorted based on score.
Foo,23
Bar,70
Baz,92
Bozo,17
Gozo,52
Dardon,20
Mekodra,23
Exercise: Analyze Apache log file
In the files section earlier we had a an example counting how many hits came from localhost and from other places.
Please improve that analyzer to provide a report: which client IP address were used and how many hits were from each IP address.
The log file can be found here: examples/files-perl/apache_access.log
Exercise: Parse variable width fields
# In a log file there are rows in which the first 16 and last 16 characters
# describe addresses while everything in between describes several commands
# Each command is built up by a leading character (A, B, C, D, etc) and a number
# of digits. The number of digits depend on the leading character.
#
# In this example we split up the data to commands and count how many times
# each command type was given.
#
1234567890123456A001B0002D00004C0000051234567890123456
1234567890123456A001A002D00004C0000051234567890123456
Solution: Parse HTTP values
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper qw(Dumper);
my @input = (
'fname=Foo&lname=Bar&email=foo@bar.com',
'ip=127.0.0.1&machine=foobar',
);
foreach my $str (@input) {
process($str);
}
sub process {
my $str = shift;
my @pairs = split /&/, $str;
my %data;
foreach my $p (@pairs) {
my ($k, $v) = split /=/, $p;
$data{$k} = $v;
}
print Dumper \%data;
}
Solution: Improve the color selector
#!/usr/bin/perl
use strict;
use warnings;
use Getopt::Long qw(GetOptions);
my $color;
my $filename = "examples/color_map.txt";
GetOptions(
"color=s" => \$color,
"filename=s" => \$filename,
);
my %colors;
open(my $fh, "<", $filename)
or die "Could not open '$filename' for reading: $!";
while (my $line = <$fh>) {
chomp $line;
my ($color_name, $letter) = split / /, $line;
if ($colors{$letter}) {
warn
sprintf "%s appears to be allocated to both %s and %s\n",
$letter, $colors{$letter}, $color_name
} else {
$colors{$letter} = $color_name;
}
}
if ($color) {
my $valid_color;
foreach my $c (values %colors) {
if ($c eq $color) {
$valid_color = 1;
next;
}
}
if (not $valid_color) {
print "The color '$color' is not valid.\n";
$color = '';
}
}
if (not $color) {
print "Please select a number:\n";
foreach my $k (sort keys %colors) {
print "$k) $colors{$k}\n";
}
my $letter = <STDIN>;
chomp($letter);
if ($colors{$letter}) {
$color = $colors{$letter};
} else {
print "Bad selection\n";
exit;
}
}
print "The selected color is $color\n";
Solution: Display scores
#!/usr/bin/perl
use strict;
use warnings;
my $filename = shift or die "Usage: $0 FILENAME\n";
open(my $fh, "<", $filename) or die "Could not open '$filename'\n";
my %score_of;
while (my $line = <$fh>) {
chomp $line;
my ($name, $score) = split /,/, $line;
$score_of{$name} = $score;
}
foreach my $name (sort keys %score_of) {
printf "%-10s %s\n", $name, $score_of{$name};
}
print "--------------------------\n";
foreach my $name (sort {
$score_of{$b} <=> $score_of{$a}
} keys %score_of) {
printf "%-10s %s\n", $name, $score_of{$name};
}
Solution: Analyze Apache log file
#!/usr/bin/perl
use strict;
use warnings;
# examples/files/apache_access.log
my $file = shift
or die "Usage: $0 FILENAME\n";
open my $fh, '<', $file or die "Could not open '$file': $!";
my %count;
while (my $line = <$fh>) {
chomp $line;
my $length = index ($line, " ");
my $ip = substr($line, 0, $length);
$count{$ip}++;
}
foreach my $ip (keys %count) {
print "$ip $count{$ip}\n";
}
Solution: Parse variable width fields
#!/usr/bin/perl
use strict;
use warnings;
my %count;
my %length = (
A => 3,
B => 4,
C => 6,
D => 5,
);
my $filename = "examples/hashes/variable_width_fields.log";
if ($ARGV[0]) {
$filename = $ARGV[0];
}
open my $data, '<', $filename or die "Could not open '$filename' $!";
LINE:
while (my $line = <$data>) {
chomp $line;
if (substr($line, 0, 1) eq "#") {
next;
}
my $cmds = substr($line, 16, -16);
while ($cmds) {
my $c = substr($cmds, 0, 1, "");
if (not defined $length{$c}) {
warn "....";
next LINE;
}
my $cmd = substr($cmds, 0, $length{$c}, "");
$count{$c}++;
print "$c : $cmd\n";
}
}
print "-" x 80, "\n";
foreach my $c (keys %count) {
print "$c $count{$c}\n";
}
Shell to Perl
Shell and Perl intro
Manipulating Files and Directories
File test or -X operators
- -e
- -f
- -d
- -M
Before we try to read from a file or try to write to a file
we might want to check our rights, if we can do the required action at all.
For this there is a bunch of so called -X operators. Usually you use them in
an if statement:
if (-e "file.txt") {
print "File exists !\n";
}
- -e File (or directory) exists
- -r File (or directory) is readable by this user
- -w File (or directory) is writable by this user
- -x File (or directory) is executable by this user
- -f Entry is a file
- -d Entry is a directory
- -l Entry is a symbolic link
- -s Size of the file (hence also means 'file is not empty')
- -M Number of days between the modification date of a file and the start time of our script
Hence -s can be used either in an if statement or like this:
$size = -s $filename;
There are more such operators see `perldoc -f -x`
Running External Programs
- system
- ``
- qx
system()
can execute any external program. You pass to it the same
string as you would type on the command line.
It returns 0 on success and the exit code of the external program on failure. Hence the strange way we check if it fails.
Passing the program name and the parameters as an array is more secure as it does not involve invocation of a shell. There is no shell processing involved;
system("some_app.exe --option");
See perldoc -f system for more error handling
my $result = `some_app.exe --option`;
my @result = `some_app.exe --option`;
backticks `` are also know as qx{}
Open pipe for reading
- -|
Connect to STDOUT of find
#!/usr/bin/perl
use strict;
use warnings;
my @resp;
open my $ph, '-|', 'find /' or die;
while (my $line = <$ph>) {
push @resp, $line;
last if $line =~ /root/;
}
close $ph;
print @resp;
Open pipe for writing
- |-
Connect to STDIN of grep
#!/usr/bin/perl
use strict;
use warnings;
open my $ph, '|-', 'grep Perl ' or die "Cannot start process";
print $ph "aa\n";
print $ph "Perl was here\n";
print $ph "bb\n";
print $ph "ccPerl\n";
close $ph;
Pager
Connect to STDIN for paging
use strict;
use warnings;
if (not @ARGV) {
die "Usage: $0 command to page the output of the given command \n";
}
my $SIZE = 10;
my $cnt = $SIZE;
open my $ph, '-|', "@ARGV" or die;
while (my $line = <$ph>) {
print $line;
$cnt--;
if ($cnt <= 0) {
print "---";
my $in = <STDIN>;
chomp $in;
if ($in eq ' ') {
$cnt = $SIZE;
} elsif ($in eq 'q') {
last;
} else {
$cnt = 1;
}
}
}
close $ph;
sendmail.pl
Connect to STDIN of sendmail
#!/usr/bin/perl
use strict;
use warnings;
sendmail('Info <info@perlmaven.com>',
'hello world',
'text...',
'Noreply <noreply@perlmaven.com>');
sub sendmail {
my ($tofield, $subject, $text, $fromfield) = @_;
my $mailprog = "/usr/lib/sendmail";
open my $ph, '|-', "$mailprog -t -oi" or die $!;
print $ph "To: $tofield\n";
print $ph "From: $fromfield\n";
print $ph "Reply-To: $fromfield\n";
print $ph "Subject: $subject\n";
print $ph "\n";
print $ph "$text";
close $ph;
return ;
}
# Warning: do not use the above script in an environment
# where anyone can supply the fields in the header
# (To, From, Reply-To, Subject in this case)
# as this can create an open relay.
UNIX commands from the inside
- unlink
- rm
- del
- rename
- chmod
- chown
- cd
- chdir
- rmdir
- ln
- link
- symlink
- readlink
- glob
- %ENV
- $ENV{HOME}
You can run every external command using `system`
but it makes it platform dependent and might have more security implications.
The following calls are available from Perl.
There are more but we won't cover them now.
| | | UNIX | DOS | | unlink | FILENAME | rm | del | | rename | OLDFILE, NEWFILE | mv | ren | | chmod | MODE, FILE | chmod | - | | chown | UID, GID, FILE | chown | - | | chdir | DIRNAME | cd | cd | | mkdir | DIRNAME, PERM | mkdir | mkdir | | rmdir | DIRNAME | rmdir | rmdir | | link | OLDNAME, NEWNAME | ln | - | | symlink | OLDNAME, NEWNAME | ln -s | - | | readlink | LINKNAME | ls -l | - | | glob | WILDCARDS | ls -1 | dir | | opendir, readdir | | ls -1 | dir | | %ENV, $ENV{HOME} | | | |
my $uid = getpwnam($username);
my $gid = getgrnam($groupname);
How to remove, copy or rename a file with Perl
File globbing (wildcards)
#!/usr/bin/perl
use strict;
use warnings;
# File globbing
my @xml_files_in_current_dir = glob "*.xml";
my $bin_dir = "/home/foo/bin";
my @perl_files = glob "$bin_dir/*.pl $bin_dir/*.pm";
# my @xml_files_using_old_syntax = <*.xml>;
Rename files
-
ls
-
glob - directory listing
#!/usr/bin/perl
use strict;
use warnings;
foreach my $file (glob "*.xml") {
my $new = substr($file, 0, -3) . "html";
rename $file, $new;
}
Directory handles
- opendir
- readdir
For a platform independent approach use opendir and readdir.
In order to read the content of a directory (that is the list of the files) first we have to open the directory similarly to the way we opened a file but using the opendir function This way we get a directory handle which we can use in subsequent operations.
Once the directory was opened successfully we can use the function readdir in a loop to get the names of the files in that directory
#!/usr/bin/perl
use strict;
use warnings;
my $dir = shift or die "Usage: $0 DIRECTORY\n";
opendir my $dh, $dir or die "Cannot open $dir: $!\n";
while (my $entry = readdir $dh) {
if ($entry eq "." or $entry eq "..") {
next;
}
print "$entry\n";
}
closedir $dh;
in LIST context readdir returns all the files in the directory.
opendir(my $dh, "/etc") or die $!;
@files = readdir $dh;
File::HomeDir
#!/usr/bin/perl
use strict;
use warnings;
use File::HomeDir;
my $home = File::HomeDir->my_home;
my $docs = File::HomeDir->my_documents;
print "$home\n";
print "$docs\n";
More UNIX commands implemented in modules
- pwd
- cwd
- basename
- dirname
- cp
- copy
- mv
- move
| Module | Usage | Comment | | Cwd | $dir = cwd; | current working directory | | File::Copy | copy "oldfile", "newfile"; | | | | move "oldfile", "newfile"; | this works between file systems as well | | File::Basename | basename "/a/b/c/file.pl"; | file.pl | | | dirname "/a/b/c/file.pl"; | /a/b/c | | File::Path | mkpath("a/b/c") | (like mkdir -p) | | | rmtree("/") | | | File::Find | | | | File::Find::Rule | | |
File::Spec
- File::Spec
- catfile
use strict;
use warnings;
use File::Spec;
use Cwd qw(cwd);
my $dir = cwd;
my $f = File::Spec->catfile($dir, 'admin', 'project.txt');
print "$f\n"; # ...\admin\project.txt on Windows
# .../admin/project.txt on Linux
use File::Spec::Functions;
my $f = catfile($dir, 'admin', 'project.txt');
Change Copyright text in every source file in a directory hierarchy
-
File::Find
-
find
-
File::Find
-
reference to subroutine
#!/usr/bin/perl
use strict;
use warnings;
use File::Find qw(find);
use File::Slurp qw(read_file write_file);
my $dir = $ARGV[0] || '.';
find( \&change_file, $dir);
sub change_file {
my $name= $_;
if (not -f $name) {
return;
}
if (substr($name, -3) ne ".pl") {
return;
}
print "$name\n";
my $data = read_file($name);
$data =~ s/Copyright Old/Copyright New/g;
# Let's not ruin our example files....
my $backup = "$name.bak";
rename $name, $backup,
write_file($name, $data);
return;
}
Exercise: Tree
- tree
Implement tree: prints a tree structure of a given directory.
All filenames are printed and subdirectories are properly indented.
$ tree.pl .
.
subdir_1
file_1_in_subdir_1
file_2_in_subdir_1
subdir_2
subdir_2_1
file_1_in_subdir_2_1
file_1_in_subdir_2
Implement the previous one using File::Find
Implement the previous one using File::Find::Rule
Exercise: Check for UID 0
In UNIX/Linux the information about users is kept in the
/etc/passwd file. Each line represents a user. The fields
in each line are as follows:
username, password,UID,GID,Gecos,home directory,shell
Today the passwords are usually kept separately hence in this file you will only
see an x in the second field.
When someone breaks in to a UNIX/Linux machine she might try to setup a user with UID 0 in order to gain root (superuser) access to the machine. Please check the following file and print a message if there is a user with 0 as UID which is NOT the root user.
root:x:0:0:root:/root:/bin/bash
daemon:x:1:1:daemon:/usr/sbin:/bin/sh
bin:x:2:2:bin:/bin:/bin/sh
sys:x:3:3:sys:/dev:/bin/sh
sync:x:4:65534:sync:/bin:/bin/sync
cracker:x:0:99:games:/usr/games:/bin/sh
games:x:5:60:games:/usr/games:/bin/sh
Exercise: CPU load
Check if the load on the computer is over a certain threshold and log the event to syslog. See the 3rd number from the right of the uptime command or the first number in the /proc/loadavg file on linux. The file looks like this:
0.15 0.26 0.72 1/128 9230
A sample file
Exercise: MANIFEST file
We have a file called MANIFEST listing all the required files in a distribution. One file on every line.
Check if all the file exist.
Check if there are files that are not listed in MANIFEST
Optionally allow for wildcards in the file ?
Example MANIFEST file:
Makefile.PL
README
MANIFEST
lib/Module.pm
Expected solution:
File 'Makefile.PL' listed in MANIFEST is missing from package
File 'CHANGES' on disk but is not listed in MANIFEST
Solutions: Tree
- File::Find
- File::Find::Rule
#!/usr/bin/perl
use strict;
use warnings;
my $dir = '.';
if (@ARGV) {
$dir = $ARGV[0];
}
traverse_dir('', $dir, 0);
sub traverse_dir {
my ($dir, $thing, $depth) = @_;
my $path = ($dir ? "$dir/$thing" : $thing);
print " " x ($depth*3), "$thing\n";
return if not -d $path;
if (opendir my $dh, $path) {
while (my $entry = readdir $dh) {
next if $entry eq "." or $entry eq "..";
traverse_dir ($path, $entry, $depth+1);
}
} else {
print " " x ($depth*3-3), "#### Could not open $dir\n";
}
return;
}
#!/usr/bin/perl
use strict;
use warnings;
use File::Find;
if (not @ARGV) {
@ARGV = (".");
}
find (\&find_name, @ARGV);
sub find_name {
# $File::Find::name looks like:
# dir/subdir/subdir/file.ext
# split at / and at \ for both unix and windows
my @directories = split m{[/\\]}, $File::Find::name;
print " " x (@directories -1); # () needed for precedence
print "$_\n";
return;
}
#!/usr/bin/perl
use strict;
use warnings;
use File::Find::Rule;
my $dir = '.';
if (@ARGV) {
$dir = shift;
}
foreach my $thing (File::Find::Rule->in($dir)) {
my @parts = split m{/}, $thing;
print " " x @parts;
print "$parts[-1]\n";
}
Solutions: Check for UID 0
#!/usr/bin/perl
use strict;
use warnings;
# Look for users with 0 uid
open my $pw, "/etc/passwd" or die "Cannot open /etc/passwd\n";
while (<$pw>) {
my @items = split /:/;
if ($items[2] =~ /^0*$/) {
print ;
}
}
# Alternatively writen:
# if ((split /:/)[2] =~ /^0*$/) {
Solution: CPU load
#!/usr/bin/perl
use strict;
use warnings;
my $threshold = 0.01;
open my $fh, "/proc/loadavg" or die "Cannot read /proc/loadavg\n";
my $line = <$fh>;
my $load = (split " ", $line)[0];
if ($load > $threshold) {
system "/usr/bin/logger", "-p", "crit", "High LOAD: $load\n";
}
# you can add here the code for logging in a file and sending e-mail
# but be careful as if the load if high you won't be able to send the
# email.
Solution: MANIFEST file
#!/usr/bin/perl
use strict;
use warnings;
use File::Find::Rule;
use File::Spec;
my $dir = shift or die "Usage: $0 DIR\n";
if (not -d $dir) {
die "'$dir' is not a directory";
}
my $manifest_file = File::Spec->catfile($dir, 'MANIFEST');
if (not -f $manifest_file) {
die "We have not found the '$manifest_file' file\n";
}
open my $fh, '<', $manifest_file;
my @manifest = <$fh>;
chomp @manifest;
foreach my $mfile (@manifest) {
if (not -e File::Spec->catfile($dir, $mfile)) {
warn "File '$mfile' listed in MANIFEST is missing from package\n";
}
}
my %manifest;
foreach my $mfile (@manifest) {
$manifest{$mfile} = 1;
}
# advanced solution of the above:
# my %manifest = map { $_ => 1 } @manifest;
my @files = File::Find::Rule->file->relative->in($dir);
foreach my $file (@files) {
if (not $manifest{$file}) {
warn "File '$file' on disk but is not listed in MANIFEST\n";
}
}
Using Perl modules, using CPAN
Using modules exporting functions
use Cwd;
my $path = cwd;
Probably better this way, so the reader will know where each function comes from and we reduce the risk of redefining other functions by importing exactly the functions we want.
use Cwd ('cwd');
my $path = cwd;
also written as
use Cwd qw(cwd);
my $path = cwd;
You can also make sure not to import anything and the use fully qualified names.
use Cwd ();
my $path = Cwd::cwd;
Using Object Oriented modules
#!/usr/bin/perl
use strict;
use warnings;
use WWW::Mechanize;
my $mech = WWW::Mechanize->new;
$mech->get('http://perlmaven.com/');
$mech->follow_link( text_regex => qr/tutorial/ );
print $mech->content;
Selecting Module to use
Evaluating modules, getting information about them
Some interesting CPAN Modules: DateTime
- DateTime
- DateTime::Tiny
- DateTime::Format::ISO8601
- DateTime::Format::Strptime
- DateTime::Duration
- DateTime examples
Days passed
use strict;
use warnings;
use DateTime;
my $d1 = DateTime->new(year => 2010, month => 8, day => 7, second => 37);
my $d2 = DateTime->new(year => 2010, month => 6, day => 17);
my $sec = $d2->subtract_datetime_absolute($d1)->seconds;
print $sec/60/60/24, "\n";
my $dur = $d1->delta_days($d2);
print $dur->in_units('days'), "\n";
Some interesting CPAN Modules: Databases
Some interesting CPAN Modules: Web
- LWP
- WWW::Mechanize
- Catalyst
- Dancer2
- Mojolicious
- Template Toolkit
- Dancer related articles.
- Mojolicius related articles.
- Catalyst related articles.
- Checking a valueless checkbox with WWW::Mechanize
- Basic Authentication with LWP::UserAgent and HTTP::Request::Common
Some interesting CPAN Modules: common file format
- Spreadsheet::Read ...
- How to read an Excel file in Perl
- XML::Twig ...
- XML::RSS::SimpleGen
- XML::Atom::SimpleFeed
- Config::Tiny
- Text::CSV
- Text::CSV_XS
- CSV examples.
- JSON::MaybeXS
- JSON examples.
- YAML::XS
- YAML examples.
Some interesting CPAN Modules
- Log::Log4perl
- Log4perl examples
- Perl::Critic
- Perl::Critic articles
- File::Slurp
- File::Path
- Net::Telnet
- Gtk2
- Wx
- PAR
- WebService::GData::YouTube
- IMDB::Film
- Bio::Util::DNA
- BioPerl
Installing modules
- Use the packaging system of your OS
- cpan
- cpanm from cpanmin.us
- Ask your sysadmin.
Installing modules from the os vendor
Let's install WWW::Mechanize
Debian/Ubuntu:
aptitude search mechanize | grep perl
sudo aptitude install libwww-mechanize-perl
Fedora / RedHat
yum search Mechanize | grep perl
yum install WWW-Mechanize
ActivePerl:
ppm install WWW::Mechanize
Using CPAN.pm
$ cpan WWW::Mechanize
or
$ cpan
cpan> install WWW::Mechanize
or
$ perl -MCPAN -eshell
cpan> install WWW::Mechanize
Configure CPAN.pm
Need to configure CPAN.pm:
$ cpan
cpan> o conf to show the configuration options
cpan> o conf urllist http://cpan.pair.com/ a CPAN mirror that is close by
cpan> o conf prerequisites_policy follow
cpan> o conf makepl_arg (PREFIX=... LIB=...)
cpan> o conf commit # to save the changes
other interesting commands
cpan> install Module::Name
cpan> look Module::Name gets to subshell
cpan> force install Module::Name
cpan> notest install Module::Name
cpan> test Module::Name
Installing modules manually
Download the tar.gz file from metacpan.org:
$ wget URL
$ tar xzf distribution.tar.gz
$ cd distribution
$ perl Makefile.PL
$ make
$ make test
$ make install (as root or with sudo)
Without root rights
perl Makefile.PL PREFIX=/home/foobar/perlib LIB=/home/foobar/perlib/lib
Installing modules manually using Build.PL
Module::Build
perl Build.PL --install_base /home/foo/perl5 \
--install_path lib=/home/foo/perl5/lib
perl Build
perl Build test
perl Build install
On Windows
On Strawberry Perl it would be dmake instead of make and there is also nmake in some installations.
# on Unix
$ gunzip distribution.tar.gz
$ tar xf distribution.tar
Changing @INC
Set the environment variable PERL5LIB or PERLLIB for all the scripts
export PERL5LIB=/path/to/lib
Adding a path to the beginning of @INC. Good for the specific script
BEGIN {
unshift @INC, '/path/to/lib';
}
The same but with the standard tool:
use lib '/path/to/lib'; # good for the specific script
On the command line. Good for this invocation only.
perl -I /path/to/lib script.pl
How to change @INC to find Perl modules in non-standard locations
Changing @INC - Relative PATH
Directory layout
script/app.pl
lib/My/Module.pm
# relative path
use FindBin;
use File::Spec;
use lib File::Spec->catfile($FindBin::Bin, '..', 'lib');
# relative path
use File::Spec;
use File::Basename;
use lib File::Spec->catfile(
File::Basename::dirname(File::Spec->rel2abs($0)),
'..',
'lib');
local::lib
Download local::lib and follow the bootstrapping technique.
$ wget http://cpan.metacpan.org/authors/id/A/AP/APEIRON/local-lib-1.008004.tar.gz
$ tar xzf local-lib-1.008004.tar.gz
$ cd local-lib-1.008004
$ perl Makefile.PL --bootstrap
$ make test
$ make install
$ echo 'eval $(perl -I$HOME/perl5/lib/perl5 -Mlocal::lib)' >>~/.bashrc
$ source ~/.bashrc
$ cpan WWW::Mechanize
It will install itself in ~/perl5/lib you will need to add some code to ~/.bashrc as well to make it visible to all the perl scripts from your user. Then you can start installing modules using the regular cpan client.
CPAN.pm
export PERL5LIB=/home/gabor/perl5lib/lib
# This is CPAN.pm's systemwide configuration file. This file provides
# defaults for users, and the values can be changed in a per-user
# configuration file. The user-config file is being looked for as
# ~/.cpan/CPAN/MyConfig.pm.
$CPAN::Config = {
'build_cache' => q[10],
'build_dir' => q[/home/gabor/.cpan/build],
'cache_metadata' => q[1],
'cpan_home' => q[/home/gabor/.cpan],
'dontload_hash' => { },
'ftp' => q[/usr/kerberos/bin/ftp],
'ftp_proxy' => q[],
'getcwd' => q[cwd],
'gpg' => q[/usr/bin/gpg],
'gzip' => q[/bin/gzip],
'histfile' => q[/home/gabor/.cpan/histfile],
'histsize' => q[100],
'http_proxy' => q[],
'inactivity_timeout' => q[0],
'index_expire' => q[1],
'inhibit_startup_message' => q[0],
'keep_source_where' => q[/home/gabor/.cpan/sources],
'links' => q[/usr/bin/links],
'make' => q[/usr/bin/make],
'make_arg' => q[],
'make_install_arg' => q[],
'makepl_arg' => q[PREFIX=/home/gabor/perl5lib LIB=/home/gabor/perl5lib/lib],
'ncftpget' => q[/usr/bin/ncftpget],
'no_proxy' => q[],
'pager' => q[/usr/bin/less],
'prerequisites_policy' => q[follow],
'scan_cache' => q[atstart],
'shell' => q[/bin/bash],
'tar' => q[/bin/tar],
'term_is_latin' => q[1],
'unzip' => q[/usr/bin/unzip],
'urllist' => [q[http://mirror.mirimar.net/cpan/]],
'wget' => q[/usr/bin/wget],
};
1;
__END__
Installing modules on ActivePerl
C:> ppm
ppm> install Name-Of-Module
in case it returns a list of modules, pick up the correct number:
ppm> install 3
There are additional sites with ppm repositories once can find on Kobes Search
Add the repository to ppm and install modules from that place as well:
ppm> rep add uwin http://theoryx5.uwinnipeg.ca/ppms/
ppm> install IO-Socket-SSL
in ActiveState 5.6.x
ppm> set rep name URL
In case the computer is behind a company proxy you can configure the http_proxy environment variable and ppm will use the proxy:
set http_proxy=http://proxy.company.com:8080
CPAN::Reporter
cpan> install CPAN::Reporter
cpan> reload cpan
cpan> o conf init test_report
cpan> o conf commit
Exercise: setup local::lib
Follow the instruction in the documentation of local::lib to bootstrap on your machine.
Exercise: Module installation
Install the Acme::EyeDrops module from CPAN and write a script
to draw a camel. As you are not root, you might need to install it in
a local subdirectory.
Create a simple script that does some simple computation.
Create a script using Acme::EyeDrops that will use the above simple script as
source.
Save your camel in a file.
Run the file containing the camel using Perl.
Exercise: Read Excel file
There are two Excel files included, read the data from the two files.
examples/spreadsheet.xls
examples/person.xls
Solution: Module installation
#!/usr/bin/perl
=pod
search.cpan.org
search Acme::EyeDrops
download the latest Acme-EyeDrops gziped file
(for me it was Acme-EyeDrops-1.01.tar.gz)
mkdir modules (create a local directory where we'll install the module)
tar xzf Acme-EyeDrops-1.01.tar.gz
cd Acme-EyeDrops-1.01
perl Makefile.PL PREFIX=/home/user/modules LIB=/home/user/module/lib
(the full path to the directory you created for the modules)
make
make test
make install
Create a script called hello_world.pl that asks for your name and then
prints Hello NAME.
Run this script. See the camel.
Now run this script and redirect to another file
perl acme_camel.pl > camel.pl
Now run the camel:
perl camel.pl
=cut
use strict;
use warnings;
use lib qw (/home/user/modules/lib/);
use Acme::EyeDrops qw (sightly);
print sightly({
Shape => 'camel',
SourceFile => 'hello_world.pl',
});
Solution: Read Excel file
#!/usr/bin/perl
use strict;
use warningds;
use Spreadsheet::ParseExcel::Simple;
my $xls = Spreadsheet::ParseExcel::Simple->read("spreadsheet.xls");
foreach my $sheet ($xls->sheets) {
while ($sheet->has_data) {
my @data = $sheet->next_row;
print join "|", @data;
print "\n";
}
}
Applications
Simple uses of Perl
After leaning the syntax of the language let's see a few simple ways to use it in real life tasks.
Create Unix user account
#!/usr/bin/env perl
use strict;
use warnings;
# --fname Foo --lname Bar
# /usr/sbin/adduser --home /opt/bfoo --gecos "Foo Bar" bfoo
my $adduser = '/usr/sbin/adduser';
use Getopt::Long qw(GetOptions);
if (not @ARGV) {
usage();
}
my %opts;
GetOptions(\%opts,
'fname=s',
'lname=s',
'run',
) or usage();
if (not $opts{fname} or $opts{fname} !~ /^[a-zA-Z]+$/) {
usage("First name must be alphabetic");
}
if (not $opts{lname} or $opts{lname} !~ /^[a-zA-Z]+$/) {
usage("Last name must be alphabetic");
}
my $username = lc( substr($opts{lname}, 0, 1) . $opts{fname});
my $home = "/opt/$username";
print "Username: $username\n";
my $cmd = qq($adduser --home $home --gecos "$opts{fname} $opts{lname}" $username);
print "$cmd\n";
if ($opts{run}) {
system $cmd;
} else {
print "need to add the --run flag to actually execute\n";
}
sub usage {
my ($msg) = @_;
if ($msg) {
print "$msg\n\n";
}
print "Usage: $0 --fname FirstName --lname LastName --run\n";
exit;
}
Reporting file system diskspace usage (df)
- df
#!/usr/bin/env perl
use strict;
use warnings;
use Filesys::DfPortable qw(dfportable);
#my $df = dfportable("/", 1024 * 1024 * 1024);
my $df = dfportable("/", 1024);
print "Total Size: $df->{blocks} K\n";
print "Available: $df->{bfree} K\n";
print "Used: $df->{bused} K\n";
print "Percent Full: $df->{per} %\n";
print "Total available to me: $df->{bavail} K\n";
$ perl diskspace.pl ; df /
Total Size: 48062440 K
Available: 38720692 K
Used: 9341748 K
Percent Full: 20 %
Total available to me: 36279216 K
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sda1 48062440 9341748 36279216 21% /
Reporting diskspace usage on the mail server
- du
#!/usr/bin/env perl
use strict;
use warnings;
#
# Reporting disk usage on the mail server
#
# Run the script in a cron job
#
# 1) Report to Boss if there are people with large files
#
# 2) If a user has a file that is too big then ask him to remove the
# large e-mail from the mail server via web access
# This one has not been implemented yet
#
######################################################
use Mail::Sendmail qw(sendmail);
use Filesys::DfPortable qw(dfportable);
################## Limit Definitions
# the size of the /var/spool/mail/username file in bytes
my $report_to_boss_limit = 1_000_000;
my $report_to_user_limit = 500_000;
my $domain = '@company.com';
my $boss_email = 'boss@company.com';
my $from_email = 'Disk Usage Report <sysadmin@company.com>';
my $disk_space_percantage = 80;
my %file_size;
# each user has a file in that directory
foreach my $path (glob "/var/spool/mail/*") {
if ($path =~ /Save/) { # disregard the Save directory
next;
}
if ($path =~ /\.pop$/) { # disregard temporary .pop files
next;
}
$file_size{$path} = -s $path;
}
my $txt = "";
# sort files by size
foreach my $path (sort {$file_size{$b} <=> $file_size{$a}} keys %file_size) {
my $name = $path;
$name =~ s{/var/spool/mail/}{};
if ($file_size{$path} > $report_to_boss_limit) {
$txt .= "$name\t\t" . int ($file_size{$path}/1_000_000) . " MB\n";
}
if ($file_size{$path} > $report_to_user_limit) {
my $msg = "You are currently using $file_size{$path} bytes\n";
$msg .= "Please reduce it to under $report_to_user_limit\n";
sendmail (
To => "$name$domain",
From => $from_email,
Subject => 'Disk Usage Report' . localtime(),
Message => $msg,
);
}
}
my @disks = qw(/ /boot);
foreach my $disk (@disks) {
my $df = dfportable($disk, 1024);
if ($df->{per} > $disk_space_percantage) {
$txt .= "\n\nDiskspace is low\n\nUsing ";
$txt .= $df->{per} . "\% of the space on $disk\n";
}
}
if ($txt) {
$txt = "Disk Usage of /var/spool/mail on the incoming mail server\n" .
"Reporting users over $report_to_boss_limit bytes\n\n" .
$txt;
sendmail (
To => $boss_email,
From => $from_email,
Subject => 'Disk Usage Report' . localtime(),
Message => $txt,
);
}
A du like script
- du
#!/usr/bin/env perl
use strict;
use warnings;
use Filesys::DiskUsage qw(du);
if (not @ARGV) {
die "Usage: $0 DIRs\n";
}
my %sizes = du({'make-hash' => 1}, @ARGV);
foreach my $entry (sort { $sizes{$a} <=> $sizes{$b} } keys %sizes) {
print "$entry => $sizes{$entry}\n";
}
Send files by e-mail
- MIME
#!/usr/bin/env perl
use strict;
use warnings;
use File::Basename qw(basename);
use File::Slurp qw(read_file);
use Getopt::Long qw(GetOptions);
use MIME::Lite;
use Pod::Usage qw(pod2usage);
my $text = <<'END_TEXT';
<html>
<head>
<title>Hello</title>
</head>
<body>
<h1>World</h1>
</body>
</html>
END_TEXT
my %opt;
GetOptions(\%opt,
'from=s',
'to=s',
'cc=s',
'subject=s',
'textfile=s',
'smtp=s',
) or pod2usage();
if (not $opt{from} or
not $opt{to} or
not $opt{subject}
) {
pod2usage();
}
if ($opt{textfile}) {
$text = read_file( $opt{textfile} );
}
send_files(\%opt, $opt{subject}, $text, @ARGV);
sub send_files {
my ($opt, $subject, $message_body, @files) = @_;
my $msg = MIME::Lite->new(
From => $opt->{from},
To => $opt->{to},
Cc => $opt->{cc},
Subject => $subject,
Type => 'multipart/mixed'
) or die "Error creating multipart container: $!\n";
$msg->attach(
Type => ($message_body =~ /<html>/ ? 'text/html' : 'text/plain'),
Data => $message_body
) or die "Error adding the text message part: $!\n";
foreach my $filename (@files) {
$msg->attach(
Type => ($filename =~ /\.xls$/ ? 'application/xls' : 'text/plain'),
Path => $filename,
Filename => basename($filename),
Disposition => 'attachment'
) or die "Error adding $filename: $!\n";
}
if ($opt->{smtp}) {
$msg->send('smtp', $opt->{smtp}, Timeout => 60) or die $!;
} else {
$msg->send or die $!;
}
return;
}
=head1 SYNOPSIS
Sending and e-mail with or without attachements
perl send_files.pl
--from from@company.com
--to to@company.com
--subject "Subject line"
report.xls
--textfile path/to/content.txt
--smtp HOSTNAME
=cut
Read Excel file
- Excel
#!/usr/bin/env perl
use strict;
use warnings;
use Spreadsheet::ParseExcel::Simple qw();
my $xls = Spreadsheet::ParseExcel::Simple->read("spreadsheet.xls");
foreach my $sheet ($xls->sheets) {
while ($sheet->has_data) {
my @data = $sheet->next_row;
print join "|", @data;
print "\n";
}
}
How to create an Excel file with Perl.
Process file with fixed width records
#!/usr/bin/env perl
use strict;
use warnings;
# You need to parse a log file where the fields are fixed length long
# and have no delimiters
# The definition is as follows:
# LABEL: 4 chars
# SOURCE: 8 digits
# DESTINATION: 8 digits
# TYPE: 4 chars
# VALUE: 8 digits
my $file = shift or die "Usage: $0 pack.txt\n";
open(my $data, '<', $file) or die "Could not open '$file'\n";
while (my $line = <$data>) {
print $line;
chomp $line;
my ($label, $source, $dest, $type, $value) = unpack ("A4 A8 A8 A4 A8", $line);
print "LABEL: $label SOURCE: $source DEST: $dest TYPE: $type VALUE: $value\n";
}
XALD37845566974923342XYZ24023984
YQRW49327408234028434ERD24448009
Process file with multiline records
device = 234234
name = Big
address = 115.6.79.8
class = B
device = 234224
name = Big Blue
address = 115.6.69.8
class = B
alias = Foxbox
device = 234235
name = Big Green box
address = 115.6.79.1
class = G
owner = Boss
device = 334235
name = Small Yellow
address = 115.6.79.10
class = Y
#!/usr/bin/env perl
use strict;
use warnings;
=head1 DESCRIPTION
File have sections separated by empty lines
Each section has several field = value entries like this:
Given a value of the name field print out all the values in this section
device = 234234
name = Big
address = 115.6.79.8
class = B
=cut
if (@ARGV != 2) {
die "\n Usage: $0 filename name\n Try: $0 examples/config.txt Big\n\n";
}
my ($filename, $name) = @ARGV;
open(my $fh, "<", $filename) or die "Could not open '$filename' $!";
my %data;
while (my $line = <$fh>) {
chomp $line;
if ($line =~ /^\s*$/ and %data) {
if ($data{name} eq $name) {
foreach my $k (keys %data) {
printf "%-10s = %s\n", $k, $data{$k};
}
exit;
}
%data = ();
} else {
my ($field, $value) = split /\s*=\s*/, $line;
$data{$field} = $value;
}
}
How to read a CSV file using Perl?
Process multi field csv file
- csv
Name,ID,Input,Output
Big Venta,12,Left,Right
Small Elevator,7343124,Bottom,Top
Giant Ant,423235,Lower floor,Upper floor
#!/usr/bin/env perl
use strict;
use warnings;
use Text::CSV_XS qw();
use Data::Dumper qw(Dumper);
my $filename = shift or die "Usage: $0 FILENAME\n";
open(my $fh, "<", $filename) or die "Could not open '$filename': $!";
my $csv = Text::CSV_XS->new;
my $key = "Name";
my $header = <$fh>;
chomp $header;
$csv->parse($header);
my @header = $csv->fields;
my %data;
while (my $line = <$fh>) {
chomp $line;
$csv->parse($line);
my @cols = $csv->fields;
my %h;
@h{@header} = @cols;
$data{$h{$key}} = \%h;
}
print Dumper \%data;
Fetch web page
- http
- web
- GET
- LWP::Simple
#!/usr/bin/env perl
use strict;
use warnings;
use LWP::Simple qw(get);
my $page = get "http://perlmaven.com/";
if ($page) {
print "Site is alive\n";
} else {
print "Site is not accessible\n";
}
Generate web page
- CGI
- HTML::Template
We are building the HTML pages from a template utilizing the HTML::Template module from CPAN. Besides the plain HTML the template has additional TMPL_* tags that will be filled by the values by HTML::Template.
{% embed include file="src/examples/applications/html.tmpl)
This is a simple Perl script that should be installed to a CGIExec enabled directory of Apache. When the user hits this page the first time it displays a white page with only entry-box and a submit button on it. the user can fill the box.
#!/usr/bin/env perl
use strict;
use warnings;
use CGI;
use HTML::Template;
my $template = HTML::Template->new(filename => "examples/html.tmpl");
my $q = CGI->new;
print $q->header;
if ($q->param("text")) {
my $text = $q->param("text");
$template->param(echo => $text);
}
print $template->output
Parse XML file
- XML
- XML::Simple
<people>
<person id="1">
<fname>Josef</fname>
<lname>Kiss</lname>
<idnum>4342324234</idnum>
<children>
<child>
<id>3</id>
</child>
</children>
</person>
<person id="2">
<fname>Peter</fname>
<lname>Kiss</lname>
<idnum>8768996</idnum>
</person>
<person id="3">
<fname>Zoltan</fname>
<lname>Kiss</lname>
<idnum>09808760</idnum>
</person>
</people>
#!/usr/bin/env perl
use strict;
use warnings;
use XML::Simple qw(XMLin);
my $xml = XMLin("examples/simple.xml", ForceArray => 1);
#use Data::Dumper qw(Dumper);
#print Dumper $xml;
#exit;
print join "-", keys %{$xml->{person}};
print "\n";
foreach my $id (keys %{$xml->{person}}) {
printf "%-10s %-10s %-10s\n",
$xml->{person}{$id}{fname}[0],
$xml->{person}{$id}{lname}[0],
$xml->{person}{$id}{idnum}[0];
}
Database access using DBI and DBD::SQLite
- database
- DBI
- DBD
- SQLite
#!/usr/bin/env perl
use strict;
use warnings;
use DBI;
my $dbfile = "sample.db";
unlink $dbfile if -e $dbfile;
my $sql = <<'SCHEMA';
CREATE TABLE people (
id INTEGER PRIMARY KEY,
fname VARCHAR(100),
lname VARCHAR(100)
);
SCHEMA
my $dbh = DBI->connect("dbi:SQLite:dbname=$dbfile","","", {});
$dbh->do($sql);
#!/usr/bin/env perl
use strict;
use warnings;
use Getopt::Long qw(GetOptions);
use DBI qw();
my $action;
GetOptions("action=s" => \$action);
if (not $action or $action !~ /^(insert|selecta|selecth)$/) {
print <<"USAGE";
Usage:
$0 --action insert|selecta|selecth
USAGE
exit;
}
my $dbfile = "sample.db";
my $dbh = DBI->connect("dbi:SQLite:dbname=$dbfile","","", {});
if ($action eq "insert") {
insert();
}
if ($action eq "selecta") {
fetch_arrays();
}
if ($action eq "selecth") {
fetch_hashref();
}
sub insert {
my @people = (
['Foo', 'Bar'],
['Apple', 'Pie'],
);
foreach my $person (@people) {
$dbh->do("INSERT INTO people (id, fname, lname) VALUES(?, ?, ?)",
undef,
1, @$person);
}
return;
}
sub fetch_arrays {
my $sth = $dbh->prepare("SELECT lname, fname FROM people WHERE id = ?");
$sth->execute(1);
while (my @result = $sth->fetchrow_array()) {
print "lname: $result[0], fname: $result[1]\n";
}
$sth->finish;
return;
}
sub fetch_hashref {
my $sth = $dbh->prepare("SELECT lname, fname FROM people WHERE id = ?");
$sth->execute(1);
while (my $result = $sth->fetchrow_hashref("NAME_lc")) {
print "lname: $result->{lname}, fname: $result->{fname}\n";
}
$sth->finish;
return;
}
Simple Database access using Perl DBI and SQL
Net::LDAP
- LDAP
- Net::LDAP
#!/usr/bin/env perl
use strict;
use warnings;
use Net::LDAP;
my $server = "ldap.itd.umich.edu";
my $ldap = Net::LDAP->new( $server ) or die $@;
$ldap->bind;
my $result = $ldap->search(
base => "",
filter => "(&(cn=Jame*) (sn=Woodw*))",
attr => ["mail"],
);
$result->code && die $result->error;
printf "COUNT: %s\n", $result->count;
foreach my $entry ($result->entries) {
$entry->dump;
}
print "===============================================\n";
foreach my $entry ($result->entries) {
printf "%s <%s>\n",
$entry->get_value("displayName"),
($entry->get_value("mail") || '');
$entry->add ( "nephew" => "Foo Bar" );
$entry->replace ( "mail" => 'foo@bar.com');
my $res = $entry->update($ldap);
if ($res->code) {
warn "Failed to update: " . $res->error . "\n";
};
}
my $res = $ldap->add(
'cn=root, o=University of Michigan, c=US',
attr => [
cn => 'Foo Bar',
ou => 'My Place in the Universe',
mail => 'mail@address.com',
],
);
if ($res->code) {
warn "Failed to add: " . $res->error . "\n";
}
$ldap->unbind;
Reading from LDAP in Perl using Net::LDAP
Tie::File
- Tie::File
#!/usr/bin/env perl
use strict;
use warnings;
use Tie::File;
tie my @file, 'Tie::File', "data.txt" or die $!;
$file[7] = "Hello";
Extra slides
List all defined variables
#!/usr/bin/env perl
use v5.10;
use strict;
use warnings;
my $x;
my @y;
our $xx;
foreach my $variable (sort keys %main::) {
say $variable;
}
Perl internal variables
- $! - error of the last system call e.g. open file.
- $/ - input record separator, slurp mode.
- $_ - the default variable for many operations.
- $0 - Name of the current program.
- $1, $2, .. Regex match variables.
- $] - version number of Perl
- $a, $b - variables of sort.
- @ARGV - command line parameters.
- @_ - parameters of the current function.
- @INC - list of directories to search for modules.
- %INC - loaded modules and the path to them on the disk.
- %ENV - environment variables
- %SIG - signal handlers
- Full list at perldoc perlvar
What is Perl ?
-
Created by Larry Wall in 1987
-
Scripting language replacing awk, sed and shell scripts
-
Glue language - putting together tools provided for the OS
-
Free software distributed under GPL and the Artistic licenses
-
Available for most of the platforms (UNIX/Linux/Windows/Macintosh/VMS/main frame/...)
-
Generally speaking machine and OS independent but the system calls are OS specific
-
Interpreted language compiled to a machine independent byte tree and ...
-
... Executed immediately by the interpreter
-
C-like syntax
What does Perl stand for ?
- Perl is not an acronym it is what we call a backronym
- Practical Extraction and Reporting Language
- Pathologically Eclectic Rubbish Lister
- Objective: Making Easy Things Easy and Difficult Things Possible
- Motto: TMTOWTDI = There's More Than One Way To Do It
Scripting language ?
Some people talk about writing code in Perl as scripting and others as programming. Some say Perl is a scripting language others that it is a programming language.
Actually there is no such definition as scripting language. The real definitions are:
- Compiled languages
- Interpreted languages
Is Perl Interpreted or Compiled Language
- Compiled languages: C, C++
- Compiled and interpreted languages: Java, Perl
- Interpreted languages: BASIC (or Shell)
Perl is in the middle as it is compiled to a byte tree (similarly to Java) that is never saved as a file. It also requires the interpreter to be available with the Perl program.
Advantages of a compiled language:
- Can be faster especially in short applications
- Does not need a compiler/interpreter/Virtual Machine on the users machine (it can be distributed on its own)
Disadvantages of a compiled language:
- It has to be compiled separately to every platform
- Difficult to make it platform independent
Why use Perl ?
- Easy - for daily use
- Nearly unlimited - for high-level programming (no real time, no kernel, no device drivers please)
- Mostly fast - fast enough for the tasks you would do with Perl
- Can solve problems - the symbol of Perl is the camel, ugly but can handle difficult environment
What is it good for ?
- QA (test automation)
- System Administration
- Configuration Management
- Web backend (CGI/Database access)
- Quick and dirty programs (automating simple daily jobs)
- Biotechnology
- Prototypes
Its strengths
- Text processing (pattern matching)
- List processing
- Database access
- System language
Redirection by the user
In order to read a real file we have to wait for a later chapter or you can ask the user of the script to use redirection on the command line.
The user of out program might do the following to redirect the input and/or the output of our program.
myperl.pl < in.txt
myperl.pl > out.txt
myperl.pl < in.txt > out.txt
for loop
- for
for (INITIALIZE; TEST; INCREMENT) {
BODY;
}
for (my $i=0; $i < 10; $i++) {
print "$i\n";
}
# but a foreach loop might be better for the same looping:
foreach my $i (0..9) {
print "$i\n";
}
ASCII table
- chr
- ord
foreach my $i (32..128) {
print $i, " ", chr($i), "\n";
}
# the inverse function is ord()
print ord('a'); # 97
default variable $_
- $_
Default variable
my @fruits = qw(apple banana peach);
foreach my $fruit (@fruits) {
print $fruit;
}
foreach my $name (@people) {
print $name;
}
foreach (@people) {
print;
}
foreach $_ (@people) {
print $_;
}
Return values
Perl functions always returns a value. Either explicitly by calling return or the result of the last computation.
#!/usr/bin/perl
sub loop {
$i = 0;
while ($i < 10) {
$i++;
}
}
print loop(); # what is this value ?
print $i; # 10
Solving the above problem: always use return
Add return $i;
before the end of the subroutine.
UNIX file system, hard links symbolic links
- What is in a Directory ?
- What is in an Inode ?
- What is a symbolic link ?
- What is a hard link ?
What links can be between different partitions ?
(hint: only symbolic links as hard links are bound to the inode
number which is local to each partition)
stat, lstat
- stat
- lstat
In order to get information from the inode table you can use the stat system call
ARRAY = stat FILEHANDLE| FILENAME
@fields = stat ($filename);
@fields = stat ($fh);
$fields[4] is the UID
$fields[7] is the size in bytes
0 dev device number of file system
1 ino inode number
2 mode file mode (type and permissions)
3 nlink number of (hard) links to the file
4 uid numeric user ID of file's owner
5 gid numeric group ID of file's owner
6 rdev the device identifier (special files only)
7 size total size of file, in bytes
8 atime last access time in seconds since the epoch
9 mtime last modify time in seconds since the epoch
10 ctime inode change time (NOT creation time!) in seconds since the epoch
11 blksize preferred block size for file system I/O
12 blocks actual number of blocks allocated
for symbolic links use lstat
Compare values
#!/usr/bin/perl
use strict;
use warnings;
my $first = <STDIN>;
chomp $first;
my $other = <STDIN>;
chomp $other;
if ($first == $other) {
print "The two numbers are the same\n";
} else {
print "The two numbers are NOT the same\n";
}
if ($first eq $other) {
print "The two strings are the same\n";
} else {
print "The two strings are NOT the same\n";
}
if ($first > $other) {
print "First is a BIGGER number\n";
} else {
print "First is a smaller number\n";
}
if ($first gt $other) {
print "First is a BIGGER string\n";
} else {
print "First is a smaller string\n";
}
Bitwise Operations
- bitwise
- &
- |
- ~
- ^
- <<
- >>
The following operators work on the binary representation of the values
& AND
| OR
^ XOR
~ NOT (complement)
>> right shift
<< left shift
bitwise examples
#!/usr/bin/perl
use strict;
use warnings;
my $x = 5; # 00101
my $y = 3; # 00011
my $z = $x & $y; # 00001 = 1
$z = $x | $y; # 00111 = 7
$z = $x ^ $y; # 00110 = 6
$z = ~ 1; # 11111111111111111111111111111110 (32 chars) = 4294967294
$z = $x << 2; # 10100 = 20
$z = $x >> 2; # 00001 = 1
printf "%b", $x; # in order to print the binary form of a value
### Bitwise operators on strings
$a = "h l o";
$b = " e l ";
print $a | $b, "\n";
print $a ^ $b, "\n";
# hello
# HELLO
Filename creation based on date
# creating a filename based on the date YYYY-MM-DD-HH-MM-SS
@t = (localtime)[5,4,3,2,1,0];
$t[0]+=1900;
$t[1]++;
$filename = sprintf "%d-%0.2d-%0.2d-%0.2d-%0.2d-%0.2d", @t;
Debugging prints
# Scalar:
print STDERR "Debug: '$a'\n";
# Array:
print STDERR "Debug: @a\n";
print STDERR "Debug: ", join "|", @a, "\n";
# Hash:
print STDERR "Debug: ", join "|", %h, "\n";
print STDERR "Debug: ";
foreach (keys %h) { print STDERR "$_=>$h{$_}|";
print STDERR "\n";
print STDERR "Debug: ";
print STDERR "$_=>$h{$_}|" for keys %h;
print STDERR "\n";
Where is perl installed on your system ?
It can be installed in any place but usually you'll find it in one of the following places:
on UNIX:
/usr/bin/perl
/usr/local/bin/perl
on Windows:
c:\perl\bin\perl.exe
In addition you'll find a bunch of Perl standard modules installed somewhere on the system.
Send e-mail
-
mail
-
sendmail
-
Mail::Sendmail
-
Mail::Sendmail
#!/usr/bin/perl
use strict;
use warnings;
use Getopt::Long "GetOptions";
use Mail::Sendmail "sendmail";
my $to;
my $from;
my $help;
my $file;
GetOptions(
"to=s" => \$to,
"from=s" => \$from,
"help" => \$help,
"file=s" => \$file,
);
if ($help) {
usage();
}
if ($to and $from and $file) {
my ($subject, $message) = read_file($file);
my %mail = (
To => $to,
From => $from,
Subject => $subject,
Message => $message,
);
sendmail(%mail) or die $Mail::Sendmail::error;
} else {
usage();
}
sub usage {
print "Usage: $0\n";
print " --to TO\n";
print " --from FROM\n";
print " --file FILE\n";
print "\n";
print " --help\n";
print "\n";
print "The given FILE is going to be the content of the e-mail\n";
print "The first line of the file should be:\n";
print "Subject: and the subject itself\n";
print "\n";
exit;
}
sub read_file {
my ($file) = @_;
open(my $fh, "<", $file) or die "Could not open '$file'\n";
my $subject = <$fh>;
local $/ = undef;
my $message = <$fh>;
$subject =~ s/^Subject: //;
return ($subject, $message);
}
XML using Perl
XML sample data
We are going to use the following XML file and will try to answer te various questions that are listed in the comment at the top of the file.
<!--
- List all the currencies
- What is the currency in the USA
- What is the currency in the country with id 2 ?
- Replace the currency in Hungary to Euro
- Add another country called Beregengocia with currency
-->
<data>
<country id="1">
<name>USA</name>
<languages>
<language id="1">English</language>
<language id="2">Spanish</language>
</languages>
<currency>USD</currency>
</country>
<country id="2">
<name>Israel</name>
<languages>
<language id="1">Hebrew</language>
<language id="2">English</language>
<language id="3">Arabic</language>
<language id="4">Russian</language>
</languages>
<currency>NIS</currency>
</country>
<country id="3">
<name>Hungary</name>
<languages>
<language id="1">Hungarian</language>
</languages>
<currency>HUF</currency>
</country>
</data>
XML Definitions
- Tags, attributes
- Well formed
<tag>text</tag>
or an empty element:
<tag />
- Valid XML (DTD, XML Schema, RelexNG, Schematron)
- Namespaces, Entity references,
XML processing modes
- DOM - build an in-memory representation of the XML (tree)
- SAX - process the XML chunk-by-chunk (stream)
XML::Simple
Let's see some of the parameters controlling the way XML is read into memory by XML::Simple.
XML::Simple is, well, relatively simple but it can still be used in a number of ways. The most straight forward way is to import the XMLin function, and pass a filename to it or a scalar with XML data in it. In addition the XMLin subroutine accepts several options. Two of them are the most important: ForceArray and KeyAttr. Also interesting is KeepRoot.
ForceArray defaults to 0 KeyAttr defaults to ['name', 'key', 'id'] KeepRoot defaults to 0
'Array folding' feature is controlled by KeyAttr
#!/usr/bin/perl
use strict;
use warnings;
use XML::Simple qw(XMLin);
use Data::Dumper;
my $data = <<'END_XML';
<people>
<person id="1">
<name>Foo</name>
</person>
<person id="2">
<name>Bar</name>
<email id="3">bar@home</email>
<picture src="img.png" />
</person>
</people>
END_XML
my $xml = XMLin($data,
ForceArray => 1,
KeyAttr => [],
#KeepRoot => 1,
);
$Data::Dumper::Sortkeys = 1;
print Dumper $xml;
print "The name of the first person: $xml->{person}[0]{name}[0]\n";
print "The name of the second person: $xml->{person}[1]{name}[0]\n";
print "The id of the first person: $xml->{person}[0]{id}\n";
print "The id of the second person: $xml->{person}[1]{id}\n";
print "The id of the first email of the second person: $xml->{person}[1]{email}[0]{id}\n";
print "The first email of the second person: $xml->{person}[1]{email}[0]{content}\n";
print "The srt of the first picture of the second person: $xml->{person}[1]{picture}[0]{src}\n";
# There still is a difference on how to get to the content of an element with attributes
# (the email) and the content of an element without attribute (name).
Without any options this will generate something that is apparently a mess. This is caused by the fact that KeyAttr is trying to be nice and uses the value of the 'name' tag and the 'id' attribute as the keys to the hash it is generating.
#!/usr/bin/perl
use strict;
use warnings;
use XML::Simple;# qw(:strict);
use Data::Dumper;
my $xml = XMLin('examples/data.xml',
ForceArray => 0,
KeyAttr => [],
);
print Dumper $xml;
XML::Simple dumped
$VAR1 = {
'country' => {
'Israel' => {
'currency' => 'NIS',
'languages' => {
'language' => [
'Hebrew',
'English',
'Arabic',
'Russian'
]
},
'id' => '2'
},
'Hungary' => {
'currency' => 'HUF',
'languages' => {
'language' => 'Hungarian'
},
'id' => '3'
},
'USA' => {
'currency' => 'USD',
'languages' => {
'language' => [
'English',
'Spanish'
]
},
'id' => '1'
}
}
};
XML::Simple processing
#!/usr/bin/perl
use strict;
use warnings;
use XML::Simple qw(XMLin XMLout);
use Data::Dumper;
my $xml = XMLin('examples/data.xml',
ForceArray => 0,
KeyAttr => [],
);
print "List all currencies\n";
# KeyAttr is the default
#foreach my $country_name (keys %{ $xml->{country} }) {
# printf "%-10s %s\n",
# $country_name, $xml->{country}->{$country_name}->{currency};
#}
# KeyAttr => [],
foreach my $country (@{ $xml->{country} }) {
printf "%-10s %s\n", $country->{name}, $country->{currency}
}
#printf "The currency of the USA is %s.\n", $xml->{country}->{USA}->{currency};
foreach my $country (@{ $xml->{country} }) {
if ($country->{name} eq 'USA') {
printf "The currency of the USA is %s.\n", $country->{currency};
last;
}
}
#foreach my $country_name (keys %{ $xml->{country} }) {
# if ($xml->{country}->{$country_name}->{id} == 2) {
# printf "The currency of country id 2 is %s.\n",
# $xml->{country}->{$country_name}->{currency};
# last;
# }
#}
foreach my $country (@{ $xml->{country} }) {
if ($country->{id} == 2) {
printf "The currency of country id 2 is %s.\n",
$country->{currency};
last;
}
}
#$xml->{country}->{Hungary}->{currency} = 'Euro';
foreach my $country (@{ $xml->{country} }) {
if ($country->{name} eq 'Hungary') {
$country->{currency} = 'Euro';
last;
}
}
XMLout($xml,
OutputFile => 'out.xml',
RootName => 'data',
KeyAttr => [],
NoAttr => 1,
);
#KeyAttr => ['id'],
#NoAttr => 0,
XML::Parser
Originally written by Larry Wall, using expat written by James Clark allow several styles of parsing such as Stream, Object, Tree.
XML::Parser tree
#!/usr/bin/perl
use strict;
use warnings;
use XML::Parser;
use Data::Dumper;
my $parser = XML::Parser->new(Style => 'Tree');
my $tree = $parser->parsefile('examples/data.xml');
$Data::Dumper::Indent = 0;
print Dumper $tree;
__END__
print "\n----------------\n";
print "$tree->[0]\n"; # data
print "$tree->[1][3]\n"; # country
print "$tree->[1][4][0]{id}\n"; # 1
print "$tree->[1][4][3]\n"; # name
print "$tree->[1][4][4][2]\n"; # USA
print "$tree->[1][4][7]\n"; # languages
print "$tree->[1][7]\n"; # country
Run the code and see what is dumped out
Read the documentation of XML::Parser (search Tree)
XML::Parser tree results
$VAR1 =
[
'data',
[
{},
0,
'',
'country',
[
{'id' => '1'},
0,
'',
'name',
[
{},
0,
'USA'
],
0,
'',
'languages',
[
{},
0,
'',
'language',
[
{},
0,
'English'
],
0,
'',
'language',
[
{},
0,
'Spanish'
],
0,
''
],
0,
'',
'currency',
[
{},
0,
'USD'
],
0,
''
],
0,
'',
'country',
[
{'id' => '2'},
0,
'',
'name',[{},0,'Israel'],....
],
0,
'',
'country',
[
{'id' => '3'},
0,
'',
'name',[{},0,'Hungary'],....
],
]
];
XML::Parser
#!/usr/bin/perl
use strict;
use warnings;
use XML::Parser;
my $parser = XML::Parser->new(Style => 'Tree');
my $tree = $parser->parsefile('examples/data.xml');
# List all the currencies
foreach my $country (get_tag_subtrees($tree->[1], 'country')) {
my ($name_tree) = get_tag_subtrees($country, 'name');
my $name = $name_tree->[2];
foreach my $currency (get_tag_subtrees($country, 'currency')) {
printf "%-10s %s\n", $name, $currency->[2];
}
}
# get a country with a given name
foreach my $country (get_tag_subtrees($tree->[1], 'country')) {
my ($name_tree) = get_tag_subtrees($country, 'name');
next if $name_tree->[2] ne 'USA';
foreach my $currency (get_tag_subtrees($country, 'currency')) {
printf "The currency in the USA is %s.\n", $currency->[2];
}
}
foreach my $country (get_tag_subtrees($tree->[1], 'country')) {
next if $country->[0]{id} ne '2';
foreach my $currency (get_tag_subtrees($country, 'currency')) {
printf "The currency in the country with id 2 is %s.\n", $currency->[2];
}
}
sub get_tag_subtrees {
my ($tree, $name) = @_;
my @subtrees;
foreach my $i (1.. (@{ $tree }-1) / 2) {
if ($tree->[2*$i-1] eq $name) {
push @subtrees, $tree->[2*$i];
}
}
return @subtrees;
}
XML::LibXML
Low level library, Perl binding to libxml2 supports the standard
XML processing mode called Document Object Mode (DOM).
#!/usr/bin/perl
use strict;
use warnings;
use XML::LibXML;
my $parser = XML::LibXML->new;
my $doc = $parser->parse_file('examples/data.xml');
print $doc;
process_node( $doc->getDocumentElement, 0);
sub process_node {
my ($node, $depth) = @_;
print "$node\n";
return unless( $node->nodeType eq &XML_ELEMENT_NODE );
print " " x $depth, $node->nodeName, "\n";
foreach my $child ( $node->getChildnodes ) {
process_node( $child, $depth+1 );
}
}
Other Node types (taken from the source code)
XML_COMMENT_NODE
XML_TEXT_NODE
XML_CDATA_SECTION_NODE
XML_ELEMENT_NODE
XML_ENTITY_REF_NODE
XML_DOCUMENT_NODE
XML_HTML_DOCUMENT_NODE
XML_DOCUMENT_FRAG_NODE
XML_PI_NODE
XML_XINCLUDE_START
XML_XINCLUDE_END
XML_DTD_NODE
XML::XPath
Built on XML::Parser
#!/usr/bin/perl
use strict;
use warnings;
use XML::XPath;
my $xml = XML::XPath->new(filename => 'examples/data.xml');
my $name_nodes = $xml->find('/data/country');
foreach my $node ($name_nodes->get_nodelist) {
my $nn = $node->find('name');
foreach my $x ($nn->get_nodelist) {
print XML::XPath::XMLParser::as_string($x);
print "\n";
}
}
XML::Twig
#!/usr/bin/perl
use strict;
use warnings;
use XML::Twig;
my $twig = XML::Twig->new(
pretty_print => 'indented',
);
$twig->parsefile('examples/data.xml');
# print the content again (look at the pretty_print option)
# $twig->print;
my $root = $twig->root;
# print $root->name, "\n"; # data
# print $root->gi, "\n"; # data
#print $twig, "\n"; # HASH(0x.....)
#print $root->twig, "\n"; # HASH)0x.....)
#my $elem = $root->first_child;
#print $elem->gi, "\n"; # country
#print $elem->att('id'), "\n"; # 1
# replace the name of the root element by a new name
#$root->set_tag('new data');
#$twig->print;
use strict;
use warnings;
my $t= XML::Twig->new();
$t->parse( '<d><title>title</title><para>p 1</para><para>p 2</para></d>');
my $root= $t->root;
$root->set_tag( 'html'); # change doc to html
$title= $root->first_child( 'title'); # get the title
$title->set_tag( 'h1'); # turn it into h1
my @para= $root->children( 'para'); # get the para children
foreach my $para (@para)
{ $para->set_tag( 'p'); } # turn them into p
$t->print; # output the document
XML::Writer
Close to handwriting of the XML but can also check well formedness.
#!/usr/bin/perl
use strict;
use warnings;
use XML::Writer;
open my $out, '>', 'out.xml' or die;
my $writer = XML::Writer->new( OUTPUT => $out, NEWLINES => 0 );
$writer->startTag('data');
$writer->startTag('country', 'id' => 1);
$writer->startTag('name');
$writer->characters('USA');
$writer->endTag('name');
$writer->endTag('country');
$writer->endTag('data');
$writer->end();
close $out;
XML::Dumper
#!/usr/bin/perl
use strict;
use warnings;
use XML::Dumper;
open my $out, '>', 'out.xml' or die;
my $dumper = XML::Dumper->new();
my $perl = {
people => [
{
name => 'Foo',
phone => 123,
},
{
name => 'Bar',
phone => 456,
},
{
name => 'Baz',
phone => 789,
},
],
};
my $xml = $dumper->pl2xml($perl);
print $xml;
my $perl_again = $dumper->xml2pl( $xml );
use Data::Dumper;
print Dumper $perl, $perl_again;
Exercise: parse glade.xml
{% embed include file="src/examples/xml/entry.glade)
Given the XML file created by Glade write the following functions:
my @names = get_all_widget_names(); # all the names
my @widgets = get_all_widges(); # all the objects
my $widget = get_widget($name); # an object representing the widget
$widget->get_handler($name); # name of the handler function
$widget->get_property($name);
$widget->set_property($name, $value);
DTD validation
Using XML::LibXML;
XML::Checker
Specialized XML modules
SOAP::Lite
Gtk2::GladeXML
XML::RSS
XML::RSS::SimpleGen
XML::Atom
XML::Atom::SimpleFeed
Resources
See CPAN for uncounted number of XML related modules.
The perl-xml mailing list
http://aspn.activestate.com/ASPN/Mail/Browse/Threaded/perl-xml
http://xmltwig.com/
http://perl-xml.sourceforge.net/
http://www.xml.com
"Perl and XML" (first ed) By Erik T. Ray, Jason McIntosh, Published by O'Reilly
Many of our examples are taken from these places either directly or
after some transformation.
Parsing HTML File
HTML::Parser - fetch title
HTML::Parser
use strict;
use warnings;
use HTML::Parser ();
sub start_handler {
my ($elem, $self) = @_;
# print "start $elem\n";
return if $elem ne "title";
$self->handler(text => sub { print shift }, "dtext");
$self->handler(end => sub { shift->eof if shift eq "title"; }, "tagname,self");
}
my $p = HTML::Parser->new(api_version => 3);
$p->handler( start => \&start_handler, "tagname,self");
my $filename = shift || 'title_source.html';
$p->parse_file($filename) or die "Could not parse '$filename' $!";
print "\n";
<head>
<title>This is the title tag</title>
</head>
<body>
cont</body>
HTML::Parser
use strict;
use warnings;
use HTML::Parser;
use Data::Dumper qw(Dumper);
my $p = HTML::Parser->new(
api_version => 3,
start_h => [ \&start, "event, self, tagname, attr, "],
text_h => [ \&text, "event, self, dtext"],
end_h => [ 'end', "event, self, tagname"], # no point in attr
);
my $html = <<'END_HTML';
<body>
<ul>
<li>first elem</li>
<li>second <a href="http://url" id=42>link</a> elem
<li>third <b>bold</b> elem</li>
<!-- <li>commented out elem -->
<li>5th elem</li>
</ul>
<img src="/path/to/img.png" />
</body>
END_HTML
# callback can be either referencfe to subroutine (or anonymous sub)
# or name of sub
# does not call "end" when </li> was missed out, even if new <li> starts
# element that is both opening and closing tag will get a '/' key with a '/' value
# empty_element_tags => 1,
# will remove that entry and generate an end call after the start call
# in the attributes of the opening tag but no call to 'end'
sub start {
print Dumper \@_;
}
sub end {
print Dumper \@_;
}
sub text {
print Dumper \@_;
}
$p->parse($html);
HTML::Encode
use strict;
use warnings;
use HTML::Entities qw(decode_entities encode_entities);
print decode_entities("xéx"), "\n";
print decode_entities("xåx"), "\n";
print encode_entities(decode_entities("xåx")), "\n";
print encode_entities('This expression $x < 3'), "\n";
Autoclose
use strict;
use warnings;
use HTML::Parser ();
sub event_handler {
my ($event, $elem) = @_;
print "$event $elem\n";
}
my $p = HTML::Parser->new(api_version => 3);
$p->handler( start => \&event_handler, "event, tagname");
$p->handler( end => \&event_handler, "event, tagname");
$p->parse('<head><title>abc</title></head>');
$p->eof;
print "----\n";
$p->parse('<head><title>abc</head>');
$p->eof;
print "----\n";
$p->parse('<ul><li>abc</li><li>def</ul>');
$p->eof;
exit;
Appendix
Resources, How to get support for Perl ?
- Perl Maven free Perl tutorial and video recordings.
- PerlMonks - web forum
- Shared blog of Perl developers
- CPAN - Comprehensive Perl Archive Network
- MetaCPAN - Modern search facility for CPAN
- List of mailing lists.
- Perl Mongers - Local Perl user groups.
- ActiveState - Perl release, PDK, Komodo (IDE)
- perl.org
- perl.com
Recommended Books
- Randal L. Schwartz: Lerning Perl - O'Reilly This book covers a bit more than the Fundamentals course
- Modern Perl book
- Tom Christiansen: Perl Cookbook - O'Reilly Huge repository of examples and explanations covering lots of areas where you'll use Perl.
- Larry Wall: Programming Perl - O'Reilly The ultimate source for Perl written by the experts. This is NOT for the beginner !
- Peter J. Scott: Perl Medic (Addison Wesley) For people who need to maintain Perl code or who need to write maintainable Perl code.
- Damian Conway: Perl Best Practices - O'Reilly
More about files
Exercise: Create a counter
Each time I run the script I want to get a higher number.
Exercise: Create multiple counters
Create multiple counters separated by newlines
I run the script like this: counter.pl 3
This will increment the counter #3 by one and print the value.
Exercises: List old logfiles
List all the logfiles in the current directory that are older than 3 days
List all the log files in this directory and subdirectories that
are more than 3 days old.
Solution: Create a counter
#!/usr/bin/perl
use strict;
use warnings;
my $filename = "counter.txt";
if (not -e $filename) {
open my $fh, ">", $filename or die "Could not create counter file: $!";
print $fh 0;
}
open my $fh, "+<", $filename or die "Could not open counter: $!\n";
my $c = <$fh>;
chomp $c;
seek $fh, 0, 0;
truncate $fh, 0;
$c++;
print $c;
print $fh $c;
close $fh;
Solution: Create multiple counters
#!/usr/bin/perl
use strict;
use warnings;
unless (@ARGV) {
print "Usage: $0 <counter-id>\n";
exit;
}
my $id = shift @ARGV;
$id--; # because we index the counters from 1 and the array is from 0
my $filename = "multiple_counter.txt";
if (not -e $filename) {
open my $fh, ">", $filename or die "Could not create counter file: $!";
print $fh 0;
}
open my $fh, "+<", $filename or die "Could not open counter: $!\n";
my @c = <$fh>;
chomp @c;
seek $fh, 0, 0; # move to the beginning of the file
truncate $fh, 0; # cut the file to a certain size
$c[$id]++;
print $c[$id];
foreach my $v (@c) {
if (defined $v) {
print $fh "$v\n";
} else {
print $fh "\n";
}
}
close $fh;
Solutions: List old logfiles
perl -e 'for (<*.log>) {print "$_\n" if -M $_ > 3}'
#!/usr/bin/perl
use strict;
use warnings;
use File::Find 'find';
find({
wanted => \&old_files,
no_chdir => 1,
}, $ARGV[0] || '.');
sub old_files {
if (substr($_, -4) ne ".log") {
return;
}
if (-M $_ > 3) {
print "$_\n";
}
return;
}
Testing Demo for Perl
Testing Demo AUT
AUT = Application under Testing
{% embed include file="src/examples/testing-demo/lib/MyMath.pm)
use strict;
use warnings;
use feature 'say';
use MyMath qw(add);
if (scalar @ARGV != 2) {
die "Usage: $0 X Y\n";
}
say add(@ARGV);
use strict;
use warnings;
use feature 'say';
use MyMath qw(multiply);
if (scalar @ARGV != 2) {
die "Usage: $0 X Y\n";
}
say multiply(@ARGV);
perl -I lib add.pl
perl -I lib multiply.pl
Testing Demo using Test::More (success)
cpanm Test::More
use strict;
use warnings;
use Test::More;
use MyMath qw(add);
is add(2, 2), 4;
done_testing;
$ prove -l t/01-add.t
t/01-add.t .. ok
All tests successful.
Files=1, Tests=1, 0 wallclock secs ( 0.02 usr 0.00 sys + 0.04 cusr 0.00 csys = 0.06 CPU)
Result: PASS
Testing Demo using Test::More (failure)
use strict;
use warnings;
use Test::More;
use MyMath qw(add);
is add(2, 2), 4;
is add(2, 3), 5;
is add(0, 0), 0;
done_testing;
$ prove -l t/02-add.t
t/02-add.t .. 1/?
# Failed test at t/02-add.t line 8.
# got: '6'
# expected: '5'
# Looks like you failed 1 test of 3.
t/02-add.t .. Dubious, test returned 1 (wstat 256, 0x100)
Failed 1/3 subtests
Test Summary Report
-------------------
t/02-add.t (Wstat: 256 Tests: 3 Failed: 1)
Failed test: 2
Non-zero exit status: 1
Files=1, Tests=3, 1 wallclock secs ( 0.01 usr 0.02 sys + 0.04 cusr 0.00 csys = 0.07 CPU)
Result: FAIL
Testing Demo - Test coverage report
cpanm Devel::Cover
cover -delete
HARNESS_PERL_SWITCHES=-MDevel::Cover prove -l t/01-add.t
cover
$ cover
Reading database from .../cover_db
------------- ------ ------ ------ ------ ------ ------ ------
File stmt bran cond sub pod time total
------------- ------ ------ ------ ------ ------ ------ ------
lib/MyMath.pm 84.6 n/a n/a 80.0 0.0 0.0 75.0
t/01-add.t 100.0 n/a n/a 100.0 n/a 99.9 100.0
Total 92.5 n/a n/a 88.8 0.0 100.0 86.8
------------- ------ ------ ------ ------ ------ ------ ------
HTML output written to .../cover_db/coverage.html done.
Open cover_db/coverage.html
.
File processing
Filehandle in scalar and list context
#!/usr/bin/perl
use strict;
use warnings;
my $file = "numbers.txt";
open(my $fh, '<', $file) or die "Could not open '$file'";
# reading in SCALAR context (line by line) and processing each line
while (my $row = <$fh>) {
chomp $row;
print "READ: $row\n";
}
open (my $other_fh, '<', $file) or die "Could not open '$file'";
# reading in LIST context all the lines at once
my @rows = <$other_fh>;
chomp @rows;
print "READ ", scalar(@rows), " lines\n";
slurp mode
#!/usr/bin/perl
use strict;
use warnings;
my $file = "numbers.txt";
# slurp mode
my $all;
{
open(my $fh, '<', $file) or die "Could not open '$file'\n";
local $/ = undef;
$all = <$fh>;
}
Diamond operator
#!/usr/bin/perl
use strict;
use warnings;
while (my $line = <>) {
print $line;
}
File::Slurp
#!/usr/bin/perl
use strict;
use warnings;
use File::Slurp qw(slurp);
my $filename = $ARGV[0];
if (not defined $filename) {
die "Usage: $0 FILENAME\n";
}
my $text = slurp($filename);
my @lines = slurp($filename);
Other
Read CSV file as array of hashes
- csv
The csv function that can be imported from Text::CSV can read a CSV into memory, creating an array of hashes.
Element from the first row will be used as the keys of these hashes and elements from all the other rows will be used as the values of these hashes.
Planet name,Distance (AU),Mass
Mercury,0.4,0.055
Venus,0.7,0.815
Earth,1,1
Mars,1.5,0.107
Ceres,2.77,0.00015
Jupiter,5.2,318
Saturn,9.5,95
Uranus,19.6,14
Neptune,30,17
Pluto,39,0.00218
Charon,39,0.000254
use strict;
use warnings;
use Text::CSV qw(csv);
use Data::Dumper qw(Dumper);
my $filename = 'planets.csv';
my $solar_system = csv(
in => $filename,
headers => 'auto');
print Dumper $solar_system;
$VAR1 = [
{
'Planet name' => 'Mercury',
'Mass' => '0.055',
'Distance (AU)' => '0.4'
},
{
'Planet name' => 'Venus',
'Mass' => '0.815',
'Distance (AU)' => '0.7'
},
{
'Distance (AU)' => '1',
'Planet name' => 'Earth',
'Mass' => '1'
},
{
'Planet name' => 'Mars',
'Mass' => '0.107',
'Distance (AU)' => '1.5'
},
{
'Planet name' => 'Ceres',
'Mass' => '0.00015',
'Distance (AU)' => '2.77'
},
{
'Distance (AU)' => '5.2',
'Mass' => '318',
'Planet name' => 'Jupiter'
},
{
'Distance (AU)' => '9.5',
'Mass' => '95',
'Planet name' => 'Saturn'
},
{
'Distance (AU)' => '19.6',
'Mass' => '14',
'Planet name' => 'Uranus'
},
{
'Mass' => '17',
'Planet name' => 'Neptune',
'Distance (AU)' => '30'
},
{
'Distance (AU)' => '39',
'Mass' => '0.00218',
'Planet name' => 'Pluto'
},
{
'Mass' => '0.000254',
'Planet name' => 'Charon',
'Distance (AU)' => '39'
}
];
Sort array using sort and sort_by
use strict;
use warnings;
use List::UtilsBy qw(sort_by);
my @numbers = (2, -3, 4);
my @sorted = sort @numbers;
print "@sorted\n";
@sorted = sort { abs($a) <=> abs($b) } @numbers;
print "@sorted\n";
@sorted = sort_by { abs($_) } @numbers;
print "@sorted\n";
use strict;
use warnings;
use List::UtilsBy qw(sort_by);
sub func {
my ($val) = @_;
print "func $val\n";
return $val;
}
my @sorted = sort { func($a) <=> func($b) } (4, 3, 2, 1);
print "@sorted\n";
print "\n"; # looks like func is called n * log(n)
@sorted = sort_by { func($_) } (4, 3, 2, 1);
print "@sorted\n";
print "\n"; # func is only called n times
use strict;
use warnings;
use Text::CSV qw(csv);
use Data::Dumper qw(Dumper);
use List::UtilsBy qw(sort_by);
my $filename = 'planets.csv';
my $solar_system = csv(
in => $filename,
headers => 'auto');
#print Dumper $solar_system;
#my @sorted = sort_by { $_->{'Planet name'} } @$solar_system;
my @sorted = sort_by { $_->{'Mass'} } @$solar_system;
print Dumper \@sorted;
my @sorted = sort { $a->{'Planet name'} cmp $b->{'Planet name'} }
@$solar_system;
Range iterator
use strict;
use warnings;
use Range::Iter qw(range_iter);
my $iter = range_iter(1, 10, 2);
while (my $val = $iter->()) {
print "$val\n"; # 1, 3, 5, 7, 9
}
Benchmark
use strict;
use warnings;
use 5.010;
use Benchmark qw(:hireswallclock);
#say(timeit(1, sub { count(1000) })->real);
#say(timeit(1, sub { count(10000) })->real);
#timethis(1, sub { count(10000) }); # (warning: too few iterations for a reliable count)
#timethis(100, sub { count(1000) });
timethese(1, {
'1' => sub { count(1000) },
'10' => sub { count(10000) },
});
sub count {
my ($n) = @_;
for my $i (1 .. $n) {
for my $j (1 .. $n) {
my $square = $i+$j;
}
}
}
Bubble sort
use strict;
use warnings;
use 5.010;
use Benchmark qw(:hireswallclock);
my @numbers = (1 .. 4);
print "@numbers\n";
my @sorted = bubble_sort(@numbers);
print "@sorted\n";
#timethese(1, {
# '1000' => sub { bubble_sort(1..1000) },
# '10000' => sub { bubble_sort(1..10000) },
#});
#say(timeit(10, sub { bubble_sort(1..1000) })->real);
sub bubble_sort {
my @items = @_;
for my $ix (0 .. $#items) {
for my $jx (0 .. $#items-$ix-1) {
if ($items[$jx] < $items[$jx+1]) {
($items[$jx], $items[$jx+1]) = ($items[$jx+1], $items[$jx]);
print "@items\n";
}
}
}
return @items;
}
Insert sort
use strict;
use warnings;
use 5.010;
use Benchmark qw(:hireswallclock);
my @numbers = (1 .. 4);
print "@numbers\n";
my @sorted = insert_sort(@numbers);
print "@sorted\n";
#timethese(1, {
# '1000' => sub { insert_sort(1..1000) },
# '10000' => sub { insert_sort(1..10000) },
#});
#say(timeit(10, sub { insert_sort(1..1000) })->real);
sub insert_sort {
my @items = @_;
for my $ix (1 .. $#items) {
my $current_item = $items[$ix];
my $jx = $ix - 1;
while ($jx >= 0 and $current_item > $items[$jx]) {
$items[$jx+1] = $items[$jx];
$jx--;
}
$items[$jx+1] = $current_item;
#print "@items\n";
}
return @items;
}
Merge sort
use strict;
use warnings;
use 5.010;
use Benchmark qw(:hireswallclock);
my @numbers = (1 .. 4);
print "@numbers\n";
my @sorted = merge_sort(@numbers);
print "@sorted\n";
#timethese(1, {
# '1000' => sub { merge_sort(1..1000) },
# '10000' => sub { merge_sort(1..10000) },
# '100000' => sub { merge_sort(1..100000) },
#});
#say(timeit(10, sub { merge_sort(1..1000) })->real);
sub merge_sort {
my @items = @_;
#print "@items\n";
if (scalar(@items) <= 1) {
return @items;
}
my $half = int(scalar(@items) / 2);
my @left = merge_sort(@items[0 .. $half-1]);
my @right = merge_sort(@items[$half .. $#items]);
my @sorted_items = merge_sorted_arrays(\@left, \@right);
return @sorted_items;
}
sub merge_sorted_arrays {
my ($left, $right) = @_;
my @sorted_items;
my ($i, $j) = (0, 0);
while ($i < @$left and $j < @$right) {
if ($left->[$i] > $right->[$j]) {
push @sorted_items, $left->[$i];
$i++;
} else {
push @sorted_items, $right->[$j];
$j++;
}
}
if ($i < @$left) {
push @sorted_items, @$left[$i .. $#$left];
} else {
push @sorted_items, @$right[$j .. $#$right];
}
return @sorted_items;
}
- It can run out of the recursion limit of Perl
Quicksort
use strict;
use warnings;
use 5.010;
use Benchmark qw(:hireswallclock);
my @numbers = (1 .. 4);
print "@numbers\n";
my @sorted = quick_sort(@numbers);
print "@sorted\n";
#timethese(1, {
# '1000' => sub { quick_sort(1..1000) },
# '10000' => sub { quick_sort(1..10000) },
# '100000' => sub { quick_sort(1..100000) },
#});
#
#say(timeit(10, sub { quick_sort(1..1000) })->real);
sub quick_sort {
my @items = @_;
#print "@items\n";
if (scalar(@items) <= 1) {
return @items;
}
my $pivot = int(scalar(@items) / 2); # this could be the first element or a random element.
my @left = grep { $_ > $items[$picot] } @items;
my @right = grep { $_ < $items[$picot] } @items;
my @same = grep { $_ == $items[$picot] } @items;
return quick_sort(@left), @same, quick_sort(@right);
}
- The worst-case is if we happen to select the smallest element every time, then the complexity is O(n^2)
- It can run out of the recursion limit of Perl
Timsort
- Timsort
- Created for Python by Tim Peters
Traverse directory tree
use strict;
use warnings;
use File::Spec::Functions qw(catfile);
my ($name) = @ARGV;
die "Usage: $0 DIR/FILE\n" if not $name;
my $result = dir_walk($name);
sub dir_walk {
my ($name) = @_;
if (-f $name) {
print "$name\n";
return;
}
if (-d $name) {
if (opendir my $dh, $name) {
while (my $subname = readdir $dh) {
next if $subname eq '.' or $subname eq '..';
my $resuld = dir_walk(catfile($name, $subname));
}
} else {
warn "Could not open dir '$name'";
}
return;
}
warn "The '$name' is not a file and not a directory. Skipping.";
return;
}
Traverse directory tree with call back
use strict;
use warnings;
use File::Spec::Functions qw(catfile);
my ($name) = @ARGV;
die "Usage: $0 DIR/FILE\n" if not $name;
#my $result = dir_walk($name, \&print_name);
#
#sub print_name {
# print "$_[0]\n";
#}
{
my $total = 0;
sub collect {
my ($filename) = @_;
$total += -s $filename;
}
sub total {
return $total;
}
}
my $result = dir_walk($name, \&collect);
print total(), "\n";
sub dir_walk {
my ($name, $cb) = @_;
if (-f $name) {
$cb->($name);
return;
}
if (-d $name) {
if (opendir my $dh, $name) {
while (my $subname = readdir $dh) {
next if $subname eq '.' or $subname eq '..';
my $resuld = dir_walk(catfile($name, $subname), $cb);
}
} else {
warn "Could not open dir '$name'";
}
return;
}
warn "The '$name' is not a file and not a directory. Skipping.";
return;
}
Linear search in unorderd array
use strict;
use warnings;
my @planets = qw(Mercury Venus Earth Mars Jupiter Saturn);
sub search {
my ($name, @items) = @_;
for my $ix (0 .. $#items) {
if ($items[$ix] eq $name) {
return $ix;
}
}
return;
}
print search('Mars', @planets), "\n";
print search('Pluto', @planets), "\n";
Binary search in sorted array
use strict;
use warnings;
my @planets = qw(Earth Jupiter Mars Mercury Saturn Venus);
sub search {
my ($name, @items) = @_;
my $left = 0;
my $right = $#items;
while ($left < $right) {
my $current = $left + int(($right-$left)/2);
if ($items[$current] lt $name) {
$left = $current+1;
next;
}
if ($items[$current] gt $name) {
$right = $current-1;
next;
}
return $current;
}
return;
}
print search('Mars', @planets), "\n";
print search('Pluto', @planets), "\n";
# If there are duplicate matching elements then depending on the length of the array
# this might return the index of any of those items.
Convert curl command with -F to LWP::UserAgent
curl -X POST -F field=value -F name=Corion "https://httpbin.org/post" -H "accept: application/json"
use strict;
use warnings;
use LWP::UserAgent ();
use HTTP::Request::Common qw(POST);
my $ua = LWP::UserAgent->new(timeout => 10);
my $url = 'https://httpbin.org/post';
my %content = (
field => 'value',
name => 'Corion',
);
my $request = POST $url, 'Content-Type' => 'form-data', Content => \%content;
my $response = $ua->request($request);
print $response->decoded_content;
Modify time anomality in two files
use strict;
use warnings;
use feature 'say';
use Time::HiRes qw(sleep time stat);
{
open(my $fh, '>', 'first.txt');
$fh->flush;
close $fh;
}
say "before ", time;
#sleep(0.0006);
say "after ", time;
{
open(my $fh, '>', 'second.txt');
$fh->flush;
close $fh;
}
my $first = (stat('first.txt'))[9];
my $second = (stat('second.txt'))[9];
say "first $first";
say "second $second";
say $first == $second ? "same" : "different";
use strict;
use warnings;
use feature 'say';
use Time::HiRes qw(stat);
{
open(my $fh, '>', 'first.txt');
}
{
open(my $fh, '>', 'second.txt');
$fh->flush;
close $fh;
}
my $first = (stat('first.txt'))[9];
my $second = (stat('second.txt'))[9];
say "first $first";
say "second $second";
say $first == $second ? "same" : "different";