Escolar Documentos
Profissional Documentos
Cultura Documentos
Nicodème Paul
Nicodeme.paul@unibas.ch
http://www2.biozentrum.unibas.ch/personal/schwede/Teaching/BixI-WS0607/frame.htm
01-11-06 1
What is programming ?
Introduction to programming in Perl WS 2006/07: Bioinformatics I
Sum : 15 + 25 + 11 ?
15 + 25 + 11
40 + 11
51
01-11-06 2
Program translator
Introduction to programming in Perl WS 2006/07: Bioinformatics I
Computer
Processor
Compiler
Program 0101011
or
Interpreter
Memory
01-11-06 3
What is Perl ?
Introduction to programming in Perl WS 2006/07: Bioinformatics I
• Text-processing language
• Glue language
01-11-06 4
Why do we use Perl?
Introduction to programming in Perl WS 2006/07: Bioinformatics I
• Simplicity
• Rapid prototyping
• Portability
01-11-06 5
A first example
Introduction to programming in Perl WS 2006/07: Bioinformatics I
# Pragmas
use strict; # Restrict unsafe constructs
use warnings; # Provide helpful diagnostics
# Assign 15 to $number1
my $number1 = 15;
# Assign 25 to $number2
my $number2 = 25;
# Assign 11 to $number3
my $number3 = 11;
01-11-06 6
Scalar Data Type
Introduction to programming in Perl WS 2006/07: Bioinformatics I
$ S
01-11-06 7
$u = 17 $v = 3 $s = “Perl”
Division $u / $v 17 / 3 = 5.66666666667
Modulus $u % $v 17 % 3 = 2
Exponentiation $u ** $v 17 ** 3 = 4913
01-11-06 8
Scalar Unary Operators
Introduction to programming in Perl WS 2006/07: Bioinformatics I
Numbers Strings
abs(expr) uc(expr)
sqrt(expr) lc(expr)
exit(expr) chop(variable)
exp(expr) chomp(variable)
int(expr) reverse(expr)
log(expr) length(expr)
¾ perldoc –f function_name
01-11-06 9
Context
Introduction to programming in Perl WS 2006/07: Bioinformatics I
$u = “12” + 5;
¾17
$u = “12john” +5;
¾17
$u = “john12” + 5;
¾5
use strict;
$u = “john12” + 5;
¾ Argument “john12” isn’t numeric in addition (+) at line 3
¾5
$u = “12” + 5;
¾17
01-11-06 10
Array data type
Introduction to programming in Perl WS 2006/07: Bioinformatics I
Values
0 35
1 12.4
Indices
2 “bye\n”
3 1.7e23
4 ‘Hi’
$data[0] = 35; $data[1] = 12.4; $data[2] = “bye\n”; $data[3] = 1.7e23; $data[4] = ‘Hi’;
01-11-06
@ a 11
Array operators
Introduction to programming in Perl WS 2006/07: Bioinformatics I
¾ perldoc –f function_name
01-11-06 12
Search for a name
Introduction to programming in Perl WS 2006/07: Bioinformatics I
#!/usr/bin/perl
use strict;
use warnings;
01-11-06 13
Comparison operators
Introduction to programming in Perl WS 2006/07: Bioinformatics I
01-11-06 14
What is true or false?
Introduction to programming in Perl WS 2006/07: Bioinformatics I
• Anything else converted to a true value string or a true value number is true.
01-11-06 15
Logical operators
Introduction to programming in Perl WS 2006/07: Bioinformatics I
! Use parentheses !
01-11-06 16
Conditional statements
Introduction to programming in Perl WS 2006/07: Bioinformatics I
• Simple
Statement if (Expression);
• Compound
if (Expression) Block
01-11-06 17
#!/usr/bin/perl
use strict;
use warnings;
01-11-06 18
Check for a name
Introduction to programming in Perl WS 2006/07: Bioinformatics I
#!/usr/bin/perl
use strict;
use warnings;
for (my $i = 0; $i < scalar(@names); $i = $i + 1) { # block start for the for loop
if ($names[$i] eq “Simon”) {
print “Simon is found\n”;
}
} # end block for the for loop
01-11-06 19
#!/usr/bin/perl
use strict;
use warnings;
for (my $i = 0; $i < scalar(@names); $i = $i + 1) { # block start for the for loop
if ($names[$i] eq “Simon”) {
print “Simon is found\n”;
last; # jump outside of the loop
}
} # end block for the for loop
01-11-06 20
Loop statements
Introduction to programming in Perl WS 2006/07: Bioinformatics I
• Simple
• Compound
01-11-06 21
Hashes
Introduction to programming in Perl WS 2006/07: Bioinformatics I
Values
John 5
Peter 3
Keys
Simon 11
Dave 1
Chris 4
%names
$names{“Dave”} = 1 $names{“Chris”} = 4
01-11-06
% Key/value 22
Check for a name
Introduction to programming in Perl WS 2006/07: Bioinformatics I
#!/usr/bin/perl
use strict;
use warnings;
if (exists $names{$key}) { exists return true if the key is in %names otherwise false
print “$key is found, his value is : $names{$key}\n”;
}
else {
print “$key is not found\n”;
}
01-11-06 23
#!/usr/bin/perl
use strict;
use warnings;
my %names = (
“John” => 5,
“Peter” => 3,
“Simon” => 11,
“Dave” => 0,
“Chris” => 4
);
my $key = “Simon”;
if (exists $names{$key}) {
print “$key is found, his value is : $names{$key}\n”;
}
else {
print “$key is not found\n”;
}
01-11-06 24
Hash operators
Introduction to programming in Perl WS 2006/07: Bioinformatics I
¾ perldoc –f function_name
01-11-06 25
#!/usr/bin/perl
use strict;
use warnings;
my $line;
print “Type something : “;
¾ Ctr-C to exit
01-11-06 26
Reading from a file
Introduction to programming in Perl WS 2006/07: Bioinformatics I
#!/usr/bin/perl
use strict;
use warnings;
print @names;
01-11-06 27
#!/usr/bin/perl
use strict;
use warnings;
01-11-06 28
Input and output functions
Introduction to programming in Perl WS 2006/07: Bioinformatics I
¾ perldoc –f function_name
01-11-06 29
Testing files
Introduction to programming in Perl WS 2006/07: Bioinformatics I
01-11-06 30
Regular expressions
Introduction to programming in Perl WS 2006/07: Bioinformatics I
#!/usr/bin/perl
use strict;
use warnings;
My $filename = “data.txt”;
my $line; Input file : data.txt
my %data = ();
my $key;
>id1
open(IN, $filename) || die “Could not open $filename\n”;
ATTGTC
while ($line = <IN>) { >id2
chomp($line); GGTCCT
if ($line =~ /^>/) { # check for ids using pattern matching >id3
$key = $line; TATGAAA
} >id4
else {
GTGTATA
data{$key} = $line;
}
}
close(IN);
my @ids = keys %data;
my @sequences = values %data;
$, = “ “;
print @ids, “\n”, @sequences, “\n”;
01-11-06 31
Regular expressions
Introduction to programming in Perl WS 2006/07: Bioinformatics I
EXPR =~ m/PATTERN/
m// Operator (Matching): searches the string in the scalar EXPR (or $_) for
PATTERN; in scalar context the operator returns true (1) if successful, false (””)
otherwise; in list context m// returns a list of substrings matched by any
capturing parentheses in PATTERN; PATTERN undergoes double-quote
interpolation.
$line = “>id1” => $line =~ /^>/
VAR =~ s/PATTERN/REPLACEMENT/
s/// Operator (Substitution): searches the string in scalar variable VAR (or $_) for
PATTERN and, if found, replaces the matched substring with the
REPLACEMENT text; in scalar and list context s// returns the number of times it
succeeded; both PATTERN and REPLACEMENT undergo double-quote
interpolation.
$line = “>id1” => $line =~ s/>//
VAR =~ tr/SEARCHLIST/REPLACEMENTLIST/
tr/// Operator (Transliteration): scans the string in scalar variable VAR (or $_) ,
character by character, and replaces each occurrence of a character found in
SEARCHLIST with the corresponding character in REPLACEMENT list; in scalar
and list context tr// returns the number of characters replaced or deleted;
SEARCHLIST is NOT a regular expression and both SEARCHLIST and
REPLACEMENT list do not undergo full double-quote interpolation (backslash
sequences but no variable interpolation).
$line = “id1” => $line =~ tr/a-z/A-Z/
01-11-06 32
Regular expressions
Introduction to programming in Perl WS 2006/07: Bioinformatics I
#!/usr/bin/perl
use strict;
use warnings;
my $filename = “data.txt”;
my $line;
my %data = (); Input file : data.txt
my $key;
>id1
open(IN, $filename) || die “Could not open $filename\n”;
while ($line = <IN>) { ATTGTC
chomp($line); >id2
if ($line =~ /^>/) { #check for ids using pattern matching GGTCCT
$line =~ s/>//; #substitute > by nothing in id
$line =~ tr/a-z/A-Z/; #translate lower case to upper case
>id3
$key = $line; TATGAAA
} >id4
else { GTGTATA
data{$key} = $line;
}
}
close(IN);
my @ids = keys %data;
my @sequences = values %data;
$, = “ “;
print @ids, “\n”, @sequences, “\n”;
01-11-06 33
Regular expressions
Introduction to programming in Perl WS 2006/07: Bioinformatics I
Symbol Meaning
\... Used to escape metacharacters (including itself) or to make the
next character a metacharacter (like \s, \w, \n)
...|... Alternation (match one or the other)
(...) Grouping (treat as a unit)
[...] Character class (match one character from a set)
^ True at the beginning of string (or sometimes after any newline)
$ True at the end of the string (or sometimes before any newline)
. Match any one character (except newline, normally)
$seq =~ /AAA$/
01-11-06 34
Regular expressions
Introduction to programming in Perl WS 2006/07: Bioinformatics I
Quantifier Meaning
* Match 0 or more times (maximal)
+ Match 1 or more times (maximal)
? Match 0 or 1 time (maximal)
{COUNT} Match exactly COUNT times
{MIN,} Match at least MIN times (maximal)
{MIN,MAX} Match at least MIN times but not more than MAX times (maximal)
*? Match 0 or more times (minimal)
+? Match 1 or more times (minimal)
?? Match 0 or 1 time (minimal)
{MIN,}? Match at least MIN times (minimal)
{MIN,MAX}? Match at least MIN times but not more than MAX times (minimal)
01-11-06 35
Regular expressions
Introduction to programming in Perl WS 2006/07: Bioinformatics I
\D Nondigit [^0-9]
\s Whitespace [ \t\n\r\f]
\S Nonwhitespace [^ \t\n\r\f]
01-11-06 36
Subroutines or functions
Introduction to programming in Perl WS 2006/07: Bioinformatics I
#!/usr/bin/perl
use strict;
use warnings; Input file : data1.txt
my $filename1 = “data1.txt”;
>id1
my $filename2 = “data2.txt”;
my %data1 = get_data($filename1); #subroutine call ATTGTC
my %data2 = get_data($filename2); #subroutine call >id2
$, = “ “; GGTCCT
print keys %data1, “\n”, values %data2, “\n”; >id3
print keys %data1, “\n”, values %data2, “\n”; TATGAAA
sub get_data { >id4
my $filename = shift(@_); GTGTATA
my $key;
my %tmp = ();
open(IN, $filename) || die “Could not open $filename\n”;
while (my $line = <IN>) { Input file : data2.txt
chomp($line);
if ($line =~ /^>/) {
$line =~ s/>//; >id5
$line =~ tr/a-z/A-Z/; ATAAAAA
$key = $line; >id6
}
else { GGAATTT
$tmp{$key} = $line; >id7
} TATGATT
} >id8
close(IN); GTGTAAT
return %tmp;
}
01-11-06 37
Packages
Introduction to programming in Perl WS 2006/07: Bioinformatics I
• Recommended Books
– Beginner
» “Learning Perl”, 4th Edition by Randal Schwartz, Tom
Phoenix & Brian D Foy