Escolar Documentos
Profissional Documentos
Cultura Documentos
PERL
AGENDA
Introduction How pattern matching works Pattern Matching Operators Special Characters in Pattern Matching Pattern matching options Pattern Substitution Translation
Page 1
Introduction
A pattern/expression is a sequence of characters to be searched for in a character string. In Perl patterns are normally enclosed in slash characters.: ex - /def/
Page 2
if PATTERN is found
Page 3
The matching operator (m//) is used to find patterns in strings. One of its more common uses is to look for a specific string inside a data file. The matching operator only searches the $_ variable. In m/ / we have two standard delimiters( / /) In m!! we have two other delimiters (!!) which is used when we want to match a path.
Refer reg.pl
Page 4
Perl defines special operators that test whether a particular pattern appears in a character string.
The =~ operator tests whether a pattern is matched Ex - $result = $var =~ /abc/; In this example, the value stored in the scalar variable $var is searched for the pattern abc. If abc is found, $result is assigned a nonzero value; otherwise, $result is set to zero.
Page 5
Continue..
The !~ operator is similar to =~, except that it checks whether a pattern is not matched. Because =~ and !~ produce either true or false as their result, these operators are ideally suited for use in conditional
expressions.
Refer reg1.pl,reg5.pl
Page 6
Variable Interpolation Character Sequences Alternation Character Classes Symbolic Character Classes Anchors Quantifiers Pattern Memory Word Boundaries Quoting Meta-Characters
Page 7
Variable Interpolation
Any variable is interpolated, and the essentially new pattern then is evaluated as a regular expression.
Page 8
Character Sequences
A sequence of characters will match the identical sequence in the searched string. Ex - /def/ will match with def but not efd or
dfe.
Page 9
Alternation (|)
The alternation meta-character (|) will let us match more than one possible string. Ex - m/a|b/; will match if either the "a" character or the "b" character is in the searched string. You can use sequences of more than one character with alternation. Ex - m/dog|cat/; will match if either of the strings "dog" or "cat" is in the searched string.
Page 10
Character Classes []
The square brackets are used to create character classes. A character class is used to match a specific type of character. The character class [0123456789] defines the
class of decimal digits. [0-9a-f] defines the class of hexadecimal digits. Refer reg2
Page 11
Page 12
Page 13
Continue
\s - This symbol matches any space, tab, or newline character. It is equivalent to [\t \n]. \S - This symbol matches any non-whitespace character. It is equivalent to [^\t \n]. \d - This symbol matches any digit. It is equivalent to [0-9]. \D - This symbol matches any non-digit character. It is equivalent to [^0-9].
Page 14
Anchors(^ , $)
The caret (^) and the dollar sign ($) are used to
anchor a pattern to the beginning and the end of the searched string. The caret is always the first character in the pattern when used as an anchor. For example, m/^one/; will only match if the searched string starts with sequence of characters, one. The dollar sign is always the last character in the pattern when used as an anchor. For example, m/end$/; will match only if the searched string ends with end.either the character sequence last or the character sequence end.
Page 15
Quantifiers
Perl provides several different quantifiers that let us specify how many times a given component must be present before the match is true. They are used when you don't know in advance how many characters need to be matched.
Page 16
Continue
* - The component must be present zero or more times. Ex ab*c matches with ac,abc,abbc but doesnt match with abb,bbc + - The component must be present one or
more times. Ex ab+c matches with abc,abbc,abbbc but doesnt match with ac,abb Refer reg4.pl
Page 17
Continue
? - The component must be present zero or one times. Ex ab?c matches with ac,abc but does not match with abbc,abb. {n} - The component must be present n times. Ex ab{2}c matches with only abbc.
Page 18
Continue
{n,} - The component must be present at least n times. Ex ab{2,}c matches with abbc,abbbc but doesnt match with abc. {n,m} - The component must be present at least n times and no more than m times. Ex ab{2,3}c matches with abbc,abbbc.
Page 19
Page 20
Continues..
Use parenthesis for sub patterns Sub pattern matches will be saved in $1, $2, $3... $1, $2, $3 are called Back-references Back-references will have value even if other portions did not match Back-references will have last matched value if multiple matches Refer reg6.pl, reg7.pl
Page 21
Page 22
Word Boundaries
The word-boundary pattern anchors, \b and \B, specify whether a matched pattern must be on a word boundary or inside a word boundary. The \b pattern anchor specifies that the pattern must be on a word boundary. Ex - /\bdef/ matches only if def is the beginning of a word.
Page 23
Continue
def\b/ matches def and abcdef, but not defghi . /\bdef\b/ matches only the word def, not abcdef or defghi. The \B pattern anchor is the opposite of \b. \B matches only if the pattern is contained in a word.
Page 24
Continue
Ex - /\Bdef/ matches abcdef, but not def.
Page 25
Quoting Characters
There are another ways to tell Perl that a special
character is to be treated as a normal character is to precede it with the \Q escape sequence. When the Perl interpreter sees \Q, every character following the \Q is treated as a normal character until \E is seen. Ex /\Q^ab*/ matches any occurrence of the string ^ab*:- /\Q^ab\E*/ matches ^a followed by zero or more occurrences of b
Page 26
Ex The pattern /\*+/ matches with one or more occurrences of * in a string.To include a backslash in a pattern,we have to specify two backslashes: /\\+/
Page 27
/a[0123456789]c/ pattern matches a, followed by any digit, followed by c. There is a another way of writing this: /a[0-9]c/ pattern matches a0c, a1c, a2c, and so on up to a9c. Similarly, the range [a-z] matches any lowercase letter, and the range [A-Z] matches any uppercase letter. The pattern /[A-Z][A-Z]/ matches any two uppercase letters.To match any uppercase letter,lowercase letter,or digit,we can use the following range: /[0-9azA-Z]/
Page 28
Substitution
The substitution operator is s///
specified by the placeholder pattern. If it finds pattern, it replaces it with the string represented by the placeholder replacement. substitution too
Cont..
$house = "henhouse"; $house =~ s/hen/dog/; Now, $house = doghouse
Page 30
Page 31
Translation
variable . It requires two operands, like this: tr/a/z/; =>This statement translates all occurrences of a into z . Refer reg9.pl, reg10.pl
Page 32
Translation Options
Options c
Description This option complements the match character list. In other words, the translation is done for every character that does not match the character list. This option deletes any character in the match list that does not have a corresponding character in the replacement list. This option reduces repeated instances of matched characters to a single instance of that character
Page 33
Imagination
Page 34
Action
Joy