Match Function
The match
function attempts to match a re
pattern to string with optional flags.
Here is the syntax for this function ?
re.match(pattern, string, flags=0)
Where,
pattern
is the regular expression to be matched,
string
is the string to be searched to match the pattern at the beginning of string and
flags
, which you can specify different flags using bitwise OR (|
).
Match Flags
Modifier | Description |
---|---|
re.I |
Performs case-insensitive matching. |
re.L |
Interprets words according to the current locale. This interpretation affects the alphabetic group (\w and \W ), as well as word boundary behavior (\b and \B ). |
re.M |
Makes $ match the end of a line and makes ^ match the start of any line. |
re.S |
Makes a period (dot) match any character, including a newline. |
re.U |
Interprets letters according to the Unicode character set. This flag affects the behavior of \w , \W , \b , \B . |
re.X |
It ignores whitespace (except inside a set [] or when escaped by a backslash and treats unescaped # as a comment marker. |
Return values
- The
re.match
function returns amatch object
on success andNone
upon failure. - - Use
group(n)
orgroups()
function of match object to get matched expression, e.g.,group(n=0)
returns entire match (or specific subgroupn
) - The function
groups()
returns all matching subgroups in a tuple (empty if there weren't any).
Example 1
Let's find the words before and after the word to
:
-
Input
The first group (.*)
identified the string: Learn
and the next group (*.?)
identified the string: Analyze
.
Example 2
groups([default])
returns a tuple containing all the subgroups of the match, from 1 up to however many groups are in the pattern.
-
Input
Example 3
groupdict([default])
returns a dictionary containing all the named subgroups of the match, keyed by the subgroup name.
-
Input
Example 4
Start, end. How can we match the start or end of a string? We can use the "A" and "Z" metacharacters. We precede them with a backslash. We match strings that start with a certain letter, and those that end with another.
-
Input
start([group])
and end([group])
return the indices of the start and end of the substring matched by group
. See the next lesson for an example.
The Match Function Exercise
Let’s assume that we have a string variable called line
, where line = "Learn to Analyze Data with Scientific Python"
passed to a function called regex_processor()
in the following code.
Can you find the word "Analyze"
after the word "to"
?
-
Input
The first group (.*)
identified the string: Learn
and the next group (*.?)
identified the string: Analyze
.