Course


Python regex match

Learn about the Python regex match function

Match Function

The match function attempts to match a re pattern to string with optional flags.

Here is the syntax for this function ?

re.match(pattern, string, flags=0)

Where, pattern is the regular expression to be matched, string is the string to be searched to match the pattern at the beginning of string and flags, which you can specify different flags using bitwise OR (|).

Match Flags

Modifier Description
re.I Performs case-insensitive matching.
re.L Interprets words according to the current locale. This interpretation affects the alphabetic group (\w and \W), as well as word boundary behavior (\b and \B).
re.M Makes $ match the end of a line and makes ^ match the start of any line.
re.S Makes a period (dot) match any character, including a newline.
re.U Interprets letters according to the Unicode character set. This flag affects the behavior of \w, \W, \b, \B.
re.X It ignores whitespace (except inside a set [] or when escaped by a backslash and treats unescaped # as a comment marker.

Return values

  • The re.match function returns a match object on success and None upon failure. -
  • Use group(n) or groups() function of match object to get matched expression, e.g., group(n=0) returns entire match (or specific subgroup n)
  • The function groups() returns all matching subgroups in a tuple (empty if there weren't any).

Example 1

Let's find the words before and after the word to:

Python (3.7.3)
  • Input  

The first group (.*) identified the string: Learn and the next group (*.?) identified the string: Analyze.

Example 2

groups([default]) returns a tuple containing all the subgroups of the match, from 1 up to however many groups are in the pattern.

Python (3.7.3)
  • Input  

Example 3

groupdict([default]) returns a dictionary containing all the named subgroups of the match, keyed by the subgroup name.

Python (3.8.1)
  • Input  

Example 4

Start, end. How can we match the start or end of a string? We can use the "A" and "Z" metacharacters. We precede them with a backslash. We match strings that start with a certain letter, and those that end with another.

Python (3.7.3)
  • Input  

start([group]) and end([group]) return the indices of the start and end of the substring matched by group. See the next lesson for an example.

The Match Function Exercise

Let’s assume that we have a string variable called line, where line = "Learn to Analyze Data with Scientific Python" passed to a function called regex_processor() in the following code.

Can you find the word "Analyze" after the word "to"?

Python (3.7.3)
  • Input  

The first group (.*) identified the string: Learn and the next group (*.?) identified the string: Analyze.