Python Regular Expressions (re)

Regular expressions (regex) are patterns used to match sequences of characters in strings. Python provides the re module to work with regular expressions for tasks like searching, matching, replacing, and extracting specific patterns from text.

Importing the re Module

To use regular expressions in Python, you must first import the re module.

import re

Try It Now

Basic Regular Expression Functions

Here are the most commonly used functions in the re module:

  • re.match(): Determines if the regular expression matches at the beginning of the string.
  • re.search(): Searches for the first occurrence of the pattern in the string.
  • re.findall(): Returns all occurrences of the pattern in the string as a list.
  • re.sub(): Replaces occurrences of the pattern with a specified string.

Example 1: re.match()

import re

text = "Python is fun"
pattern = r"Python"

# Check if the text starts with "Python"
match = re.match(pattern, text)
if match:
    print("Match found!")  # Output: Match found!
else:
    print("No match.")

Try It Now

Example 2: re.search()

import re

text = "I love Python programming"
pattern = r"Python"

# Search for "Python" anywhere in the string
result = re.search(pattern, text)
if result:
    print("Pattern found!")  # Output: Pattern found!
else:
    print("Pattern not found.")

Try It Now

Example 3: re.findall()

import re

text = "The rain in Spain falls mainly on the plain"
pattern = r"\bain\b"  # Matches "ain" as a whole word

# Find all occurrences of the pattern
matches = re.findall(pattern, text)
print(matches)  # Output: ['ain', 'ain']

Try It Now

Example 4: re.sub()

import re

text = "The sky is blue. The ocean is blue."
pattern = r"blue"

# Replace "blue" with "green"
new_text = re.sub(pattern, "green", text)
print(new_text)  # Output: The sky is green. The ocean is green.

Try It Now

Special Characters in Regular Expressions

Regular expressions use special characters to represent specific patterns. Here are some common ones:

Character Description Example
. Matches any character (except newline). r"a.b" matches “acb”, “a7b”, etc.
^ Matches the start of the string. r"^Hello" matches “Hello world”.
$ Matches the end of the string. r"world$" matches “Hello world”.
* Matches 0 or more repetitions of the preceding character. r"do*g" matches “dg”, “dog”, “dooog”, etc.
+ Matches 1 or more repetitions of the preceding character. r"do+g" matches “dog”, “dooog”, etc.
? Matches 0 or 1 occurrence of the preceding character. r"do?g" matches “dg” or “dog”.
\d Matches any digit (0-9). r"\d" matches “1”, “9”, etc.
\w Matches any word character (alphanumeric). r"\w" matches “a”, “9”, etc.
\s Matches any whitespace character (space, tab, newline). r"\s" matches ” “, “\t”, etc.

Compiling Regular Expressions

You can compile regular expressions for better performance, especially if the pattern is used multiple times.

import re

pattern = re.compile(r"\d+")  # Matches one or more digits
text = "Order number: 12345"

# Use the compiled pattern to search
result = pattern.search(text)
if result:
    print(f"Found: {result.group()}")  # Output: Found: 12345

Try It Now

Conclusion

Regular expressions are a powerful tool for text processing in Python. Understanding how to use re functions and special characters will help you solve complex string-matching problems with ease. Practice using regex to unlock its full potential!