Beginner Level
Intermediate Level
Advanced Level
Introduction
Regular expressions, or regex, allows us to match patterns in strings, making it a powerful tool for data processing and analysis. Word boundaries are a key feature of regex that allow us to match specific words or sequences of characters within a larger string. In Python, the re module provides support for working with regular expressions, including the use of word boundaries. This tutorial will guide you through the basics of using regex word boundaries in Python and demonstrate how to use them to match patterns in strings.
Table of Contents :
- Python Regex Word Boundary
- \b Word Boundary
- \B Word Boundary
- How to use Word Boundaries with Quantifiers
- Negated Word Boundary
Python Regex Word Boundary :
- A word boundary is a position in a string where a word character is adjacent to a non-word character.
- Word boundaries can be used in regular expressions to match whole words.
- Different types of word boundaries are :
- \b Word Boundary
- \B Word Boundary
The \b Word Boundary :
- The
\b
anchor matches a word boundary position. - Here's an example of using
\b
in a regular expression to match a whole word: - Code Sample :
import re
pattern = r"\bapple\b"
string = "I have an apple and a pineapple."
result = re.findall(pattern, string)
print(result)
# The above example will match the whole word "apple" and return it as a list.
The \B Word Boundary :
- The
\B
anchor matches a position that is not a word boundary. - Here's an example of using
\B
in a regular expression to match a partial word: - Code Sample :
import re
pattern = r"\Bcat"
string = "The cat in the hat sat on the mat."
result = re.findall(pattern, string)
print(result)
# The above example will match any occurrence of the word "cat" that is not at the beginning of a word and return it as a list.
How to use Word Boundaries with Quantifiers :
- Word boundaries can be used with quantifiers such as
+
and*
to match multiple occurrences of a whole word. - Here's an example of using
\b
and the+
quantifier in a regular expression to match all occurrences of a word: - Code Sample :
import re
pattern = r"\b\w+\b"
string = "Hello, my name is John. I am 28 years old."
result = re.findall(pattern, string)
print(result)
# The above example will match all whole words and return them as a list.
Negated Word Boundary :
- A negated word boundary,
\B
, can be used to match any position in a string that is not a whole word. - Here's an example of using
\B
in a regular expression to match any non-whole word: - Code Sample :
import re
pattern = r"\B\w+\B"
string = "Hello, my name is John. I am 28 years old."
result = re.findall(pattern, string)
print(result)
# The above example will match any non-whole words and return them as a list.
Prev. Tutorial : Regex Anchors
Next Tutorial : Quantifiers