A Quick Guide to Regex in Python
Introduction
In this article, we will discover regular expressions (RegEx) and work with RegEx using Python's re-module (with the help of examples). A string of characters known as a regular expression (RegEx) characterizes a search pattern. In order to work with regular expressions, Python has a module called re. We must import the module in order to use it. To work with RegEx, the module defines a number of functions and constants which we will see one by one with the associated code.
The table below highlights all the important regex rulesets.
re.findall()
The list of strings that the re.findall() method returns contains all matches. The empty list is returned by re.findall() if the pattern is not found.
Code to extract numbers from a string:
import re |
# Output: ['12', '89', '34'] |
write your code here: Coding Playground
re.split()
When there is a match, the re.split function separates the matching string and returns a list of the split strings. Re.split() provides a list representing the original text if the pattern was not detected. The re.split() function accepts the maxsplit parameter. It represents the highest number of splits possible. The maximum split is set by default to 0 and includes all splits.
Example 1:
import re |
|
write your code here: Coding Playground
Example 2
import re |
# Output: ['Twelve:', ' Eighty nine:89 Nine:9.'] |
re.sub()
Syntax -> re.sub(pattern, replace, string)
The method delivers a string with the replace variable's contents substituted for all instances that match. The original string is returned by re.sub() if the match is not detected. The re.sub() function accepts count as a fourth argument. If left out, it equals 0. This will take the place of all instances.
Code to remove all whitespaces
import re |
|
Example 2
import re |
# Output: |
re.subn()
The difference between re.subn() and re.sub() is that the latter provides a tuple of two items that includes the new string and the number of replacements that were made.
Code to remove all whitespaces
import re |
|
re.search()
A pattern and a string are the two inputs required by the re.search() function. The first place where the RegEx pattern and the string match is where the method looks. Re.search() provides a match object if the query is successful and None if it is unsuccessful.
Syntax -> match = re.search(pattern, str)
Example
import re |
# Output: pattern found inside the string |
Using Regular Character Prefixes in RegEx
Before a regular expression, the letters r or R stand for raw string. For instance, "n" stands for a new line, whereas "r" and "n" stand for a backslash and an n, respectively. All metacharacters, including other characters, can be escaped with the backlash symbol. Nevertheless, the r prefix causes it to be treated as a regular character.
Example
import re |