Tasks studies - laboratory
Regular expressions are a powerful tool for manipulating strings.
They are available as libraries in most modern programming languages, including Python.
They are useful for two main tasks:
In Python, regular expressions are available via the re
module, which is part of the standard library.
Once a regular expression is defined, you can use the re.match
function to check if it matches the beginning of a string.
If it does, the function returns a match object; otherwise, it returns None
.
Other pattern matching functions include re.search
and re.findall
:
re.search
finds a pattern match anywhere in the string.re.findall
returns a list of all substrings that match the pattern.import re
pattern = r"spam"
if re.search(pattern, "ssspamspamspamsp"):
print(re.search(pattern, "ssspamspamspamsp").span().__getitem__(1))
else:
print("No match")
The search
function returns an object with several methods providing details about the match:
group()
returns the matched string.start()
and end()
return the start and end positions of the first match.span()
returns the start and end positions as a tuple.import re
pattern = r"pam"
match = re.search(pattern, "eggspamsausage")
if match:
print(match.group())
print(match.start())
print(match.end())
print(match.span())
One of the most important methods using regular expressions is sub
:
re.sub(pattern, repl, string, max=0)
This method replaces all occurrences of pattern
in string
with repl
.
By default, all occurrences are replaced unless max
is specified.
The method returns the modified string.
import re
str = "My name is David. Hi David."
pattern = r"David"
newstr = re.sub(pattern, "Amy", str)
print(newstr)
The first metacharacter is .
(dot), which matches any character except a newline.
import re
pattern = r"gr.y"
if re.match(pattern, "grey"):
print("Match 1")
if re.match(pattern, "gray"):
print("Match 2")
if re.match(pattern, "blue"):
print("Match 3")
Next, we have ^
and $
:
^
matches the beginning of a string.$
matches the end of a string.import re
pattern = r"^gr.y$"
if re.match(pattern, "grey"):
print("Match 1")
if re.match(pattern, "gray"):
print("Match 2")
if re.match(pattern, "stingray"):
print("Match 3")
Character classes allow matching only a specific set of characters.
A character class is created by placing characters inside square brackets []
.
import re
pattern = r"[aeiou]"
if re.search(pattern, "grey"):
print("Match 1")
if re.search(pattern, "qwertyuiop"):
print("Match 2")
if re.search(pattern, "rhythm myths"):
print("Match 3")
The pattern
[aeiou]
matches any string containing a vowel.
Character classes can also match ranges of characters:
[a-z]
matches any lowercase letter.[A-Z]
matches any uppercase letter.[0-9]
matches any digit.[A-Za-z]
matches any letter (uppercase or lowercase).import re
pattern = r"[A-Z][A-Z][0-9]"
if re.search(pattern, "LS8"):
print("Match 1")
if re.search(pattern, "E3"):
print("Match 2")
if re.search(pattern, "1ab"):
print("Match 3")
To negate a character class (match anything except the specified characters), use ^
at the beginning.
import re
pattern = r"[^A-Z]"
if re.search(pattern, "this is all quiet"):
print("Match 1")
if re.search(pattern, "AbCdEfG123"):
print("Match 2")
if re.search(pattern, "THISISALLSHOUTING"):
print("Match 3")
Quantifiers define how many times a pattern should be repeated.
Metacharacter | Meaning |
---|---|
* |
Zero or more repetitions |
+ |
One or more repetitions |
? |
Zero or one repetition |
{x,y} |
Between x and y repetitions |
Example of *
(zero or more repetitions):
import re
pattern = r"egg(spam)*"
if re.match(pattern, "egg"):
print("Match 1")
if re.match(pattern, "eggspamspamegg"):
print("Match 2")
if re.match(pattern, "spam"):
print("Match 3")
Example of +
(one or more repetitions):
import re
pattern = r"g+"
if re.match(pattern, "g"):
print("Match 1")
if re.match(pattern, "gggggggggggggg"):
print("Match 2")
if re.match(pattern, "abc"):
print("Match 3")
Example of ?
(zero or one repetition):
import re
pattern = r"ice(-)?cream"
if re.match(pattern, "ice-cream"):
print("Match 1")
if re.match(pattern, "icecream"):
print("Match 2")
if re.match(pattern, "sausages"):
print("Match 3")
if re.match(pattern, "ice--ice"):
print("Match 4")
Example of {x,y}
(between x
and y
repetitions):
import re
pattern = r"9{1,3}$"
if re.match(pattern, "9"):
print("Match 1")
if re.match(pattern, "999"):
print("Match 2")
if re.match(pattern, "9999"):
print("Match 3")
Before starting, make sure you meet the following requirements:
All new projects are created in the same way:
Click Create New Project in the PyCharm Quick Start menu.
Then select Django, specify the Project Name and Location.
Expand Project Interpreter: New Virtualenv Environment, select Virtualenv,
specify the location, and choose the base Python interpreter.
Expand More Settings and enter an Application Name.
Click Create – your Django project is ready.
There is much more about Django project structure, database setup, running a server, creating models, and building views,
but due to the length of the document, let me know if you need further sections translated! 🚀