Directory
1. Regular expressions
3.1 Matching function
3.2 Retrieval and Replacement
3.3 Regular Expression Objects
Second, the common method to use
Third, the use of generic functions
Fourth, the context manager
5. Decorator
6. Abnormal
6.1 Throwing and catching exceptions
1. Regular expression
A regular expression is a special sequence of characters that helps check whether a string matches a certain pattern
The re module has been added to python, so that the Python language has all the regular expression functions
- re.match(): Match from the beginning of the string, and return none if unsuccessful.
- re.search(): Matches the entire string until a match is found.
- re.groups(): returns matching expressions
- re.sub(): Used to replace matches in a string.
- re.compile (): used to compile the regular expression and generate a regular expression (Pattern) object for use by the two functions match() and search()
- re.finditer(): Find all substrings matched by the regular expression in the string and return them as an iterator.
- re.split(): The matched substring splits the string and returns a list
(1) Special character classes and their meanings
(2) Character classes
(3) Regular expression mode
Pattern strings use a special syntax to represent a regular expression:
Letters and numbers represent themselves. Letters and numbers in a regular expression pattern match the same string. Most letters and numbers have a different meaning when preceded by a backslash. Punctuation marks match themselves only if they are escaped, otherwise they have a special meaning. The backslash itself needs to be escaped with a backslash (\). Since regular expressions often contain backslashes, it’s best to use raw strings to represent them. Pattern elements (such as r’\t’, equivalent to ‘\t’) match the corresponding special characters.
The following table lists the special elements in the regular expression pattern syntax. When using a pattern with optional flags arguments, the meaning of some pattern elements changes.
3.1 Matching function
(1) re.match function
Match a pattern from the starting position of the string. If the match is successful, output the range of the character (closed before opening and then opened). If the matching is not successful at the starting position, match() will return none.
re.match(pattern, string, flags=0) #pattern: matching regular expression;
import re #import module print(re.match('www', 'www.runoob.com').span()) # match at the beginning output (0, 3) print(re.match('com', 'www.runoob.com')) # does not match at the beginning position output None
Use the match object function to get the match expression:
- group(num): Get the element corresponding to the num subscript;
- group() gets all elements.
- groups(): Returns a tuple of all group strings, from 1 to the contained group number
import re line = "Cats are smarter than dogs" #Define a string variable matchObj = re.match( r'(.*) are (.*?) .*', line, re.M|re.I) if matchObj: matchObj: print("matchObj.group() : ", matchObj.groups()) # ('Cats', 'smarter') print("matchObj.group() : ", matchObj.group()) # matchObj.group() : Cats are smarter than dogs print("matchObj.group(1) : ", matchObj.group(1)) # matchObj.group(1) : Cats print("matchObj.group(2) : ", matchObj.group(2)) # matchObj.group(2) : smarter else: print("No match!!")
(2) re.search method
Scans the entire string and returns the first successful match. Use the group(num) or groups() match object functions to obtain match expressions.
re. search(pattern, string, flags=0)
import re print(re.search('www', 'www.runoob.com').span()) # The location of www is (0, 3) print(re.search('com', 'www.runoob.com').span()) # The position of com is (11, 14) before closing and then opening line = "Cats are smarter than dogs" searchObj = re.search( r'(.*) are (.*?) .*', line, re.M|re.I) if searchObj: print("searchObj.group() : ", searchObj.group()) # Cats are smarter than dogs print("searchObj.group(1) : ", searchObj.group(1)) # Cats print("searchObj.group(2) : ", searchObj.group(2)) # smarter else: print("Nothing found!!")
(3) The difference between re.match and re.search
re.match only matches the beginning of the string, if the beginning of the string does not match the regular expression, the match fails and the function returns None; while re.search matches the entire string until a match is found (returns its position).
import re line = "Cats are smarter than dogs" matchObj = re.match(r'dogs', line, re.M|re.I) #Start matching dog from the string, because the starting value is not dog, so the match is unsuccessful if matchObj: print("match --> matchObj.group() : ", matchObj.group()) else: print("No mtch!!") # Execute this statement matchObj = re.search(r'dogs', line, re.M|re.I) #traverse the entire string to match dogs, and the match can be successful if matchObj: print("search --> searchObj.group() : ", matchObj.group()) #Execute this statement and output dogs (refer to matchobj for specific output) else: print("No match!!")
3.2 Search and Replace
(1) The re module provides re.sub for replacing matches in strings.
re.sub(pattern, repl, string, count=0, flags=0)
- pattern : The pattern string in the regex.
- repl : The string to replace, it can also be a function.
- string : The raw string to be searched and replaced.
- count : The maximum number of replacements after pattern matching, default 0 means replace all matches.
import re phone = "2004-959-559 # This is a foreign phone number" # Remove Python comments from the string num = re.sub(r'#.*$', "", phone) print("The phone number is: ", num) #The phone number is: 2004-959-559 # Remove non-numeric (-) strings num = re.sub(r'\D', "", phone) print("The phone number is: ", num) #The phone number is: 2004959559
(2) re.compile function
Used to compile regular expressions to generate a regular expression (Pattern) object for use by the two functions match() and search().
- re.I ignore case
- re.L represents the special character set \w, \W, \b, \B, \s, \S depends on the current environment
- re.M multiline mode
- re.S is . and any character including newline (. does not include newline)
- re.U represents the special character set \w, \W, \b, \B, \d, \D, \s, \S depends on the Unicode character attribute database
- re.X Ignore spaces and comments after # for readability
In the above, a Match object is returned when the match is successful, where:
- The
group([group1, …])
method is used to obtain one or more group matching strings. When you want to obtain the entire matching substring, you can directly usegroup()
code> orgroup(0)
; - The
start([group])
method is used to get the starting position of the substring matched by the group in the entire string (the index of the first character of the substring), and the default value of the parameter is 0; - The
end([group])
method is used to obtain the end position of the substring matched by the group in the entire string (the index of the last character of the substring + 1), and the default value of the parameter is 0; - The
span([group])
method returns(start(group), end(group))
.
(3) findall
Find all the substrings matched by the regular expression in the string, and return a list; if there are multiple matching patterns, return a list of tuples, if no match is found, return an empty list .
Attention! ! ! match and search match once, findall matches all.
findall(string[, pos[, endpos]])
- string : The string to match.
- pos : optional parameter, specify the starting position of the string, the default is 0.
- endpos : Optional parameter, specify the end position of the string, the default is the length of the string.
import re pattern = re.compile(r'\d + ') # only look for numbers result1 = pattern.findall('runoob 123 google 456') result2 = pattern.findall('run88oob123google456', 0, 10) print(result1) print(result2)
Multiple match patterns:
result = re.findall(r'(\w + )=(\d + )', 'set width=20 and height=10') print(result) # output [('width', '20'), ('height', '10')]
3.3 Regular Expression Object
re.compile() returns a RegexObject object.
group() returns the strings matched by the RE.
- start() returns the position where the match starts
- end() returns the position where the match ends
- span() returns a tuple containing the position of the match (start, end)
2. Commonly used methods
import re print(re.match('www', 'www.runoob.com').span()) # match at the starting position (0, 3) print(re.match('com', 'www.runoob.com')) # does not match none at the beginning #------Find all the numbers in the string --------# pattern = re.compile(r'\d + ') # find numbers result1 = pattern.findall('runoob 123 google 456') result2 = pattern.findall('run88oob123google456', 0, 10) print(result1) # ['123', '456'] print(result2) #['88', '12'] #------#Multiple matching patterns, return a list of tuples --------# result = re.findall(r'(\w + )=(\d + )', 'set width=20 and height=10') print(result) #[('width', '20'), ('height', '10')]
1. Regular search URL
import re str1 = input() result = re.match("https://www", str1) print(result.span()) ##Output the range of returned results # print(re.match('https://www', str1).span()) #Output the range of the URL from the beginning match to the first mismatch.
2. map() function:
from collections.abc import Iterator #import iterator map_obj = map(lambda x: x*2, [1,2,3,4,5]) print(isinstance(map_obj,Iterator)) #True print(list(map_obj)) #[2, 4, 6, 8, 10] def square(x): #define a function square() v = x**2 print(v) print(list(map(square, [1,2,3,4,5])))
3. filter () function:
filter_obj = filter(lambda x: x > 5, range(0,10)) print(list(filter_obj))
4.isinstance(): You can determine whether an object is an instance of a specific type or custom class.
print(isinstance("hello world", str)) #True print(isinstance(10,int)) #True print(isinstance(10.0,float)) #True print(isinstance(5,float)) #False
5.hasattr(obj,attribute): Determine whether the target object obj contains the attribute attribute
aa = hasattr(json,"dumps") print(aa) ##True bb = getattr(json,"__path__") #Get the value of attribute __path__ print(bb) # ['D:\Anacoda3\lib\json']
6. callable(): Determine whether an object is callable (such as functions and classes, these objects are callable objects.)
print(callable("hello python")) #False print(callable(list)) #True
7. Module
print(json.__doc__) #Query the documentation of the module json, which outputs the same content as help() print(json.__name__) #query the name of the module json print(json.__file__) #Query the file path of the module json. If the built-in module does not have this attribute, accessing it will throw an exception! print(json.__dict__) #Query the dictionary type object of module json
3. Use of generic functions
Generic:
singledispatch
in Python: calls different functions according to different types of incoming parameters.
from functools import singledispatch @singledispatch def age(obj): print('Please pass in a legal type of parameter!') @age. register(int) def _(age): print('I am {} years old.'.format(age)) @age. register(str) def _(age): print('I am {} years old.'. format(age)) age(23) # int I am 23 years old. age('twenty three') # str I am twenty three years old. age(['23']) # list Please pass in parameters of legal type!
(1) Splicing of functions
from functools import singledispatch def check_type(func): #General stitching function def wrapper(*args): arg1, arg2 = args[:2] if type(arg1) != type(arg2): return '[Error]: The parameter types are different and cannot be spliced!!' return func(*args) return wrapper @singledispatch def add(obj, new_obj): raise TypeError @add. register(str) @check_type def _(obj, new_obj): obj + = new_obj #String concatenation return obj @add. register(list) @check_type def _(obj, new_obj): obj.extend(new_obj) #list splicing return obj @add. register(dict) @check_type def _(obj, new_obj): obj.update(new_obj) #dictionary stitching return obj @add. register(tuple) @check_type def _(obj, new_obj): return (*obj, *new_obj) #tuple splicing print(add('hello',', world')) #hello, world print(add([1,2,3], [4,5,6])) #[1, 2, 3, 4, 5, 6] print(add({'name': 'wangbm'}, {'age':25})) #{'name': 'wangbm', 'age': 25} print(add(('apple', 'huawei'), ('vivo', 'oppo'))) #('apple', 'huawei', 'vivo', 'oppo') #list and string cannot be concatenated (different types cannot be concatenated) print(add([1,2,3], '4,5,6')) # Output: [Error]: Parameter types are different, cannot be spliced!!
four, context manager
Benefits of context managers: Improve code reuse, elegance, and readability;
(1) Read the file content
file = open("C:\Users\HY\Desktop\Autotest\1.txt") print(file.readline()) #read the first line print(file.read()) #read the entire content of the file file.close() #Manually close the file handle
(2) Use the with keyword to read the file, and the file handle can be automatically closed after reading (with is the context manager)
- Context expression:
with open('test.txt') as file:
Context manager:
open('test.txt')
file
is a resource object
with open("C:\Users\HY\Desktop\Autotest\1.txt") as file: print(file. read())
(3) Implement a context manager in the class: That is to say, define in the class: __enter__ and __exit__ methods, the instance of this class is also the context manager
class Resource(): def __enter__(self): print("-----connect to resource-----") return self def __exit__(self, exc_type, exc_val, exc_tb): print("-----close resource connection------") def func(self): print("----Execute the logic inside the function") with Resource() as result: result. func() # output: # -----connect to resource----- # ---- Execute the logic inside the function # -----close resource connection------
When writing the __exit__
function, it must have these three parameters:
-
exc_type: exception type
-
exc_val: abnormal value
-
exc_tb: exception error stack information
When the main logic code does not report an exception, these three parameters will all be None.
(4) Use contextlib to build a context manager (implement the context manager through a function instead of a class)
In python, the contextlib protocol implements a context manager for opening files (with open).
import contextlib @contextlib.contextmanager def open_func(file_name): # __enter__ method print("open file:",file_name,"in__enter__") file_handler = open(file_name,"r") #open file yield file_handler # generator (with yield) # __exit__ method print("close file:",file_name,"in__exit") file_handler.close() #close the handle return with open_func("C:\Users\HY\Desktop\Autotest\1.txt") as file_in: for line in file_in: print(line)
The above code can only achieve the first purpose of the context manager (manage resources), but cannot achieve the second purpose (handle exceptions). If you want to handle exceptions, you can change it to:
import contextlib @contextlib.contextmanager def open_func(file_name): # __enter__ method print("open file:",file_name,"in__enter__") file_handler = open(file_name,"r") #read file try: yield file_handler # generator (with yield) except Exception as exc: print("the exception was thrown") finally: print("close file:",file_name,"in__exit") # __exit__ method file_handler.close() #close the handle return with open_func("C:\Users\HY\Desktop\Autotest\1.txt") as file_in: for line in file_in: 1/0 print(line)
5. Decorator
A decorator is essentially a Python function, which allows other functions to add additional functions without any code changes. The return value of the decorator is also a function object.
It is often used in scenarios with cross-cutting requirements: such as inserting logs, performance testing, transaction processing, caching, permission verification and other scenarios. With decorators, we can extract a large amount of identical code that has nothing to do with the function itself and continue to reuse it.
How to use the decorator:
First define a decorator decorator (hat)
Then define your business function or class (human) wrapper
Finally, put this decorator (hat) on the head of this function (person)
#================== Use of decorators ============================ = # define decorator def decorator(func): def wrapper(*args,**kw): return func() return wrapper # Define business functions and decorate @decorator def function(): print("hello world")
(1) Ordinary decorator
#===================Use of ordinary decorators========================== == #Define the decorator function, logger is the decorator, and the parameter func is the decorated function def logger(func): def wrapper(*args,**kw): print('Start executing {} function: '.format(func.__name__)) # #Function logic body: the logic that really needs to be executed func(*args,**kw) print('Execution completed') return wrapper #Write function, specific function @logger def add(x,y): print(f"{x} + {y}={x + y}") #Both outputs are available print("{} + {}={}". format(x,y,x + y)) add(20,30)
Six, Abnormal
Exception: An error that causes a program to abort and exit abnormally during execution. Under normal circumstances, Exceptions will not be handled by the program, but displayed in the form of error messages. All exceptions are exception classes with the first letter capitalized!
- SyntaxError: syntax error
- TypeError: type error, that is to say, when an operation or function is applied to an object of an inappropriate type, such as addition and subtraction of integer and character
- IndexError: An error occurred in the index, such as the most common subscript index beyond the sequence boundary
- KeyError:Keyword error, mainly occurs in the dictionary, for example, it will be triggered when the user tries to access a key that does not exist in the dictionary.
- ValueError: Raised when a value that the caller does not expect is passed in, even if the type of the value is correct, such as trying to get the index of a value that does not exist in a list.
- AttributeError: Attribute error, raised when trying to access a non-existing attribute of an object. (For example, a dictionary has a get method, but a list does not. If the list object calls the get method, this exception will be thrown.)
- NameError: A variable name error occurs, such as when the user tries to call a variable that has not been assigned or initialized.
- IOError: Open file error, raised when the user tries to open a non-existent file for reading.
- StopIteration: Iterator error, when the last value of the iterator is still accessed, this exception will be thrown, reminding the user that there is no value in the iterator for access
- AssertionError: Assertion error, when the user uses an assertion statement to detect an exception, if the expression detected by the assertion statement is false, this exception will be thrown.
- IndentationError: indentation error
- ImportError: An error occurred during the package import process, the package name is wrong or the path is wrong, the package is not installed, and an ImportError is thrown
6.1 Throwing and catching exceptions
Exception handling includes: throwing and catching.
Capture refers to using try….except to wrap a specific statement, while raise is to actively throw an exception
(1) Throw an exception
Exceptions come from two sources:
-
The program automatically throws: For example,
1/0
will automatically throw ZeroDivisionError -
Raised by the developer: use the
raise
keyword to raise.
def demo_func(filename): if not os.path.isfile(filename): raise Exception #raise throws an exception
(2) Catch exception
There are four syntaxes for exception capture:
#====================== Only capture, not get exception information ==================== # try: code A except [EXCEPTION]: code B #========= Captured, but also to get the exception information, after assigning it to e, print the exception information to the log. =================# try: code A except [EXCEPTION] as e: code B #=================== Code A has an exception, it will go to the logic of code B ================== ======# try: code A except [exception] as e : code B #============= If an exception occurs in code A, it will go to the logic of code B. If there is no exception, it will go to code C ============ =# try: code A except [exception] as e: code B else: code C #============= If an exception occurs in code A, it will go to the logic of code B, and finally code C will be executed regardless of whether there is an exception or not=============# try: code A except [exception] as e: code B finally: Code C
(3) Catch multiple exceptions: except can catch one or more exceptions
1) Each except catches an exception
A try statement may have multiple except clauses to specify handlers for different exceptions, but at most one handler will be executed.
try: 1/0 #An exception is thrown here, because the divisor cannot be 0 except IOError: print("IO read and write error") except FloatingPointError: # Floating point calculation error print("calculation error") except ZeroDivisionError: #Therefore, an exception is caught here, and this part of the code is executed, and the rest of the exception code is not executed # divisor cannot be 0 print("calculation error") #The final output is: calculation error
2) One except catches multiple exceptions
Except can be followed by multiple exceptions, Use parentheses between multiple exceptions. As long as it matches the previous one, it will be captured, and it will enter the corresponding code branch.
try: 1/0 except IOError: print("IO read and write error") except (ZeroDivisionError, FloatingPointError): #Exception caught here print("calculation error")
(4) Custom exception
Custom exceptions should inherit from the Exception
class, either directly or indirectly.
Custom exception or error class, use InputError
below (The name of the exception ends with Error
, we also need to follow this when naming the custom exception A specification, just like the standard exception naming), indicating that a problem occurred while accepting user input.
class InputError(Exception): def __init__(self, msg): self. message = msg def __str__(self): return self. message def get_input(): name = input("Please enter your name:") if name == '': raise InputError("No input content") try: get_input() except InputError as e: print(e)
(5) How to turn off the exception automatic association context
If an exception is thrown in an exception handler or finally block, by default the exception mechanism works implicitly by attaching the previous exception as the new exception’s __context__
attribute. This is Python’s auto-correlation exception context enabled by default.
If you want to control this context, you can add a from keyword (the syntax of from
has a restriction, that is, the second expression must be another exception class or instance.) to indicate your new exception is directly caused by which exception
try: print(1/0) #Throw an exception except Exception as exc: raise RuntimeError("Something bad happened") from exc #Execute this statement and throw a RuntimeError while throwing a ZeroDivisionError exception generated by 1/0
You can also use the with_traceback()
method to set the context __context__
attribute for the exception, which can also better display the exception information in the traceback
.
try: print(1 / 0) except Exception as exc: raise RuntimeError("bad thing").with_traceback(exc)
Summary:
Only catch statements that may throw exceptions, avoiding ambiguous catch logic
Maintain the abstract consistency of module exception classes, and wrap the underlying exception classes when necessary
Repeated exception handling logic can be simplified by using a “context manager”