marshmallow official site
define a test class
import datetime as dt class User: def __init__(self, name, email): self.name = name self.email = email self.created_time = dt.datetime.now()
1, Scheme
To convert a class or a json data to each other (that is, serialize and deserialize), an intermediate carrier is needed, which is Schema, and Schema can also be used for data verification.
# This is a simple Scheme from marshmallow import Schema, fields class UserSchema(Schema): name = fields. String() email = fields. Email() created_time = fields.DateTime()
2, Serializing (serialization)
Use the dump() method of scheme to serialize the object and return the data in dict format
In addition, the dumps() method of the schema serializes the object and returns a string in json-encoded format.
user = User("lhh","[email protected]") schema = UserSchema() res = schema. dump(user) print(res) # {'email': '[email protected]', 'created_time': '2021-05-28 20:43:08.946112', 'name': 'lhh'} dict res2 = schema. dumps(user) print(res2) # {"name": "lhh", "email": "[email protected]", "created_time": "2021-05-28 20:45:17.418739"} json
3. Filter output
When you do not need to output all the fields, you can declare the only parameter when instantiating Scheme to specify the output:
summary_schema = UserSchema(only={<!-- -->"name","email"}) res = summary_schema. dump(user) print(res)
4. Deserializing
The load() method of the schema is the opposite of the dump() method, and is used for deserialization of the dict type. It converts the input dictionary format data into the application layer data structure. He can also play a role in validating the input dictionary format data.
Similarly, there is also a loads() method for decoding json. Used for deserialization of string type.
By default, the load() method returns a dictionary and throws a ValidationError exception when the value of the input data does not match the field type.
user_data = {<!-- --> "name": "lhh", "email": "[email protected]", "created_time": "2021-05-28 20:45:17.418739" } schema = UserSchema() res = schema. load(user_data) print(res) # {'created_time': '2021-05-28 20:45:17.418739', 'email': '[email protected]', 'name': 'lhh'}
For deserialization, it makes more sense to turn the incoming dict into an object. In Marshmallow, the dict -> object method needs to be implemented by itself, and then a decorator post_load can be added in front of the method
class UserSchema(Schema): name = fields. String() email = fields. Email() created_time = fields.DateTime() @post_load def make_user(self, data): return User(**data)
In this way, each time the load() method is called, a User class object will be returned according to the logic of make_user.
user_data = {<!-- --> "name": "lhh", "email": "[email protected]" } schema = UserSchema() res = schema. load(user_data) print(res) # <__main__. User object at 0x0000027BE9678128> user = res print("name: {} email: {}". format(user. name, user. email)) # name: lhh email: [email protected]
5. Handling collections of multiple objects
If a collection of multiple objects is iterable, you can also directly serialize or deserialize this collection. Set the parameter many=True when instantiating the Scheme class
You can also pass in this parameter when calling the dump() method instead of setting it when instantiating the class.
user1 = User(name="lhh1", email="[email protected]") user2 = User(name="lhh2", email="[email protected]") users = [user1, user2] # the first method schema = UserSchema(many=True) res = schema. dump(users) print(res) # The second method schema = UserSchema() res = schema. dump(users,many=True) print(res)
6. Validation
When invalid data is passed through Schema.load() or Schema.loads(), a ValidationError exception will be thrown. The ValidationError.messages attribute has validation error messages, and the data that passes the validation is in the ValidationError.valid_data attribute
We catch this exception and then do exception handling. First you need to import the exception ValidationError
from marshmallow import Schema,fields,ValidationError class UserSchema(Schema): name = fields. String() email = fields. Email() created_time = fields.DateTime() try: res = UserSchema().load({<!-- -->"name":"lhh","email":"lhh"}) except ValidationError as e: print(f"Error message: {<!-- -->e.messages} Valid data: {<!-- -->e.valid_data}") ''' When verifying a data set, the returned error message will be stored in errors in the form of a key-value pair of error number-error message ''' user_data = [ {<!-- -->'email': '[email protected]', 'name': 'lhh'}, {<!-- -->'email': 'invalid', 'name': 'Invalid'}, {<!-- -->'name': 'wcy'}, {<!-- -->'email': '[email protected]'}, ] try: schema = UserSchema(many=True) res = schema. load(user_data) print(res) except ValidationError as e: print("Error message: {} Valid data: {}".format(e.messages, e.valid_data))
As you can see above, there is an error message, but there is no check for the attributes that are not passed in, that is to say, there is no requirement that the attributes must be passed in.
Specify non-default fields in the Schema: set the parameter required=True
As you can see above, there is an error message, but there is no check for the attributes that are not passed in, that is to say, there is no requirement that the attributes must be passed in.
Specify non-default fields in the Schema: set the parameter required=True
6.1 Custom authentication information
When writing a Schema class, you can set the value of the validate parameter to the built-in fields to customize the validation logic. The value of validate can be a function, an anonymous function lambda, or an object that defines __call__.
from marshmallow import Schema,fields,ValidationError class UserSchema(Schema): name = fields. String(required=True, validate=lambda s:len(s) < 6) email = fields. Email() created_time = fields.DateTime() user_data = {<!-- -->"name":"InvalidName","email":"[email protected]"} try: res = UserSchema().load(user_data) except ValidationError as e: print(e. messages)
Customize exception information in the validation function:
#encoding=utf-8 from marshmallow import Schema, fields, ValidationError def validate_name(name): if len(name) <=2: raise ValidationError("name length must be greater than 2 digits") if len(name) >= 6: raise ValidationError("name length cannot be greater than 6 characters") class UserSchema(Schema): name = fields. String(required=True, validate=validate_name) email = fields. Email() created_time = fields.DateTime() user_data = {<!-- -->"name":"InvalidName","email":"[email protected]"} try: res = UserSchema().load(user_data) except ValidationError as e: print(e. messages)
NOTE: Validation only happens during deserialization! It will not be validated during serialization!
6.2 Write the verification function in the Schema to become a verification method
In Schema, the validation method can be registered using the validates decorator.
#encoding=utf-8 from marshmallow import Schema, fields, ValidationError, validates class UserSchema(Schema): name = fields. String(required=True) email = fields. Email() created_time = fields.DateTime() @validates("name") def validate_name(self, value): if len(value) <= 2: raise ValidationError("name length must be greater than 2 digits") if len(value) >= 6: raise ValidationError("name length cannot be greater than 6 characters") user_data = {<!-- -->"name":"InvalidName","email":"[email protected]"} try: res = UserSchema().load(user_data) except ValidationError as e: print(e. messages)
6.3 Required Fields (required options)
Custom required exception information:
First of all, we can customize the exception message thrown when the field is missing when require=True: set the value of the parameter error_messages
#encoding=utf-8 from marshmallow import Schema, fields, ValidationError, validates class UserSchema(Schema): name = fields.String(required=True, error_messages={<!-- -->"required":"The name field is required"}) email = fields. Email() created_time = fields.DateTime() @validates("name") def validate_name(self, value): if len(value) <= 2: raise ValidationError("name length must be greater than 2 digits") if len(value) >= 6: raise ValidationError("name length cannot be greater than 6 characters") user_data = {<!-- -->"email":"[email protected]"} try: res = UserSchema().load(user_data) except ValidationError as e: print(e. messages)
Ignore some fields:
After using required, we can still ignore this required field when passing in data.
#encoding=utf-8 from marshmallow import Schema, fields, ValidationError, validates class UserSchema(Schema): name = fields. String(required=True) age = fields. Integer(required=True) # Method 1: Set the value (tuple) of the partial parameter in the load() method, and ignore those fields in the table. schema = UserSchema() res = schema.load({<!-- -->"age": 42}, partial=("name",)) print(res) # {'age': 42} # Method 2: Set partial=True directly schema = UserSchema() res = schema.load({<!-- -->"age": 42}, partial=True) print(res) # {'age': 42}
It seems that the two methods are the same, but there is a difference between method 1 and method 2: method 1 ignores the fields passed in partial, method 2 ignores all fields except the existing fields in the previously passed data
6.4 Handling of unknown fields
By default, if an unknown field (a field that is not in the Schema) is passed in, executing the load() method will throw a ValidationError exception. This behavior can be modified by changing the unknown option.
unknown has three values:
- EXCLUDE: exclude unknown fields (throw away unknown fields directly)
- INCLUDE: accept and include the unknown fields (accept unknown fields)
- RAISE: raise a ValidationError if there are any unknown fields (throws an exception)
We can see that the default behavior is RAISE. There are two ways to change:
Method 1: Modify in class Meta when writing Schema class
from marshmallow import EXCLUDE,Schema,fields class UserSchema(Schema): name = fields.String(required=True,error_messages={<!-- -->"required": "The name field must be filled in"}) email = fields. Email() created_time = fields.DateTime() classMeta: unknown = EXCLUDE
Method 2: Set the value of the parameter unknown when instantiating the Schema class
class UserSchema(Schema): name = fields.Str(required=True, error_messages={<!-- -->"required": "The name field must be filled in"}) email = fields. Email() created_time = fields.DateTime() shema = UserSchema(unknown=EXCLUDE)
7, Schema.validate (validation data)
If you just want to use Schema to validate data without deserializing to generate objects, you can use Schema.validate()
As you can see, the schema.validate() will automatically verify the data. If there is an error, it will return the dict of the error message. If there is no error, it will return an empty dict. Through the returned data, we can confirm whether the verification is passed. .
#encoding=utf-8 from marshmallow import Schema, fields, ValidationError class UserSchema(Schema): name = fields.Str(required=True, error_messages={<!-- -->"required": "The name field must be filled in"}) email = fields. Email() created_time = fields.DateTime() user = {<!-- -->"name":"lhh","email":"2432783449"} schema = UserSchema() res = schema. validate(user) print(res) # {'email': ['Not a valid email address.']} user = {<!-- -->"name":"lhh","email":"[email protected]"} schema = UserSchema() res = schema. validate(user) print(res) # {}
8. Specifying Serialization/Deserialization Keys (specifying serialization/deserialization keys)
data_key satisfies both serialization and deserialization methods
from marshmallow import fields,Schema,ValidationError import datetime as dt class User: def __init__(self, name, email): self.name = name self.email = email self.created_time = dt.datetime.now() class UserSchema(Schema): name = fields.Str(data_key="name_123") email = fields.Email(data_key="email_123") created_time = fields.DateTime() user = User("lhh",email="[email protected]") user = {<!-- -->"name": "lhh", "email": "[email protected]"} schema = UserSchema() res = schema. dump(user) print(res) # {'email_123': '[email protected]', 'name_123': 'lhh'} user = {<!-- -->"name_123": "lhh", "email_123": "[email protected]"} schema = UserSchema() res = schema. load(user) print(res) # {'email': '[email protected]', 'name': 'lhh'}
9. Refactoring: Create an implicit field
When a Schema has many attributes, specifying field types for each attribute can be repetitive, especially when many attributes are already native Python data types. class Meta allows specifying the properties to be serialized, and marshmallow will choose the appropriate field type based on the type of the property.
# Refactor the Schema class UserSchema(Schema): uppername = fields.Function(lambda obj: obj.name.upper()) classMeta: fields = ("name", "email", "created_at", "uppername")
In the above code, name will be automatically formatted as String type, and created_at will be formatted as DateTime type.
Additional options are available if you wish to specify which field names are included in addition to those explicitly declared. as follows:
class UserSchema(Schema): uppername = fields.Function(lambda obj: obj.name.upper()) classMeta: # No need to include 'uppername' additional = ("name", "email", "created_at")
10. Sort
For some use cases, it may be useful to maintain the field order of serialized output. To enable ordering, set the ordered option to true. This will instruct marshmallow to serialize the data into collections.OrderedDict
from collections import OrderedDict import datetime as dt from marshmallow import fields, ValidationError, Schema class User: def __init__(self, name, email): self.name = name self.email = email self.created_time = dt.datetime.now() class UserSchema(Schema): uppername = fields.Function(lambda obj: obj.name.upper()) classMeta: fields = ("name", "email", "created_time", "uppername") ordered = True user = User("lhh", "[email protected]") schema = UserSchema() res = schema. dump(user) print(isinstance(res,OrderedDict)) # determine variable type # True print(res) # OrderedDict([('name', 'lhh'), ('email', '[email protected]'), ('created_time', '2021-05-29T09:40:46.351382'), ('uppername' , 'LHH')])
11. “Read-only” and “Write-only” fields
In the context of Web API, the serialization parameter dump_only and the deserialization parameter load_only are conceptually equivalent to read-only and write-only fields, respectively.
from marshmallow import Schema,fields class UserSchema(Schema): name = fields. Str() password = fields.Str(load_only=True) # equal to write only created_at = fields.DateTime(dump_only=True) # equal to read-only
When loading, dump_only fields are treated as unknown fields. If the unknown option is set to include, the values for keys corresponding to these fields will therefore be loaded without validation.
12. The default value of the specified field when serializing/deserializing
If the input value is missing during serialization, use default to specify the default value. If the input value is missing during deserialization, use missing to specify the default value.
#encoding=utf-8 import uuid import datetime as dt from marshmallow import fields, ValidationError, Schema class UserSchema(Schema): id = fields.UUID(missing=uuid.uuid1) birthday = fields.DateTime(default=dt.datetime(1996,11,17)) # Serialization res = UserSchema(). dump({<!-- -->}) print(res) # {'birthday': '1996-11-17T00:00:00'} # deserialize res = UserSchema().load({<!-- -->'birthday': '1996-11-17T00:00:00'}) print(res) # {'id': UUID('751d95db-c020-11eb-83eb-001a7dda7115'), 'birthday': datetime.datetime(1996, 11, 17, 0, 0)}