将 json 字符串反序列化为 python 中的对象

我有以下字符串

{"action":"print","method":"onData","data":"Madan Mohan"}

我想反序列化为类的一个对象

class payload
string action
string method
string data

我使用的是 python 2.6和2.7

173450 次浏览

You can specialize an encoder for object creation: http://docs.python.org/2/library/json.html

import json
class ComplexEncoder(json.JSONEncoder):
def default(self, obj):
if isinstance(obj, complex):
return {"real": obj.real,
"imag": obj.imag,
"__class__": "complex"}
return json.JSONEncoder.default(self, obj)


print json.dumps(2 + 1j, cls=ComplexEncoder)
>>> j = '{"action": "print", "method": "onData", "data": "Madan Mohan"}'
>>> import json
>>>
>>> class Payload(object):
...     def __init__(self, j):
...         self.__dict__ = json.loads(j)
...
>>> p = Payload(j)
>>>
>>> p.action
'print'
>>> p.method
'onData'
>>> p.data
'Madan Mohan'

To elaborate on Sami's answer:

From the docs:

class Payload(object):
def __init__(self, action, method, data):
self.action = action
self.method = method
self.data = data


import json


def as_payload(dct):
return Payload(dct['action'], dct['method'], dct['data'])


payload = json.loads(message, object_hook = as_payload)

My objection to the

.__dict__

solution is that while it does the job and is concise, the Payload class becomes totally generic - it doesn't document its fields.

For example, if the Payload message had an unexpected format, instead of throwing a key not found error when the Payload was created, no error would be generated until the payload was used.

If you want to save lines of code and leave the most flexible solution, we can deserialize the json string to a dynamic object:

p = lambda:None
p.__dict__ = json.loads('{"action": "print", "method": "onData", "data": "Madan Mohan"}')


>>>> p.action
output: u'print'

>>>> p.method
output: u'onData'

I prefer to add some checking of the fields, e.g. so you can catch errors like when you get invalid json, or not the json you were expecting, so I used namedtuples:

from collections import namedtuple
payload = namedtuple('payload', ['action', 'method', 'data'])
def deserialize_payload(json):
kwargs =  dict([(field, json[field]) for field in payload._fields])
return payload(**kwargs)

this will let give you nice errors when the json you are parsing does not match the thing you want it to parse

>>> json = {"action":"print","method":"onData","data":"Madan Mohan"}
>>> deserialize_payload(json)
payload(action='print', method='onData', data='Madan Mohan')
>>> badjson = {"error":"404","info":"page not found"}
>>> deserialize_payload(badjson)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 2, in deserialize_payload
KeyError: 'action'

if you want to parse nested relations, e.g. '{"parent":{"child":{"name":"henry"}}}' you can still use the namedtuples, and even a more reusable function

Person = namedtuple("Person", ['parent'])
Parent = namedtuple("Parent", ['child'])
Child = namedtuple('Child', ['name'])
def deserialize_json_to_namedtuple(json, namedtuple):
return namedtuple(**dict([(field, json[field]) for field in namedtuple._fields]))


def deserialize_person(json):
json['parent']['child']  = deserialize_json_to_namedtuple(json['parent']['child'], Child)
json['parent'] =  deserialize_json_to_namedtuple(json['parent'], Parent)
person = deserialize_json_to_namedtuple(json, Person)
return person

giving you

>>> deserialize_person({"parent":{"child":{"name":"henry"}}})
Person(parent=Parent(child=Child(name='henry')))
>>> deserialize_person({"error":"404","info":"page not found"})
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 2, in deserialize_person
KeyError: 'parent'

If you are embracing the type hints in Python 3.6, you can do it like this:

def from_json(data, cls):
annotations: dict = cls.__annotations__ if hasattr(cls, '__annotations__') else None
if issubclass(cls, List):
list_type = cls.__args__[0]
instance: list = list()
for value in data:
instance.append(from_json(value, list_type))
return instance
elif issubclass(cls, Dict):
key_type = cls.__args__[0]
val_type = cls.__args__[1]
instance: dict = dict()
for key, value in data.items():
instance.update(from_json(key, key_type), from_json(value, val_type))
return instance
else:
instance : cls = cls()
for name, value in data.items():
field_type = annotations.get(name)
if inspect.isclass(field_type) and isinstance(value, (dict, tuple, list, set, frozenset)):
setattr(instance, name, from_json(value, field_type))
else:
setattr(instance, name, value)
return instance

Which then allows you do instantiate typed objects like this:

class Bar:
value : int


class Foo:
x : int
bar : List[Bar]




obj : Foo = from_json(json.loads('{"x": 123, "bar":[{"value": 3}, {"value": 2}, {"value": 1}]}'), Foo)
print(obj.x)
print(obj.bar[2].value)

This syntax requires Python 3.6 though and does not cover all cases - for example, support for typing.Any... But at least it does not pollute the classes that need to be deserialized with extra init/tojson methods.

I thought I lose all my hairs for solving this 'challenge'. I faced following problems:

  1. How to deserialize nested objects, lists etc.
  2. I like constructors with specified fields
  3. I don't like dynamic fields
  4. I don't like hacky solutions

I found a library called jsonpickle which is has proven to be really useful.

Installation:

pip install jsonpickle

Here is a code example with writing nested objects to file:

import jsonpickle




class SubObject:
def __init__(self, sub_name, sub_age):
self.sub_name = sub_name
self.sub_age = sub_age




class TestClass:


def __init__(self, name, age, sub_object):
self.name = name
self.age = age
self.sub_object = sub_object




john_junior = SubObject("John jr.", 2)


john = TestClass("John", 21, john_junior)


file_name = 'JohnWithSon' + '.json'


john_string = jsonpickle.encode(john)


with open(file_name, 'w') as fp:
fp.write(john_string)


john_from_file = open(file_name).read()


test_class_2 = jsonpickle.decode(john_from_file)


print(test_class_2.name)
print(test_class_2.age)
print(test_class_2.sub_object.sub_name)

Output:

John
21
John jr.

Website: http://jsonpickle.github.io/

Hope it will save your time (and hairs).

Another way is to simply pass the json string as a dict to the constructor of your object. For example your object is:

class Payload(object):
def __init__(self, action, method, data, *args, **kwargs):
self.action = action
self.method = method
self.data = data

And the following two lines of python code will construct it:

j = json.loads(yourJsonString)
payload = Payload(**j)

Basically, we first create a generic json object from the json string. Then, we pass the generic json object as a dict to the constructor of the Payload class. The constructor of Payload class interprets the dict as keyword arguments and sets all the appropriate fields.

In recent versions of python, you can use marshmallow-dataclass:

from marshmallow_dataclass import dataclass


@dataclass
class Payload
action:str
method:str
data:str


Payload.Schema().load({"action":"print","method":"onData","data":"Madan Mohan"})

While Alex's answer points us to a good technique, the implementation that he gave runs into a problem when we have nested objects.

class more_info
string status


class payload
string action
string method
string data
class more_info

with the below code:

def as_more_info(dct):
return MoreInfo(dct['status'])


def as_payload(dct):
return Payload(dct['action'], dct['method'], dct['data'], as_more_info(dct['more_info']))


payload = json.loads(message, object_hook = as_payload)

payload.more_info will also be treated as an instance of payload which will lead to parsing errors.

From the official docs:

object_hook is an optional function that will be called with the result of any object literal decoded (a dict). The return value of object_hook will be used instead of the dict.

Hence, I would prefer to propose the following solution instead:

class MoreInfo(object):
def __init__(self, status):
self.status = status


@staticmethod
def fromJson(mapping):
if mapping is None:
return None


return MoreInfo(
mapping.get('status')
)


class Payload(object):
def __init__(self, action, method, data, more_info):
self.action = action
self.method = method
self.data = data
self.more_info = more_info


@staticmethod
def fromJson(mapping):
if mapping is None:
return None


return Payload(
mapping.get('action'),
mapping.get('method'),
mapping.get('data'),
MoreInfo.fromJson(mapping.get('more_info'))
)


import json
def toJson(obj, **kwargs):
return json.dumps(obj, default=lambda j: j.__dict__, **kwargs)


def fromJson(msg, cls, **kwargs):
return cls.fromJson(json.loads(msg, **kwargs))


info = MoreInfo('ok')
payload = Payload('print', 'onData', 'better_solution', info)
pl_json = toJson(payload)
l1 = fromJson(pl_json, Payload)


There are different methods to deserialize json string to an object. All above methods are acceptable but I suggest using a library to prevent duplicate key issues or serializing/deserializing of nested objects.

Pykson, is a JSON Serializer and Deserializer for Python which can help you achieve. Simply define Payload class model as JsonObject then use Pykson to convert json string to object.

from pykson import Pykson, JsonObject, StringField


class Payload(pykson.JsonObject):
action = StringField()
method = StringField()
data = StringField()


json_text = '{"action":"print","method":"onData","data":"Madan Mohan"}'
payload = Pykson.from_json(json_text, Payload)

pydantic is an increasingly popular library for python 3.6+ projects. It mainly does data validation and settings management using type hints.

A basic example using different types:

from pydantic import BaseModel


class ClassicBar(BaseModel):
count_drinks: int
is_open: bool
 

data = {'count_drinks': '226', 'is_open': 'False'}
cb = ClassicBar(**data)
>>> cb
ClassicBar(count_drinks=226, is_open=False)

What I love about the lib is that you get a lot of goodies for free, like

>>> cb.json()
'{"count_drinks": 226, "is_open": false}'
>>> cb.dict()
{'count_drinks': 226, 'is_open': False}