将字典的字符串表示转换为字典

如何将dictstr表示形式(例如以下字符串)转换为dict

s = "{'muffin' : 'lolz', 'foo' : 'kitty'}"

我不喜欢使用eval。我还可以使用什么?

主要原因是,我的一个同事编写的类将所有输入转换为字符串。我没有心情去修改他的类来处理这个问题。

1248883 次浏览

如果字符串总是是可信的,你可以使用eval(或者按照建议使用literal_eval;不管字符串是什么都是安全的。)否则你需要一个解析器。如果JSON解析器(例如simplejson)只存储符合JSON方案的内容,他将工作。

您可以使用内置#0

>>> import ast>>> ast.literal_eval("{'muffin' : 'lolz', 'foo' : 'kitty'}"){'muffin': 'lolz', 'foo': 'kitty'}

这比使用eval更安全。正如它自己的文档所说:

>>> help(ast.literal_eval)Help on function literal_eval in module ast:
literal_eval(node_or_string)Safely evaluate an expression node or a string containing a Pythonexpression.  The string or node provided may only consist of the followingPython literal structures: strings, numbers, tuples, lists, dicts, booleans,and None.

例如:

>>> eval("shutil.rmtree('mongo')")Traceback (most recent call last):File "<stdin>", line 1, in <module>File "<string>", line 1, in <module>File "/opt/Python-2.6.1/lib/python2.6/shutil.py", line 208, in rmtreeonerror(os.listdir, path, sys.exc_info())File "/opt/Python-2.6.1/lib/python2.6/shutil.py", line 206, in rmtreenames = os.listdir(path)OSError: [Errno 2] No such file or directory: 'mongo'>>> ast.literal_eval("shutil.rmtree('mongo')")Traceback (most recent call last):File "<stdin>", line 1, in <module>File "/opt/Python-2.6.1/lib/python2.6/ast.py", line 68, in literal_evalreturn _convert(node_or_string)File "/opt/Python-2.6.1/lib/python2.6/ast.py", line 67, in _convertraise ValueError('malformed string')ValueError: malformed string

https://docs.python.org/3.8/library/json.html

JSON可以解决这个问题,尽管它的解码器需要在键和值周围加上双引号。如果你不介意替换黑客…

import jsons = "{'muffin' : 'lolz', 'foo' : 'kitty'}"json_acceptable_string = s.replace("'", "\"")d = json.loads(json_acceptable_string)# d = {u'muffin': u'lolz', u'foo': u'kitty'}

请注意,如果您将单引号作为键或值的一部分,这将由于不正确的字符替换而失败。仅当您强烈厌恶val解决方案时,才建议使用此解决方案。

更多关于json单引号:jQuery.parseJSON抛出“无效的JSON”错误,因为在JSON中转义了单引号

使用json.loads

>>> import json>>> h = '{"foo":"bar", "foo2":"bar2"}'>>> d = json.loads(h)>>> d{u'foo': u'bar', u'foo2': u'bar2'}>>> type(d)<type 'dict'>

使用jsonast库消耗大量内存且速度较慢。我有一个进程需要读取156Mb的文本文件。Ast转换字典json延迟5分钟,使用60%的内存减少1分钟!

以OP为例:

s = "{'muffin' : 'lolz', 'foo' : 'kitty'}"

我们可以使用Yaml来处理这种字符串中的非标准json:

>>> import yaml>>> s = "{'muffin' : 'lolz', 'foo' : 'kitty'}">>> s"{'muffin' : 'lolz', 'foo' : 'kitty'}">>> yaml.load(s){'muffin': 'lolz', 'foo': 'kitty'}
string = "{'server1':'value','server2':'value'}"
#Now removing { and }s = string.replace("{" ,"")finalstring = s.replace("}" , "")
#Splitting the string based on , we get key value pairslist = finalstring.split(",")
dictionary ={}for i in list:#Get Key Value pairs separately to store in dictionarykeyvalue = i.split(":")
#Replacing the single quotes in the leading.m= keyvalue[0].strip('\'')m = m.replace("\"", "")dictionary[m] = keyvalue[1].strip('"\'')
print dictionary

没有使用任何libs(python2):

dict_format_string = "{'1':'one', '2' : 'two'}"d = {}elems  = filter(str.isalnum,dict_format_string.split("'"))values = elems[1::2]keys   = elems[0::2]d.update(zip(keys,values))

注意:由于它已硬编码split("'")将仅适用于数据为“单引号”的字符串。

注意2:在python3中,您需要将filter()包装到list()以获取列表。

总结如下:

import ast, yaml, json, timeit
descs=['short string','long string']strings=['{"809001":2,"848545":2,"565828":1}','{"2979":1,"30581":1,"7296":1,"127256":1,"18803":2,"41619":1,"41312":1,"16837":1,"7253":1,"70075":1,"3453":1,"4126":1,"23599":1,"11465":3,"19172":1,"4019":1,"4775":1,"64225":1,"3235":2,"15593":1,"7528":1,"176840":1,"40022":1,"152854":1,"9878":1,"16156":1,"6512":1,"4138":1,"11090":1,"12259":1,"4934":1,"65581":1,"9747":2,"18290":1,"107981":1,"459762":1,"23177":1,"23246":1,"3591":1,"3671":1,"5767":1,"3930":1,"89507":2,"19293":1,"92797":1,"32444":2,"70089":1,"46549":1,"30988":1,"4613":1,"14042":1,"26298":1,"222972":1,"2982":1,"3932":1,"11134":1,"3084":1,"6516":1,"486617":1,"14475":2,"2127":1,"51359":1,"2662":1,"4121":1,"53848":2,"552967":1,"204081":1,"5675":2,"32433":1,"92448":1}']funcs=[json.loads,eval,ast.literal_eval,yaml.load]
for  desc,string in zip(descs,strings):print('***',desc,'***')print('')for  func in funcs:print(func.__module__+' '+func.__name__+':')%timeit func(string)print('')

结果:

*** short string ***
json loads:4.47 µs ± 33.4 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)builtins eval:24.1 µs ± 163 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)ast literal_eval:30.4 µs ± 299 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)yaml load:504 µs ± 1.29 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
*** long string ***
json loads:29.6 µs ± 230 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)builtins eval:219 µs ± 3.92 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)ast literal_eval:331 µs ± 1.89 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)yaml load:9.02 ms ± 92.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

结论:首选项json.loads

Siva Kameswara Rao Munipalle的优化代码

s = s.replace("{", "").replace("}", "").split(",")            
dictionary = {}
for i in s:dictionary[i.split(":")[0].strip('\'').replace("\"", "")] = i.split(":")[1].strip('"\'')            
print(dictionary)

我的字符串里面没有引号:
s = 'Date: 2022-11-29T10:57:01.024Z, Size: 910.11 KB'

我的解决方案是使用str.split
{k:v for k, v in map(lambda d: d.split(': '), s.split(', '))}