Python 中是否存在可变的命名元组? ？

小开

看来这个问题的答案是否定的。

下面的代码非常接近，但是它在技术上是不可变的。这将创建一个新的 namedtuple()实例，其中包含更新后的 x 值:

Point = namedtuple('Point', ['x', 'y'])
p = Point(0, 0)
p = p._replace(x=10)

另一方面，您可以使用 __slots__创建一个简单的类，它应该可以很好地用于频繁更新类实例属性:

class Point:
__slots__ = ['x', 'y']
def __init__(self, x, y):
self.x = x
self.y = y

为了补充这个答案，我认为 __slots__在这里很有用，因为当您创建大量类实例时，它的内存效率很高。唯一的缺点是无法创建新的类属性。

这里有一个相关的线程，说明了内存效率-Dictionary vs Object-哪个更有效? 为什么？

这个线程的答案中引用的内容非常简洁地解释了为什么 __slots__的内存效率更高—— 巨蟒插槽

小开

根据定义，元组是不可变的。

然而，您可以创建一个 dictionary 子类，在这里您可以使用点符号访问属性;

In [1]: %cpaste
Pasting code; enter '--' alone on the line to stop or use Ctrl-D.
:class AttrDict(dict):
:
:    def __getattr__(self, name):
:        return self[name]
:
:    def __setattr__(self, name, value):
:        self[name] = value
:--


In [2]: test = AttrDict()


In [3]: test.a = 1


In [4]: test.b = True


In [5]: test
Out[5]: {'a': 1, 'b': True}

小开

如果您想要与 namedtuple 类似但可变的行为，请尝试名单

请注意，为了可变，它的不能是一个元组。

小开

让我们用动态类型创建来实现这一点:

import copy
def namedgroup(typename, fieldnames):


def init(self, **kwargs):
attrs = {k: None for k in self._attrs_}
for k in kwargs:
if k in self._attrs_:
attrs[k] = kwargs[k]
else:
raise AttributeError('Invalid Field')
self.__dict__.update(attrs)


def getattribute(self, attr):
if attr.startswith("_") or attr in self._attrs_:
return object.__getattribute__(self, attr)
else:
raise AttributeError('Invalid Field')


def setattr(self, attr, value):
if attr in self._attrs_:
object.__setattr__(self, attr, value)
else:
raise AttributeError('Invalid Field')


def rep(self):
d = ["{}={}".format(v,self.__dict__[v]) for v in self._attrs_]
return self._typename_ + '(' + ', '.join(d) + ')'


def iterate(self):
for x in self._attrs_:
yield self.__dict__[x]
raise StopIteration()


def setitem(self, *args, **kwargs):
return self.__dict__.__setitem__(*args, **kwargs)


def getitem(self, *args, **kwargs):
return self.__dict__.__getitem__(*args, **kwargs)


attrs = {"__init__": init,
"__setattr__": setattr,
"__getattribute__": getattribute,
"_attrs_": copy.deepcopy(fieldnames),
"_typename_": str(typename),
"__str__": rep,
"__repr__": rep,
"__len__": lambda self: len(fieldnames),
"__iter__": iterate,
"__setitem__": setitem,
"__getitem__": getitem,
}


return type(typename, (object,), attrs)

这将在允许操作继续之前检查属性，以确定它们是否有效。

那么，这是可腌制的吗? 是的，如果(并且只有)你做以下事情:

>>> import pickle
>>> Point = namedgroup("Point", ["x", "y"])
>>> p = Point(x=100, y=200)
>>> p2 = pickle.loads(pickle.dumps(p))
>>> p2.x
100
>>> p2.y
200
>>> id(p) != id(p2)
True

定义必须在您的名称空间中，并且必须存在足够长的时间以便 pickle 找到它。因此，如果您将它定义为包中的内容，那么它应该可以工作。

Point = namedgroup("Point", ["x", "y"])

如果执行以下操作，或者使定义为临时的，Pickle 就会失败(例如，当函数结束时超出范围) :

some_point = namedgroup("Point", ["x", "y"])

是的，它确实保留了类型创建中列出的字段的顺序。

小开

最佳答案

有一个可变的替代 collections.namedtuple-记录类。它可以从 PyPI 安装:

pip3 install recordclass

它具有与 namedtuple相同的 API 和内存占用，并且支持分配(速度也应该更快)。例如:

from recordclass import recordclass


Point = recordclass('Point', 'x y')


>>> p = Point(1, 2)
>>> p
Point(x=1, y=2)
>>> print(p.x, p.y)
1 2
>>> p.x += 2; p.y += 3; print(p)
Point(x=3, y=5)

recordclass(从0.5开始)支持类型提示:

from recordclass import recordclass, RecordClass


class Point(RecordClass):
x: int
y: int


>>> Point.__annotations__
{'x':int, 'y':int}
>>> p = Point(1, 2)
>>> p
Point(x=1, y=2)
>>> print(p.x, p.y)
1 2
>>> p.x += 2; p.y += 3; print(p)
Point(x=3, y=5)

有一个更完整的例子(它还包括性能比较)。

Recordclass库现在提供了另一种变体—— recordclass.make_dataclass工厂函数。它支持类似数据类的 API (有模块级函数 update、 make、 replace而不是 self._update、 self._replace、 self._asdict、 cls._make方法)。

from recordclass import dataobject, make_dataclass


Point = make_dataclass('Point', [('x', int), ('y',int)])
Point = make_dataclass('Point', {'x':int, 'y':int})


class Point(dataobject):
x: int
y: int


>>> p = Point(1, 2)
>>> p
Point(x=1, y=2)
>>> p.x = 10; p.y += 3; print(p)
Point(x=10, y=5)

recordclass和 make_dataclass可以生成类，它们的实例比基于 __slots__的实例占用更少的内存。这对于具有属性值的实例非常重要，因为属性值并不打算具有引用周期。如果需要创建数百万个实例，它可能有助于减少内存使用。这里是一个说明性的例子。

小开

下面是 Python 3的一个很好的解决方案: 一个使用 __slots__和 Sequence抽象基类的最小类; 不需要进行错误检测之类的操作，但是它可以工作，并且其行为大多像一个可变的 tuple (除了类型检查之外)。

from collections import Sequence


class NamedMutableSequence(Sequence):
__slots__ = ()


def __init__(self, *a, **kw):
slots = self.__slots__
for k in slots:
setattr(self, k, kw.get(k))


if a:
for k, v in zip(slots, a):
setattr(self, k, v)


def __str__(self):
clsname = self.__class__.__name__
values = ', '.join('%s=%r' % (k, getattr(self, k))
for k in self.__slots__)
return '%s(%s)' % (clsname, values)


__repr__ = __str__


def __getitem__(self, item):
return getattr(self, self.__slots__[item])


def __setitem__(self, item, value):
return setattr(self, self.__slots__[item], value)


def __len__(self):
return len(self.__slots__)


class Point(NamedMutableSequence):
__slots__ = ('x', 'y')

例如:

>>> p = Point(0, 0)
>>> p.x = 10
>>> p
Point(x=10, y=0)
>>> p.x *= 10
>>> p
Point(x=100, y=0)

如果需要，还可以使用一个方法来创建类(尽管使用显式类更透明) :

def namedgroup(name, members):
if isinstance(members, str):
members = members.split()
members = tuple(members)
return type(name, (NamedMutableSequence,), {'__slots__': members})

例如:

>>> Point = namedgroup('Point', ['x', 'y'])
>>> Point(6, 42)
Point(x=6, y=42)

在 Python2中，您需要稍微调整它——如果您使用继承自 ABC0，该类将有一个 __dict__，那么 __slots__将停止工作。

Python2中的解决方案不是从 Sequence继承，而是从 object继承。如果需要 isinstance(Point, Sequence) == True，则需要将 NamedMutableSequence作为基类注册到 Sequence:

Sequence.register(NamedMutableSequence)

小开

从2016年1月11日起，最新的名单1.7通过了 Python 2.7和 Python 3.5 的所有测试。而 recordclass是 C 扩展。当然，这取决于您的需求是否首选 C 扩展。

您的测试(但也请参阅下面的说明) :

from __future__ import print_function
import pickle
import sys
from namedlist import namedlist


Point = namedlist('Point', 'x y')
p = Point(x=1, y=2)


print('1. Mutation of field values')
p.x *= 10
p.y += 10
print('p: {}, {}\n'.format(p.x, p.y))


print('2. String')
print('p: {}\n'.format(p))


print('3. Representation')
print(repr(p), '\n')


print('4. Sizeof')
print('size of p:', sys.getsizeof(p), '\n')


print('5. Access by name of field')
print('p: {}, {}\n'.format(p.x, p.y))


print('6. Access by index')
print('p: {}, {}\n'.format(p[0], p[1]))


print('7. Iterative unpacking')
x, y = p
print('p: {}, {}\n'.format(x, y))


print('8. Iteration')
print('p: {}\n'.format([v for v in p]))


print('9. Ordered Dict')
print('p: {}\n'.format(p._asdict()))


print('10. Inplace replacement (update?)')
p._update(x=100, y=200)
print('p: {}\n'.format(p))


print('11. Pickle and Unpickle')
pickled = pickle.dumps(p)
unpickled = pickle.loads(pickled)
assert p == unpickled
print('Pickled successfully\n')


print('12. Fields\n')
print('p: {}\n'.format(p._fields))


print('13. Slots')
print('p: {}\n'.format(p.__slots__))

Python 2.7的输出

1. Mutation of field values
p: 10, 12


2. String
p: Point(x=10, y=12)


3. Representation
Point(x=10, y=12)


4. Sizeof
size of p: 64


5. Access by name of field
p: 10, 12


6. Access by index
p: 10, 12


7. Iterative unpacking
p: 10, 12


8. Iteration
p: [10, 12]


9. Ordered Dict
p: OrderedDict([('x', 10), ('y', 12)])


10. Inplace replacement (update?)
p: Point(x=100, y=200)


11. Pickle and Unpickle
Pickled successfully


12. Fields
p: ('x', 'y')


13. Slots
p: ('x', 'y')

与 Python 3.5的唯一区别是 namedlist变小了，大小为56(Python 2.7报告64)。

注意，我已经为就地替换更改了您的测试10。namedlist有一个 _replace()方法，它执行浅拷贝，这对我来说非常有意义，因为标准库中的 namedtuple的行为方式是相同的。改变 _replace()方法的语义会让人感到困惑。在我看来，_update()方法应该用于就地更新。或许我没能理解你第十次测试的目的？

小开

SimpleNamespace 是在 Python 3.3中引入的，支持请求的需求。

from types import SimpleNamespace t = SimpleNamespace(foo='bar') t.ham = 'spam' print(t) namespace(foo='bar', ham='spam') print(t.foo) 'bar' import pickle with open('/tmp/pickle', 'wb') as f: pickle.dump(t, f)

小开

如果性能不那么重要，人们可以使用一种愚蠢的黑客技术，比如:

from collection import namedtuple Point = namedtuple('Point', 'x y z') mutable_z = Point(1,2,[3])

小开

作为此任务的 Python 替代方案，自 Python-3.7以来，您可以使用 dataclasses 模块的行为不仅像一个可变的 NamedTuple，因为它们使用普通的类定义，它们还支持其他类特性。

来自 PEP-0557:

虽然它们使用一种非常不同的机制，但是数据类可以被认为是“具有默认值的可变命名元组”。因为 Data Class 使用普通的类定义语法，所以您可以自由地使用继承、元类、 docstring、用户定义的方法、类工厂和其他 Python 类特性。

提供了一个类修饰符，用于检查具有 PEP 526中定义的类型注释的变量的类定义，即“变量注释的语法”。在本文档中，这些变量称为字段。使用这些字段，装饰器将生成的方法定义添加到类中，以支持实例初始化、 repr、比较方法以及规格小节中描述的其他方法(可选)。这样的类称为 Data Class，但是这个类实际上并没有什么特别之处: 装饰器将生成的方法添加到类中，并返回给出的相同类。

这个特性是在 PEP-0557中引入的，您可以在提供的文档链接中阅读有关它的更多细节。

例如:

In [20]: from dataclasses import dataclass In [21]: @dataclass ...: class InventoryItem: ...: '''Class for keeping track of an item in inventory.''' ...: name: str ...: unit_price: float ...: quantity_on_hand: int = 0 ...: ...: def total_cost(self) -> float: ...: return self.unit_price * self.quantity_on_hand ...:

演示:

In [23]: II = InventoryItem('bisc', 2000) In [24]: II Out[24]: InventoryItem(name='bisc', unit_price=2000, quantity_on_hand=0) In [25]: II.name = 'choco' In [26]: II.name Out[26]: 'choco' In [27]: In [27]: II.unit_price *= 3 In [28]: II.unit_price Out[28]: 6000 In [29]: II Out[29]: InventoryItem(name='choco', unit_price=6000, quantity_on_hand=0)

小开

我不敢相信以前没人这么说过，但在我看来 Python 只是想让你去 编写您自己的简单的，可变的类，而不是使用 namedtuple时，您需要的“ namedtuple”是可变的。

快速总结

只要直接跳到下面的 进场5就行了。它简短明了，是目前为止这些选择中最好的。

各种详细的方法:

方法1(good) : 使用 __call__()的简单、可调用的类

下面是 (x, y)点的一个简单 Point对象的例子:

class Point(): def __init__(self, x, y): self.x = x self.y = y def __call__(self): """ Make `Point` objects callable. Print their contents when they are called. """ print("Point(x={}, y={})".format(self.x, self.y))

现在使用它:

p1 = Point(1,2) p1() p1.x = 7 p1() p1.y = 8 p1()

以下是输出结果:

Point(x=1, y=2) Point(x=7, y=2) Point(x=7, y=8)

这非常类似于 namedtuple，除了它是完全可变的，不像 namedtuple。另外，namedtuple是不可调用的，因此要查看它的内容，只需在它后面输入对象实例名称，但不要加括号(如下例中的 p2，而不是为 p2())。请看下面的示例和输出:

>>> from collections import namedtuple >>> Point2 = namedtuple("Point2", ["x", "y"]) >>> p2 = Point2(1, 2) >>> p2 Point2(x=1, y=2) >>> p2() Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: 'Point2' object is not callable >>> p2.x = 7 Traceback (most recent call last): File "<stdin>", line 1, in <module> AttributeError: can't set attribute

方法2(更好) : 使用 __repr__()代替 __call__()

我刚刚学到你可以用 __repr__()代替 __call__()，来获得更多类似 namedtuple的行为。定义 __repr__()方法允许您定义“对象的‘正式’字符串表示”(参见这里有官方文件)。现在，仅仅调用 p1就相当于调用 __repr__()方法，您将获得与 namedtuple相同的行为。这是新班级:

class Point(): def __init__(self, x, y): self.x = x self.y = y def __repr__(self): """ Obtain the string representation of `Point`, so that just typing the instance name of an object of this type will call this method and obtain this string, just like `namedtuple` already does! """ return "Point(x={}, y={})".format(self.x, self.y)

现在使用它:

p1 = Point(1,2) p1 p1.x = 7 p1 p1.y = 8 p1

以下是输出结果:

Point(x=1, y=2) Point(x=7, y=2) Point(x=7, y=8)

方法3(更好，但是有点难用) : 使它成为一个返回 (x, y)元组的可调用元素

原来的海报(OP)也希望这样的工作(见他的评论下面我的答案) :

x, y = Point(x=1, y=2)

好吧，为了简单起见，我们还是这样做吧:

x, y = Point(x=1, y=2)() # OR p1 = Point(x=1, y=2) x, y = p1()

既然说到这里，我们还要简单说明一下:

self.x = x self.y = y

... 变成这个(来源我第一次看到这个的地方) :

self.x, self.y = x, y

下面是上面所有内容的类定义:

class Point(): def __init__(self, x, y): self.x, self.y = x, y def __repr__(self): """ Obtain the string representation of `Point`, so that just typing the instance name of an object of this type will call this method and obtain this string, just like `namedtuple` already does! """ return "Point(x={}, y={})".format(self.x, self.y) def __call__(self): """ Make the object callable. Return a tuple of the x and y components of the Point. """ return self.x, self.y

以下是一些测试电话:

p1 = Point(1,2) p1 p1.x = 7 x, y = p1() x2, y2 = Point(10, 12)() x y x2 y2

这次我不会展示将类定义粘贴到解释器中，但是下面是这些调用及其输出:

>>> p1 = Point(1,2) >>> p1 Point(x=1, y=2) >>> p1.x = 7 >>> x, y = p1() >>> x2, y2 = Point(10, 12)() >>> x 7 >>> y 2 >>> x2 10 >>> y2 12

方法4(到目前为止是最好的，但需要编写更多的代码) : 使类也成为迭代器

通过把它变成迭代器类，我们可以得到这样的行为:

x, y = Point(x=1, y=2) # OR x, y = Point(1, 2) # OR p1 = Point(1, 2) x, y = p1

让我们去掉 __call__()方法，但是为了使这个类成为迭代器，我们将添加 __iter__()和 __next__()方法。点击这里阅读更多相关内容:

Https://treyhunner.com/2018/06/how-to-make-an-iterator-in-python/

如何构建一个基本的迭代器？

Https://docs.python.org/3/library/exceptions.html#stopiteration

解决办法如下:

class Point(): def __init__(self, x, y): self.x, self.y = x, y self._iterator_index = 0 self._num_items = 2 # counting self.x and self.y def __repr__(self): """ Obtain the string representation of `Point`, so that just typing the instance name of an object of this type will call this method and obtain this string, just like `namedtuple` already does! """ return "Point(x={}, y={})".format(self.x, self.y) def __iter__(self): return self def __next__(self): self._iterator_index += 1 if self._iterator_index == 1: return self.x elif self._iterator_index == 2: return self.y else: raise StopIteration

下面是一些测试调用及其输出:

>>> x, y = Point(x=1, y=2) >>> x 1 >>> y 2 >>> x, y = Point(3, 4) >>> x 3 >>> y 4 >>> p1 = Point(5, 6) >>> x, y = p1 >>> x 5 >>> y 6 >>> p1 Point(x=5, y=6)

方法5(使用这一个)(完美! ——最佳和最干净/最短的方法) : 使用 yield生成器关键字使类成为可迭代的

研究这些参考文献:

Https://treyhunner.com/2018/06/how-to-make-an-iterator-in-python/

“屈服”关键字有什么用？

这就是解决方案，它依赖于一种奇特的“迭代生成器”(也就是“生成器”)关键字/Python 机制，称为 yield。

基本上，第一次迭代调用下一个项时，它调用 __iter__()方法，并停止并返回第一个 yield调用的内容(下面代码中的 self.x)。下一次迭代调用下一个项时，它从上一次停止的地方(在本例中就在第一个 yield之后)继续，并查找下一个 yield，停止并返回该 yield调用的内容(下面的代码中是 self.y)。每个来自 yield的“返回”实际上都返回一个“生成器”对象，这个对象本身是可迭代的，因此您可以对它进行迭代。对下一个项目的每个新的迭代调用都会继续这个过程，从它上次停止的地方开始，就在最近调用的 yield之后，直到不再存在更多的 yield调用，此时迭代结束，迭代完全被迭代。因此，一旦这个迭代器调用了两个对象，这两个 yield调用就都用完了，因此迭代器就结束了。最终的结果是，像这样的调用完美地工作了，就像他们在方法4中所做的那样，但是对于 yield1:

x, y = Point(x=1, y=2) # OR x, y = Point(1, 2) # OR p1 = Point(1, 2) x, y = p1

下面是解决方案 (这个解决方案的一部分也可以在上面的 treyhunner.com 引用中找到)。注意这个解决方案是多么的简短和干净！

只有类定义代码; 没有文档字符串，因此您可以真正看到这有多么简短:

class Point():
def __init__(self, x, y):
self.x, self.y = x, y


def __repr__(self):
return "Point(x={}, y={})".format(self.x, self.y)
    

def __iter__(self):
yield self.x
yield self.y

使用描述性文档字符串:

class Point():
def __init__(self, x, y):
self.x, self.y = x, y


def __repr__(self):
"""
Obtain the string representation of `Point`, so that just typing
the instance name of an object of this type will call this method
and obtain this string, just like `namedtuple` already does!
"""
return "Point(x={}, y={})".format(self.x, self.y)


def __iter__(self):
"""
Make this `Point` class an iterable. When used as an iterable, it will
now return `self.x` and `self.y` as the two elements of a list-like,
iterable object, "generated" by the usages of the `yield` "generator"
keyword.
"""
yield self.x
yield self.y

复制并粘贴与上一种方法(方法4)中使用的完全相同的测试代码，您将得到 完全相同的输出以上！

参考文献:

Https://docs.python.org/3/library/collections.html#collections.namedtuple
方法1:
1. __init__和 __call__有什么不同？
方法2:

方法4:
1. 如何构建一个基本的迭代器？
2. Https://docs.python.org/3/library/exceptions.html#stopiteration
方法5:
1. 参见方法4的链接，另外:
2. 好极了
对象名称前的单下划线和双下划线是什么意思？

小开

我能想到的最优雅的方法是不需要第三方库，并且允许您创建一个带有默认成员变量的快速模拟类构造函数，而不需要 dataclasses繁琐的类型规范。因此，最好是粗略地编写一些代码:

# copy-paste 3 lines:
from inspect import getargvalues, stack
from types import SimpleNamespace
def DefaultableNS(): return SimpleNamespace(**getargvalues(stack()[1].frame)[3])


# then you can make classes with default fields on the fly in one line, eg:
def Node(value,left=None,right=None): return DefaultableNS()


node=Node(123)
print(node)
#[stdout] namespace(value=123, left=None, right=None)


print(node.value,node.left,node.right) # all fields exist

普通的 SimpleNamespace 比较笨拙，它破坏了 DRY:

def Node(value,left=None,right=None):
return SimpleNamespace(value=value,left=left,right=right)
# breaks DRY as you need to repeat the argument names twice

小开

如果你想“现场”创建类，我发现以下方法非常方便:

class Struct:
def __init__(self, **kw):
self.__dict__.update(**kw)

这让我可以写:

p = Struct(x=0, y=0)
P.x = 10


stats = Struct(count=0, total=0.0)
stats.count += 1

快速总结

各种详细的方法:

方法1(good) : 使用 `call()`的简单、可调用的类

方法2(更好) : 使用 `repr()`代替 `call()`

方法3(更好，但是有点难用) : 使它成为一个返回 `(x, y)`元组的可调用元素

方法4(到目前为止是最好的，但需要编写更多的代码) : 使类也成为迭代器

方法5(使用这一个)(完美! ——最佳和最干净/最短的方法) : 使用 `yield`生成器关键字使类成为可迭代的

参考文献:

Python 中是否存在可变的命名元组? ？

Python 中是否存在可变的命名元组? ？

快速总结

各种详细的方法:

方法1(good) : 使用 __call__()的简单、可调用的类

方法2(更好) : 使用 __repr__()代替 __call__()

方法3(更好，但是有点难用) : 使它成为一个返回 (x, y)元组的可调用元素

方法4(到目前为止是最好的，但需要编写更多的代码) : 使类也成为迭代器

方法5(使用这一个)(完美! ——最佳和最干净/最短的方法) : 使用 yield生成器关键字使类成为可迭代的

参考文献:

方法1(good) : 使用 `call()`的简单、可调用的类

方法2(更好) : 使用 `repr()`代替 `call()`

方法3(更好，但是有点难用) : 使它成为一个返回 `(x, y)`元组的可调用元素

方法5(使用这一个)(完美! ——最佳和最干净/最短的方法) : 使用 `yield`生成器关键字使类成为可迭代的