如果 Python 字符串是不可变的,那么为什么要保持相同的 id,如果我使用 + = 来附加它呢?

Python 中的字符串是不可变的,这意味着不能更改值。但是,当在下面的示例中添加字符串时,原始字符串内存似乎被修改了,因为 id 保持不变:

>>> s = 'String'
>>> for i in range(5, 0, -1):
...     s += str(i)
...     print(f"{s:<11} stored at {id(s)}")
...
String5     stored at 139841228476848
String54    stored at 139841228476848
String543   stored at 139841228476848
String5432  stored at 139841228476848
String54321 stored at 139841228476848

相反,在下面的示例中,id 会发生变化:

>>> a = "hello"
>>> id(a)
139841228475760
>>> a = "b" + a[1:]
>>> print(a)
bello
>>> id(a)
139841228475312
6444 次浏览

It's a CPython-specific optimization for the case when the str being appended to happens to have no other living references. The interpreter "cheats" in this case, allowing it to modify the existing string by reallocating (which can be in place, depending on heap layout) and appending the data directly, and often reducing the work significantly in loops that repeatedly concatenate (making it behave more like the amortized O(1) appends of a list rather than O(n) copy operations each time). It has no visible effect besides the unchanged id, so it's legal to do this (no one with an existing reference to a str ever sees it change unless the str was logically being replaced).

You're not actually supposed to rely on it (non-reference counted interpreters can't use this trick, since they can't know if the str has other references), per PEP8's very first programming recommendation:

Code should be written in a way that does not disadvantage other implementations of Python (PyPy, Jython, IronPython, Cython, Psyco, and such).

For example, do not rely on CPython’s efficient implementation of in-place string concatenation for statements in the form a += b or a = a + b. This optimization is fragile even in CPython (it only works for some types) and isn’t present at all in implementations that don’t use refcounting. In performance sensitive parts of the library, the ''.join() form should be used instead. This will ensure that concatenation occurs in linear time across various implementations.

If you want to break the optimization, there are all sorts of ways to do so, e.g. changing your code to:

>>> while i!=0:
...     s_alias = s  # Gonna save off an alias here
...     s += str(i)
...     print(s + " stored at " + str(id(s)))
...     i -= 1
...

breaks it by creating an alias, increasing the reference count and telling Python that the change would be visible somewhere other than s, so it can't apply it. Similarly, code like:

s = s + a + b

can't use it, because s + a occurs first, and produces a temporary that b must then be added to, rather than immediately replacing s, and the optimization is too brittle to try to handle that. Almost identical code like:

s += a + b

or:

s = s + (a + b)

restores the optimization by ensuring the final concatenation is always one where s is the left operand and the result is used to immediately replace s.

Regardless of implementation details, the docs say:

… Two objects with non-overlapping lifetimes may have the same id() value.

The previous object referenced by s no longer exists after the += so the new object breaks no rules by having the same id.

If objects have non-overlapping lifetimes, their id values may be the same, but if variables have overlapping lifetimes so they must have different id values.

In C, Java, or some other programming languages, a variable is an identifier or a name, connected to a memory location.

e.g.

enter image description here

but in Python, a variable is considered a tag that is tied to some value. Python considers value as an object. and it can save memory and assign memory location with respect to value instead of variable.

enter image description here

Variables are the same but if values are different then it can assign new memory, in a closer look no variable can be referenced a=10 then it is removed by the garbage collector and a new value hold the variable in a new memory location.

enter image description here

In C, Java memory location is for variable but in Python, the memory location is for value which can be treated as an object.