因为无论如何都要迭代这个 QuerySet,结果将被缓存(除非使用 iterator) ,所以最好使用 len,因为 this avoids hitting the database again, and also the possibly of retrieving a different number of results!).
如果您使用的是 iterator,那么出于同样的原因,我建议在迭代时包含一个计数变量(而不是使用 count)。
DO:queryset.count() - this will perform single SELECT COUNT(*) FROM some_table query, all computation is carried on RDBMS side, Python just needs to retrieve the result number with fixed cost of O(1)
DO N’T: len(queryset)-这将执行 SELECT * FROM some_table查询,获取整个表 O (N)并需要额外的 O (N)内存来存储它。这是我们能做的最糟糕的事情
len(queryset) # SELECT * fetching all the data - NO extra cost - data would be fetched anyway in the for loop
for obj in queryset: # data is already fetched by len() - using cache
pass
count()(两个 db 查询!) :
queryset.count() # First db query SELECT COUNT(*)
for obj in queryset: # Second db query (fetching data) SELECT *
pass
恢复的第2种情况(当查询集已经获取时) :
for obj in queryset: # iteration fetches the data
len(queryset) # using already cached data - O(1) no extra cost
queryset.count() # using cache - O(1) no extra db query
len(queryset) # the same O(1)
queryset.count() # the same: no query, O(1)
只要你看一眼“引擎盖下面”,一切都会变得清晰起来:
class QuerySet(object):
def __init__(self, model=None, query=None, using=None, hints=None):
# (...)
self._result_cache = None
def __len__(self):
self._fetch_all()
return len(self._result_cache)
def _fetch_all(self):
if self._result_cache is None:
self._result_cache = list(self.iterator())
if self._prefetch_related_lookups and not self._prefetch_done:
self._prefetch_related_objects()
def count(self):
if self._result_cache is not None:
return len(self._result_cache)
return self.query.get_count(using=self.db)