Django 只选择具有重复字段值的行

小开

最佳答案

试试:

from django.db.models import Count
Literal.objects.values('name')
.annotate(Count('id'))
.order_by()
.filter(id__count__gt=1)

这是你和姜戈最接近的距离了。问题是这将返回一个只有 name和 count的 ValuesQuerySet。然而，您可以使用它来构造一个常规的 QuerySet，方法是将它反馈回另一个查询:

dupes = Literal.objects.values('name')
.annotate(Count('id'))
.order_by()
.filter(id__count__gt=1)
Literal.objects.filter(name__in=[item['name'] for item in dupes])

小开

尝试使用聚合

Literal.objects.values('name').annotate(name_count=Count('name')).exclude(name_count=1)

小开

这是作为一个编辑被拒绝。所以这里是作为一个 好多了的答案

dups = (
Literal.objects.values('name')
.annotate(count=Count('id'))
.values('name')
.order_by()
.filter(count__gt=1)
)

这将返回一个包含所有重复名称的 ValuesQuerySet。然而，您可以使用它来构造一个常规的 QuerySet，方法是将它反馈回另一个查询。Django ORM 足够聪明，可以将它们组合成一个查询:

Literal.objects.filter(name__in=dups)

在注释调用之后对 .values('name')的额外调用看起来有点奇怪。否则，子查询将失败。额外的值欺骗 ORM 只为子查询选择名称列。

小开

如果只希望产生名称列表，而不希望产生对象，则可以使用以下查询

repeated_names = Literal.objects.values('name').annotate(Count('id')).order_by().filter(id__count__gt=1).values_list('name', flat='true')

小开

如果您使用 PostgreSQL，您可以这样做:

from django.contrib.postgres.aggregates import ArrayAgg
from django.db.models import Func, Value


duplicate_ids = (Literal.objects.values('name')
.annotate(ids=ArrayAgg('id'))
.annotate(c=Func('ids', Value(1), function='array_length'))
.filter(c__gt=1)
.annotate(ids=Func('ids', function='unnest'))
.values_list('ids', flat=True))

它导致了这个相当简单的 SQL 查询:

SELECT unnest(ARRAY_AGG("app_literal"."id")) AS "ids"
FROM "app_literal"
GROUP BY "app_literal"."name"
HAVING array_length(ARRAY_AGG("app_literal"."id"), 1) > 1

小开

好的，由于某种原因，上面的方法都不起作用，它总是返回 <MultilingualQuerySet []>。我使用以下方法，更容易理解，但不是那么优雅的解决方案:

dupes = []
uniques = []


dupes_query = MyModel.objects.values_list('field', flat=True)


for dupe in set(dupes_query):
if not dupe in uniques:
uniques.append(dupe)
else:
dupes.append(dupe)


print(set(dupes))