如何在 Python 中检查两个列表是否循环相同

小开

最佳答案

首先，这可以在 O(n)中根据列表的长度来完成您可以注意到，如果您将复制您的列表2次([1, 2, 3])将是 [1, 2, 3, 1, 2, 3]，那么您的新列表将肯定包含所有可能的循环列表。

所以你需要做的就是检查你正在搜索的列表是否在你开始列表的2倍之内。在 python 中，可以通过以下方式实现这一点(假设长度相同)。

list1 = [1, 1, 1, 0, 0]
list2 = [1, 1, 0, 0, 1]
print ' '.join(map(str, list2)) in ' '.join(map(str, list1 * 2))

关于我的线条笔，有一些解释: list * 2将把一个列表与它自己组合在一起，map(str, [1, 2])将所有数字转换为字符串，' '.join()将数组 ['1', '2', '111']转换为字符串 '1 2 111'。

正如一些人在评论中指出的那样，一个线条可能会给出一些错误的肯定，所以要涵盖所有可能的边缘情况:

def isCircular(arr1, arr2):
if len(arr1) != len(arr2):
return False


str1 = ' '.join(map(str, arr1))
str2 = ' '.join(map(str, arr2))
if len(str1) != len(str2):
return False


return str1 in str2 + ' ' + str2

说到时间复杂度，值得注意的是，如果能在 O(n)时间内找到子串，就可以实现 O(n)。它并不总是这样，而是取决于您的语言的实现(例如尽管它可能是线性的时间 KMP)。

对于害怕弦操作的人来说，由于这个事实，认为答案是不好的。重要的是复杂性和速度。该算法可以在 O(n)时间和 O(n)空间内运行，比 O(n^2)域中的任何算法都要好。要自己看到这一点，可以运行一个小基准测试(创建一个随机列表，弹出第一个元素并将其附加到末尾，从而创建一个循环列表。你可以自由操纵自己)

from random import random bigList = [int(1000 * random()) for i in xrange(10**6)] bigList2 = bigList[:] bigList2.append(bigList2.pop(0)) # then test how much time will it take to come up with an answer from datetime import datetime startTime = datetime.now() print isCircular(bigList, bigList2) print datetime.now() - startTime # please fill free to use timeit, but it will give similar results

0.3秒。没多久。现在试着将这个解决方案与 O(n^2)解决方案进行比较。在比较的时候，你可以从美国到澳大利亚旅行(最有可能是乘游轮)

小开

根据 Salvador Dali 非常聪明的解决方案，处理它的最佳方法是确保所有元素的长度相同，以及两个 LIST 的长度相同。

def is_circular_equal(lst1, lst2): if len(lst1) != len(lst2): return False lst1, lst2 = map(str, lst1), map(str, lst2) len_longest_element = max(map(len, lst1)) template = "\{\{:{}}}".format(len_longest_element) circ_lst = " ".join([template.format(el) for el in lst1]) * 2 return " ".join([template.format(el) for el in lst2]) in circ_lst

不知道这是否比阿什维尼 · 乔杜里(AshwiniChaudhary)在萨尔瓦多 · 达利(Salvador Dali)的回答中推荐的 regex 解决方案更快或更慢，答案是:

import re def is_circular_equal(lst1, lst2): if len(lst2) != len(lst2): return False return bool(re.search(r"\b{}\b".format(' '.join(map(str, lst2))), ' '.join(map(str, lst1)) * 2))

小开

借助@Salvaddali 的观察，在 b + b 中的任何 a 长度的切片中寻找 a 的匹配，这里有一个使用列表操作的解决方案。

def rollmatch(a,b): bb=b*2 return any(not any(ax^bbx for ax,bbx in zip(a,bb[i:])) for i in range(len(a))) l1 = [1,0,0,1] l2 = [1,1,0,0] l3 = [1,0,1,0] rollmatch(l1,l2) # True rollmatch(l1,l3) # False

第二种方法: [删除]

小开

在 Python 中没有足够的知识来回答这个问题，但是在 C/C + + 中，考虑到你的问题的参数，我会把0和1转换成位，然后把它们放到 uint64 _ t 中最不重要的位上。这将允许您比较所有55位在一个秋千 -1时钟。

速度非常快，而且整个系统可以放入芯片缓存(209,880字节)中。只有在 CPU 的寄存器中才有硬件支持可以同时移动所有55个列表成员。同样的道理也适用于同时比较所有55个成员。这允许将问题映射到软件解决方案。(并使用 SIMD/SSE 256位寄存器，如果需要最多256个成员)作为一个结果，代码是立即显而易见的读者。

你也许可以在 Python 中实现它，只是我不太了解它，不知道它是否可行或者性能如何。

睡在上面之后，一些事情变得明显起来，而且一切都变得更好了。

1)使用位来旋转循环链表是如此容易，以至于达利的非常聪明的技巧没有必要。在64位寄存器内部，标准位移将非常简单地完成旋转，并试图通过使用算术而不是位操作使这一切更加 Python 友好。

(2)除以2可以很容易地完成位移。

3)检查列表的末尾是0还是1可以通过模2很容易地完成。

4)将0从尾部移动到列表的头部可以通过除以2来完成。这是因为如果这个零被移动了，它会使第55位为假，这已经是完全不做任何事情了。

5)将1从尾部移动到列表的头部，可以通过除以2并加上18,014,398,509,481,984来完成——这是通过标记第55位为真，其余为假而创建的值。

6)如果锚点和组成 uint64 _ t 在任何给定的旋转后比较为 TRUE，则中断并返回 TRUE。

我会先将整个列表数组转换为 uint64 _ ts 数组，以避免重复进行转换。

在花了几个小时试图优化代码，学习汇编语言之后，我能够减少20% 的运行时。我应该补充的是，O/S 和 MSVC 编译器昨天中午也得到了更新。不管出于什么原因，C 编译器生成的代码的质量在更新后(11/15/2014)有了显著的提高。运行时间现在是 ~ 70个时钟17纳秒组成和比较锚环与所有55轮的测试环和 NxN 的所有环对所有其他环是在 12.5秒中完成的。

这个代码非常紧密，但是有4个寄存器99% 的时间都在闲置。汇编语言几乎一行一行地匹配 C 代码。非常容易阅读和理解。一个伟大的组装项目，如果有人是自学成才。

硬件是 Hazwell i7，MSVC 64位，完全优化。

#include "stdafx.h" #include "stdafx.h" #include <string> #include <memory> #include <stdio.h> #include <time.h> const uint8_t LIST_LENGTH = 55; // uint_8 supports full witdth of SIMD and AVX2 // max left shifts is 32, so must use right shifts to create head_bit const uint64_t head_bit = (0x8000000000000000 >> (64 - LIST_LENGTH)); const uint64_t CPU_FREQ = 3840000000; // turbo-mode clock freq of my i7 chip const uint64_t LOOP_KNT = 688275225; // 26235^2 // 1000000000; // ---------------------------------------------------------------------------- __inline uint8_t is_circular_identical(const uint64_t anchor_ring, uint64_t test_ring) { // By trial and error, try to synch 2 circular lists by holding one constant // and turning the other 0 to LIST_LENGTH positions. Return compare count. // Return the number of tries which aligned the circularly identical rings, // where any non-zero value is treated as a bool TRUE. Return a zero/FALSE, // if all tries failed to find a sequence match. // If anchor_ring and test_ring are equal to start with, return one. for (uint8_t i = LIST_LENGTH; i; i--) { // This function could be made bool, returning TRUE or FALSE, but // as a debugging tool, knowing the try_knt that got a match is nice. if (anchor_ring == test_ring) { // test all 55 list members simultaneously return (LIST_LENGTH +1) - i; } if (test_ring % 2) { // ring's tail is 1 ? test_ring /= 2; // right-shift 1 bit // if the ring tail was 1, set head to 1 to simulate wrapping test_ring += head_bit; } else { // ring's tail must be 0 test_ring /= 2; // right-shift 1 bit // if the ring tail was 0, doing nothing leaves head a 0 } } // if we got here, they can't be circularly identical return 0; } // ---------------------------------------------------------------------------- int main(void) { time_t start = clock(); uint64_t anchor, test_ring, i, milliseconds; uint8_t try_knt; anchor = 31525197391593472; // bits 55,54,53 set true, all others false // Anchor right-shifted LIST_LENGTH/2 represents the average search turns test_ring = anchor >> (1 + (LIST_LENGTH / 2)); // 117440512; printf("\n\nRunning benchmarks for %llu loops.", LOOP_KNT); start = clock(); for (i = LOOP_KNT; i; i--) { try_knt = is_circular_identical(anchor, test_ring); // The shifting of test_ring below is a test fixture to prevent the // optimizer from optimizing the loop away and returning instantly if (i % 2) { test_ring /= 2; } else { test_ring *= 2; } } milliseconds = (uint64_t)(clock() - start); printf("\nET for is_circular_identical was %f milliseconds." "\n\tLast try_knt was %u for test_ring list %llu", (double)milliseconds, try_knt, test_ring); printf("\nConsuming %7.1f clocks per list.\n", (double)((milliseconds * (CPU_FREQ / 1000)) / (uint64_t)LOOP_KNT)); getchar(); return 0; }

小开

重复第一个数组，然后使用 Z 算法(O (n)时间)查找第一个数组中的第二个数组。

(注意: 不必物理地复制第一个数组，只需在匹配过程中进行包装即可。)

Z 算法的优点在于，与 KMP、 BM 等相比，它的非常非常简单。
但是，如果您感到雄心勃勃，您可以在线性时间和不变空间中进行字符串匹配——例如，strstr就是这样做的。但实施起来会更痛苦。

小开

从字里行间可以看出，你似乎是在尝试列举每个圆形等价类中有3个1和52个0的字符串的一个代表。让我们从密集表示切换到稀疏表示(range(55)中由三个数字组成的集合)。在这个表示中，s被 k的循环移位是由理解 set((i + k) % 55 for i in s)给出的。类中的词典最小代表总是包含位置0。给定一组具有 0 < i < j的 {0, i, j}格式，班级中最小的其他候选格式是 {0, j - i, 55 - i}和 {0, 55 - j, 55 + i - j}。因此，我们需要 (i, j) <= min((j - i, 55 - i), (55 - j, 55 + i - j))的原始是最小的。这是一些枚举代码。

def makereps(): reps = [] for i in range(1, 55 - 1): for j in range(i + 1, 55): if (i, j) <= min((j - i, 55 - i), (55 - j, 55 + i - j)): reps.append('1' + '0' * (i - 1) + '1' + '0' * (j - i - 1) + '1' + '0' * (55 - j - 1)) return reps

小开

考虑到你需要做这么多的比较，是否值得你首先浏览一下你的列表，把它们转换成某种可以很容易比较的规范形式？

您是否正在尝试获取一组循环唯一列表？如果是这样，您可以在转换为元组之后将它们扔到一个集合中。

def normalise(lst): # Pick the 'maximum' out of all cyclic options return max([lst[i:]+lst[:i] for i in range(len(lst))]) a_normalised = map(normalise,a) a_tuples = map(tuple,a_normalised) a_unique = set(a_tuples)

很抱歉 David Eisenstat 没有发现他的类似答案。

小开

你可以像这样滚动一个列表:

list1, list2 = [0,1,1,1,0,0,1,0], [1,0,0,1,0,0,1,1] str_list1="".join(map(str,list1)) str_list2="".join(map(str,list2)) def rotate(string_to_rotate, result=[]): result.append(string_to_rotate) for i in xrange(1,len(string_to_rotate)): result.append(result[-1][1:]+result[-1][0]) return result for x in rotate(str_list1): if cmp(x,str_list2)==0: print "lists are rotationally identical" break

小开

首先将列表中的每个元素(如果需要，可以复制)转换为词法最大的那个旋转版本。

然后对产生的列表列表进行排序(将索引保留在原始列表位置) ，并统一排序列表，根据需要标记原始列表中的所有重复项。

小开

这不是一个完整的、独立的答案，但是关于通过减少比较来优化的话题，我也在考虑规范化表示。

也就是说，如果您的输入字母表是{0,1} ，您可以显著减少允许的排列数。将第一个列表旋转为(伪)规范化形式(考虑到你问题中的分布，我会选择1位中的一位在最左边，0位中的一位在最右边)。现在，在进行每次比较之前，先用相同的对齐模式在可能的位置连续旋转另一个列表。

例如，如果总共有4个1位，那么这种对齐最多可以有4个排列，如果有相邻1位的集群，那么这种集群中的每个额外位都会减少位置的数量。

List 1 1 1 1 0 1 0 List 2 1 0 1 1 1 0 1st permutation 1 1 1 0 1 0 2nd permutation, final permutation, match, done

这将推广到更大的字母表和不同的对齐模式; 主要的挑战是找到只有少数可能表示的良好规范化。理想情况下，这将是一个适当的规范化，只有一个唯一的表示，但考虑到这个问题，我认为这是不可能的。

小开

基于 RocketRoy 的回答: 将所有列表前置转换为无符号64位数字。对于每个列表，旋转这些55位以找到最小的数值。

现在，每个列表只剩下一个无符号64位值，可以直接与其他列表的值进行比较。不再需要函数 is _ round _ same ()了。

(实际上，您为您的列表创建了一个标识值，该值不受列表元素旋转的影响) 如果你的列表中有任意数量的一个，这甚至可以工作。

小开

这与萨尔瓦多 · 达利的想法相同，但不需要字符串转换。后面是同样的 KMP 恢复想法，以避免不可能的换班检查。他们只调用 KMPAmendment (list1，list2 + list2)。

public class KmpModified { public int[] CalculatePhi(int[] pattern) { var phi = new int[pattern.Length + 1]; phi[0] = -1; phi[1] = 0; int pos = 1, cnd = 0; while (pos < pattern.Length) if (pattern[pos] == pattern[cnd]) { cnd++; phi[pos + 1] = cnd; pos++; } else if (cnd > 0) cnd = phi[cnd]; else { phi[pos + 1] = 0; pos++; } return phi; } public IEnumerable<int> Search(int[] pattern, int[] list) { var phi = CalculatePhi(pattern); int m = 0, i = 0; while (m < list.Length) if (pattern[i] == list[m]) { i++; if (i == pattern.Length) { yield return m - i + 1; i = phi[i]; } m++; } else if (i > 0) { i = phi[i]; } else { i = 0; m++; } } [Fact] public void BasicTest() { var pattern = new[] { 1, 1, 10 }; var list = new[] {2, 4, 1, 1, 1, 10, 1, 5, 1, 1, 10, 9}; var matches = Search(pattern, list).ToList(); Assert.Equal(new[] {3, 8}, matches); } [Fact] public void SolveProblem() { var random = new Random(); var list = new int[10]; for (var k = 0; k < list.Length; k++) list[k]= random.Next(); var rotation = new int[list.Length]; for (var k = 1; k < list.Length; k++) rotation[k - 1] = list[k]; rotation[rotation.Length - 1] = list[0]; Assert.True(Search(list, rotation.Concat(rotation).ToArray()).Any()); } }

希望这有帮助！

小开

为了粘合到最蟒蛇的方式做到这一点，使用集！

from sets import Set a = Set ([1, 1, 1, 0, 0]) b = Set ([0, 1, 1, 1, 0]) c = Set ([1, 0, 0, 1, 1]) a==b True a==b==c True

小开

简化问题

这个问题由一系列有序的项目组成

值域是二进制 (0,1)

我们可以通过将连续的 1映射到一个计数来减少这个问题

和连续的 0变成负值

例子

A = [ 1, 1, 1, 0, 0, 1, 1, 0 ] B = [ 1, 1, 0, 1, 1, 1, 0, 0 ] ~ A = [ +3, -2, +2, -1 ] B = [ +2, -1, +3, -2 ]

此过程要求第一项和最后一项必须不同

这将减少总体的比较数量

查核程序

如果我们假设它们是复制品，那么我们就可以假设我们要找的东西

基本上，第一个列表中的第一个项目必须存在于另一个列表中的某个地方

然后是第一个列表中的内容，并且是以同样的方式

前面的项目应该是第一个列表中的最后一个项目

因为它是圆形的，所以顺序是一样的

格里普

这里的问题是从哪里开始，技术上称为 lookup和 look-ahead

我们将通过第二个列表检查第一个列表的第一个元素存在的位置

将列表映射成直方图，降低了频繁元素的概率

伪代码

FUNCTION IS_DUPLICATE (LIST L1, LIST L2) : BOOLEAN LIST A = MAP_LIST(L1) LIST B = MAP_LIST(L2) LIST ALPHA = LOOKUP_INDEX(B, A[0]) IF A.SIZE != B.SIZE OR COUNT_CHAR(A, 0) != COUNT_CHAR(B, ALPHA[0]) THEN RETURN FALSE END IF FOR EACH INDEX IN ALPHA IF ALPHA_NGRAM(A, B, INDEX, 1) THEN IF IS_DUPLICATE(A, B, INDEX) THEN RETURN TRUE END IF END IF END FOR RETURN FALSE END FUNCTION

FUNCTION IS_DUPLICATE (LIST L1, LIST L2, INTEGER INDEX) : BOOLEAN INTEGER I = 0 WHILE I < L1.SIZE DO IF L1[I] != L2[(INDEX+I)%L2.SIZE] THEN RETURN FALSE END IF I = I + 1 END WHILE RETURN TRUE END FUNCTION

职能

MAP_LIST(LIST A):LIST映射结构元素作为新列表中的计数

元素 E在列表 A中存在的索引的返回列表

计算一个元素在列表 A中出现的次数

检查 B[I]在两个方向上是否等于 A[0]

终于来了

如果列表的大小很大，或者我们开始检查循环的元素的大小经常很高，那么我们可以执行以下操作:

从第一个列表中最少出现的项目开始查找

增加 n-gram N 参数以降低通过线性检查的概率

小开

有关列表的一种高效、快速计算的“规范形式”可以推导如下:

计算1和1之间的零的数目(忽略环绕) ，得到三个数字。

旋转这三个数字，使最大的数字是第一个。

第一个数字(a)必须在 18和 52之间(包括 52)。重新编码为 0和 34之间。

第二个数字(b)必须在 0和 26之间，但这并不重要。

删除第三个数字，因为它只是 52 - (a + b)，没有添加任何信息

规范形式是整数 b * 35 + a，它介于 0和 936之间(包括 936) ，它相当紧凑(总共有 477循环唯一列表)。

小开

我编写了一个简单的解决方案，比较两个列表，并且只增加(和包装)每次迭代的比较值的索引。

我不太了解 python，所以我用 Java 编写了它，但它真的很简单，所以应该很容易适应任何其他语言。

通过这种方法，您还可以比较其他类型的列表。

public class Main { public static void main(String[] args){ int[] a = {0,1,1,1,0}; int[] b = {1,1,0,0,1}; System.out.println(isCircularIdentical(a, b)); } public static boolean isCircularIdentical(int[] a, int[]b){ if(a.length != b.length){ return false; } //The outer loop is for the increase of the index of the second list outer: for(int i = 0; i < a.length; i++){ //Loop trough the list and compare each value to the according value of the second list for(int k = 0; k < a.length; k++){ // I use modulo length to wrap around the index if(a[k] != b[(k + i) % a.length]){ //If the values do not match I continue and shift the index one further continue outer; } } return true; } return false; } }

小开

正如其他人提到的，一旦找到列表的规范化旋转，就可以对它们进行比较。

这里有一些工作代码, 基本方法是找到每个列表的标准化旋转并比较:

计算每个列表上的规范化旋转索引。

循环两个列表及其偏移量，比较每个项目，如果不匹配则返回。

注意，这个方法不依赖于数字，可以传入字符串列表(任何可以比较的值)。

我们知道我们希望列表从最小值开始，而不是在列表中进行搜索，这样我们就可以循环遍历最小值，搜索直到找到连续值最小的列表，存储这些值以便进一步比较，直到找到最佳值。

在计算索引时有很多提前退出的机会，包括一些优化的详细信息。

当只有一个最小值时，跳过搜索最佳最小值。

当前一个值也是最小值时，跳过搜索最小值(它永远不会是更好的匹配)。

当所有值相同时跳过搜索。

如果列表具有不同的最小值，则提前失败。

当偏移量匹配时使用常规比较。

调整偏移量，以避免在比较期间将索引值包装在其中一个列表上。

请注意，在 Python 中，列表中的列表搜索可能会更快，但是我有兴趣找到一种有效的算法——它也可以用于其他语言。另外，避免创建新列表也有一些好处。

def normalize_rotation_index(ls, v_min_other=None): """ Return the index or -1 (when the minimum is above `v_min_other`) """ if len(ls) <= 1: return 0 def compare_rotations(i_a, i_b): """ Return True when i_a is smaller. Note: unless there are large duplicate sections of identical values, this loop will exit early on. """ for offset in range(1, len(ls)): v_a = ls[(i_a + offset) % len(ls)] v_b = ls[(i_b + offset) % len(ls)] if v_a < v_b: return True elif v_a > v_b: return False return False v_min = ls[0] i_best_first = 0 i_best_last = 0 i_best_total = 1 for i in range(1, len(ls)): v = ls[i] if v_min > v: v_min = v i_best_first = i i_best_last = i i_best_total = 1 elif v_min == v: i_best_last = i i_best_total += 1 # all values match if i_best_total == len(ls): return 0 # exit early if we're not matching another lists minimum if v_min_other is not None: if v_min != v_min_other: return -1 # simple case, only one minimum if i_best_first == i_best_last: return i_best_first # otherwise find the minimum with the lowest values compared to all others. # start looking after the first we've found i_best = i_best_first for i in range(i_best_first + 1, i_best_last + 1): if (ls[i] == v_min) and (ls[i - 1] != v_min): if compare_rotations(i, i_best): i_best = i return i_best def compare_circular_lists(ls_a, ls_b): # sanity checks if len(ls_a) != len(ls_b): return False if len(ls_a) <= 1: return (ls_a == ls_b) index_a = normalize_rotation_index(ls_a) index_b = normalize_rotation_index(ls_b, ls_a[index_a]) if index_b == -1: return False if index_a == index_b: return (ls_a == ls_b) # cancel out 'index_a' index_b = (index_b - index_a) if index_b < 0: index_b += len(ls_a) index_a = 0 # ignore it # compare rotated lists for i in range(len(ls_a)): if ls_a[i] != ls_b[(index_b + i) % len(ls_b)]: return False return True assert(compare_circular_lists([0, 9, -1, 2, -1], [-1, 2, -1, 0, 9]) == True) assert(compare_circular_lists([2, 9, -1, 0, -1], [-1, 2, -1, 0, 9]) == False) assert(compare_circular_lists(["Hello" "Circular", "World"], ["World", "Hello" "Circular"]) == True) assert(compare_circular_lists(["Hello" "Circular", "World"], ["Circular", "Hello" "World"]) == False)

更多测试/示例请参见: 这个片段。

小开

您可以很容易地检查列表 A 是否等于列表 B 在预期 O (N)时间内的循环移位。

我将使用一个多项式散列函数来计算列表 A 的散列，以及列表 B 的每个循环移位。如果列表 B 的移位与列表 A 的散列相同，我将比较实际元素，看它们是否相等。

之所以这么快，是因为使用了多项式散列函数(这是非常常见的!)，您可以在常量时间内计算每个循环移位与前一个循环移位的哈希值，因此您可以计算 O (N)时间内所有循环移位的哈希值。

工作原理是这样的:

假设 B 有 N 个元素，那么使用素数 P 的 B 的散列表是:

Hb=0; for (i=0; i<N ; i++) { Hb = Hb*P + B[i]; }

这是求 P 中多项式的一种优化方法，等价于:

Hb=0; for (i=0; i<N ; i++) { Hb += B[i] * P^(N-1-i); //^ is exponentiation, not XOR }

注意每个 B [ i ]是如何乘以 P ^ (N-1-i)的。如果我们将 B 向左移动1，那么每个 B [ i ]将被另外一个 P 乘以，除了第一个 P。因为乘法分布在加法上，所以我们可以通过整个散列一次乘以所有的组件，然后为第一个元素确定因子。

B 的左移位的散列就是

Hb1 = Hb*P + B[0]*(1-(P^N))

第二次左移:

Hb2 = Hb1*P + B[1]*(1-(P^N))

诸如此类。

注意: 上面的所有数学都是按照一定的机器单词大小模数进行的，并且只需要计算 P ^ N 一次。